Migrate Data with Debezium

On this page

Debezium is a self-hosted distributed platform that can read data from a variety of sources and import it into Kafka. You can use Debezium to migrate data to CockroachDB from another database that is accessible over the public internet.

As of this writing, Debezium supports the following database sources:

MongoDB
MySQL
PostgreSQL
SQL Server
Oracle
Db2
Cassandra
Vitess (incubating)
Spanner (incubating)
JDBC (incubating)

Note:

Migrating with Debezium requires familiarity with Kafka. Refer to the Debezium documentation for information on how Debezium is deployed with Kafka Connect.

Before you begin

Complete the following items before using Debezium:

Configure a secure publicly-accessible CockroachDB cluster running the latest v23.2 production release with at least one SQL user, make a note of the credentials for the SQL user.
Install and configure Debezium, Kafka Connect, and Kafka. This documentation assumes you have already added data from your source database to a Kafka topic.

Migrate data to CockroachDB

Once all of the prerequisite steps are completed, you can use Debezium to migrate data to CockroachDB.

To write data from Kafka to CockroachDB, use the Confluent JDBC Sink Connector. First use the following dockerfile to create a custom image with the JDBC driver:

FROM quay.io/debezium/connect:latest
ENV KAFKA_CONNECT_JDBC_DIR=$KAFKA_CONNECT_PLUGINS_DIR/kafka-connect-jdbc \

ARG POSTGRES_VERSION=latest
ARG KAFKA_JDBC_VERSION=latest

# Deploy PostgreSQL JDBC Driver
RUN cd /kafka/libs && curl -sO https://jdbc.postgresql.org/download/postgresql-$POSTGRES_VERSION.jar

# Deploy Kafka Connect JDBC
RUN mkdir $KAFKA_CONNECT_JDBC_DIR && cd $KAFKA_CONNECT_JDBC_DIR &&\
   curl -sO https://packages.confluent.io/maven/io/confluent/kafka-connect-jdbc/$KAFKA_JDBC_VERSION/kafka-connect-jdbc-$KAFKA_JDBC_VERSION.jar

Create the JSON configuration file that you will use to create the sink:

{
   "name": "pg-sink",
   "config": {
       "connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector", 
       "tasks.max": "10",
       "topics" : "{topic.example.table}",
       "connection.url": "jdbc:postgresql://{host}:{port}/{user}?sslmode=require",
       "connection.user": "{username}",
       "connection.password": "{password}",
       "insert.mode": "upsert",
       "pk.mode": "record_value",
       "pk.fields":"id",
       "database.time_zone": "UTC",
       "auto.create":true,
       "auto.evolve": false,
       "transforms": "unwrap",
       "transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState"
   }
}

Specify the Connection URL in JDBC format. For information about where to find the CockroachDB connection parameters, see Connect to a CockroachDB Cluster.

To create the sink, POST the JSON configuration file to the Kafka Connect /connectors endpoint. Refer to the Kafka Connect API documentation for more information.

Cockroach
University

Docs Hub

Migrate Data with Debezium

Before you begin

Migrate data to CockroachDB

See also

Cockroach University

Docs Hub

Cockroach University

Docs Hub

Migrate Data with Debezium

Before you begin

Migrate data to CockroachDB

See also

Cockroach
University

Cockroach
University