佰聆实践:Oceanbase 数据迁移文档

作者简介:张杰,大数据开发工程师。

 

一、练习背景

通过对 mysql 至 OceanBase 数据的迁移过程的学习,去了解和熟悉 OceanBase 数据库业务数据表结构及数据的多种迁移方法和迁移工具等。

二、练习内容

1、使用 mysqldump 将 mysql 的表结构和数据同步到 OceanBase 的 MySQL 租户中。

2、使用 datax 配置至少一个表的 MySQL 到 OceanBase 的 MySQL 租户的离线同步。

3、使用 datax 配置至少一个表的 OceanBase 到 CSV 以及 CSV 到 OceanBase 的离线同步。

4、使用 canal 配置 MySQL 的增量同步到 OceanBase 的 MySQL 租户。

三、Mysqldump 数据迁移

1、环境准备:

·安装tpcc

wget http://imysql.com/wp-content/uploads/2014/09/tpcc-mysql-src.tgz
gunzip tpcc-mysql-src.tgz
tar xf tpcc-mysql-src.tar
cd tpcc-mysql/src 
make

佰聆实践:Oceanbase 数据迁移文档-1

·并将测试数据导入mysql数据库的tpcc数据库

[root@ob1 tpcc-mysql]# ./tpcc_load 172.20.10.10:3306 tpcc root 123456 1

佰聆实践:Oceanbase 数据迁移文档-2

·查看测试业务库数据表

佰聆实践:Oceanbase 数据迁移文档-3

2、使用 mysqldump 导出业务表结构

mysqldump -h 127.1 -uroot -P3306 -p123456 -d tpcc --compact > tpch_ddl.sql

佰聆实践:Oceanbase 数据迁移文档-4

3、使用 mysqldump 导出业务表数据

mysqldump -h 127.1 -uroot -P3306 -p123456 -t tpcc > tpcc_data.sql

佰聆实践:Oceanbase 数据迁移文档-5

4、同步业务表结构至 OceanBase

mysql -h 172.20.10.10 -u root@sys#obdemo -P 2883 -p123456 -c -A tpcc
source tpcc_ddl.sql

佰聆实践:Oceanbase 数据迁移文档-6

5、同步业务表数据至 OceanBase

·source tpcc_data.sql

佰聆实践:Oceanbase 数据迁移文档-7

·查看数据已经同步完成

佰聆实践:Oceanbase 数据迁移文档-8

四、Datax 数据迁移

1、环境准备

wget http://datax-opensource.oss-cn-hangzhou.aliyuncs.com/datax.tar.gz
tar -zxvf data.tar.gz

佰聆实践:Oceanbase 数据迁移文档-9

2、删除 plugin 文件夹下的隐藏文件

cd /usr/local/soft/datax/datax/plugin/reader && rm -rf .*
cd /usr/local/soft/datax/datax/plugin/writer && rm -rf .*

佰聆实践:Oceanbase 数据迁移文档-10

3、生成 mysql 同步至 OceanBase的配置模板

python ./bin/datax.py -r mysqlreader -w oceanbasev10writer

佰聆实践:Oceanbase 数据迁移文档-11

4、修改配置文件

vim mysql2ob.json
{
   
"job": {
       
"setting": {
           },
           
"errorLimit": {
               
"record": 0,
               
"percentage": 0.1
           }
       },
       
"content": [
           {
               
"reader": {
                   
"name": "mysqlreader",
                   
"parameter": {
                       
"username": "root",
                       
"password": "123456",
                       
"column": [
                           
"*"
                       ],
                       
"connection": [
                           {
                               
"table": [
                                   
"student"
                               ],
                               
"jdbcUrl": ["jdbc:mysql://172.20.10.10:3306/test?useUnicode=true&characterEncoding=utf8"]
                           }
                       ]
                   }
               },
 
               
"writer": {
                   
"name": "oceanbasev10writer",
                   
"parameter": {
                       
"obWriteMode": "insert",
                       
"column": [
                           
"*"
                       ],
                       
"preSql": [
                           
"truncate table student"
                       ],
                       
"connection": [
                           {
                               
"jdbcUrl": "||_dsc_ob10_dsc_||obdemo:sys||_dsc_ob10_dsc_||jdbc:oceanbase://172.20.10.10:2883/test?useLocalSessionState=true&allowBatch=true&allowMultiQueries=true&rewriteBatchedStatements=true",
                               
"table": [
                                   
"student"
                               ]
                           }
                       ],
                       
"username": "root",
                       
"password":"123456",
                       
"writerThreadCount":10,
                       
"batchSize": 1000,
                       
"memstoreThreshold": "0.9"
                   }
               }
           }
       ]
   }
}

5、同步任务的启动与完成

·同步任务的启动

python ./bin/datax.py ./job/mysql2ob.json

佰聆实践:Oceanbase 数据迁移文档-12

·执行同步完成

佰聆实践:Oceanbase 数据迁移文档-13

五、datax 实现 OceanBase 以 csv 格式的导入导出

1、OceanBase 以 csv 格式导出数据

·生成 datax 配置 json 模板

python bin/datax.py -r oceanbasev10reader -w txtfilewriter > job/oceanbase2csv.json

佰聆实践:Oceanbase 数据迁移文档-14

·修改 json 文件为可执行文件

chmod 777 oceanbase2csv.json

佰聆实践:Oceanbase 数据迁移文档-15

·编辑配置 json 文件

vim job/oceanbase2csv.json
{
   
"job": {
       
"setting": {
           
"speed": {
               
"channel": 4
           },
           
"errorLimit": {
               
"record": 0,
               
"percentage": 0.1
           }
       },
       
"content": [
           {
               
"reader": {
                   
"name": "oceanbasev10reader",
                   
"parameter": {
                       
"column": ["stu_no","stu_name","stu_addr","stu_class"],
                       
"connection": [
                           {
                               
"jdbcUrl": ["||_dsc_ob10_dsc_||obdemo:sys||_dsc_ob10_dsc_||jdbc:oceanbase://172.20.10.10:2883/test?useLocalSessionState=true&allowBatch=true&allowMultiQueries=true&rewriteBatchedStatements=true"],
                               
"table": ["student"]
                           }
                       ],
                       
"password": "123456",
                       
"username": "root",
                       
"where": ""
                   }
               },
               
"writer": {
                   
"name": "txtfilewriter",
                   
"parameter": {
                       
"dateFormat": "yyyy-MM-dd hh:mm:ss",
                       
"fieldDelimiter": ",",
                       
"fileName": "ob2csv",
                       
"path": "/usr/local/soft/mysql_dump",
                       
"writeMode": "truncate"
                   }
               }
           }
       ]
   }
}

·执行任务

python bin/datax.py job/oceanbase2csv.json

佰聆实践:Oceanbase 数据迁移文档-16

·查看执行结果

佰聆实践:Oceanbase 数据迁移文档-17

2、OceanBase 以 csv 格式导入数据

·生成 datax 配置 json 模板

python bin/datax.py -w oceanbasev10writer -r txtfilereader > job/csv2oceanbase.json

佰聆实践:Oceanbase 数据迁移文档-18

·修改 json 文件为可执行文件

chmod 777 csv2oceanbase.json

佰聆实践:Oceanbase 数据迁移文档-19

·编辑配置 json 文件

vim csv2oceanbase.json
{
   
"job": {
       
"setting": {
           
"speed": {
               
"channel": 4
           },
           
"errorLimit": {
               
"record": 0,
               
"percentage": 0.1
           }
       },
       
"content": [
           {
               
"reader": {
                   
"name": "txtfilereader",
                   
"parameter": {
                       
"path": ["/usr/local/soft/mysql_dump"],
                       
"fileName": "ob2csv",
                       
"encoding": "UTF-8",
                       
"column": ["*"],
                       
"dateFormat": "yyyy-MM-dd hh:mm:ss" ,
                       
"nullFormat": "\\N" ,
                       
"fieldDelimiter": ","
                   }
               },
               
"writer": {
                   
"name": "oceanbasev10writer",
                   
"parameter": {
                       
"obWriteMode": "insert",
                       
"column": [
                           
"stu_no","stu_name","stu_addr","stu_class"
                       ],
                       
"preSql": [
                           
"truncate table student"
                       ],
                       
"connection": [
                           {
                               
"jdbcUrl": "||_dsc_ob10_dsc_||obdemo:sys||_dsc_ob10_dsc_||jdbc:oceanbase://172.20.10.10:2883/test?useLocalSessionState=true&allowBatch=true&allowMultiQueries=true&rewriteBatchedStatements=true",
                               
"table": [
                                   
"student"
                               ]
                           }
                       ],
                       
"username": "root",
                       
"password": "123456",
                       
"writerThreadCount": 10,
                       
"batchSize": 100,
                       
"memstoreThreshold": "0.9"
                   }
               }
           }
       ]
   }
}

·执行任务

python bin/datax.py job/csv2oceanbase.json

佰聆实践:Oceanbase 数据迁移文档-20

·查看执行结果

佰聆实践:Oceanbase 数据迁移文档-21

·注意:

Path:需要给定文件的路径,该路径下不要有其他文件,执行会报错。

六、Canal 实现 mysql 与 OceanBase 的增量同步

1、Mysql 环境准备

·开启 binlog 日志

[mysqld]
log-bin=mysql-bin # 开启 binlog
binlog-format=ROW # 选择 ROW 模式
server_id=1 # 配置 MySQL replaction 需要定义,不要和 canal 的 slaveId 重复

佰聆实践:Oceanbase 数据迁移文档-22

·授权账号权限

这里我使用的是 root 账号,可以创建账号并赋权限。

CREATE USER canal IDENTIFIED BY 'canal'; 
GRANT SELECT, REPLICATION SLAVE, REPLICATION
 CLIENT ON *.* TO 'canal'@'%';
-- GRANT ALL PRIVILEGES ON *.* 
TO 'canal'@'%' ;
FLUSH PRIVILEGES;

2、部署 canal admin

·下载 canal admin 并解压

wget https://github.com/alibaba/canal/releases/download/canal-1.1.5/canal.admin-1.1.5.tar.gz
tar -zxvf canal.admin-
1.1.5.tar.gz

·修改配置文件

cd canal-admin/conf
vim application.yml

佰聆实践:Oceanbase 数据迁移文档-23

·初始化元数据库

mysql -uroot -P3306 -p123456
source conf/canal_manager.sql

佰聆实践:Oceanbase 数据迁移文档-24

·启动 canal-admin

cd canal/canal-admin/bin/ && ./startup.sh

佰聆实践:Oceanbase 数据迁移文档-25

·查看 canal-admin 的 web 服务

http:
//127.0.0.1:8080/
Username:admin
Password:
123456

佰聆实践:Oceanbase 数据迁移文档-26

3、部署canal deployer

·下载canal deployer并解压

wget https://github.com/alibaba/canal/releases/download/canal-1.1.5/canal.deployer-1.1.5.tar.gz
tar canal.deployer-
1.1.5.tar.gz

·修改配置文件

 canal Admin 部署 server 和 instance,配置文件 conf/canal_local.properties 替换 conf/canal.properties。需要修改 conf/canal.properties 里的 manager 地址,其他参数值可以保持默认。

mv canal_local.properties canal.properties && vim canal.properties

佰聆实践:Oceanbase 数据迁移文档-27

·启动 canal-deployer 服务

cd ~/canal/canal-deployer/bin/ && ./startup.sh

佰聆实践:Oceanbase 数据迁移文档-1

4、通过 canal-admin 的 web 服务配置 canal server 和 canal instance

·创建 canal server 并启动

佰聆实践:Oceanbase 数据迁移文档-29

·创建 canal instance 并启动

佰聆实践:Oceanbase 数据迁移文档-30

·导入模板并修改对应的参数

#################################################

## mysql serverId , v1.0.26+ will autoGen
canal.
instance.mysql.slaveId=1000
 
1. enable gtid use true/false
canal.
instance.gtidon=false
 
1. position info
canal.
instance.master.address=172.20.10.10:3306
canal.
instance.master.journal.name=
canal.
instance.master.position=
canal.
instance.master.timestamp=
canal.
instance.master.gtid=
 
1. rds oss binlog
canal.
instance.rds.accesskey=
canal.
instance.rds.secretkey=
canal.
instance.rds.instanceId=
 
1. table meta tsdb info
canal.
instance.tsdb.enable=true
#canal.
instance.tsdb.url=jdbc:mysql://127.0.0.1:3306/canal_tsdb
#canal.
instance.tsdb.dbUsername=canal
#canal.
instance.tsdb.dbPassword=canal
 
#canal.
instance.standby.address =
#canal.
instance.standby.journal.name =
#canal.
instance.standby.position =
#canal.
instance.standby.timestamp =
#canal.
instance.standby.gtid=
 
1. username/password
canal.
instance.dbUsername=canal
canal.
instance.dbPassword=canal
canal.
instance.connectionCharset = UTF-8
1. enable druid Decrypt database password
canal.
instance.enableDruid=false
#canal.
instance.pwdPublicKey=MFwwDQYJKoZIhvcNAQEBBQADSwAwSAJBALK4BUxdDltRRE5/zXpVEVPUgunvscYFtEip3pmLlhrWpacX7y7GCMo2/JM6LeHmiiNdH1FWgGCpUfircSwlWKUCAwEAAQ==
 
1. table regex
canal.
instance.filter.regex=.*\\..*
1. table black regex
canal.
instance.filter.black.regex=
1. table field filter(format: schema1.tableName1:field1/field2,schema2.tableName2:field1/field2)
#canal.
instance.filter.field=test1.t_product:id/subject/keywords,test2.t_company:id/name/contact/ch
1. table field black filter(format: schema1.tableName1:field1/field2,schema2.tableName2:field1/field2)
#canal.
instance.filter.black.field=test1.t_product:subject/product_image,test2.t_company:id/name/contact/ch
 
1. mq config
canal.mq.topic=example
1. dynamic topic route by schema or table regex
#canal.mq.dynamicTopic=mytest1.user,mytest2\\..*,.*\\..*
canal.mq.partition=
0
1. hash partition config
#canal.mq.partitionsNum=
3
#canal.mq.partitionHash=test.table:id^name,.*\\..*
#################################################

 

5、部署 canal adapter

·下载 canal adapter 并解压

wget https://github.com/alibaba/canal/releases/download/canal-1.1.5/canal.adapter-1.1.5.tar.gz


tar canal.adapter-1.1.5.tar.gz

·修改配置文件

cd ~/canal/canal-adapter/conf/ && vim application.yml


server:
 
port: 8081
spring:
 
jackson:
   
date-format: yyyy-MM-dd HH:mm:ss
   
time-zone: GMT+8
   
default-property-inclusion: non_null
 
canal.
conf:
 
mode: tcp #tcp kafka rocketMQ rabbitMQ
 
flatMessage: true
 
zookeeperHosts:
 
syncBatchSize: 1000
 
retries: 0
 
timeout:
 
accessKey:
 
secretKey:
 
consumerProperties:
   # canal tcp consumer
   canal.tcp.server.
host: 172.20.10.10:11111
   canal.tcp.zookeeper.
hosts:
   canal.tcp.batch.
size: 500
   canal.tcp.
username: admin
canal.tcp.
password: 4ACFE3202A5FF5CF467898FC58AAB1D615029441
 
canalAdapters:
 - 
instance: mysqldb # canal instance Name or mq topic name
   
groups:
   - 
groupId: group1
     
outerAdapters:
     - 
name: logger
     - 
name: rdb
       
key: oceanbase1
       
properties:
         jdbc.
driverClassName: com.mysql.jdbc.Driver
         jdbc.
url: jdbc:mysql://172.20.10.10:2883/test?useUnicode=true
         jdbc.
username: root@sys#obdemo
         jdbc.
password: 123456

·配置映射文件

cd ~/canal/canal-adapter/conf/rdb/ && vim mytest_user.yml

佰聆实践:Oceanbase 数据迁移文档-31

·启动 canal adapter 服务

cd ~/canal/canal-adapter/bin/ && ./startup.sh

佰聆实践:Oceanbase 数据迁移文档-32

6、同步测试                                      

· mysql 创建一个业务表

佰聆实践:Oceanbase 数据迁移文档-2

· OceanBase 数据库同步创建成功

佰聆实践:Oceanbase 数据迁移文档-34

·查看日志

佰聆实践:Oceanbase 数据迁移文档-35


OceanBase 社区版入门到实战教程直播正在进行中~

快和小伙伴一起进群交流学习吧~

加入直播群方式一:

钉钉群号 3582 5151

加入直播群方式二:

扫码下方二维码加入

佰聆实践:Oceanbase 数据迁移文档-36