当前位置:网站首页>DataX 的使用
DataX 的使用
2022-08-02 14:05:00 【boyzwz】
一、DataX 的部署
1、上传 datax 压缩包并解压
tar -zxvf datax.tar.gz -C /usr/local/soft/
2、自检,执行命令(在datax目录下)
[[email protected] datax]# python ./bin/datax.py ./job/job.json
安装成功
二、DataX 的使用
MySQL写入MySQL
1、生成模板命令
[[email protected] datax]# python ./bin/datax.py -r mysqlreader -w mysqlwriter
2、根据读写的数据源,获取json模板;可根据官网修改 json 完成数据的同步
{
"job": {
"content": [{
"reader": {
"name": "mysqlreader",
"parameter": {
"column": [
"id",
"name",
"age",
"gender",
"clazz",
"last_mod"
],
"connection": [{
"jdbcUrl": ["jdbc:mysql://master:3306/student"],
"table": ["student"]
}],
"password": "123456",
"username": "root"
}
},
"writer": {
"name": "mysqlwriter",
"parameter": {
"column": [
"id",
"name",
"age",
"gender",
"clazz",
"last_mod"
],
"connection": [{
"jdbcUrl": "jdbc:mysql://master:3306/student2?useUnicode=true&characterEncoding=utf8",
"table": ["student2"]
}],
"preSql": [
"truncate table student2"
],
"password": "123456",
"username": "root",
"writeMode": "insert"
}
}
}],
"setting": {
"speed": {
"channel": "5"
}
}
}
}
3、执行
[[email protected] dataxjsons]# datax.py mysql2mysql.json
MySQL写入HDFS
1、生成模板
[[email protected] dataxjsons]# python /usr/local/soft/datax/bin/datax.py -r mysqlreader -w hdfswriter
2、修改
{
"job": {
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"column": ["*"],
"connection": [
{
"jdbcUrl": ["jdbc:mysql://master:3306/student"],
"table": ["student"]
}
],
"password": "123456",
"username": "root"
}
},
"writer": {
"name": "hdfswriter",
"parameter": {
"column": [
{
"name": "col1",
"type": "int"
},
{
"name": "col2",
"type": "String"
},
{
"name": "col3",
"type": "int"
},
{
"name": "col4",
"type": "String"
},
{
"name": "col5",
"type": "String"
},
{
"name": "col6",
"type": "Date"
}
],
"defaultFS": "hdfs://master:9000",
"fieldDelimiter": ",",
"fileName": "msql2hdfs",
"fileType": "text",
"path": "/shujia/bigdata17/datax/",
"writeMode": "append"
}
}
}
],
"setting": {
"speed": {
"channel": "1"
}
}
}
}
3、执行
[[email protected] dataxjsons]# datax.py mysql2hdfs.json
MySQL同步数据到Hive
1、hive建表
CREATE EXTERNAL TABLE IF NOT EXISTS student2(
id BIGINT,
name STRING,
age INT,
gender STRING,
clazz STRING,
last_mod STRING
)
comment '学生表'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
2、生成模板
[[email protected] dataxjsons]# python /usr/local/soft/datax/bin/datax.py -r mysqlreader -w hdfswriter
3、修改
{
"job": {
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"column": ["*"],
"connection": [
{
"jdbcUrl": ["jdbc:mysql://master:3306/student"],
"table": ["student"]
}
],
"password": "123456",
"username": "root"
}
},
"writer": {
"name": "hdfswriter",
"parameter": {
"column": [
{
"name": "id",
"type": "bigint"
},
{
"name": "name",
"type": "string"
},
{
"name": "age",
"type": "INT"
},
{
"name": "gender",
"type": "string"
},
{
"name": "clazz",
"type": "string"
},
{
"name": "last_mod",
"type": "string"
}
],
"defaultFS": "hdfs://master:9000",
"fieldDelimiter": ",",
"fileName": "student2",
"fileType": "text",
"path": "/user/hive/warehouse/bigdata17.db/student2/",
"writeMode": "append"
}
}
}
],
"setting": {
"speed": {
"channel": "1"
}
}
}
}
4、执行
[[email protected] dataxjsons]# datax.py mysql2hive.json
向Hive中同步数据,即向HDFS上Hive表目录下同步数据
增量同步,可在where后添加条件("where": "id > 7")
Mysql向HBase同步数据
1、hbase创建表
hbase(main):003:0> create 'datastudent','info'
2、生成模板
[[email protected] dataxjsons]# python /usr/local/soft/datax/bin/datax.py -r mysqlreader -w hbase11xwriter
3、修改
{
"job": {
"content": [{
"reader": {
"name": "mysqlreader",
"parameter": {
"column": [
"id",
"name",
"age",
"gender",
"clazz",
"last_mod"
],
"connection": [{
"jdbcUrl": ["jdbc:mysql://master:3306/student"],
"table": ["student"]
}],
"password": "123456",
"username": "root"
}
},
"writer": {
"name": "hbase11xwriter",
"parameter": {
"column": [{
"index": 1,
"name": "info:name",
"type": "string"
},
{
"index": 2,
"name": "info:age",
"type": "int"
},
{
"index": 3,
"name": "info:gender",
"type": "string"
},
{
"index": 5,
"name": "info:last_mod",
"type": "string"
}
],
"encoding": "utf-8",
"hbaseConfig": {
"hbase.zookeeper.quorum": "master:2181,node1:2181,node2:2181"
},
"mode": "normal",
"rowkeyColumn": [{
"index": 0,
"type": "string"
},
{
"index": -1,
"type": "string",
"value": "_"
},
{
"index": 4,
"type": "string"
}
],
"table": "datastudent"
}
}
}],
"setting": {
"speed": {
"channel": "5"
}
}
}
}
4、执行
[[email protected] dataxjsons]# datax.py mysql2hbase.json
HBase同步数据到MySQL
1、生成模板
python /usr/local/soft/datax/bin/datax.py -r hbase11xreader -w mysqlwriter
2、修改
{
"job": {
"content": [
{
"reader": {
"name": "hbase11xreader",
"parameter": {
"column": [
{
"name": "rowkey",
"type": "string"
},
{
"name": "info: name",
"type": "string"
},
{
"name": "info: age",
"type": "int"
},
{
"name": "info: gender",
"type": "string"
},
{
"name": "info: last_mod",
"type": "string"
},
],
"encoding": "utf-8",
"hbaseConfig": {
"hbase.zookeeper.quorum": "master:2181,node1:2181,node2:2181"
},
"mode": "normal",
"table": "datastudent"
}
},
"writer": {
"name": "mysqlwriter",
"parameter": {
"column": [
"id",
"name",
"age",
"gender",
"last_mod"
],
"connection": [
{
"jdbcUrl": "jdbc:mysql://master:3306/student?useUnicode=true&characterEncoding=utf8",
"table": ["student_copy1"]
}
],
"password": "123456",
"username": "root",
"writeMode": "append"
}
}
}
],
"setting": {
"speed": {
"channel": "5"
}
}
}
}
3、执行
[[email protected] dataxjsons]# datax.py hbase2mysql.json
可在MySQL中查看从hbase同步过来的数据
边栏推荐
猜你喜欢
随机推荐
getUserProfile接口不显示用户性别和地区
MarkDown syntax summary
mysql常用函数
drf view component
存储系统Cache(知识点+例题)
二进制乘法运算
Raj delivery notes - separation 第08 speak, speaking, reading and writing
Flask framework in-depth two
二级指针,数组指针,指针数组和函数指针
Visual Studio配置OpenCV之后,提示:#include<opencv2/opencv.hpp>无法打开源文件
ToF相机从Camera2 API中获取DEPTH16格式深度图
verilog学习|《Verilog数字系统设计教程》夏宇闻 第三版思考题答案(第十章)
drf routing component Routers
科创知识年度盛会,中国科创者大会8月6日首场开幕!
C语言日记 5、7setprecision()问题
VS Code远程开发及免密配置
C语言一维数组练习——将一个字符串中的某个字符替换成其它字符
芝诺悖论的理解
每周招聘|PostgreSQL专家,年薪60+,高能力高薪资
MySQL知识总结 (四) 事务