当前位置:网站首页>DataX 的使用
DataX 的使用
2022-08-02 14:05:00 【boyzwz】
一、DataX 的部署
1、上传 datax 压缩包并解压
tar -zxvf datax.tar.gz -C /usr/local/soft/
2、自检,执行命令(在datax目录下)
[[email protected] datax]# python ./bin/datax.py ./job/job.json
安装成功
二、DataX 的使用
MySQL写入MySQL
1、生成模板命令
[[email protected] datax]# python ./bin/datax.py -r mysqlreader -w mysqlwriter
2、根据读写的数据源,获取json模板;可根据官网修改 json 完成数据的同步
{
"job": {
"content": [{
"reader": {
"name": "mysqlreader",
"parameter": {
"column": [
"id",
"name",
"age",
"gender",
"clazz",
"last_mod"
],
"connection": [{
"jdbcUrl": ["jdbc:mysql://master:3306/student"],
"table": ["student"]
}],
"password": "123456",
"username": "root"
}
},
"writer": {
"name": "mysqlwriter",
"parameter": {
"column": [
"id",
"name",
"age",
"gender",
"clazz",
"last_mod"
],
"connection": [{
"jdbcUrl": "jdbc:mysql://master:3306/student2?useUnicode=true&characterEncoding=utf8",
"table": ["student2"]
}],
"preSql": [
"truncate table student2"
],
"password": "123456",
"username": "root",
"writeMode": "insert"
}
}
}],
"setting": {
"speed": {
"channel": "5"
}
}
}
}
3、执行
[[email protected] dataxjsons]# datax.py mysql2mysql.json
MySQL写入HDFS
1、生成模板
[[email protected] dataxjsons]# python /usr/local/soft/datax/bin/datax.py -r mysqlreader -w hdfswriter
2、修改
{
"job": {
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"column": ["*"],
"connection": [
{
"jdbcUrl": ["jdbc:mysql://master:3306/student"],
"table": ["student"]
}
],
"password": "123456",
"username": "root"
}
},
"writer": {
"name": "hdfswriter",
"parameter": {
"column": [
{
"name": "col1",
"type": "int"
},
{
"name": "col2",
"type": "String"
},
{
"name": "col3",
"type": "int"
},
{
"name": "col4",
"type": "String"
},
{
"name": "col5",
"type": "String"
},
{
"name": "col6",
"type": "Date"
}
],
"defaultFS": "hdfs://master:9000",
"fieldDelimiter": ",",
"fileName": "msql2hdfs",
"fileType": "text",
"path": "/shujia/bigdata17/datax/",
"writeMode": "append"
}
}
}
],
"setting": {
"speed": {
"channel": "1"
}
}
}
}
3、执行
[[email protected] dataxjsons]# datax.py mysql2hdfs.json
MySQL同步数据到Hive
1、hive建表
CREATE EXTERNAL TABLE IF NOT EXISTS student2(
id BIGINT,
name STRING,
age INT,
gender STRING,
clazz STRING,
last_mod STRING
)
comment '学生表'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
2、生成模板
[[email protected] dataxjsons]# python /usr/local/soft/datax/bin/datax.py -r mysqlreader -w hdfswriter
3、修改
{
"job": {
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"column": ["*"],
"connection": [
{
"jdbcUrl": ["jdbc:mysql://master:3306/student"],
"table": ["student"]
}
],
"password": "123456",
"username": "root"
}
},
"writer": {
"name": "hdfswriter",
"parameter": {
"column": [
{
"name": "id",
"type": "bigint"
},
{
"name": "name",
"type": "string"
},
{
"name": "age",
"type": "INT"
},
{
"name": "gender",
"type": "string"
},
{
"name": "clazz",
"type": "string"
},
{
"name": "last_mod",
"type": "string"
}
],
"defaultFS": "hdfs://master:9000",
"fieldDelimiter": ",",
"fileName": "student2",
"fileType": "text",
"path": "/user/hive/warehouse/bigdata17.db/student2/",
"writeMode": "append"
}
}
}
],
"setting": {
"speed": {
"channel": "1"
}
}
}
}
4、执行
[[email protected] dataxjsons]# datax.py mysql2hive.json
向Hive中同步数据,即向HDFS上Hive表目录下同步数据
增量同步,可在where后添加条件("where": "id > 7")
Mysql向HBase同步数据
1、hbase创建表
hbase(main):003:0> create 'datastudent','info'
2、生成模板
[[email protected] dataxjsons]# python /usr/local/soft/datax/bin/datax.py -r mysqlreader -w hbase11xwriter
3、修改
{
"job": {
"content": [{
"reader": {
"name": "mysqlreader",
"parameter": {
"column": [
"id",
"name",
"age",
"gender",
"clazz",
"last_mod"
],
"connection": [{
"jdbcUrl": ["jdbc:mysql://master:3306/student"],
"table": ["student"]
}],
"password": "123456",
"username": "root"
}
},
"writer": {
"name": "hbase11xwriter",
"parameter": {
"column": [{
"index": 1,
"name": "info:name",
"type": "string"
},
{
"index": 2,
"name": "info:age",
"type": "int"
},
{
"index": 3,
"name": "info:gender",
"type": "string"
},
{
"index": 5,
"name": "info:last_mod",
"type": "string"
}
],
"encoding": "utf-8",
"hbaseConfig": {
"hbase.zookeeper.quorum": "master:2181,node1:2181,node2:2181"
},
"mode": "normal",
"rowkeyColumn": [{
"index": 0,
"type": "string"
},
{
"index": -1,
"type": "string",
"value": "_"
},
{
"index": 4,
"type": "string"
}
],
"table": "datastudent"
}
}
}],
"setting": {
"speed": {
"channel": "5"
}
}
}
}
4、执行
[[email protected] dataxjsons]# datax.py mysql2hbase.json
HBase同步数据到MySQL
1、生成模板
python /usr/local/soft/datax/bin/datax.py -r hbase11xreader -w mysqlwriter
2、修改
{
"job": {
"content": [
{
"reader": {
"name": "hbase11xreader",
"parameter": {
"column": [
{
"name": "rowkey",
"type": "string"
},
{
"name": "info: name",
"type": "string"
},
{
"name": "info: age",
"type": "int"
},
{
"name": "info: gender",
"type": "string"
},
{
"name": "info: last_mod",
"type": "string"
},
],
"encoding": "utf-8",
"hbaseConfig": {
"hbase.zookeeper.quorum": "master:2181,node1:2181,node2:2181"
},
"mode": "normal",
"table": "datastudent"
}
},
"writer": {
"name": "mysqlwriter",
"parameter": {
"column": [
"id",
"name",
"age",
"gender",
"last_mod"
],
"connection": [
{
"jdbcUrl": "jdbc:mysql://master:3306/student?useUnicode=true&characterEncoding=utf8",
"table": ["student_copy1"]
}
],
"password": "123456",
"username": "root",
"writeMode": "append"
}
}
}
],
"setting": {
"speed": {
"channel": "5"
}
}
}
}
3、执行
[[email protected] dataxjsons]# datax.py hbase2mysql.json
可在MySQL中查看从hbase同步过来的数据
边栏推荐
- MySQL 8.0 新特性
- The IDEA of packaged jar package
- St. Regis Takeaway Notes - Lecture 10 Swagger
- Using the cloud GPU + pycharm training model to realize automatic background run programs, save training results, the server automatically power off
- Error Correction Design Principle of Hamming Check Code
- Kubernetes介绍
- 【Camera2】由Camera2 特性想到的有关MED(多场景设备互动)的场景Idea
- C语言初级—水仙花数
- 原码、补码、反码
- PHP open source customer service system _ online customer service source code PHP
猜你喜欢
随机推荐
PHP open source customer service system _ online customer service source code PHP
标签加id 和 加号 两个文本框 和一个var 赋值
我理解的学习金字塔
c语言三子棋详解!!! (电脑智能下棋)(附上完整代码)
uni-app页面、组件视图数据无法刷新问题的解决办法
Tornado framework routing system introduction and (IOloop.current().start()) start source code analysis
无人驾驶综述:摘要
What's wrong with running yolov5 (1) p, r, map are all 0
函数递归和动态内存初识
MySQL知识总结 (五) 锁
Redis-01-Nosql概述
Caused by: org.gradle.api.internal.plugins.PluginApplicationException: Failed to apply plugin [id ‘c
MySQL知识总结 (二) 存储引擎
二级指针,数组指针,指针数组和函数指针
主存储器(一)
drf serializer - Serializer
Building and getting started with the Flask framework
Raj delivery notes - separation 第08 speak, speaking, reading and writing
Flask-SQLAlchemy
window10 lower semi-automatic labeling