当前位置:网站首页>Use sqoop to export ads layer data to MySQL
Use sqoop to export ads layer data to MySQL
2022-07-02 12:15:00 【Small base o_ O】
List of articles
background
- Use Sqoop hold ADS Export layer data to MySQL
- Use
sqoop export
To add when--columns
, Avoid some strange mistakes - Use Regular expressions Get field name
technological process
- ADS Layers are not partitioned , Uncompressed , Bank deposit
- ADS Floor construction table SQL Have separate documents , If the table is updated, the table creation statement of the file must be updated
- Table name :ADS Layer of HIVE Table has
ads_
Prefix , Corresponding to MySQL Remove the prefix when creating the table - Field :ADS Layer table and MySQL Tabular The field name and field order should be consistent , use ` Symbol package
- Traverse ADS Layered TABLE statement , Get with regular expression Table name 、 All field names
- It is said that Sqoop command
Code
ADS Layered TABLE statement (ADS Floor construction table .sql
)
-- HIVE Create table statement , Field use ` Symbol package , The table name does not need to be wrapped
CREATE EXTERNAL TABLE ads_purchase_order_info (
`prch_order_id` BIGINT COMMENT ' Purchase order header id',
`exfactory_total_price` DOUBLE COMMENT ' Total ex factory price ',
`insert_time` STRING COMMENT ' Data insertion date '
) COMMENT ' Purchasing information ';
MySQL Create table statement
CREATE TABLE purchase_order_info (
`prch_order_id` bigint COMMENT ' Purchase order header id',
`exfactory_total_price` DOUBLE COMMENT ' Total ex factory price ',
`insert_time` text COMMENT ' Data insertion date ',
PRIMARY KEY (`prch_order_id`)
) COMMENT ' Purchasing information ';
Python
class Sqoop(Shell):
def sqoop(self, cmd):
return self.sh_cmd_and_alert(' '.join(cmd.split()))
def sqoop_export(self, mysql_tb, export_dir, columns='', update_mode='allowinsert', update_key='prch_order_id'):
""" --columns The default is all columns ; Advice and , Avoid some inexplicable bug --update-mode The default is updateonly, Can be changed to allowinsert --update-key Is an anchor column for updating ; Multiple columns are separated by commas """
return self.sqoop(r''' {sqoop} export --connect jdbc:mysql://{host}:{port}/{database} --username '{username}' --password '{password}' --table {table} --num-mappers 1 --input-fields-terminated-by '\001' --input-null-string '\\N' --input-null-non-string '\\N' --export-dir '{export_dir}' {columns} '''.format(
sqoop=self.get('sqoop', 'sqoop'),
host=self.get('mysql_host', 'localhost'),
port=self.get('mysql_port', '3306'),
database=self['mysql_db'],
username=self.get('mysql_user', 'root'),
password=self['mysql_pwd'],
table=mysql_tb,
export_dir=export_dir,
columns=columns,
))
from re import findall
s = get_sqoop()
for ads_ddl in read_sql_file('ADS Floor construction table .sql').split(';')[:-1]:
columns = '--columns ' + ','.join(findall('`([^`]+)`', ads_ddl))
hive_tb = findall(r'CREATE EXTERNAL TABLE (\S+)', ads_ddl)[0]
mysql_tb = hive_tb.replace('ads_', '')
print(s.sqoop_export(mysql_tb, EXPORT_DIR_PREFIX + hive_tb, columns))
because it is you
边栏推荐
- [C language] convert decimal numbers to binary numbers
- Map and set
- 基于Arduino和ESP8266的连接手机热点实验(成功)
- Leetcode922 按奇偶排序数组 II
- kubenetes中port、targetPort、nodePort、containerPort的区别与联系
- (C语言)3个小代码:1+2+3+···+100=?和判断一个年份是闰年还是平年?和计算圆的周长和面积?
- Test shift left and right
- mysql数据库基础
- 自然语言处理系列(一)——RNN基础
- Lekao: contents of the provisions on the responsibility of units for fire safety in the fire protection law
猜你喜欢
The blink code based on Arduino and esp8266 runs successfully (including error analysis)
使用Sqoop把ADS层数据导出到MySQL
Performance tuning project case
Find the common ancestor of any two numbers in a binary tree
Experiment of connecting mobile phone hotspot based on Arduino and esp8266 (successful)
From scratch, develop a web office suite (3): mouse events
排序---
Sort---
测试左移和右移
深入理解PyTorch中的nn.Embedding
随机推荐
Sort---
Leetcode209 subarray with the smallest length
post请求体内容无法重复获取
[geek challenge 2019] upload
计算二叉树的最大路径和
【C语言】十进制数转换成二进制数
二分刷题记录(洛谷题单)区间的甄别
Leetcode739 daily temperature
[old horse of industrial control] detailed explanation of Siemens PLC TCP protocol
单指令多数据SIMD的SSE/AVX指令集和API
Mysql database foundation
Go learning notes - multithreading
[C language] Yang Hui triangle, customize the number of lines of the triangle
Test shift left and right
Deep understanding of P-R curve, ROC and AUC
uniapp uni-list-item @click,uniapp uni-list-item带参数跳转
Addition, deletion, modification and query of MySQL table (Advanced)
记录一下MySql update会锁定哪些范围的数据
基于Arduino和ESP8266的连接手机热点实验(成功)
Le tutoriel F - String le plus facile à comprendre de l'histoire.