当前位置:网站首页>Use sqoop to export ads layer data to MySQL
Use sqoop to export ads layer data to MySQL
2022-07-02 12:15:00 【Small base o_ O】
List of articles
background
- Use Sqoop hold ADS Export layer data to MySQL
- Use
sqoop export
To add when--columns
, Avoid some strange mistakes - Use Regular expressions Get field name
technological process
- ADS Layers are not partitioned , Uncompressed , Bank deposit
- ADS Floor construction table SQL Have separate documents , If the table is updated, the table creation statement of the file must be updated
- Table name :ADS Layer of HIVE Table has
ads_
Prefix , Corresponding to MySQL Remove the prefix when creating the table - Field :ADS Layer table and MySQL Tabular The field name and field order should be consistent , use ` Symbol package
- Traverse ADS Layered TABLE statement , Get with regular expression Table name 、 All field names
- It is said that Sqoop command
Code
ADS Layered TABLE statement (ADS Floor construction table .sql
)
-- HIVE Create table statement , Field use ` Symbol package , The table name does not need to be wrapped
CREATE EXTERNAL TABLE ads_purchase_order_info (
`prch_order_id` BIGINT COMMENT ' Purchase order header id',
`exfactory_total_price` DOUBLE COMMENT ' Total ex factory price ',
`insert_time` STRING COMMENT ' Data insertion date '
) COMMENT ' Purchasing information ';
MySQL Create table statement
CREATE TABLE purchase_order_info (
`prch_order_id` bigint COMMENT ' Purchase order header id',
`exfactory_total_price` DOUBLE COMMENT ' Total ex factory price ',
`insert_time` text COMMENT ' Data insertion date ',
PRIMARY KEY (`prch_order_id`)
) COMMENT ' Purchasing information ';
Python
class Sqoop(Shell):
def sqoop(self, cmd):
return self.sh_cmd_and_alert(' '.join(cmd.split()))
def sqoop_export(self, mysql_tb, export_dir, columns='', update_mode='allowinsert', update_key='prch_order_id'):
""" --columns The default is all columns ; Advice and , Avoid some inexplicable bug --update-mode The default is updateonly, Can be changed to allowinsert --update-key Is an anchor column for updating ; Multiple columns are separated by commas """
return self.sqoop(r''' {sqoop} export --connect jdbc:mysql://{host}:{port}/{database} --username '{username}' --password '{password}' --table {table} --num-mappers 1 --input-fields-terminated-by '\001' --input-null-string '\\N' --input-null-non-string '\\N' --export-dir '{export_dir}' {columns} '''.format(
sqoop=self.get('sqoop', 'sqoop'),
host=self.get('mysql_host', 'localhost'),
port=self.get('mysql_port', '3306'),
database=self['mysql_db'],
username=self.get('mysql_user', 'root'),
password=self['mysql_pwd'],
table=mysql_tb,
export_dir=export_dir,
columns=columns,
))
from re import findall
s = get_sqoop()
for ads_ddl in read_sql_file('ADS Floor construction table .sql').split(';')[:-1]:
columns = '--columns ' + ','.join(findall('`([^`]+)`', ads_ddl))
hive_tb = findall(r'CREATE EXTERNAL TABLE (\S+)', ads_ddl)[0]
mysql_tb = hive_tb.replace('ads_', '')
print(s.sqoop_export(mysql_tb, EXPORT_DIR_PREFIX + hive_tb, columns))
because it is you
边栏推荐
- Leetcode14 longest public prefix
- The blink code based on Arduino and esp8266 runs successfully (including error analysis)
- Find the factorial of a positive integer within 16, that is, the class of n (0= < n < =16). Enter 1111 to exit.
- 5g era, learning audio and video development, a super hot audio and video advanced development and learning classic
- Codeforces 771 div2 B (no one FST, refers to himself)
- MySQL与PostgreSQL抓取慢sql的方法
- Initial JDBC programming
- LeetCode—<动态规划专项>剑指 Offer 19、49、60
- (C language) octal conversion decimal
- 基于Arduino和ESP8266的连接手机热点实验(成功)
猜你喜欢
CDH存在隐患 : 该角色的进程使用的交换内存为xx兆字节。警告阈值:200字节
刷题---二叉树--2
Applet link generation
甜心教主:王心凌
Lekao: contents of the provisions on the responsibility of units for fire safety in the fire protection law
The differences and relationships among port, targetport, nodeport and containerport in kubenetes
深入理解PyTorch中的nn.Embedding
自然语言处理系列(三)——LSTM
排序---
arcgis js 4. Add pictures to x map
随机推荐
Larvel modify table fields
输入一个三位的数字,输出它的个位数,十位数、百位数。
计算二叉树的最大路径和
arcgis js 4. Add pictures to x map
String palindrome hash template question o (1) judge whether the string is palindrome
SVO2系列之深度滤波DepthFilter
PyTorch nn. Full analysis of RNN parameters
Brush questions --- binary tree --2
Experiment of connecting mobile phone hotspot based on Arduino and esp8266 (successful)
Leetcode922 sort array by parity II
Intel 内部指令 --- AVX和AVX2学习笔记
浅谈sklearn中的数据预处理
史上最易懂的f-string教程,收藏這一篇就够了
(C语言)3个小代码:1+2+3+···+100=?和判断一个年份是闰年还是平年?和计算圆的周长和面积?
Fastdateformat why thread safe
ES集群中节点与分片的区别
MySQL indexes and transactions
(C language) input a line of characters and count the number of English letters, spaces, numbers and other characters.
Deep understanding of NN in pytorch Embedding
[C language] Yang Hui triangle, customize the number of lines of the triangle