当前位置:网站首页>Use sqoop to export ads layer data to MySQL
Use sqoop to export ads layer data to MySQL
2022-07-02 12:15:00 【Small base o_ O】
List of articles
background
- Use Sqoop hold ADS Export layer data to MySQL
- Use
sqoop exportTo add when--columns, Avoid some strange mistakes - Use Regular expressions Get field name
technological process
- ADS Layers are not partitioned , Uncompressed , Bank deposit
- ADS Floor construction table SQL Have separate documents , If the table is updated, the table creation statement of the file must be updated
- Table name :ADS Layer of HIVE Table has
ads_Prefix , Corresponding to MySQL Remove the prefix when creating the table - Field :ADS Layer table and MySQL Tabular The field name and field order should be consistent , use ` Symbol package
- Traverse ADS Layered TABLE statement , Get with regular expression Table name 、 All field names
- It is said that Sqoop command
Code
ADS Layered TABLE statement (ADS Floor construction table .sql)
-- HIVE Create table statement , Field use ` Symbol package , The table name does not need to be wrapped
CREATE EXTERNAL TABLE ads_purchase_order_info (
`prch_order_id` BIGINT COMMENT ' Purchase order header id',
`exfactory_total_price` DOUBLE COMMENT ' Total ex factory price ',
`insert_time` STRING COMMENT ' Data insertion date '
) COMMENT ' Purchasing information ';
MySQL Create table statement
CREATE TABLE purchase_order_info (
`prch_order_id` bigint COMMENT ' Purchase order header id',
`exfactory_total_price` DOUBLE COMMENT ' Total ex factory price ',
`insert_time` text COMMENT ' Data insertion date ',
PRIMARY KEY (`prch_order_id`)
) COMMENT ' Purchasing information ';
Python
class Sqoop(Shell):
def sqoop(self, cmd):
return self.sh_cmd_and_alert(' '.join(cmd.split()))
def sqoop_export(self, mysql_tb, export_dir, columns='', update_mode='allowinsert', update_key='prch_order_id'):
""" --columns The default is all columns ; Advice and , Avoid some inexplicable bug --update-mode The default is updateonly, Can be changed to allowinsert --update-key Is an anchor column for updating ; Multiple columns are separated by commas """
return self.sqoop(r''' {sqoop} export --connect jdbc:mysql://{host}:{port}/{database} --username '{username}' --password '{password}' --table {table} --num-mappers 1 --input-fields-terminated-by '\001' --input-null-string '\\N' --input-null-non-string '\\N' --export-dir '{export_dir}' {columns} '''.format(
sqoop=self.get('sqoop', 'sqoop'),
host=self.get('mysql_host', 'localhost'),
port=self.get('mysql_port', '3306'),
database=self['mysql_db'],
username=self.get('mysql_user', 'root'),
password=self['mysql_pwd'],
table=mysql_tb,
export_dir=export_dir,
columns=columns,
))
from re import findall
s = get_sqoop()
for ads_ddl in read_sql_file('ADS Floor construction table .sql').split(';')[:-1]:
columns = '--columns ' + ','.join(findall('`([^`]+)`', ads_ddl))
hive_tb = findall(r'CREATE EXTERNAL TABLE (\S+)', ads_ddl)[0]
mysql_tb = hive_tb.replace('ads_', '')
print(s.sqoop_export(mysql_tb, EXPORT_DIR_PREFIX + hive_tb, columns))
because it is you
边栏推荐
- [geek challenge 2019] upload
- [old horse of industrial control] detailed explanation of Siemens PLC TCP protocol
- CDH6之Sqoop添加数据库驱动
- 全链路压测
- Go learning notes - multithreading
- Pytorch builds LSTM to realize clothing classification (fashionmnist)
- LeetCode—剑指 Offer 51. 数组中的逆序对
- LeetCode—剑指 Offer 59 - I、59 - II
- drools执行String规则或执行某个规则文件
- 5g era, learning audio and video development, a super hot audio and video advanced development and learning classic
猜你喜欢

Differences between nodes and sharding in ES cluster

SVO2系列之深度滤波DepthFilter

PyTorch nn.RNN 参数全解析

mysql表的增删改查(进阶)
![[QT] Qt development environment installation (QT version 5.14.2 | QT download | QT installation)](/img/18/f0c9ef6250a717f8e66c95da4de08c.jpg)
[QT] Qt development environment installation (QT version 5.14.2 | QT download | QT installation)

深入理解PyTorch中的nn.Embedding

MySQL与PostgreSQL抓取慢sql的方法

Natural language processing series (I) -- RNN Foundation

Find the common ancestor of any two numbers in a binary tree

CDA data analysis -- Introduction and use of aarrr growth model
随机推荐
CONDA common command summary
On data preprocessing in sklearn
drools执行完某个规则后终止别的规则执行
FastDateFormat为什么线程安全
MySQL与PostgreSQL抓取慢sql的方法
Take you ten days to easily finish the finale of go micro services (distributed transactions)
CDH6之Sqoop添加数据库驱动
arcgis js 4. Add pictures to x map
drools中then部分的写法
When uploading a file, the server reports an error: iofileuploadexception: processing of multipart / form data request failed There is no space on the device
PyTorch nn. Full analysis of RNN parameters
[untitled] how to mount a hard disk in armbian
From scratch, develop a web office suite (3): mouse events
Small guide for rapid formation of manipulator (VII): description method of position and posture of manipulator
深入理解PyTorch中的nn.Embedding
HR wonderful dividing line
Leetcode739 daily temperature
Lekao: contents of the provisions on the responsibility of units for fire safety in the fire protection law
使用Sqoop把ADS层数据导出到MySQL
子线程获取Request
