当前位置:网站首页>Use sqoop to export ads layer data to MySQL
Use sqoop to export ads layer data to MySQL
2022-07-02 12:15:00 【Small base o_ O】
List of articles
background
- Use Sqoop hold ADS Export layer data to MySQL
- Use
sqoop exportTo add when--columns, Avoid some strange mistakes - Use Regular expressions Get field name
technological process
- ADS Layers are not partitioned , Uncompressed , Bank deposit
- ADS Floor construction table SQL Have separate documents , If the table is updated, the table creation statement of the file must be updated
- Table name :ADS Layer of HIVE Table has
ads_Prefix , Corresponding to MySQL Remove the prefix when creating the table - Field :ADS Layer table and MySQL Tabular The field name and field order should be consistent , use ` Symbol package
- Traverse ADS Layered TABLE statement , Get with regular expression Table name 、 All field names
- It is said that Sqoop command
Code
ADS Layered TABLE statement (ADS Floor construction table .sql)
-- HIVE Create table statement , Field use ` Symbol package , The table name does not need to be wrapped
CREATE EXTERNAL TABLE ads_purchase_order_info (
`prch_order_id` BIGINT COMMENT ' Purchase order header id',
`exfactory_total_price` DOUBLE COMMENT ' Total ex factory price ',
`insert_time` STRING COMMENT ' Data insertion date '
) COMMENT ' Purchasing information ';
MySQL Create table statement
CREATE TABLE purchase_order_info (
`prch_order_id` bigint COMMENT ' Purchase order header id',
`exfactory_total_price` DOUBLE COMMENT ' Total ex factory price ',
`insert_time` text COMMENT ' Data insertion date ',
PRIMARY KEY (`prch_order_id`)
) COMMENT ' Purchasing information ';
Python
class Sqoop(Shell):
def sqoop(self, cmd):
return self.sh_cmd_and_alert(' '.join(cmd.split()))
def sqoop_export(self, mysql_tb, export_dir, columns='', update_mode='allowinsert', update_key='prch_order_id'):
""" --columns The default is all columns ; Advice and , Avoid some inexplicable bug --update-mode The default is updateonly, Can be changed to allowinsert --update-key Is an anchor column for updating ; Multiple columns are separated by commas """
return self.sqoop(r''' {sqoop} export --connect jdbc:mysql://{host}:{port}/{database} --username '{username}' --password '{password}' --table {table} --num-mappers 1 --input-fields-terminated-by '\001' --input-null-string '\\N' --input-null-non-string '\\N' --export-dir '{export_dir}' {columns} '''.format(
sqoop=self.get('sqoop', 'sqoop'),
host=self.get('mysql_host', 'localhost'),
port=self.get('mysql_port', '3306'),
database=self['mysql_db'],
username=self.get('mysql_user', 'root'),
password=self['mysql_pwd'],
table=mysql_tb,
export_dir=export_dir,
columns=columns,
))
from re import findall
s = get_sqoop()
for ads_ddl in read_sql_file('ADS Floor construction table .sql').split(';')[:-1]:
columns = '--columns ' + ','.join(findall('`([^`]+)`', ads_ddl))
hive_tb = findall(r'CREATE EXTERNAL TABLE (\S+)', ads_ddl)[0]
mysql_tb = hive_tb.replace('ads_', '')
print(s.sqoop_export(mysql_tb, EXPORT_DIR_PREFIX + hive_tb, columns))
because it is you
边栏推荐
猜你喜欢

AI中台技术调研

Discrimination of the interval of dichotomy question brushing record (Luogu question sheet)
![[geek challenge 2019] upload](/img/04/731323142161a4994c14fedae38b81.jpg)
[geek challenge 2019] upload

堆(优先级队列)

Docker-compose配置Mysql,Redis,MongoDB

CDH存在隐患 : 该角色的进程使用的交换内存为xx兆字节。警告阈值:200字节

MySQL and PostgreSQL methods to grab slow SQL

深入理解P-R曲线、ROC与AUC

Sort---

Test shift left and right
随机推荐
[C language] Yang Hui triangle, customize the number of lines of the triangle
String palindrome hash template question o (1) judge whether the string is palindrome
Map和Set
[untitled] how to mount a hard disk in armbian
浅谈sklearn中的数据预处理
5g era, learning audio and video development, a super hot audio and video advanced development and learning classic
刷题---二叉树--2
Time format display
Those logs in MySQL
Full link voltage measurement
Drools executes the specified rule
Lombok common annotations
高性能纠删码编码
WSL 2 will not be installed yet? It's enough to read this article
drools执行指定的规则
深入理解PyTorch中的nn.Embedding
Read the Flink source code and join Alibaba cloud Flink group..
[geek challenge 2019] upload
When uploading a file, the server reports an error: iofileuploadexception: processing of multipart / form data request failed There is no space on the device
Leetcode739 daily temperature
