当前位置:网站首页>Exploration of sqoop1.4.4 native incremental import feature
Exploration of sqoop1.4.4 native incremental import feature
2022-07-03 12:37:00 【Brother Xing plays with the clouds】
Original ideas
To implement incremental import , It's completely possible to not use Sqoop The native incremental feature of , Use only shell The script generates a fixed time range based on the current time , Then joining together Sqoop Command statement .
Introduction to native incremental import features
Sqoop Provides the feature of native incremental import , It contains the following three key parameters :
Argument | Description |
|---|---|
--check-column (col) | To specify a “ Flag column ” Used to determine the data range of incremental import , The column cannot be of character type , It's better to be numeric or date ( This is easy to understand ). |
--incremental (mode) | Specify incremental mode , contain “ Append mode ” append and “ Last modification mode ” lastmodified ( This mode is more suitable for common needs ). |
--last-value (value) | Appoint “ Flag column ” Upper bound of last import . If “ Flag column ” Is the last modification time , be --last-value Is the time when the import script was last executed . |
combination Saved Jobs Mechanism , You can schedule incremental updates repeatedly Job when --last-value Automatic update assignment of fields , combining cron perhaps oozie Time scheduling for , It can realize real incremental update .
experiment : The incremental job Creation and execution of
Create incremental updates job:
[email protected]:~/Sqoop/sqoop-1.4.4/bin$ sqoop job --create incretest -- import --connect jdbc:Oracle:thin:@192.168.0.138:1521:orcl --username HIVE --password hivefbi --table FBI_SQOOPTEST --hive-import --hive-table INCRETEST --incremental lastmodified --check-column LASTMODIFIED --last-value '2014/8/27 13:00:00'
14/08/27 17:29:37 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/08/27 17:29:37 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
14/08/27 17:29:37 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
14/08/27 17:29:37 WARN tool.BaseSqoopTool: It seems that you've specified at least one of following:
14/08/27 17:29:37 WARN tool.BaseSqoopTool: --hive-home
14/08/27 17:29:37 WARN tool.BaseSqoopTool: --hive-overwrite
14/08/27 17:29:37 WARN tool.BaseSqoopTool: --create-hive-table
14/08/27 17:29:37 WARN tool.BaseSqoopTool: --hive-table
14/08/27 17:29:37 WARN tool.BaseSqoopTool: --hive-partition-key
14/08/27 17:29:37 WARN tool.BaseSqoopTool: --hive-partition-value
14/08/27 17:29:37 WARN tool.BaseSqoopTool: --map-column-hive
14/08/27 17:29:37 WARN tool.BaseSqoopTool: Without specifying parameter --hive-import. Please note that
14/08/27 17:29:37 WARN tool.BaseSqoopTool: those arguments will not be used in this session. Either
14/08/27 17:29:37 WARN tool.BaseSqoopTool: specify --hive-import to apply them correctly or remove them
14/08/27 17:29:37 WARN tool.BaseSqoopTool: from command line to remove this warning.
14/08/27 17:29:37 INFO tool.BaseSqoopTool: Please note that --hive-home, --hive-partition-key,
14/08/27 17:29:37 INFO tool.BaseSqoopTool: hive-partition-value and --map-column-hive options are
14/08/27 17:29:37 INFO tool.BaseSqoopTool: are also valid for HCatalog imports and exports
perform Job:
[email protected]:~/Sqoop/sqoop-1.4.4/bin$ ./sqoop job --exec incretest
Notice what appears in the log SQL sentence :
14/08/27 17:36:23 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(ID), MAX(ID) FROM FBI_SQOOPTEST WHERE ( LASTMODIFIED >= TO_DATE('2014/8/27 13:00:00', 'YYYY-MM-DD HH24:MI:SS') AND LASTMODIFIED < TO_DATE('2014-08-27 17:36:23', 'YYYY-MM-DD HH24:MI:SS') )
among ,LASTMODIFIED The lower bound of is create job Specified in the statement of , The upper bound is current time 2014-08-27 17:36:23.
verification :
hive> select * from incretest;
OK
2 lion 2014-08-27
Time taken: 0.085 seconds, Fetched: 1 row(s)
Then I asked Oracle Insert a piece of data in :
Execute it again :
[email protected]:~/Sqoop/sqoop-1.4.4/bin$ ./sqoop job --exec incretest
Displayed in the log SQL sentence :
14/08/27 17:47:19 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(ID), MAX(ID) FROM FBI_SQOOPTEST WHERE ( LASTMODIFIED >= TO_DATE('2014-08-27 17:36:23', 'YYYY-MM-DD HH24:MI:SS') AND LASTMODIFIED < TO_DATE('2014-08-27 17:47:19', 'YYYY-MM-DD HH24:MI:SS') )
among ,LASTMODIFIED The lower bound of is the last execution of this job The upper bound of , in other words ,Sqoop Of “Saved Jobs” Mechanism for incremental import classes Job, The last execution time is automatically recorded , And automatically assign the time to the next execution --last-value Parameters ! in other words , We just need to pass crontab Set regular execution of this job that will do ,job Medium --last-value Will be “Saved Jobs” The mechanism is automatically updated to achieve real incremental import .
above Oracle The newly added data in the table is successfully inserted Hive In the table .
Again to oracle Add a new piece of data in the table , Perform the task again job, The situation remains the same , The log shows that the previous upper bound automatically becomes the lower bound of this import :
14/08/27 17:59:34 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(ID), MAX(ID) FROM FBI_SQOOPTEST WHERE ( LASTMODIFIED >= TO_DATE('2014-08-27 17:47:19', 'YYYY-MM-DD HH24:MI:SS') AND LASTMODIFIED < TO_DATE('2014-08-27 17:59:34', 'YYYY-MM-DD HH24:MI:SS') )
边栏推荐
猜你喜欢

Solve the problem of VI opening files with ^m at the end

Summary of error prone knowledge points: Calculation of define s (x) 3*x*x+1.

Use bloc to build a page instance of shutter

TOGAF认证自学宝典V2.0

New features of ES6
![Sword finger offer03 Repeated numbers in the array [simple]](/img/cf/c1ad2f2a45560b674b5b8c11fed244.png)
Sword finger offer03 Repeated numbers in the array [simple]

Sword finger offer05 Replace spaces

Display time with message interval of more than 1 minute in wechat applet discussion area

剑指Offer03. 数组中重复的数字【简单】
![Sword finger offer04 Search in two-dimensional array [medium]](/img/c4/002c951f8d914aaea4f4133685ebd1.png)
Sword finger offer04 Search in two-dimensional array [medium]
随机推荐
It feels great to know you learned something, isn‘t it?
Flutter: self study system
剑指Offer06. 从尾到头打印链表
ImportError: No module named examples. tutorials. mnist
Summary of development issues
Self made pop-up input box, input text, and click to complete the event.
云计算未来 — 云原生
【ManageEngine】IP地址扫描的作用
剑指Offer09. 用两个栈实现队列
145. Post order traversal of binary tree
[embedded] - Introduction to four memory areas
Lambda expression
023 ([template] minimum spanning tree) (minimum spanning tree)
Approve iPad, which wants to use your icloud account
Swift Error Handling
剑指Offer04. 二维数组中的查找【中等】
社交社区论坛APP超高颜值UI界面
225. Implement stack with queue
idea将web项目打包成war包并部署到服务器上运行
regular expression