当前位置:网站首页>Exploration of sqoop1.4.4 native incremental import feature
Exploration of sqoop1.4.4 native incremental import feature
2022-07-03 12:37:00 【Brother Xing plays with the clouds】
Original ideas
To implement incremental import , It's completely possible to not use Sqoop The native incremental feature of , Use only shell The script generates a fixed time range based on the current time , Then joining together Sqoop Command statement .
Introduction to native incremental import features
Sqoop Provides the feature of native incremental import , It contains the following three key parameters :
Argument | Description |
---|---|
--check-column (col) | To specify a “ Flag column ” Used to determine the data range of incremental import , The column cannot be of character type , It's better to be numeric or date ( This is easy to understand ). |
--incremental (mode) | Specify incremental mode , contain “ Append mode ” append and “ Last modification mode ” lastmodified ( This mode is more suitable for common needs ). |
--last-value (value) | Appoint “ Flag column ” Upper bound of last import . If “ Flag column ” Is the last modification time , be --last-value Is the time when the import script was last executed . |
combination Saved Jobs Mechanism , You can schedule incremental updates repeatedly Job when --last-value Automatic update assignment of fields , combining cron perhaps oozie Time scheduling for , It can realize real incremental update .
experiment : The incremental job Creation and execution of
Create incremental updates job:
[email protected]:~/Sqoop/sqoop-1.4.4/bin$ sqoop job --create incretest -- import --connect jdbc:Oracle:thin:@192.168.0.138:1521:orcl --username HIVE --password hivefbi --table FBI_SQOOPTEST --hive-import --hive-table INCRETEST --incremental lastmodified --check-column LASTMODIFIED --last-value '2014/8/27 13:00:00'
14/08/27 17:29:37 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/08/27 17:29:37 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
14/08/27 17:29:37 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
14/08/27 17:29:37 WARN tool.BaseSqoopTool: It seems that you've specified at least one of following:
14/08/27 17:29:37 WARN tool.BaseSqoopTool: --hive-home
14/08/27 17:29:37 WARN tool.BaseSqoopTool: --hive-overwrite
14/08/27 17:29:37 WARN tool.BaseSqoopTool: --create-hive-table
14/08/27 17:29:37 WARN tool.BaseSqoopTool: --hive-table
14/08/27 17:29:37 WARN tool.BaseSqoopTool: --hive-partition-key
14/08/27 17:29:37 WARN tool.BaseSqoopTool: --hive-partition-value
14/08/27 17:29:37 WARN tool.BaseSqoopTool: --map-column-hive
14/08/27 17:29:37 WARN tool.BaseSqoopTool: Without specifying parameter --hive-import. Please note that
14/08/27 17:29:37 WARN tool.BaseSqoopTool: those arguments will not be used in this session. Either
14/08/27 17:29:37 WARN tool.BaseSqoopTool: specify --hive-import to apply them correctly or remove them
14/08/27 17:29:37 WARN tool.BaseSqoopTool: from command line to remove this warning.
14/08/27 17:29:37 INFO tool.BaseSqoopTool: Please note that --hive-home, --hive-partition-key,
14/08/27 17:29:37 INFO tool.BaseSqoopTool: hive-partition-value and --map-column-hive options are
14/08/27 17:29:37 INFO tool.BaseSqoopTool: are also valid for HCatalog imports and exports
perform Job:
[email protected]:~/Sqoop/sqoop-1.4.4/bin$ ./sqoop job --exec incretest
Notice what appears in the log SQL sentence :
14/08/27 17:36:23 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(ID), MAX(ID) FROM FBI_SQOOPTEST WHERE ( LASTMODIFIED >= TO_DATE('2014/8/27 13:00:00', 'YYYY-MM-DD HH24:MI:SS') AND LASTMODIFIED < TO_DATE('2014-08-27 17:36:23', 'YYYY-MM-DD HH24:MI:SS') )
among ,LASTMODIFIED The lower bound of is create job Specified in the statement of , The upper bound is current time 2014-08-27 17:36:23.
verification :
hive> select * from incretest;
OK
2 lion 2014-08-27
Time taken: 0.085 seconds, Fetched: 1 row(s)
Then I asked Oracle Insert a piece of data in :
Execute it again :
[email protected]:~/Sqoop/sqoop-1.4.4/bin$ ./sqoop job --exec incretest
Displayed in the log SQL sentence :
14/08/27 17:47:19 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(ID), MAX(ID) FROM FBI_SQOOPTEST WHERE ( LASTMODIFIED >= TO_DATE('2014-08-27 17:36:23', 'YYYY-MM-DD HH24:MI:SS') AND LASTMODIFIED < TO_DATE('2014-08-27 17:47:19', 'YYYY-MM-DD HH24:MI:SS') )
among ,LASTMODIFIED The lower bound of is the last execution of this job The upper bound of , in other words ,Sqoop Of “Saved Jobs” Mechanism for incremental import classes Job, The last execution time is automatically recorded , And automatically assign the time to the next execution --last-value Parameters ! in other words , We just need to pass crontab Set regular execution of this job that will do ,job Medium --last-value Will be “Saved Jobs” The mechanism is automatically updated to achieve real incremental import .
above Oracle The newly added data in the table is successfully inserted Hive In the table .
Again to oracle Add a new piece of data in the table , Perform the task again job, The situation remains the same , The log shows that the previous upper bound automatically becomes the lower bound of this import :
14/08/27 17:59:34 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(ID), MAX(ID) FROM FBI_SQOOPTEST WHERE ( LASTMODIFIED >= TO_DATE('2014-08-27 17:47:19', 'YYYY-MM-DD HH24:MI:SS') AND LASTMODIFIED < TO_DATE('2014-08-27 17:59:34', 'YYYY-MM-DD HH24:MI:SS') )
边栏推荐
- Summary of development issues
- If you can't learn, you have to learn. Jetpack compose writes an im app (I)
- idea将web项目打包成war包并部署到服务器上运行
- Apache Mina开发手册
- What is more elegant for flutter to log out and confirm again?
- Take you to the installation and simple use tutorial of the deveco studio compiler of harmonyos to create and run Hello world?
- Tensorflow binary installation & Failure
- 347. Top k high frequency elements
- Adult adult adult
- Implement verification code verification
猜你喜欢
Day 1 of kotlin learning: simple built-in types of kotlin
1-2 project technology selection and structure
Itext7 uses iexternalsignature container for signature and signature verification
【ArcGIS自定义脚本工具】矢量文件生成扩大矩形面要素
TOGAF认证自学宝典V2.0
Sword finger offer06 Print linked list from end to end
Take you to the installation and simple use tutorial of the deveco studio compiler of harmonyos to create and run Hello world?
Sword finger offer09 Implementing queues with two stacks
Sword finger offer04 Search in two-dimensional array [medium]
Public and private account sending prompt information (user microservice -- message microservice)
随机推荐
RedHat5 安装Socket5代理服务器
CNN MNIST handwriting recognition
The future of cloud computing cloud native
adb push apk
wpa_ cli
网上炒股开户安不安全?谁给回答一下
Sword finger offer04 Search in two-dimensional array [medium]
Swagger
error: expected reference but got (raw string)
Wechat applet pages always report errors when sending values to the background. It turned out to be this pit!
Public and private account sending prompt information (user microservice -- message microservice)
Flutter: self study system
DEJA_ Vu3d - 054 of cesium feature set - simulate the whole process of rocket launch
2020-10_ Development experience set
20. Valid brackets
手机号码变成空号导致亚马逊账号登陆两步验证失败的恢复网址及方法
剑指Offer04. 二维数组中的查找【中等】
[combinatorics] permutation and combination (summary of permutation and combination content | selection problem | set permutation | set combination)
剑指Offer10- I. 斐波那契数列
Sword finger offer07 Rebuild binary tree