当前位置:网站首页>Installation and use of sqoop
Installation and use of sqoop
2022-07-01 16:34:00 【It addict】
python Programming fast ( Ongoing update …)
Recommended system basis
List of articles
One 、Sqoop Introduce
effect : Data exchange tools , Can achieve data in mysql/oracle<–>hdfs Pass each other
principle : By writing sqoop The command sqoop The order is translated into mapreduce, adopt maperdece Connect various data , Realize data transfer
Two 、Sqoop principle
3、 ... and 、Sqoop install
Prepare the installation package in advance
sqoop link
Extraction code :4tkp
Unpack the installation
Support the installation package into /software Under the table of contents
tar -zxvf sqoop-1.4.6-cdh5.14.2.tar.gz -C /opt
Get into /opt Yes sqoop Change of name
cd /opt/
mv sqoop-1.4.6-cdh5.14.2/ sqoop
Configure environment variables
vi /etc/profile
export SQOOP_HOME=/opt/sqoop
export PATH= S Q O O P H O M E / b i n : SQOOP_HOME/bin: SQOOPHOME/bin:PATH
Let the configuration file take effect
source /etc/profile
Modify the configuration file
cd sqoop/conf
mv sqoop-env-template.sh sqoop-env.sh
vi sqoop-env.sh
export HADOOP_COMMON_HOME=/opt/hadoop
export HADOOP_MAPRED_HOME=/opt/hadoop
export HIVE_HOME=/opt/hive
export ZOOKEEPER_HOME=/opt/zookeeper
export ZOOCFGDIR=/opt/zookeeper
export HBASE_HOME=/opt/hbase
Two to be prepared jar The bag dragged to /opt/sqoop/lib Under the table of contents
Verify input
sqoop help
An order means success
function sqoop1.4.5 newspaper Warning: does not exist! HCatalog jobs will fail.
Get into bin
cd vi configure-sqoop
notes
## Moved to be a runtime check in sqoop.
#if [ ! -d "${HCAT_HOME}" ]; then
# echo "Warning: $HCAT_HOME does not exist! HCatalog jobs will fail."
# echo 'Please set $HCAT_HOME to the root of your HCatalog installation.'
#fi
#if [ ! -d "${ACCUMULO_HOME}" ]; then
# echo "Warning: $ACCUMULO_HOME does not exist! Accumulo imports will fail."
# echo 'Please set $ACCUMULO_HOME to the root of your Accumulo installation.'
#fi
Four 、Sqoop Use
1、MySQL->HDFS
Get ready sql Script , Put it in the directory you know
preparation :mysql Build database, build table
mysql> create database sqoop;
mysql> use sqoop;
mysql> source /tmp/retail_db.sql
mysql> show tables;
Use sqoop take customers Table import to hdfs On
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop //mysql Database on
–driver com.mysql.jdbc.Driver
–table customers //mysql Table on
–username root //mysql user name
–password root // password
–target-dir /tmp/customers // The goal is HDFS route
–m 3 //map Number
sqoop import --connect jdbc:mysql://localhost:3306/sqoop --driver com.mysql.jdbc.Driver --table customers --username root --password root --target-dir /tmp/customers --m 3
Use where Filter
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–table orders
–where “order_id<500”
–username root
–password root
–target-dir /data1/retail_db/orders
–m 3
Use colum Filter
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop1
–driver com.mysql.jdbc.Driver
–table emp
–columns “EMPNO,ENAME,JOB,HIREDATE”
–where “SAL>2000”
–username root
–password root
–delete-target-dir
–target-dir /data1/sqoop1/emp
–m 3
Using query statements
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–query “select * from orders where order_status!=‘CLOSED’ and $CONDITIONS”
–username root
–password root
–split-by order_id
–delete-target-dir
–target-dir /data1/retail_db/orders
–m 3
Append import
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–table orders
–username root
–password root
–incremental append
–check-column order_date
–last-value ‘2014-07-24 00:00:00’
–target-dir /data1/retail_db/orders
–m 3
2、 establish job
establish job Be careful import There must be a space before
sqoop job
–create mysqlToHdfs
– import
–connect jdbc:mysql://localhost:3306/sqoop
–table orders
–username root
–password root
–incremental append
–check-column order_date
–last-value ‘0’
–target-dir /data1/retail_db/orders
–m 3
see job
sqoop job --list
perform job
sqoop job --exec mysqlToHdfs
Timing execution
crontab -e
- 2 */1 * *
sqoop job --exec mysqlToHdfs
3、 Import data to Hive in
First in Hive Create a table
hive -e “create database if not exists retail_db;”
If the target path exists, an error will be reported Delete the existing directory
hdfs dfs -rmr hdfs://hadoop1:9000/user/root/orders1
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–table orders
–username root
–password root
–hive-import
–create-hive-table
–hive-database retail_db
–hive-table orders1
–m 3
Import data to Hive partition
Delete Hive surface
drop table if exists orders;
Import
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–query “select order_id,order_status from orders where order_date>=‘2013-11-03’ and order_date <‘2013-11-04’ and $CONDITIONS”
–username root
–password ok
–delete-target-dir
–target-dir /data1/retail_db/orders
–split-by order_id
–hive-import
–hive-database retail_db
–hive-table orders
–hive-partition-key “order_date”
–hive-partition-value “2013-11-03”
–m 3
Be careful : Partition fields cannot be imported into the table as ordinary fields
4、 Import data to HBase in
1. stay HBase CSCEC table
create ‘products’,‘data’,‘category’
2.sqoop Import
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–username root
–password ok
–table products
–hbase-table products
–column-family data
–m 3
5、HDFS towards MySQL Export data from
1.MySQL CSCEC table
create table customers_demo as select * from customers where 1=2;
2. Upload data
hdfs dfs -mkdir /customerinput
hdfs dfs -put customers.csv /customerinput
3. Derived data
sqoop export
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–username root
–password root
–table customers_demo
–export-dir /customerinput
–m 1
6、 To write sqoop Script
1. Write a script job_01.opt
import
--connect
jdbc:mysql://localhost:3306/sqoop
--driver com.mysql.jdbc.Driver
--table customers
--username root
--password root
--target-dir
/data/retail_db/customers
--delete-target-dir
--m 3
2. Execute the script
sqoop --options-file job_01.opt
边栏推荐
- [每日一氵]Latex 的通讯作者怎么搞
- Authentication processing in interface testing framework
- 【Hot100】20. Valid parentheses
- Red team Chapter 8: blind guess the difficult utilization process of the package to upload vulnerabilities
- How to use MySQL language for row and column devices?
- How to write good code - Defensive Programming Guide
- [daily news]what happened to the corresponding author of latex
- 苹果自研基带芯片再次失败,说明了华为海思的技术领先性
- Is the programmer's career really short?
- She is the "HR of others" | ones character
猜你喜欢
China's intelligent transportation construction from the perspective of "one hour life circle" in Dawan District
韩国AI团队抄袭震动学界!1个导师带51个学生,还是抄袭惯犯
Problems encountered in IM instant messaging development to maintain heartbeat
Vscode find and replace the data of all files in a folder
复杂度相关OJ题(LeetCode、C语言、复杂度、消失的数字、旋转数组)
Uncover the "intelligence tax" of mousse: spend 4billion on marketing, and only 7 invention patents
独家消息:阿里云悄然推出RPA云电脑,已与多家RPA厂商开放合作
Apple's self-developed baseband chip failed again, which shows Huawei Hisilicon's technological leadership
Do280 management application deployment - pod scheduling control
Nuxt. JS data prefetching
随机推荐
The Department came to a Post-00 test paper king who took out 25K. The veteran said it was really dry, but it had been
Building blocks for domestic databases, stonedb integrated real-time HTAP database is officially open source!
The picgo shortcut is amazing. This person thinks exactly the same as me
Red team Chapter 10: ColdFusion the difficult process of deserializing WAF to exp to get the target
Guide for high-end programmers to fish at work
Tutorial on the principle and application of database system (005) -- Yum offline installation of MySQL 5.7 (Linux Environment)
Authentication processing in interface testing framework
The supply of chips has turned to excess, and the daily output of Chinese chips has increased to 1billion, which will make it more difficult for foreign chips
Virtual serial port simulator and serial port debugging assistant tutorial "suggestions collection"
Ring iron pronunciation, dynamic and noiseless, strong and brilliant, magic wave hifiair Bluetooth headset evaluation
[nodemon] app crashed - waiting for file changes before starting...解决方法
Research on multi model architecture of ads computing power chip
How to use MySQL language for row and column devices?
嗨 FUN 一夏,与 StarRocks 一起玩转 SQL Planner!
Talking from mlperf: how to lead the next wave of AI accelerator
Vscode find and replace the data of all files in a folder
Huawei issued hcsp-solution-5g security talent certification to help build 5g security talent ecosystem
【观察】数字化时代的咨询往何处走?软通咨询的思与行
Why is the pkg/errors tripartite library more recommended for go language error handling?
瑞典公布决定排除华为5G设备,但是华为已成功找到新出路