当前位置:网站首页>Installation and use of sqoop
Installation and use of sqoop
2022-07-01 16:34:00 【It addict】
python Programming fast ( Ongoing update …)
Recommended system basis
List of articles
One 、Sqoop Introduce
effect : Data exchange tools , Can achieve data in mysql/oracle<–>hdfs Pass each other
principle : By writing sqoop The command sqoop The order is translated into mapreduce, adopt maperdece Connect various data , Realize data transfer
Two 、Sqoop principle
3、 ... and 、Sqoop install
Prepare the installation package in advance
sqoop link
Extraction code :4tkp
Unpack the installation
Support the installation package into /software Under the table of contents
tar -zxvf sqoop-1.4.6-cdh5.14.2.tar.gz -C /opt
Get into /opt Yes sqoop Change of name
cd /opt/
mv sqoop-1.4.6-cdh5.14.2/ sqoop
Configure environment variables
vi /etc/profile
export SQOOP_HOME=/opt/sqoop
export PATH= S Q O O P H O M E / b i n : SQOOP_HOME/bin: SQOOPHOME/bin:PATH
Let the configuration file take effect
source /etc/profile
Modify the configuration file
cd sqoop/conf
mv sqoop-env-template.sh sqoop-env.sh
vi sqoop-env.sh
export HADOOP_COMMON_HOME=/opt/hadoop
export HADOOP_MAPRED_HOME=/opt/hadoop
export HIVE_HOME=/opt/hive
export ZOOKEEPER_HOME=/opt/zookeeper
export ZOOCFGDIR=/opt/zookeeper
export HBASE_HOME=/opt/hbase
Two to be prepared jar The bag dragged to /opt/sqoop/lib Under the table of contents
Verify input
sqoop help
An order means success
function sqoop1.4.5 newspaper Warning: does not exist! HCatalog jobs will fail.
Get into bin
cd vi configure-sqoop
notes
## Moved to be a runtime check in sqoop.
#if [ ! -d "${HCAT_HOME}" ]; then
# echo "Warning: $HCAT_HOME does not exist! HCatalog jobs will fail."
# echo 'Please set $HCAT_HOME to the root of your HCatalog installation.'
#fi
#if [ ! -d "${ACCUMULO_HOME}" ]; then
# echo "Warning: $ACCUMULO_HOME does not exist! Accumulo imports will fail."
# echo 'Please set $ACCUMULO_HOME to the root of your Accumulo installation.'
#fi
Four 、Sqoop Use
1、MySQL->HDFS
Get ready sql Script , Put it in the directory you know
preparation :mysql Build database, build table
mysql> create database sqoop;
mysql> use sqoop;
mysql> source /tmp/retail_db.sql
mysql> show tables;
Use sqoop take customers Table import to hdfs On
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop //mysql Database on
–driver com.mysql.jdbc.Driver
–table customers //mysql Table on
–username root //mysql user name
–password root // password
–target-dir /tmp/customers // The goal is HDFS route
–m 3 //map Number
sqoop import --connect jdbc:mysql://localhost:3306/sqoop --driver com.mysql.jdbc.Driver --table customers --username root --password root --target-dir /tmp/customers --m 3
Use where Filter
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–table orders
–where “order_id<500”
–username root
–password root
–target-dir /data1/retail_db/orders
–m 3
Use colum Filter
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop1
–driver com.mysql.jdbc.Driver
–table emp
–columns “EMPNO,ENAME,JOB,HIREDATE”
–where “SAL>2000”
–username root
–password root
–delete-target-dir
–target-dir /data1/sqoop1/emp
–m 3
Using query statements
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–query “select * from orders where order_status!=‘CLOSED’ and $CONDITIONS”
–username root
–password root
–split-by order_id
–delete-target-dir
–target-dir /data1/retail_db/orders
–m 3
Append import
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–table orders
–username root
–password root
–incremental append
–check-column order_date
–last-value ‘2014-07-24 00:00:00’
–target-dir /data1/retail_db/orders
–m 3
2、 establish job
establish job Be careful import There must be a space before
sqoop job
–create mysqlToHdfs
– import
–connect jdbc:mysql://localhost:3306/sqoop
–table orders
–username root
–password root
–incremental append
–check-column order_date
–last-value ‘0’
–target-dir /data1/retail_db/orders
–m 3
see job
sqoop job --list
perform job
sqoop job --exec mysqlToHdfs
Timing execution
crontab -e
- 2 */1 * *
sqoop job --exec mysqlToHdfs
3、 Import data to Hive in
First in Hive Create a table
hive -e “create database if not exists retail_db;”
If the target path exists, an error will be reported Delete the existing directory
hdfs dfs -rmr hdfs://hadoop1:9000/user/root/orders1
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–table orders
–username root
–password root
–hive-import
–create-hive-table
–hive-database retail_db
–hive-table orders1
–m 3
Import data to Hive partition
Delete Hive surface
drop table if exists orders;
Import
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–query “select order_id,order_status from orders where order_date>=‘2013-11-03’ and order_date <‘2013-11-04’ and $CONDITIONS”
–username root
–password ok
–delete-target-dir
–target-dir /data1/retail_db/orders
–split-by order_id
–hive-import
–hive-database retail_db
–hive-table orders
–hive-partition-key “order_date”
–hive-partition-value “2013-11-03”
–m 3
Be careful : Partition fields cannot be imported into the table as ordinary fields
4、 Import data to HBase in
1. stay HBase CSCEC table
create ‘products’,‘data’,‘category’
2.sqoop Import
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–username root
–password ok
–table products
–hbase-table products
–column-family data
–m 3
5、HDFS towards MySQL Export data from
1.MySQL CSCEC table
create table customers_demo as select * from customers where 1=2;
2. Upload data
hdfs dfs -mkdir /customerinput
hdfs dfs -put customers.csv /customerinput
3. Derived data
sqoop export
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–username root
–password root
–table customers_demo
–export-dir /customerinput
–m 1
6、 To write sqoop Script
1. Write a script job_01.opt
import
--connect
jdbc:mysql://localhost:3306/sqoop
--driver com.mysql.jdbc.Driver
--table customers
--username root
--password root
--target-dir
/data/retail_db/customers
--delete-target-dir
--m 3
2. Execute the script
sqoop --options-file job_01.opt
边栏推荐
- 數據庫系統原理與應用教程(006)—— 編譯安裝 MySQL5.7(Linux 環境)
- Principle of SSM framework
- 接口测试框架中的鉴权处理
- Ring iron pronunciation, dynamic and noiseless, strong and brilliant, magic wave hifiair Bluetooth headset evaluation
- SQLServer查询: a.id与b.id相同时,a.id对应的a.p在b.id对应的b.p里找不到的话,就显示出这个a.id和a.p
- In the era of super video, what kind of technology will become the base?
- Uncover the "intelligence tax" of mousse: spend 4billion on marketing, and only 7 invention patents
- VMware 虛擬機啟動時出現故障:VMware Workstation 與 Hyper-v 不兼容...
- Huawei issued hcsp-solution-5g security talent certification to help build 5g security talent ecosystem
- process. env. NODE_ ENV
猜你喜欢
[nodemon] app crashed - waiting for file changes before starting... resolvent
Ring iron pronunciation, dynamic and noiseless, strong and brilliant, magic wave hifiair Bluetooth headset evaluation
[jetsonnano] [tutorial] [introductory series] [III] build tensorflow environment
2022 Moonriver global hacker song winning project list
广东用电量大跌,说明高新技术产业替代高能耗产业已取得初步成果
IM即時通訊開發實現心跳保活遇到的問題
In the past six months, it has been invested by five "giants", and this intelligent driving "dark horse" is sought after by capital
Apple's self-developed baseband chip failed again, which shows Huawei Hisilicon's technological leadership
投稿开奖丨轻量应用服务器征文活动(5月)奖励公布
How to use MySQL language for row and column devices?
随机推荐
Share the daily work and welfare of DJI (Shenzhen headquarters) in Dajiang
Comprehensively view the value of enterprise digital transformation
韩国AI团队抄袭震动学界!1个导师带51个学生,还是抄袭惯犯
[nodemon] app crashed - waiting for file changes before starting...解决方法
【Hot100】17. 电话号码的字母组合
Apple's self-developed baseband chip failed again, which shows Huawei Hisilicon's technological leadership
Go 语言错误处理为什么更推荐使用 pkg/errors 三方库?
程序员职业生涯真的很短吗?
Does 1.5.1 in Seata support mysql8?
What is the digital transformation of manufacturing industry
Germany if was crowned with many awards. How strong is this pair of headphones? In depth evaluation of yinpo GTW 270 hybrid
process. env. NODE_ ENV
ABAP call restful API
怎么用MySQL语言进行行列装置?
Pico, do you want to save or bring consumer VR?
Problems encountered in IM instant messaging development to maintain heartbeat
Principes et applications du système de base de données (006) - - compilation et installation de MySQL 5.7 (environnement Linux)
Idea start command line is too long problem handling
Virtual serial port simulator and serial port debugging assistant tutorial "suggestions collection"
投稿开奖丨轻量应用服务器征文活动(5月)奖励公布