当前位置:网站首页>Installation and use of sqoop
Installation and use of sqoop
2022-07-01 16:34:00 【It addict】
python Programming fast ( Ongoing update …)
Recommended system basis
List of articles
One 、Sqoop Introduce
effect : Data exchange tools , Can achieve data in mysql/oracle<–>hdfs Pass each other
principle : By writing sqoop The command sqoop The order is translated into mapreduce, adopt maperdece Connect various data , Realize data transfer
Two 、Sqoop principle

3、 ... and 、Sqoop install
Prepare the installation package in advance
sqoop link
Extraction code :4tkp
Unpack the installation
Support the installation package into /software Under the table of contents
tar -zxvf sqoop-1.4.6-cdh5.14.2.tar.gz -C /opt
Get into /opt Yes sqoop Change of name
cd /opt/
mv sqoop-1.4.6-cdh5.14.2/ sqoop
Configure environment variables
vi /etc/profile
export SQOOP_HOME=/opt/sqoop
export PATH= S Q O O P H O M E / b i n : SQOOP_HOME/bin: SQOOPHOME/bin:PATH
Let the configuration file take effect
source /etc/profile
Modify the configuration file
cd sqoop/conf
mv sqoop-env-template.sh sqoop-env.sh
vi sqoop-env.sh
export HADOOP_COMMON_HOME=/opt/hadoop
export HADOOP_MAPRED_HOME=/opt/hadoop
export HIVE_HOME=/opt/hive
export ZOOKEEPER_HOME=/opt/zookeeper
export ZOOCFGDIR=/opt/zookeeper
export HBASE_HOME=/opt/hbase
Two to be prepared jar The bag dragged to /opt/sqoop/lib Under the table of contents
Verify input
sqoop help
An order means success
function sqoop1.4.5 newspaper Warning: does not exist! HCatalog jobs will fail.
Get into bin
cd vi configure-sqoop
notes
## Moved to be a runtime check in sqoop.
#if [ ! -d "${HCAT_HOME}" ]; then
# echo "Warning: $HCAT_HOME does not exist! HCatalog jobs will fail."
# echo 'Please set $HCAT_HOME to the root of your HCatalog installation.'
#fi
#if [ ! -d "${ACCUMULO_HOME}" ]; then
# echo "Warning: $ACCUMULO_HOME does not exist! Accumulo imports will fail."
# echo 'Please set $ACCUMULO_HOME to the root of your Accumulo installation.'
#fi
Four 、Sqoop Use
1、MySQL->HDFS
Get ready sql Script , Put it in the directory you know
preparation :mysql Build database, build table
mysql> create database sqoop;
mysql> use sqoop;
mysql> source /tmp/retail_db.sql
mysql> show tables;

Use sqoop take customers Table import to hdfs On
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop //mysql Database on
–driver com.mysql.jdbc.Driver
–table customers //mysql Table on
–username root //mysql user name
–password root // password
–target-dir /tmp/customers // The goal is HDFS route
–m 3 //map Number
sqoop import --connect jdbc:mysql://localhost:3306/sqoop --driver com.mysql.jdbc.Driver --table customers --username root --password root --target-dir /tmp/customers --m 3
Use where Filter
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–table orders
–where “order_id<500”
–username root
–password root
–target-dir /data1/retail_db/orders
–m 3
Use colum Filter
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop1
–driver com.mysql.jdbc.Driver
–table emp
–columns “EMPNO,ENAME,JOB,HIREDATE”
–where “SAL>2000”
–username root
–password root
–delete-target-dir
–target-dir /data1/sqoop1/emp
–m 3
Using query statements
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–query “select * from orders where order_status!=‘CLOSED’ and $CONDITIONS”
–username root
–password root
–split-by order_id
–delete-target-dir
–target-dir /data1/retail_db/orders
–m 3
Append import
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–table orders
–username root
–password root
–incremental append
–check-column order_date
–last-value ‘2014-07-24 00:00:00’
–target-dir /data1/retail_db/orders
–m 3
2、 establish job
establish job Be careful import There must be a space before
sqoop job
–create mysqlToHdfs
– import
–connect jdbc:mysql://localhost:3306/sqoop
–table orders
–username root
–password root
–incremental append
–check-column order_date
–last-value ‘0’
–target-dir /data1/retail_db/orders
–m 3
see job
sqoop job --list
perform job
sqoop job --exec mysqlToHdfs
Timing execution
crontab -e
- 2 */1 * *
sqoop job --exec mysqlToHdfs
3、 Import data to Hive in
First in Hive Create a table
hive -e “create database if not exists retail_db;”
If the target path exists, an error will be reported Delete the existing directory
hdfs dfs -rmr hdfs://hadoop1:9000/user/root/orders1
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–table orders
–username root
–password root
–hive-import
–create-hive-table
–hive-database retail_db
–hive-table orders1
–m 3
Import data to Hive partition
Delete Hive surface
drop table if exists orders;
Import
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–query “select order_id,order_status from orders where order_date>=‘2013-11-03’ and order_date <‘2013-11-04’ and $CONDITIONS”
–username root
–password ok
–delete-target-dir
–target-dir /data1/retail_db/orders
–split-by order_id
–hive-import
–hive-database retail_db
–hive-table orders
–hive-partition-key “order_date”
–hive-partition-value “2013-11-03”
–m 3
Be careful : Partition fields cannot be imported into the table as ordinary fields
4、 Import data to HBase in
1. stay HBase CSCEC table
create ‘products’,‘data’,‘category’
2.sqoop Import
sqoop import
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–username root
–password ok
–table products
–hbase-table products
–column-family data
–m 3
5、HDFS towards MySQL Export data from
1.MySQL CSCEC table
create table customers_demo as select * from customers where 1=2;
2. Upload data
hdfs dfs -mkdir /customerinput
hdfs dfs -put customers.csv /customerinput
3. Derived data
sqoop export
–connect jdbc:mysql://localhost:3306/sqoop
–driver com.mysql.jdbc.Driver
–username root
–password root
–table customers_demo
–export-dir /customerinput
–m 1
6、 To write sqoop Script
1. Write a script job_01.opt
import
--connect
jdbc:mysql://localhost:3306/sqoop
--driver com.mysql.jdbc.Driver
--table customers
--username root
--password root
--target-dir
/data/retail_db/customers
--delete-target-dir
--m 3
2. Execute the script
sqoop --options-file job_01.opt
边栏推荐
- Germany if was crowned with many awards. How strong is this pair of headphones? In depth evaluation of yinpo GTW 270 hybrid
- Five years after graduation, I became a test development engineer with an annual salary of 30w+
- 數據庫系統原理與應用教程(006)—— 編譯安裝 MySQL5.7(Linux 環境)
- IM即時通訊開發實現心跳保活遇到的問題
- Crypto Daily: Sun Yuchen proposed to solve global problems with digital technology on MC12
- C#/VB. Net merge PDF document
- 数据库系统原理与应用教程(003)—— MySQL 安装与配置:手工配置 MySQL(windows 环境)
- [jetsonnano] [tutorial] [introductory series] [III] build tensorflow environment
- Red team Chapter 10: ColdFusion the difficult process of deserializing WAF to exp to get the target
- Go 语言怎么使用对称加密?
猜你喜欢

Share the daily work and welfare of DJI (Shenzhen headquarters) in Dajiang

全面看待企业数字化转型的价值
![[SQL statement] Why do you select two Shanghai and query different counts here? I want it to become a Shanghai, and count only displays a sum](/img/a4/58b942d1389834069e9a6ec9f8ee0a.png)
[SQL statement] Why do you select two Shanghai and query different counts here? I want it to become a Shanghai, and count only displays a sum

2022 Moonriver global hacker song winning project list

VMware 虛擬機啟動時出現故障:VMware Workstation 與 Hyper-v 不兼容...

Guide for high-end programmers to fish at work

The sharp drop in electricity consumption in Guangdong shows that the substitution of high-tech industries for high-energy consumption industries has achieved preliminary results

【直播预约】数据库OBCP认证全面升级公开课

Where should older test / development programmers go? Will it be abandoned by the times?

今天14:00 | 港大、北航、耶鲁、清华、加大等15位ICLR一作讲者精彩继续!
随机推荐
How to adjust the color of the computer screen and how to change the color of the computer screen
Ring iron pronunciation, dynamic and noiseless, strong and brilliant, magic wave hifiair Bluetooth headset evaluation
【Hot100】20. 有效的括号
如何使用phpIPAM来管理IP地址和子网
Comment utiliser le langage MySQL pour les appareils de ligne et de ligne?
Principle of motion capture system
虚拟串口模拟器和串口调试助手使用教程「建议收藏」
Huawei issued hcsp-solution-5g security talent certification to help build 5g security talent ecosystem
StoneDB 为国产数据库添砖加瓦,基于 MySQL 的一体化实时 HTAP 数据库正式开源!
Research on multi model architecture of ads computing power chip
数据库系统原理与应用教程(006)—— 编译安装 MySQL5.7(Linux 环境)
Go 语言源码级调试器 Delve
芯片供应转向过剩,中国芯片日产增加至10亿,国外芯片将更难受
Is it reliable to open an account on flush with mobile phones? Is there any potential safety hazard
数据库系统原理与应用教程(001)—— MySQL 安装与配置:MySQL 软件的安装(windows 环境)
[SQL statement] Why do you select two Shanghai and query different counts here? I want it to become a Shanghai, and count only displays a sum
从大湾区“1小时生活圈”看我国智慧交通建设
Can't global transactions be used when shardingjdbc is used in seate?
2022 Moonriver global hacker song winning project list
程序员职业生涯真的很短吗?
