当前位置:网站首页>Shell implements basic file operations (cutting, sorting, and de duplication)
Shell implements basic file operations (cutting, sorting, and de duplication)
2022-07-03 00:33:00 【Dreamy channeling】
Use Shell Built-in tools , Realize the operation of large text files , Meet the needs of daily data processing .
One 、 command cut - cutting
cut It can process text by column , It is especially suitable for data processing of large files .
The basic grammar is cut [option] filename
Parameters
cut -f Column number , What column do you want to get ;
cut -c Divide by character ;
cut -d Divide in bytes , Automatically ignore multi byte character boundaries , Rujia -n , Then do not split multi byte characters ;
cut n- Designate the n Column start ;
cut n-m Specify from n List to m Column ;
demo demonstration
1) Byte cutting
The original file is shown below .
Get the first two columns . Enter the command cut info.text -d " " -f 1-2, Custom segmentation , Split by space .
2) cutting bash Of PID
Found... In the virtual machine bash Information about , As shown in the figure below .
Carry out orders ps -aux | grep bash | head -n 1 | cut -d " " -f 8, lookup bash process , Take the first line , Space division , Intercept by column , Take the first place 8 Column , The results are shown in the following figure .
Two 、 command sort - Sort
sort Sort the files , And output the sorting result standard or redirection to the specified file .
The basic grammar is **sort [option] **
Parameters
sort -n Sort by numerical value ;
sort -r Sort in reverse order ;
sort -t Separator Default space separator , Separator when sorting ;
sort -k Specify the columns to sort ;
sort -o Save the sorted results to the specified file ;
sort -u Result only , That is to remove the heavy ;
demo demonstration
1) Sort
The original file is shown below .
Carry out orders sort -t " " -k2n,2 infodata.txt, The second column is sorted in ascending numerical order , Note that the sorting should specify from which column to which column , The effect is shown below .
There are duplicate data in the above results , How to remove heavy ?
Add... To the command -uk1,2, Full command sort -t " " -k2n,2 -uk1,2 infodata.txt, The effect is as follows .
How to print out duplicate data ?
Use command sort infodata.txt | uniq -dc, The effect is shown below .
3、 ... and 、 command uniq - duplicate removal
uniq Behavior unit , Compare and remove the weight between lines , It can only be effective De duplication of ordered text , Therefore, sort Command in combination with .
The basic grammar is **uniq [option] **
Parameters
uniq -c Count the number of rows ;
uniq -d Show only duplicate lines and remove duplicates ;
uniq -u Show only unique rows ;
uniq -i Ignore case ;
uniq -f Ignore before N A field , Fields are separated by white space characters ;
demo demonstration
1) Sort and de duplicate
Show only the lines that appear once , Carry out orders sort infodata.txt | uniq -u, The effect is shown below .
For text files with line numbers , Use -f Parameter ignores the first line number field , Reprocess the following fields .
Tests found sort duplicate removal It doesn't seem to work for the last line ( The last line repeats without ), Verify again in practical application .
Reference blog
【1】https://blog.csdn.net/qq_43382735/article/details/121007185
边栏推荐
猜你喜欢
![[target detection] r-cnn, fast r-cnn, fast r-cnn learning](/img/f0/df285f01ffadff62eb3dcb92f2e04f.jpg)
[target detection] r-cnn, fast r-cnn, fast r-cnn learning

Bloom filter

Should you study kubernetes?

CMake基本使用

UART、RS232、RS485、I2C和SPI的介绍

The "2022 China Digital Office Market Research Report" can be downloaded to explain the 176.8 billion yuan market in detail

University of Toronto:Anthony Coache | 深度强化学习的条件可诱导动态风险度量

ftrace工具的介绍及使用
![[shutter] Introduction to the official example of shutter Gallery (learning example | email application | retail application | wealth management application | travel application | news application | a](/img/f2/f3b8899aa774dd32006c5928d370f1.gif)
[shutter] Introduction to the official example of shutter Gallery (learning example | email application | retail application | wealth management application | travel application | news application | a

可下载《2022年中国数字化办公市场研究报告》详解1768亿元市场
随机推荐
Why is the website slow to open?
An excellent orm in dotnet circle -- FreeSQL
Confluence的PDF导出中文文档异常显示问题解决
Blue decides red - burst CS teamserver password
Centos7 one click compilation to build MySQL script
Automated defect analysis in electronic microscopic images
毕业总结
kubernetes编写yml简单入门
Shell 实现文件基本操作(切割、排序、去重)
Which websites can I search for references when writing a thesis?
MySQL 23 classic interview hanging interviewer
Shell脚本基本使用
About the practice topic of screen related to unity screen, unity moves around a certain point inside
MySQL 23道经典面试吊打面试官
Nc50528 sliding window
FRP reverse proxy +msf get shell
mm中的GAN模型架构
How SQLSEVER removes data with duplicate IDS
Understanding and application of least square method
Monitor container runtime tool Falco