当前位置:网站首页>Awk of three swordsmen in text processing
Awk of three swordsmen in text processing
2022-07-07 12:57:00 【LC181119】
1.awk Working principle and basic usage instructions
- AWK: Originally from AT & T Laboratory AWK
- NAWK:New awk,AT & T Laboratory, AWK Upgraded version
- GAWK: namely GNU AWK. be-all GNU/Linux The release comes with GAWK, It is associated with AWK and NAWK Fully compatible with
https://www.gnu.org/software/gawk/manual/gawk.html
- Text processing
- Output formatted text report
- Perform arithmetic operations
- Perform string operations
awk [options] 'program' var=value file…
awk [options] -f programfile var=value file…
- BEGIN Sentence block
- Generic statement blocks for pattern matching
- END Sentence block
- -F “ Separator ” Indicates the field separator used for input , The default delimiter is several consecutive white space characters
- -v var=value Variable assignment
pattern{action statements;..}
- Fields separated by separators ( Column column, Domain field) Mark $1,$2...$n It's called a domain identifier ,$0 For all domains , Be careful : and shell Medium variable $ They have different meanings
- Each line of a file is called a record record
- If omitted action, By default print $0 The operation of
- output statements:print,printf
- Expressions: The arithmetic , Compare expressions, etc
- Compound statements: Combining statements
- Control statements:if, while etc.
- input statements
- { statements;… } Combining statements
- if(condition) {statements;…}
- if(condition) {statements;…} else {statements;…}
- while(conditon) {statments;…}
- do {statements;…} while(condition)
- for(expr1;expr2;expr3) {statements;…}
- break
- continue
- exit
2. action print
print item1, item2, ...
- GNU sed
- Output item You can use strings , It's also a numerical value ; The field of the current record 、 Variable or awk The expression of
- If omitted item, amount to print $0
- Fixed character characters need to use “ ” Lead up , Variables and numbers don't need
[[email protected]_0_10_centos logs]# awk '{print $1}' nginx.access.log-20200428|sort |
uniq -c |sort -nr|head -3
5498 122.51.38.20
2161 117.157.173.214
953 211.159.177.120
[[email protected] ~]#awk '{print $1}' access_log |sort |uniq -c|sort -nr|head
4870 172.20.116.228
3429 172.20.116.208
2834 172.20.0.222
2613 172.20.112.14
2267 172.20.0.227
2262 172.20.116.179
2259 172.20.65.65
1565 172.20.0.76
1482 172.20.0.200
1110 172.20.28.145
[[email protected] ~]# df | awk -F"[[:space:]]+|%" '{print $5}'
Use
0
0
1
0
3
19
1
0
Example : take ifconfig In the output result IP Address
[[email protected] ~]# ifconfig eth0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.0.0.85 netmask 255.255.255.0 broadcast 10.0.0.255
inet6 fe80::20c:29ff:fe3d:d1e7 prefixlen 64 scopeid 0x20<link>
ether 00:0c:29:3d:d1:e7 txqueuelen 1000 (Ethernet)
RX packets 24590 bytes 25224965 (24.0 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 12793 bytes 4232673 (4.0 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
[[email protected] ~]# ifconfig eth0 | sed -n "2p"
inet 10.0.0.85 netmask 255.255.255.0 broadcast 10.0.0.255
[[email protected] ~]# ifconfig eth0 | sed -n "2p" | awk '{print $2}'
10.0.0.85
[[email protected] ~]# ifconfig eth0 | awk '/netmask/{print $2}'
10.0.0.85
[[email protected] ~]# ifconfig eth0 | awk 'NR==2{print $2}'
10.0.0.85
3.awk Variable
3.1 Common built-in variables
- FS: Enter field separator , The default is blank , Function equivalent to -F
Example :
[[email protected] ~]#awk -v FS=":" '{print $1FS$3}' /etc/passwd |head -n3
root:0
bin:1
daemon:2
- OFS: Output field separator , The default is blank
Example :
[[email protected] ~]#awk -v FS=':' '{print $1,$3,$7}' /etc/passwd|head -n1
root 0 /bin/bash
[[email protected] ~]#awk -v FS=':' -v OFS=':' '{print $1,$3,$7}'
/etc/passwd|head -n1
root:0:/bin/bash
- RS: Input record record Separator , Specify line breaks when entering
awk -v RS=' ' '{print }' /etc/passwd
- ORS: Output record separator , Output with specified symbol instead of line break
awk -v RS=' ' -v ORS='###' '{print $0}' /etc/passwd
- NF: Number of fields
# When referencing variables , You don't need to add before a variable $
[[email protected] ~]#awk -F:'{print NF}' /etc/fstab
[[email protected] ~]#awk -F:'{print $(NF-1)}' /etc/passwd
[[email protected] ~]#ls /misc/cd/BaseOS/Packages/*.rpm |awk -F"." '{print $(NF-
1)}'|sort |uniq -c
389 i686
208 noarch
1060 x86_64
- NR: Record number
[[email protected] ~]#awk '{print NR,$0}' /etc/issue /etc/centos-release
1 \S
2 Kernel \r on an \m
34 CentOS Linux release 8.1.1911 (Core)
- FNR: Count each document separately , Record number
awk '{print FNR}' /etc/fstab /etc/inittab
[[email protected] ~]#awk '{print NR,$0}' /etc/issue /etc/redhat-release
1 \S
2 Kernel \r on an \m
34 CentOS Linux release 8.0.1905 (Core)
[[email protected] script40]#awk '{print FNR,$0}' /etc/issue /etc/redhat-release
1 \S
2 Kernel \r on an \m
31 CentOS Linux release 8.0.1905 (Core)
- FILENAME: Current filename
[[email protected] ~]#awk '{print FILENAME}' /etc/fstab
[[email protected] ~]#awk '{print FNR,FILENAME,$0}' /etc/issue /etc/redhat-release
1 /etc/issue \S
2 /etc/issue Kernel \r on an \m
3 /etc/issue
1 /etc/redhat-release CentOS Linux release 8.0.1905 (Core)
- ARGC: Number of command line arguments
[[email protected] ~]#awk '{print ARGC}' /etc/issue /etc/redhat-release
3
3
3
3
[[email protected] ~]#awk 'BEGIN{print ARGC}' /etc/issue /etc/redhat-release
3
- ARGV: Array , The parameters given by the command line are saved , Every parameter :ARGV[0],......
[[email protected] ~]#awk 'BEGIN{print ARGV[0]}' /etc/issue /etc/redhat-release
awk
[[email protected] ~]#awk 'BEGIN{print ARGV[1]}' /etc/issue /etc/redhat-release
/etc/issue
[[email protected] ~]#awk 'BEGIN{print ARGV[2]}' /etc/issue /etc/redhat-release
/etc/redhat-release
[[email protected] ~]#awk 'BEGIN{print ARGV[3]}' /etc/issue /etc/redhat-release
[[email protected] ~]#
3.2 Custom variable
- -v var=value
- stay program Directly defined in
[[email protected] ~]#awk -v test1=test2="hello,gawk" 'BEGIN{print test1,test2}'
test2=hello,gawk
[[email protected] ~]#awk -v test1=test2="hello1,gawk"
'BEGIN{test1=test2="hello2,gawk";print test1,test2}'
hello2,gawk hello2,g
4. action printf
printf “FORMAT”, item1, item2, ...
- Must specify FORMAT
- No line wrapping , Need to explicitly give newline control \n
- FORMAT Need to be followed by each item Specify formatter
%s: display string%d, %i: Show decimal integers%f: Display as floating point%e, %E: Display scientific count values%c: Display character's ASCII code%g, %G: Display values in scientific or floating-point form%u: Unsigned integer%%: Show % Oneself
#[.#] The first number controls the width of the display ; the second # Precision after decimal point , Such as :%3.1f- Align left ( Default right alignment ) Such as :%-15s+ Show positive and negative signs of values Such as :%+d
awk -F: '{printf "%s",$1}' /etc/passwd
awk -F: '{printf "%s\n",$1}' /etc/passwd
awk -F: '{printf "%20s\n",$1}' /etc/passwd
awk -F: '{printf "%-20s\n",$1}' /etc/passwd
awk -F: '{printf "%-20s %10d\n",$1,$3}' /etc/passwd
awk -F: '{printf "Username: %s\n",$1}' /etc/passwd
awk -F: '{printf “Username: %sUID:%d\n",$1,$3}' /etc/passwd
awk -F: '{printf "Username: %25sUID:%d\n",$1,$3}' /etc/passwd
awk -F: '{printf "Username: %-25sUID:%d\n",$1,$3}'
5. The operator
arithmetic operator :
x+y, x-y, x*y, x/y, x^y, x%y-x: Convert to negative+x: Convert a string to a number
=, +=, -=, *=, /=, %=, ^=,++, --
[[email protected] ~]#awk 'BEGIN{i=0;print i++,i}'
0 1
[[email protected] ~]#awk 'BEGIN{i=0;print ++i,i}'
1 1
==, !=, >, >=, <, <=
[[email protected] ~]#seq 10 | awk 'NR%2==0'
2
4
6
8
10
[[email protected] ~]#seq 10 | awk 'NR%2==1'
1
3
5
7
9
~ Whether the left side matches the right side , Inclusion relation!~ Mismatch or not
[[email protected] ~]#awk -F: '$0 ~ /root/{print $1}' /etc/passwd
[[email protected] ~]#awk -F: '$0 ~ "^root"{print $1}' /etc/passwd
[[email protected] ~]#awk '$0 !~ /root/' /etc/passwd
[[email protected] ~]#awk '/root/' /etc/passwd
[[email protected] ~]#awk -F: '/r/' /etc/passwd
[[email protected] ~]#awk -F: '$3==0' /etc/passwd
[[email protected] ~]#df | awk -F"[[:space:]]+|%" '$0 ~ /^\/dev\/sd/{print $5}'
51
92
[[email protected] ~]#ifconfig eth0 | awk 'NR==2{print $2}'
10.0.0.8
And :&&, And the relationshipor :||, Or the relationshipNot :!, Take the opposite
[[email protected] ~]#awk 'BEGIN{print !i}'
1
[[email protected] ~]#awk -v i=10 'BEGIN{print !i}'
0
[[email protected] ~]#awk -v i=-3 'BEGIN{print !i}'
0
[[email protected] ~]#awk -v i=0 'BEGIN{print !i}'
1
[[email protected] ~]#awk -v i=abc 'BEGIN{print !i}'
0
selector?if-true-expression:if-false-expression
6. Pattern PATTERN
- If not specified : Empty mode , Match each line
[[email protected] ~]#awk -F: '{print $1,$3}' /etc/passwd
- /regular expression/: Only rows that can be pattern matched are processed , Need to use / / Cover up
- relational expression: Relationship expression , The result is “ really ” Will be dealt with
- line ranges: Line scope
- Direct use of line numbers is not supported , But you can use variables NR Indirectly specify the line number
- BEGIN/END Pattern
7. conditional if-else
if(condition){statement;…}[else statement]
if(condition1){statement1}else if(condition2){statement2}else if(condition3)
{statement3}...... else {statementN}
8. conditional switch
switch(expression) {case VALUE1 or /REGEXP/: statement1; case VALUE2 or
/REGEXP2/: statement2; ...; default: statementn}
9. loop while
while (condition) {statement;…}
10. loop do-while
do {statement;…}while(condition)
11. loop for
for(expr1;expr2;expr3) {statement;…}
for(variable assignment;condition;iteration process) {for-body}
Special Usage : Can traverse the elements in an array
for(var in array) {for-body}
12.continue and break
continue [n]
break [n]
13.next
14. Array
array_name[index-expression]
- Using the array , Realization k/v function
- Any string can be used ; String to be enclosed in double quotes
- If an array element does not exist in advance , When quoted ,awk This element will be created automatically , And initialize its value to “ Empty string ”
- To determine whether an element exists in an array , To use “index in array” Format for traversal
15.awk function
awk The functions of are divided into built-in and user-defined functions
https://www.gnu.org/software/gawk/manual/gawk.html#Functions
15.1 Common built-in functions
- Numerical processing :
rand(): return 0 and 1 A random number between
srand(): coordination rand() function , Seeds that generate random numbers
int(): Return integer
- string manipulation :
length([s]): Returns the length of the specified string
sub(r,s,[t]): Yes t String search r Represents the content of pattern matching , And replace the first match with s
gsub(r,s,[t]): Yes t String to search r Content of the pattern match represented by , And replace them all with s Content represented
split(s,array,[r]): With r Separator , Cut string s, And save the results after cutting to array In the array represented , The first
An index value is 1, The second index value is 2,…
- Sure awk Call in shell command
system('cmd')
- Time function
https://www.gnu.org/software/gawk/manual/gawk.html#Time-Functions
systime() The current time is 1970 year 1 month 1 Seconds of the day
strftime() Specify the time format
15.2 Custom function
function name ( parameter, parameter, ... ) {
statements
return expression
}
16.awk Script
awkfile var=value var2=value2... Inputfile
边栏推荐
- 2022广东省安全员A证第三批(主要负责人)考试练习题及模拟考试
- [statistical learning methods] learning notes - Chapter 5: Decision Tree
- What are the benefits of ip2long?
- 2022A特种设备相关管理(锅炉压力容器压力管道)模拟考试题库模拟考试平台操作
- Several ways to clear floating
- 2022 examination questions and online simulation examination for safety production management personnel of hazardous chemical production units
- 通讯协议设计与实现
- 2022 practice questions and mock examination of the third batch of Guangdong Provincial Safety Officer a certificate (main person in charge)
- 关于 appium 启动 app 后闪退的问题 - (已解决)
- Users, groups, and permissions
猜你喜欢
[crawler] avoid script detection when using selenium
[pytorch practice] write poetry with RNN
Day22 deadlock, thread communication, singleton mode
2022广东省安全员A证第三批(主要负责人)考试练习题及模拟考试
Master公式。(用于计算递归的时间复杂度。)
Charles: four ways to modify the input parameters or return results of the interface
HZOJ #240. Graphic printing IV
关于 appium 启动 app 后闪退的问题 - (已解决)
图像像素读写操作
数据库安全的重要性
随机推荐
visual stdio 2017关于opencv4.1的环境配置
.Net下极限生产力之efcore分表分库全自动化迁移CodeFirst
[difficult and miscellaneous]pip running suddenly appears modulenotfounderror: no module named 'pip‘
详解ThinkPHP支持的URL模式有四种普通模式、PATHINFO、REWRITE和兼容模式
Smart cloud health listed: with a market value of HK $15billion, SIG Jingwei and Jingxin fund are shareholders
About IPSec
认养一头牛冲刺A股:拟募资18.5亿 徐晓波持股近40%
如何将 @Transactional 事务注解运用到炉火纯青?
[statistical learning methods] learning notes - Chapter 5: Decision Tree
[learn micro services from 0] [02] move from single application to service
Day-20 file operation, recursive copy, serialization
[learn microservices from 0] [03] explore the microservice architecture
[learn microservice from 0] [01] what is microservice
SSM框架搭建的步骤
Guangzhou held work safety conference
Leetcode brush question: binary tree 24 (the nearest common ancestor of binary tree)
关于 appium 启动 app 后闪退的问题 - (已解决)
非分区表转换成分区表以及注意事项
@What is the difference between resource and @autowired?
达晨与小米投的凌云光上市:市值153亿 为机器植入眼睛和大脑