当前位置:网站首页>Awk of three swordsmen in text processing
Awk of three swordsmen in text processing
2022-07-07 12:57:00 【LC181119】
1.awk Working principle and basic usage instructions
- AWK: Originally from AT & T Laboratory AWK
- NAWK:New awk,AT & T Laboratory, AWK Upgraded version
- GAWK: namely GNU AWK. be-all GNU/Linux The release comes with GAWK, It is associated with AWK and NAWK Fully compatible with
https://www.gnu.org/software/gawk/manual/gawk.html
- Text processing
- Output formatted text report
- Perform arithmetic operations
- Perform string operations
awk [options] 'program' var=value file…
awk [options] -f programfile var=value file…
- BEGIN Sentence block
- Generic statement blocks for pattern matching
- END Sentence block
- -F “ Separator ” Indicates the field separator used for input , The default delimiter is several consecutive white space characters
- -v var=value Variable assignment
pattern{action statements;..}

- Fields separated by separators ( Column column, Domain field) Mark $1,$2...$n It's called a domain identifier ,$0 For all domains , Be careful : and shell Medium variable $ They have different meanings
- Each line of a file is called a record record
- If omitted action, By default print $0 The operation of
- output statements:print,printf
- Expressions: The arithmetic , Compare expressions, etc
- Compound statements: Combining statements
- Control statements:if, while etc.
- input statements
- { statements;… } Combining statements
- if(condition) {statements;…}
- if(condition) {statements;…} else {statements;…}
- while(conditon) {statments;…}
- do {statements;…} while(condition)
- for(expr1;expr2;expr3) {statements;…}
- break
- continue
- exit
2. action print
print item1, item2, ...
- GNU sed
- Output item You can use strings , It's also a numerical value ; The field of the current record 、 Variable or awk The expression of
- If omitted item, amount to print $0
- Fixed character characters need to use “ ” Lead up , Variables and numbers don't need
[[email protected]_0_10_centos logs]# awk '{print $1}' nginx.access.log-20200428|sort |
uniq -c |sort -nr|head -3
5498 122.51.38.20
2161 117.157.173.214
953 211.159.177.120
[[email protected] ~]#awk '{print $1}' access_log |sort |uniq -c|sort -nr|head
4870 172.20.116.228
3429 172.20.116.208
2834 172.20.0.222
2613 172.20.112.14
2267 172.20.0.227
2262 172.20.116.179
2259 172.20.65.65
1565 172.20.0.76
1482 172.20.0.200
1110 172.20.28.145
[[email protected] ~]# df | awk -F"[[:space:]]+|%" '{print $5}'
Use
0
0
1
0
3
19
1
0
Example : take ifconfig In the output result IP Address
[[email protected] ~]# ifconfig eth0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.0.0.85 netmask 255.255.255.0 broadcast 10.0.0.255
inet6 fe80::20c:29ff:fe3d:d1e7 prefixlen 64 scopeid 0x20<link>
ether 00:0c:29:3d:d1:e7 txqueuelen 1000 (Ethernet)
RX packets 24590 bytes 25224965 (24.0 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 12793 bytes 4232673 (4.0 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
[[email protected] ~]# ifconfig eth0 | sed -n "2p"
inet 10.0.0.85 netmask 255.255.255.0 broadcast 10.0.0.255
[[email protected] ~]# ifconfig eth0 | sed -n "2p" | awk '{print $2}'
10.0.0.85
[[email protected] ~]# ifconfig eth0 | awk '/netmask/{print $2}'
10.0.0.85
[[email protected] ~]# ifconfig eth0 | awk 'NR==2{print $2}'
10.0.0.85
3.awk Variable
3.1 Common built-in variables
- FS: Enter field separator , The default is blank , Function equivalent to -F
Example :
[[email protected] ~]#awk -v FS=":" '{print $1FS$3}' /etc/passwd |head -n3
root:0
bin:1
daemon:2
- OFS: Output field separator , The default is blank
Example :
[[email protected] ~]#awk -v FS=':' '{print $1,$3,$7}' /etc/passwd|head -n1
root 0 /bin/bash
[[email protected] ~]#awk -v FS=':' -v OFS=':' '{print $1,$3,$7}'
/etc/passwd|head -n1
root:0:/bin/bash
- RS: Input record record Separator , Specify line breaks when entering
awk -v RS=' ' '{print }' /etc/passwd
- ORS: Output record separator , Output with specified symbol instead of line break
awk -v RS=' ' -v ORS='###' '{print $0}' /etc/passwd
- NF: Number of fields
# When referencing variables , You don't need to add before a variable $
[[email protected] ~]#awk -F:'{print NF}' /etc/fstab
[[email protected] ~]#awk -F:'{print $(NF-1)}' /etc/passwd
[[email protected] ~]#ls /misc/cd/BaseOS/Packages/*.rpm |awk -F"." '{print $(NF-
1)}'|sort |uniq -c
389 i686
208 noarch
1060 x86_64
- NR: Record number
[[email protected] ~]#awk '{print NR,$0}' /etc/issue /etc/centos-release
1 \S
2 Kernel \r on an \m
34 CentOS Linux release 8.1.1911 (Core)
- FNR: Count each document separately , Record number
awk '{print FNR}' /etc/fstab /etc/inittab
[[email protected] ~]#awk '{print NR,$0}' /etc/issue /etc/redhat-release
1 \S
2 Kernel \r on an \m
34 CentOS Linux release 8.0.1905 (Core)
[[email protected] script40]#awk '{print FNR,$0}' /etc/issue /etc/redhat-release
1 \S
2 Kernel \r on an \m
31 CentOS Linux release 8.0.1905 (Core)
- FILENAME: Current filename
[[email protected] ~]#awk '{print FILENAME}' /etc/fstab
[[email protected] ~]#awk '{print FNR,FILENAME,$0}' /etc/issue /etc/redhat-release
1 /etc/issue \S
2 /etc/issue Kernel \r on an \m
3 /etc/issue
1 /etc/redhat-release CentOS Linux release 8.0.1905 (Core)
- ARGC: Number of command line arguments
[[email protected] ~]#awk '{print ARGC}' /etc/issue /etc/redhat-release
3
3
3
3
[[email protected] ~]#awk 'BEGIN{print ARGC}' /etc/issue /etc/redhat-release
3
- ARGV: Array , The parameters given by the command line are saved , Every parameter :ARGV[0],......
[[email protected] ~]#awk 'BEGIN{print ARGV[0]}' /etc/issue /etc/redhat-release
awk
[[email protected] ~]#awk 'BEGIN{print ARGV[1]}' /etc/issue /etc/redhat-release
/etc/issue
[[email protected] ~]#awk 'BEGIN{print ARGV[2]}' /etc/issue /etc/redhat-release
/etc/redhat-release
[[email protected] ~]#awk 'BEGIN{print ARGV[3]}' /etc/issue /etc/redhat-release
[[email protected] ~]#
3.2 Custom variable
- -v var=value
- stay program Directly defined in
[[email protected] ~]#awk -v test1=test2="hello,gawk" 'BEGIN{print test1,test2}'
test2=hello,gawk
[[email protected] ~]#awk -v test1=test2="hello1,gawk"
'BEGIN{test1=test2="hello2,gawk";print test1,test2}'
hello2,gawk hello2,g
4. action printf
printf “FORMAT”, item1, item2, ...
- Must specify FORMAT
- No line wrapping , Need to explicitly give newline control \n
- FORMAT Need to be followed by each item Specify formatter
%s: display string%d, %i: Show decimal integers%f: Display as floating point%e, %E: Display scientific count values%c: Display character's ASCII code%g, %G: Display values in scientific or floating-point form%u: Unsigned integer%%: Show % Oneself
#[.#] The first number controls the width of the display ; the second # Precision after decimal point , Such as :%3.1f- Align left ( Default right alignment ) Such as :%-15s+ Show positive and negative signs of values Such as :%+d
awk -F: '{printf "%s",$1}' /etc/passwd
awk -F: '{printf "%s\n",$1}' /etc/passwd
awk -F: '{printf "%20s\n",$1}' /etc/passwd
awk -F: '{printf "%-20s\n",$1}' /etc/passwd
awk -F: '{printf "%-20s %10d\n",$1,$3}' /etc/passwd
awk -F: '{printf "Username: %s\n",$1}' /etc/passwd
awk -F: '{printf “Username: %sUID:%d\n",$1,$3}' /etc/passwd
awk -F: '{printf "Username: %25sUID:%d\n",$1,$3}' /etc/passwd
awk -F: '{printf "Username: %-25sUID:%d\n",$1,$3}'
5. The operator
arithmetic operator :
x+y, x-y, x*y, x/y, x^y, x%y-x: Convert to negative+x: Convert a string to a number
=, +=, -=, *=, /=, %=, ^=,++, --
[[email protected] ~]#awk 'BEGIN{i=0;print i++,i}'
0 1
[[email protected] ~]#awk 'BEGIN{i=0;print ++i,i}'
1 1
==, !=, >, >=, <, <=
[[email protected] ~]#seq 10 | awk 'NR%2==0'
2
4
6
8
10
[[email protected] ~]#seq 10 | awk 'NR%2==1'
1
3
5
7
9
~ Whether the left side matches the right side , Inclusion relation!~ Mismatch or not
[[email protected] ~]#awk -F: '$0 ~ /root/{print $1}' /etc/passwd
[[email protected] ~]#awk -F: '$0 ~ "^root"{print $1}' /etc/passwd
[[email protected] ~]#awk '$0 !~ /root/' /etc/passwd
[[email protected] ~]#awk '/root/' /etc/passwd
[[email protected] ~]#awk -F: '/r/' /etc/passwd
[[email protected] ~]#awk -F: '$3==0' /etc/passwd
[[email protected] ~]#df | awk -F"[[:space:]]+|%" '$0 ~ /^\/dev\/sd/{print $5}'
51
92
[[email protected] ~]#ifconfig eth0 | awk 'NR==2{print $2}'
10.0.0.8
And :&&, And the relationshipor :||, Or the relationshipNot :!, Take the opposite
[[email protected] ~]#awk 'BEGIN{print !i}'
1
[[email protected] ~]#awk -v i=10 'BEGIN{print !i}'
0
[[email protected] ~]#awk -v i=-3 'BEGIN{print !i}'
0
[[email protected] ~]#awk -v i=0 'BEGIN{print !i}'
1
[[email protected] ~]#awk -v i=abc 'BEGIN{print !i}'
0
selector?if-true-expression:if-false-expression
6. Pattern PATTERN
- If not specified : Empty mode , Match each line
[[email protected] ~]#awk -F: '{print $1,$3}' /etc/passwd
- /regular expression/: Only rows that can be pattern matched are processed , Need to use / / Cover up
- relational expression: Relationship expression , The result is “ really ” Will be dealt with
- line ranges: Line scope
- Direct use of line numbers is not supported , But you can use variables NR Indirectly specify the line number
- BEGIN/END Pattern
7. conditional if-else
if(condition){statement;…}[else statement]
if(condition1){statement1}else if(condition2){statement2}else if(condition3)
{statement3}...... else {statementN}
8. conditional switch
switch(expression) {case VALUE1 or /REGEXP/: statement1; case VALUE2 or
/REGEXP2/: statement2; ...; default: statementn}
9. loop while
while (condition) {statement;…}
10. loop do-while
do {statement;…}while(condition)
11. loop for
for(expr1;expr2;expr3) {statement;…}
for(variable assignment;condition;iteration process) {for-body}
Special Usage : Can traverse the elements in an array
for(var in array) {for-body}
12.continue and break
continue [n]
break [n]
13.next
14. Array
array_name[index-expression]
- Using the array , Realization k/v function
- Any string can be used ; String to be enclosed in double quotes
- If an array element does not exist in advance , When quoted ,awk This element will be created automatically , And initialize its value to “ Empty string ”
- To determine whether an element exists in an array , To use “index in array” Format for traversal
15.awk function
awk The functions of are divided into built-in and user-defined functions
https://www.gnu.org/software/gawk/manual/gawk.html#Functions
15.1 Common built-in functions
- Numerical processing :
rand(): return 0 and 1 A random number between
srand(): coordination rand() function , Seeds that generate random numbers
int(): Return integer
- string manipulation :
length([s]): Returns the length of the specified string
sub(r,s,[t]): Yes t String search r Represents the content of pattern matching , And replace the first match with s
gsub(r,s,[t]): Yes t String to search r Content of the pattern match represented by , And replace them all with s Content represented
split(s,array,[r]): With r Separator , Cut string s, And save the results after cutting to array In the array represented , The first
An index value is 1, The second index value is 2,…
- Sure awk Call in shell command
system('cmd')
- Time function
https://www.gnu.org/software/gawk/manual/gawk.html#Time-Functions
systime() The current time is 1970 year 1 month 1 Seconds of the day
strftime() Specify the time format
15.2 Custom function
function name ( parameter, parameter, ... ) {
statements
return expression
}
16.awk Script
awkfile var=value var2=value2... Inputfile
边栏推荐
- 2022 polymerization process test question simulation test question bank and online simulation test
- ISPRS2021/遥感影像云检测:一种地理信息驱动的方法和一种新的大规模遥感云/雪检测数据集
- Lingyunguang of Dachen and Xiaomi investment is listed: the market value is 15.3 billion, and the machine is implanted into the eyes and brain
- . Net ultimate productivity of efcore sub table sub database fully automated migration codefirst
- Several ways to clear floating
- 人均瑞数系列,瑞数 4 代 JS 逆向分析
- Connect to blog method, overload, recursion
- 非分区表转换成分区表以及注意事项
- 【从 0 开始学微服务】【03】初探微服务架构
- Cryptography series: detailed explanation of online certificate status protocol OCSP
猜你喜欢
2022 polymerization process test question simulation test question bank and online simulation test
红杉中国完成新一期90亿美元基金募集
聊聊Redis缓存4种集群方案、及优缺点对比
- Oui. Migration entièrement automatisée de la Sous - base de données des tableaux d'effets sous net
Blog recommendation | Apache pulsar cross regional replication scheme selection practice
About IPSec
Leetcode skimming: binary tree 27 (delete nodes in the binary search tree)
Aike AI frontier promotion (7.7)
明星企业普渡科技大裁员:曾募资超10亿 腾讯红杉是股东
Day-15 common APIs and exception mechanisms
随机推荐
怎样重置火狐浏览器
图形对象的创建与赋值
Grep of three swordsmen in text processing
Master formula. (used to calculate the time complexity of recursion.)
2022a special equipment related management (boiler, pressure vessel and pressure pipeline) simulated examination question bank simulated examination platform operation
2022 practice questions and mock examination of the third batch of Guangdong Provincial Safety Officer a certificate (main person in charge)
Day-20 file operation, recursive copy, serialization
Leetcode skimming: binary tree 25 (the nearest common ancestor of binary search tree)
2022 examination questions and online simulation examination for safety production management personnel of hazardous chemical production units
.Net下极限生产力之efcore分表分库全自动化迁移CodeFirst
ICLR 2022 | 基于对抗自注意力机制的预训练语言模型
免费手机号码归属地API查询接口
人均瑞数系列,瑞数 4 代 JS 逆向分析
如何将 @Transactional 事务注解运用到炉火纯青?
图像像素读写操作
Several ways to clear floating
What if the xshell evaluation period has expired
[learn microservices from 0] [03] explore the microservice architecture
Leetcode skimming: binary tree 22 (minimum absolute difference of binary search tree)
Users, groups, and permissions