当前位置:网站首页>awk notes
awk notes
2022-07-30 20:31:00 【666QAQ】
快速上手
Built-in variables and printing
NF
(可以理解为Number of fields)- 内建变量,表示当前行的字段数量
{print NF, $1, $NF}
NR
(可以理解为Number of rows)- 内建变量,Indicates the number of lines read so far.
- 打印文本
- 在 print 语句中, The text surrounded by double quotes will and fields, along with the result of the operation.
{print "total pay for"}
- printf
{ printf("total pay for %s is $%.2f\n", $1, $2 * $3) }
{ printf("%-8s $%6.2f\n", $1, $2 * $3) }
- 输出排序
awk '{ printf("%6.2f %s\n", $2 * $3, $0) }' emp.data | sort -n
选择
Print those that pay over 50 的$2 * $3 > 50 { printf("$%.2f for %s\n", $2 * $3, $1) }
Print all the first fields areSusie 的行:$1 == "Susie"
通过正则表达式, 打印所有包含 Susie 的行/Susie/
模式组合
Patterns can be combined using parentheses and logical operators, 逻辑运算符包括 &&, ||, 和 !.
打印那些 $2 至少为 4, 或者 $3 至少为 20 的行:
$2 >= 4 || $3 >= 20
两个条件都满足的行只输出一次. Contrast this program with the program below, It contains two modes,If a row satisfies both conditions, It will be printed twice:
$2 >= 4
$3 >= 20
打印不满足($2 小于 4, 并且 $3 也小于 20)的行
!($2 < 4 && $3 < 20)
数据验证
Real data always has errors. Check that the data has reasonable values, 格式是否正确, This task is often referred to as data validation (data
validation), 在这一方面 awk Is a very good tool.
数据验证在本质上是否定: Lines with expected attributes are not printed, Instead, the suspicious line is printed. The next program uses the comparison mode, 将
5 Article sanity test is applied to emp.data 的每一行:
NF != 3 {
print $0, "number of fields is not equal to 3" }
$2 < 3.35 {
print $0, "rate is below minimum wage" }
$2 > 10 {
print $0, "rate exceeds $10 per hour" }
$3 < 0 {
print $0, "negative hours worked" }
$3 > 60 {
print $0, "too many hours worked" }
如果数据没有错误, 就不会有输出.
BEGIN 与 END
特殊的模式 BEGIN 在第一个输入文件的第一行之前被匹配, END The last line of the last input file is processed
之后匹配. 这个程序使用 BEGIN 打印一个标题:
BEGIN {
print "NAME RATE HOURS"; print "" }
{
print }
可以在同一行放置多个语句, 语句之间用分号分开.注意 print ""
打印一个空行, 它与一个单独的 print
并
不相同, The latter prints the current line
计算
awk
Can be used to perform simple math or string calculations,And you can use built-in variables、Custom variables to calculate and store data.
在 awk
中, User-created variables can be used without prior declaration.
计数
用一个变量 emp Calculation work hours exceeded 15 number of employees per hour:
$3 > 15 {
emp = emp + 1 }
END {
print emp, "employees worked more than 15 hours" }
Sum and Average
利用 NR to calculate the average return:
{
pay = pay + $2 * $3 }
END {
print NR, "employees"
print "total pay is", pay
print "average pay is", pay / NR
}
很明显, printf Can be used to produce more aesthetically pleasing output. This program has a potential bug: An uncommon case is NR 的值为
0, The program will try to 0 作除数, 此时 awk
就会产生一条错误消息.
字符串拼接
A new string can be generated by combining old strings; This operation is called splicing (concatenation).
{
names = names $1 " " }
END {
print names }
打印最后一行
虽然在 END 动作里, NR 的值被保留了下来, 但是 $0 却不会.
打印文件最后一行
{
last = $0 }
END {
print last }
内建函数
awk There are also built-in functions for computing other values. 求平方根, 取对数, 随机数, Except for these math functions, There are other functions for manipulating text. 其中之一是 length
, It is used to count the number of characters in a string.
Calculate the length of each person's name:
{
print $1, length($1) }
行,Count of words and characters
使用 length, NF 与 NR 计算行, Number of words and characters, 为方便起见, We treat each field as a word
{
nc = nc + length($0) + 1
nw = nw + NF
}
END {
print NR, "lines,", nw, "words,", nc, "characters" }
流程控制语句
Awk provided for decision making if-else 语句, 以及循环语句, All of these come from C 语言. They can only be used in actions(Action) 里.
if else
$2 > 6 {
n = n + 1; pay = pay + $2 * $3 }
END {
if (n > 0)
print n, "employees, total pay is", pay,
"average pay is", pay/n
else
print "no employees are paid more than $6/hour"
}
while
一个 while Contains a conditional judgment and a loop body. 当条件为真时, 循环体执行.
{
i = 1
while (i <= $3) {
printf("\t%.2f\n", $1 * (1 + $2) ^ i)
i = i + 1
}
}
for
Most loops include initialization, 测试, 增值, 而 for statement compresses all three into one line.
{
for (i = 1; i <= $3; i = i + 1)
printf("\t%.2f\n", $1 * (1 + $2) ^ i)
}
数组
一个简单的例子.
The following program displays the input data in reverse row order..The first action puts the input lines into an array line
in the next element;
也就是说, Put in the first line line[1]
, Put in the second line line[2]
, 依次类推..END
Action with one while
循环, Printing starts from the last element of the array, Print until the first element.
{
line[NR] = $0}
END {
i = NR
while(i > 0){
print line[i]
i = i-1
}
}
用forThe loop implements the equivalent program:
{
line[NR] = $0 } # remember each input line
END {
for (i = NR; i > 0; i = i - 1)
print line[i]
}
实用“一行”手册
虽然 awk
Very complex programs can be written, But many useful programs are not much more complicated than what we've seen so far.
Here are some collections of small programs, There should be some reference value for readers. Most are variants of the programs we've already discussed.
- Enter the total number of rows for the row
END { print NR }
- 打印第10行
NR == 10
- Print the last field of each input line
{ print $NF }
- Print the last field of the last line
{ field = $NF } END { print field }
- Excessive number of print fields4个的输入行
NF>4
- 打印最后一个字段值大于4的输入行
$NF >4
- Print the sum of the number of fields belonging to the input line
{ nf = nf + NF } END { print nf }
- 打印包含Beth的行的数量
/Beth/ { nlines = nlines + 1 } END { print nlines }
- Print the first field with the largest value,and the line containing it(假设$1总是正的)
$1 > max {
max = $1; maxline = $0 }
END {
print max, maxline }
- Print lines that contain at least one field
NF > 0
- 打印长度超过80个字符的行
length($0) > 80
- Prepend each line with its field number
{ print NF, $0 }
- 打印每一行的第1与第2个字段,但顺序相反
{print $2, $1}
- Swap the first of each line1与第2个字段,并打印该行
{ temp = $1; $1 = $2; $2 = temp; print}
- Replace the first field of each line with the line number
{ $1 = NR; print }
- Printing removes the first2line after fields
{ $2 = ""; print }
- Print the fields of each row in reverse order
{ for(i=NF; i>0; i=i-1) printf("%s ", $i) printf("\n") }
- Print the sum of all field values for each row
{ sum = 0 for(i=1; i<= NF; i=i+1) sum = sum + $i print sum }
- Add up all field values for all rows
{ for (i = 1; i <= NF; i = i + 1) sum = sum + $i } END { print sum }
- Replace each field of each row with its absolute value
{ for (i = 1; i <= NF; i = i + 1) if ($i < 0) $i = -$i print }
系统学习
- 待办
边栏推荐
猜你喜欢
MySQL 高级(进阶) SQL 语句 (一)
MySQL (2)
使用map函数,对list中的每个元素进行操作 好像不用map
MySQL----多表查询
倾斜文档扫描与字符识别(opencv,坐标变换分析)
Mysql索引特性(重要)
推荐系统:概述【架构:用户/物品特征工程---->召回层---->排序层---->测试/评估】【冷启动问题、实时性问题】
如何解决gedit 深色模式下高亮文本不可见?
MYSQL JDBC图书管理系统
Recommended system: cold start problem [user cold start, item cold start, system cold start]
随机推荐
Mysql索引特性(重要)
vookloop函数怎么用?vlookup函数的使用方法介绍
Recommendation System - Sorting Layer - Model (1): Embedding + MLP (Multilayer Perceptron) Model [Deep Crossing Model: Classic Embedding + MLP Model Structure]
英文字母间隔突然增大(全角与半角转换)
从离线到实时对客,湖仓一体释放全量数据价值
Scala类中的属性
多线程的互斥锁应用RAII机制
18.客户端会话技术Cookie
C语言中指针没那么难~ (1)【文章结尾有资料】
【视频】极值理论EVT与R语言应用:GPD模型火灾损失分布分析
历史上的今天:Win10 七周年;微软和雅虎的搜索协议;微软发行 NT 4.0
KEIL问题:【keil Error: failed to execute ‘C:\Keil\ARM\ARMCC‘】
excel数字如何转换成文本?excel表格数据转换成文本的方法
Recommendation system-model: FNN model (FM+MLP=FNN)
HarmonyOS笔记-----------(三)
Apple Silicon配置二进制环境(一)
Oblique document scanning and character recognition (opencv, coordinate transformation analysis)
Recommendation System - Sorting Layer: Sorting Layer Architecture [User and Item Feature Processing Steps]
c语言:操作符详解
MySQL 删除表数据,重置自增 id 为 0 的两个方式