当前位置:网站首页>A statistical problem of shell script
A statistical problem of shell script
2022-06-27 13:28:00 【User 3147702】
1. Problem description
1.1. Input format :
a. Several lines of data , Each row of data has 3 Column content , Between columns \t Division . b. The first column represents the attribute 1, The second column represents the attribute 2, The third column represents the attribute 3. c. The possible values of each attribute are fixed during a calculation , For example, properties 1 Only 0,1,2,4, attribute 2 Can only be 29,35,55,70. d. The possible values of each attribute may vary in different calculation processes , For example, for the first calculation, the attribute 1 Can only be 0,1,2,4, In the second calculation , attribute 1 There is one more possible value of 5, That is, the second calculation , attribute 1 It is only possible to take 0,1,2,4,5.
1.2. Examples of input :
0 29 50. 1 35 60. 0 29 60.
1.3. Output result :
Flag 29 35. 0 2 0. 1 0 1.
1.4. Interpretation of output results :
a. Flag Is constant , This is the output . b. First row Division flag outside , Attribute 2 All possible values of . c. The first column is divided by flag outside , Attribute 1 All possible values of . d. The meaning of the numbers in other positions : for example : The second row, the second column 2, Indicates in the input data , attribute 1 The value of is 0 And properties 2 The value of is 29, Such data have 2 That's ok .
2. Answer code
2.1. main.sh
#!/bin/bash
bash count.sh > output.txt &&
bash count_result.sh > result.txt &&
cat result.txt
echo2.2. cout.sh
#!/bin/bash
sort -k 1,2 input.txt > output_a.txt &&
b="<![INITED]>";
i=1;
while read line; do
y=`echo $line | awk '{ print $1"\t"$2 }'`;
if [ "$b" != "<![INITED]>" ]; then
if [ "$y" != "$b" ]
then
echo -e ${b}"\t"${i};
i=1;
else
i=$((i+1))
fi
fi
b=$y;
done < output_a.txt
echo -e ${b}"\t"${i};2.3. count_result.sh
#!/bin/bash
echo -e "Flag\t\c"
i=0
x="<![INITED]>"
while read line; do
b=`echo -e $line | awk '{print $1}'`
if [ "$b" != "$x" ]; then
array_1[$((i))]=$b
x=${array_1[$((i))]}
i=$((i+1))
fi
done < output.txt
sort -k 2 output.txt > out2.txt
i=0
x="<![INITED]>"
while read line; do
b=`echo -e $line | awk '{print $2}'`
if [ "$b" != "$x" ]; then
array_2[$((i))]=$b
x=${array_2[$((i))]}
i=$((i+1))
fi
done < out2.txt
for var in ${array_2[@]}; do
echo -e $var"\t\c"
done
echo
echo -e ${array_1[0]}"\t\c"
i=0
j=0
while read line; do
e1=`echo -e $line | awk '{ print $1 }'`
e2=`echo -e $line | awk '{ print $2 }'`
e3=`echo -e $line | awk '{ print $3 }'`
if [ $e1 != ${array_1[$((j))]} ]; then
while [ $((i)) -lt ${#array_2[@]} ]; do
echo -e 0"\t\c"
i=$((i+1))
done
i=0
j=$((j+1))
echo
echo -e ${array_1[$((j))]}"\t\c"
fi
while [ 1 ]; do
if [ $e2 == ${array_2[$((i))]} ]; then
echo -e $e3"\t\c"
break;
elif [ $e2 -gt ${array_2[$((i))]} ]; then
echo -e 0"\t\c"
i=$((i+1))
fi
done
i=$((i+1))
done < output.txt
while [ $((i)) -lt ${#array_2[@]} ]; do
echo -e 0"\t\c"
i=$((i+1))
done
rm -rf out*.txt3. defects
while read line; do ... doneThis method is inefficient for reading files , If the input is 100000 lines or higher , Run time is not acceptable . therefore , The plan had to be abandoned .
4. awk improvement
4.1. awk_main.sh
cat zeyu_test_input | awk -F"\t" -f count.awk -v a=2 b=4 > out.txt &&
sort -k 1,2 out.txt > output.txt &&
sh count_result.sh > result.txt &&
echo >> result.txt
echo >> result.txt
cat zeyu_test_input | awk -F"\t" -f count.awk -v a=2 b=5 > out.txt &&
sort -k 1,2 out.txt > output.txt &&
sh count_result.sh >> result.txt &&
cat result.txt &&
echo4.2. count.awk
{
y=$a"\t"$b;
if (y in A)
A[y]++;
else
A[y] = 1;
};
END \
{
for (k in A)
{
print k"\t"A[k];
}
}4.3. count_result.sh
#!/bin/bash
echo -e "Flag\t\c"
i=0
x="<![INITED]>"
while read line; do
b=`echo $line | awk '{print $1}'`
if [ "$b" != "$x" ]; then
array_1[$((i))]=$b
x=${array_1[$((i))]}
i=$((i+1))
fi
done < output.txt
sort -k 2 output.txt > out2.txt
i=0
x="<![INITED]>"
while read line; do
b=`echo $line | awk '{print $2}'`
if [ "$b" != "$x" ]; then
array_2[$((i))]=$b
x=${array_2[$((i))]}
i=$((i+1))
fi
done < out2.txt
for var in ${array_2[@]}; do
echo -e $var"\t\c"
done
echo -e "total\t"
echo -e ${array_1[0]}"\t\c"
i=0
j=0
x_t=0
while read line; do
e1=`echo $line | awk '{ print $1 }'`
e2=`echo $line | awk '{ print $2 }'`
e3=`echo $line | awk '{ print $3 }'`
if [ $e1 != ${array_1[$((j))]} ]; then
while [ $((i)) -lt ${#array_2[@]} ]; do
echo -e 0"\t\c"
i=$((i+1))
done
i=0
j=$((j+1))
echo $x_t
x_t=0
echo -e ${array_1[$((j))]}"\t\c"
fi
while [ 1 ]; do
if [ $e2 == ${array_2[$((i))]} ]; then
echo -e $e3"\t\c"
y_t[$((i))]=$(($((${y_t[$((i))]}))+$e3));
x_t=$((x_t+e3))
break;
elif [ $e2 -gt ${array_2[$((i))]} ]; then
echo -e 0"\t\c"
i=$((i+1))
fi
done
i=$((i+1))
done < output.txt
while [ $((i)) -lt ${#array_2[@]} ]; do
echo -e 0"\t\c"
i=$((i+1))
done
echo $x_t
i=0
echo -e "total\t\c"
for var in ${y_t[@]}; do
echo -e $var"\t\c"
i=$((i+var))
done
echo $i
rm -rf out*.txt5. Execution results
Run one 1620590 Data time of row 16 second . The execution result is shown in the figure :
边栏推荐
- 云原生(三十) | Kubernetes篇之应用商店-Helm
- 阿胖的操作记录
- Openhgnn releases version 0.3
- 创建Deployment后,无法创建Pod问题处理
- Daily question brushing record (6)
- Realization of hospital medical record management system based on JSP
- Teach you how to build a permanent personal server!
- 基于STM32设计的蓝牙健康管理设备
- crane:字典项与关联数据处理的新思路
- ThreadLocal 源码全详解(ThreadLocalMap)
猜你喜欢
随机推荐
《预训练周刊》第51期:重构预训练、零样本自动微调、一键调用OPT
快讯:华为启动鸿蒙开发者大赛;腾讯会议发布“万室如意”计划
大小端字节序
ThreadLocal 源码全详解(ThreadLocalMap)
Stack calculation (whether the order of entering and leaving the stack is legal) - Code
爱可可AI前沿推介(6.27)
With the advent of the era of Internet of everything, Ruijie released a scenario based wireless zero roaming scheme
How to open an account for CSI 500 stock index futures, what are the regular domestic stock index futures platforms, and where is the safest place to open an account?
阿胖的操作记录
[dynamic programming] - Knapsack Problem
After 2 years of outsourcing, I finally landed! Record my ByteDance 3 rounds of interviews, hope to help you!
scrapy
jvm 性能调优、监控工具 -- jps、jstack、jmap、jhat、jstat、hprof
scrapy
【问题解决】Tensorflow中run究竟运行了哪些节点?
#yyds干货盘点# 解决剑指offer:剪绳子(进阶版)
数字化新星何为低代码?何为无代码
Principle of printf indefinite length parameter
Viewpager2 usage record
MySQL 索引及其分类
![[acwing] explanation of the 57th weekly competition](/img/ef/be89606b0e7fffac08280db0a73781.gif)



![[weekly replay] the 81st biweekly match of leetcode](/img/66/03ee4dbb88b0be7486b71cd4059f44.png)



