当前位置:网站首页>Machine learning - Data Science Library - day two
Machine learning - Data Science Library - day two
2022-07-01 12:04:00 【weixin_ forty-five million six hundred and forty-nine thousand 】
Catalog
Draw a scatter plot
Suppose you get Beijing through a reptile 2016 year 3,10 The highest temperature during the day in January ( Respectively in the list a,b), So how to find out the breath temperature at this time, over time ( God ) Some law of change ?
a = [11,17,16,11,12,11,12,6,6,7,8,9,12,15,14,17,18,21,16,17,20,14,15,15,15,19,21,22,22,22,23]
b = [26,26,28,19,21,17,16,19,18,20,20,19,22,23,17,20,21,20,22,15,11,15,5,13,17,10,11,13,12,13,6]
from matplotlib import pyplot as plt
from matplotlib import font_manager
import matplotlib
my_font = matplotlib.rc('font',family='MicroSoft YaHei',weight='bold')
y_3 = [11,17,16,11,12,11,12,6,6,7,8,9,12,15,14,17,18,21,16,17,20,14,15,15,15,19,21,22,22,22,23]
y_10 = [26,26,28,19,21,17,16,19,18,20,20,19,22,23,17,20,21,20,22,15,11,15,5,13,17,10,11,13,12,13,6]
x_3 = range(1,32)
x_10 = range(51,82)
# Set graphic size
plt.figure(figsize=(20,8),dpi=80)
# Use scatter Method to draw scatter diagram , The only difference from the previous line chart
plt.scatter(x_3,y_3,label="3 month ")
plt.scatter(x_10,y_10,label="10 month ")
# adjustment x Axis scale
_x = list(x_3)+list(x_10)
_xtick_labels = ["3 month {} Japan ".format(i) for i in x_3]
_xtick_labels += ["10 month {} Japan ".format(i-50) for i in x_10]
plt.xticks(_x[::3],_xtick_labels[::3],fontproperties=my_font,rotation=45)
# Add legend
plt.legend(loc="upper left",prop=my_font)
# Add a description
plt.xlabel(" Time ",fontproperties=my_font)
plt.ylabel(" temperature ",fontproperties=my_font)
plt.title(" title ",fontproperties=my_font)
# Exhibition
plt.show()
Running results :
More application scenarios of scatter chart
- Different conditions ( dimension ) The internal relationship between
- Observe the degree of discrete aggregation of data
Draw a bar graph
Suppose you get 2017 Before the box office of mainland films in 20 In the movie ( list a) And movie box office data ( list b), So how to show the data more intuitively ?
a = [“ Warwolf 2”,“ Fast and furious 8”,“ Kung Fu Yoga ”,“ The west journey fu demon ”,“ The transformers 5: The last Knight ”,“ Wrestling. ! Dad ”,“ Pirates of the Caribbean 5: There was no testimony of witness after the conspirator passed away ”,“ Peter Jackson's King Kong : Skull Island ”,“ Extreme agent : Ultimate return ”,“ Biochemical crisis 6: The final chapter ”,“ Ride the wind and waves ”,“ Despicable Me 3”,“ Take advantage of Weihushan ”,“ Make a fuss about Tianzhu ”,“ Wolverine 3: To death ”,“ spider-man : The hero returns ”,“ The wu is empty the ”,“ Galaxy escort 2”,“ Feeling saint ”,“ The new mummy ”,]
b=[56.01,26.94,17.53,16.49,15.45,12.96,11.8,11.61,11.28,11.12,10.49,10.3,8.75,7.55,7.32,6.99,6.88,6.86,6.58,6.23] Company : Billion
bar() Methods draw a bar chart , use width To control the thickness of lines
from matplotlib import pyplot as plt
from matplotlib import font_manager
import matplotlib
my_font = matplotlib.rc('font',family='MicroSoft YaHei',weight='bold')
a = [" Warwolf 2"," Fast and furious 8"," Kung Fu Yoga "," The west journey fu demon "," The transformers 5: The last Knight "," Wrestling. ! Dad "," Pirates of the Caribbean 5: There was no testimony of witness after the conspirator passed away "," Peter Jackson's King Kong : Skull Island "," Extreme agent : Ultimate return "," Biochemical crisis 6: The final chapter "," Ride the wind and waves "," Despicable Me 3"," Take advantage of Weihushan "," Make a fuss about Tianzhu "," Wolverine 3: To death "," spider-man : The hero returns "," The wu is empty the "," Galaxy escort 2"," Feeling saint "," The new mummy ",]
b=[56.01,26.94,17.53,16.49,15.45,12.96,11.8,11.61,11.28,11.12,10.49,10.3,8.75,7.55,7.32,6.99,6.88,6.86,6.58,6.23]
# Set graphic size
plt.figure(figsize=(20,15),dpi=80)
# Draw a bar graph
plt.bar(range(len(a)),b,width=0.3)
# Set the string to x Axis
plt.xticks(range(len(a)),a,fontproperties=my_font,rotation=90)
plt.savefig("./movie.png")
plt.show()
Running results :
But the presentation effect of abscissa is not very good , You can exchange its horizontal and vertical coordinates ,
use barh() Method to draw a horizontal bar graph , use height To control the height of lines **
# Draw a horizontal bar chart
from matplotlib import pyplot as plt
from matplotlib import font_manager
import matplotlib
my_font = matplotlib.rc('font',family='MicroSoft YaHei',weight='bold')
a = [" Warwolf 2"," Fast and furious 8"," Kung Fu Yoga "," The west journey fu demon "," The transformers 5: The last Knight "," Wrestling. ! Dad "," Pirates of the Caribbean 5: There was no testimony of witness after the conspirator passed away "," Peter Jackson's King Kong : Skull Island "," Extreme agent : Ultimate return "," Biochemical crisis 6: The final chapter "," Ride the wind and waves "," Despicable Me 3"," Take advantage of Weihushan "," Make a fuss about Tianzhu "," Wolverine 3: To death "," spider-man : The hero returns "," The wu is empty the "," Galaxy escort 2"," Feeling saint "," The new mummy ",]
b=[56.01,26.94,17.53,16.49,15.45,12.96,11.8,11.61,11.28,11.12,10.49,10.3,8.75,7.55,7.32,6.99,6.88,6.86,6.58,6.23]
# Set graphic size
plt.figure(figsize=(20,8),dpi=80)
# Draw a bar graph
plt.barh(range(len(a)),b,height=0.3,color="orange")
# Set the string to x Axis
plt.yticks(range(len(a)),a,fontproperties=my_font)
plt.grid(alpha=0.3)
plt.show()
Running results :
【 practice 】 Suppose you know the list a The Chinese films are in 2017-09-14(b_14), 2017-09-15(b_15), 2017-09-16(b_16) Three days at the box office , In order to show the box office of the film itself in the list and the data comparison with other films , How to present the data more intuitively ?
a = [“ Scarlet ball rise 3: The ultimate battle ”,“ Dunkirk ”,“ spider-man : The hero returns ”,“ Warwolf 2”]
b_16 = [15746,312,4497,319]
b_15 = [12357,156,2045,168]
b_14 = [2358,399,2358,362]
from matplotlib import pyplot as plt
import matplotlib
my_font = matplotlib.rc('font',family='MicroSoft YaHei',weight='bold')
a = [" Scarlet ball rise 3: The ultimate battle "," Dunkirk "," spider-man : The hero returns "," Warwolf 2"]
b_16 = [15746,312,4497,319]
b_15 = [12357,156,2045,168]
b_14 = [2358,399,2358,362]
bar_width = 0.2
x_14 = list(range(len(a)))
x_15 = [i+bar_width for i in x_14]
x_16 = [i+bar_width*2 for i in x_14]
# Set graphic size
plt.figure(figsize=(20,8),dpi=80)
plt.bar(range(len(a)),b_14,width=bar_width,label="9 month 14 Japan ")
plt.bar(x_15,b_15,width=bar_width,label="9 month 15 Japan ")
plt.bar(x_16,b_16,width=bar_width,label="9 month 16 Japan ")
# Set legend
plt.legend(prop=my_font)
# Set up x Axis scale
plt.xticks(x_15,a,fontproperties=my_font)
plt.show()
Running results :
More application scenarios of bar chart
- Quantity statistics
- frequency count ( Market saturation )
Draw histogram
use hist() Method to draw histogram
Suppose you get 250 The length of a movie ( list a in ), I hope to find out the distribution of the duration of these films ( For example, the duration is 100 Minutes to 120 The number of movies per minute , Frequency of occurrence ) Etc , How do you present the data ?
a=[131, 98, 125, 131, 124, 139, 131, 117, 128, 108, 135, 138, 131, 102, 107, 114, 119, 128, 121, 142, 127, 130, 124, 101, 110, 116, 117, 110, 128, 128, 115, 99, 136, 126, 134, 95, 138, 117, 111,78, 132, 124, 113, 150, 110, 117, 86, 95, 144, 105, 126, 130,126, 130, 126, 116, 123, 106, 112, 138, 123, 86, 101, 99, 136,123, 117, 119, 105, 137, 123, 128, 125, 104, 109, 134, 125, 127,105, 120, 107, 129, 116, 108, 132, 103, 136, 118, 102, 120, 114,105, 115, 132, 145, 119, 121, 112, 139, 125, 138, 109, 132, 134,156, 106, 117, 127, 144, 139, 139, 119, 140, 83, 110, 102,123,107, 143, 115, 136, 118, 139, 123, 112, 118, 125, 109, 119, 133,112, 114, 122, 109, 106, 123, 116, 131, 127, 115, 118, 112, 135,115, 146, 137, 116, 103, 144, 83, 123, 111, 110, 111, 100, 154,136, 100, 118, 119, 133, 134, 106, 129, 126, 110, 111, 109, 141,120, 117, 106, 149, 122, 122, 110, 118, 127, 121, 114, 125, 126,114, 140, 103, 130, 141, 117, 106, 114, 121, 114, 133, 137, 92,121, 112, 146, 97, 137, 105, 98, 117, 112, 81, 97, 139, 113,134, 106, 144, 110, 137, 137, 111, 104, 117, 100, 111, 101, 110,105, 129, 137, 112, 120, 113, 133, 112, 83, 94, 146, 133, 101,131, 116, 111, 84, 137, 115, 122, 106, 144, 109, 123, 116, 111,111, 133, 150]
from matplotlib import pyplot as plt
a=[131, 98, 125, 131, 124, 139, 131, 117, 128, 108, 135, 138, 131, 102, 107, 114, 119, 128, 121, 142, 127, 130, 124, 101, 110, 116, 117, 110, 128, 128, 115, 99, 136, 126, 134, 95, 138, 117, 111,78, 132, 124, 113, 150, 110, 117, 86, 95, 144, 105, 126, 130,126, 130, 126, 116, 123, 106, 112, 138, 123, 86, 101, 99, 136,123, 117, 119, 105, 137, 123, 128, 125, 104, 109, 134, 125, 127,105, 120, 107, 129, 116, 108, 132, 103, 136, 118, 102, 120, 114,105, 115, 132, 145, 119, 121, 112, 139, 125, 138, 109, 132, 134,156, 106, 117, 127, 144, 139, 139, 119, 140, 83, 110, 102,123,107, 143, 115, 136, 118, 139, 123, 112, 118, 125, 109, 119, 133,112, 114, 122, 109, 106, 123, 116, 131, 127, 115, 118, 112, 135,115, 146, 137, 116, 103, 144, 83, 123, 111, 110, 111, 100, 154,136, 100, 118, 119, 133, 134, 106, 129, 126, 110, 111, 109, 141,120, 117, 106, 149, 122, 122, 110, 118, 127, 121, 114, 125, 126,114, 140, 103, 130, 141, 117, 106, 114, 121, 114, 133, 137, 92,121, 112, 146, 97, 137, 105, 98, 117, 112, 81, 97, 139, 113,134, 106, 144, 110, 137, 137, 111, 104, 117, 100, 111, 101, 110,105, 129, 137, 112, 120, 113, 133, 112, 83, 94, 146, 133, 101,131, 116, 111, 84, 137, 115, 122, 106, 144, 109, 123, 116, 111,111, 133, 150]
# Count groups
d = 3 # Group spacing
num_bins = (max(a)-min(a))//d
print(max(a),min(a),max(a)-min(a))
print(num_bins)
# Set the size of the graph
plt.figure(figsize=(20,8),dpi=80)
plt.hist(a,num_bins,density=True,stacked=True)
# Set up x Axis scale
plt.xticks(range(min(a),max(a)+d,d))
plt.grid()
plt.show()
Running results :
What is given in the tutorial is normed Method , But the program will report an error after running :AttributeError:‘Rectangle’ object has no property ‘normed’
Learned through Baidu , Because this library has been updated , This attribute is no longer available , So put the code normed Replace this attribute with density, Add another attribute stacked=True. After the modification, it will be successful to run again
### Histogram more application scenarios
- User's age distribution status
- The distribution of user clicks over a period of time
- The distribution of user active time
matplotlib Summary of the processes used
- Clarify the problem
- Choose how the graphics are rendered
- Prepare the data
- Drawing and graphic perfection
边栏推荐
- usb peripheral 驱动 - cable connect/disconnect
- 对于mvvm和mvc的理解
- Understanding of MVVM and MVC
- Learning summary on June 29, 2022
- Value/hush in redis
- Want to ask, is there a discount for opening a securities account? Is it safe to open a mobile account?
- Golang introduces the implementation method of the corresponding configuration file according to the parameters
- LeetCode力扣(剑指offer 31-35)31. 栈的压入弹出序列32I.II.III.从上到下打印二叉树33. 二叉搜索树的后序遍历序列34. 二叉树中和为某一值的路径35. 复杂链表的复制
- ABBIRB120工业机器人机械零点位置
- Building external modules
猜你喜欢

MQ prevent message loss and repeated consumption

91. (cesium chapter) cesium rocket launch simulation

Dlhsoft Kanban, Kanban component of WPF

GID: open vision proposes a comprehensive detection model knowledge distillation | CVPR 2021

Exposure: a white box photo post processing framework reading notes

91.(cesium篇)cesium火箭發射模擬

Harbor webhook from principle to construction

Self organization is the two-way rush of managers and members

Binary stack (I) - principle and C implementation

Mechanism and type of CPU context switch
随机推荐
Force button homepage introduction animation
redis常识
基于IMDB评论数据集的情感分析
研发效能度量框架解读
【单片机】【数码管】数码管显示
伸展树(一) - 概念和C实现
构建外部模块(Building External Modules)
How to make the development of liquidity pledge mining system, case analysis and source code of DAPP defi NFT LP liquidity pledge mining system development
CAD如何設置標注小數比特
241. 为运算表达式设计优先级 : DFS 运用题
Prepare for the Blue Bridge Cup Day10__ PWM control light brightness
Emotion analysis based on IMDB comment data set
Explore the contour detection function findcontours() of OpenCV in detail with practical examples, and thoroughly understand the real role and meaning of each parameter and mode
Implementation of address book management system with C language
GID: open vision proposes a comprehensive detection model knowledge distillation | CVPR 2021
Sum of factor numbers of interval product -- prefix sum idea + fixed one shift two
区间乘积的因子数之和——前缀和思想+定一移二
C # dependency injection (straight to the point) will be explained as soon as you see the series
S7-1500plc simulation
如何看懂开发的查询语句