当前位置:网站首页>Machine learning - Data Science Library - day two
Machine learning - Data Science Library - day two
2022-07-01 12:04:00 【weixin_ forty-five million six hundred and forty-nine thousand 】
Catalog
Draw a scatter plot
Suppose you get Beijing through a reptile 2016 year 3,10 The highest temperature during the day in January ( Respectively in the list a,b), So how to find out the breath temperature at this time, over time ( God ) Some law of change ?
a = [11,17,16,11,12,11,12,6,6,7,8,9,12,15,14,17,18,21,16,17,20,14,15,15,15,19,21,22,22,22,23]
b = [26,26,28,19,21,17,16,19,18,20,20,19,22,23,17,20,21,20,22,15,11,15,5,13,17,10,11,13,12,13,6]
from matplotlib import pyplot as plt
from matplotlib import font_manager
import matplotlib
my_font = matplotlib.rc('font',family='MicroSoft YaHei',weight='bold')
y_3 = [11,17,16,11,12,11,12,6,6,7,8,9,12,15,14,17,18,21,16,17,20,14,15,15,15,19,21,22,22,22,23]
y_10 = [26,26,28,19,21,17,16,19,18,20,20,19,22,23,17,20,21,20,22,15,11,15,5,13,17,10,11,13,12,13,6]
x_3 = range(1,32)
x_10 = range(51,82)
# Set graphic size
plt.figure(figsize=(20,8),dpi=80)
# Use scatter Method to draw scatter diagram , The only difference from the previous line chart
plt.scatter(x_3,y_3,label="3 month ")
plt.scatter(x_10,y_10,label="10 month ")
# adjustment x Axis scale
_x = list(x_3)+list(x_10)
_xtick_labels = ["3 month {} Japan ".format(i) for i in x_3]
_xtick_labels += ["10 month {} Japan ".format(i-50) for i in x_10]
plt.xticks(_x[::3],_xtick_labels[::3],fontproperties=my_font,rotation=45)
# Add legend
plt.legend(loc="upper left",prop=my_font)
# Add a description
plt.xlabel(" Time ",fontproperties=my_font)
plt.ylabel(" temperature ",fontproperties=my_font)
plt.title(" title ",fontproperties=my_font)
# Exhibition
plt.show()
Running results :
More application scenarios of scatter chart
- Different conditions ( dimension ) The internal relationship between
- Observe the degree of discrete aggregation of data
Draw a bar graph
Suppose you get 2017 Before the box office of mainland films in 20 In the movie ( list a) And movie box office data ( list b), So how to show the data more intuitively ?
a = [“ Warwolf 2”,“ Fast and furious 8”,“ Kung Fu Yoga ”,“ The west journey fu demon ”,“ The transformers 5: The last Knight ”,“ Wrestling. ! Dad ”,“ Pirates of the Caribbean 5: There was no testimony of witness after the conspirator passed away ”,“ Peter Jackson's King Kong : Skull Island ”,“ Extreme agent : Ultimate return ”,“ Biochemical crisis 6: The final chapter ”,“ Ride the wind and waves ”,“ Despicable Me 3”,“ Take advantage of Weihushan ”,“ Make a fuss about Tianzhu ”,“ Wolverine 3: To death ”,“ spider-man : The hero returns ”,“ The wu is empty the ”,“ Galaxy escort 2”,“ Feeling saint ”,“ The new mummy ”,]
b=[56.01,26.94,17.53,16.49,15.45,12.96,11.8,11.61,11.28,11.12,10.49,10.3,8.75,7.55,7.32,6.99,6.88,6.86,6.58,6.23] Company : Billion
bar() Methods draw a bar chart , use width To control the thickness of lines
from matplotlib import pyplot as plt
from matplotlib import font_manager
import matplotlib
my_font = matplotlib.rc('font',family='MicroSoft YaHei',weight='bold')
a = [" Warwolf 2"," Fast and furious 8"," Kung Fu Yoga "," The west journey fu demon "," The transformers 5: The last Knight "," Wrestling. ! Dad "," Pirates of the Caribbean 5: There was no testimony of witness after the conspirator passed away "," Peter Jackson's King Kong : Skull Island "," Extreme agent : Ultimate return "," Biochemical crisis 6: The final chapter "," Ride the wind and waves "," Despicable Me 3"," Take advantage of Weihushan "," Make a fuss about Tianzhu "," Wolverine 3: To death "," spider-man : The hero returns "," The wu is empty the "," Galaxy escort 2"," Feeling saint "," The new mummy ",]
b=[56.01,26.94,17.53,16.49,15.45,12.96,11.8,11.61,11.28,11.12,10.49,10.3,8.75,7.55,7.32,6.99,6.88,6.86,6.58,6.23]
# Set graphic size
plt.figure(figsize=(20,15),dpi=80)
# Draw a bar graph
plt.bar(range(len(a)),b,width=0.3)
# Set the string to x Axis
plt.xticks(range(len(a)),a,fontproperties=my_font,rotation=90)
plt.savefig("./movie.png")
plt.show()
Running results :
But the presentation effect of abscissa is not very good , You can exchange its horizontal and vertical coordinates ,
use barh() Method to draw a horizontal bar graph , use height To control the height of lines **
# Draw a horizontal bar chart
from matplotlib import pyplot as plt
from matplotlib import font_manager
import matplotlib
my_font = matplotlib.rc('font',family='MicroSoft YaHei',weight='bold')
a = [" Warwolf 2"," Fast and furious 8"," Kung Fu Yoga "," The west journey fu demon "," The transformers 5: The last Knight "," Wrestling. ! Dad "," Pirates of the Caribbean 5: There was no testimony of witness after the conspirator passed away "," Peter Jackson's King Kong : Skull Island "," Extreme agent : Ultimate return "," Biochemical crisis 6: The final chapter "," Ride the wind and waves "," Despicable Me 3"," Take advantage of Weihushan "," Make a fuss about Tianzhu "," Wolverine 3: To death "," spider-man : The hero returns "," The wu is empty the "," Galaxy escort 2"," Feeling saint "," The new mummy ",]
b=[56.01,26.94,17.53,16.49,15.45,12.96,11.8,11.61,11.28,11.12,10.49,10.3,8.75,7.55,7.32,6.99,6.88,6.86,6.58,6.23]
# Set graphic size
plt.figure(figsize=(20,8),dpi=80)
# Draw a bar graph
plt.barh(range(len(a)),b,height=0.3,color="orange")
# Set the string to x Axis
plt.yticks(range(len(a)),a,fontproperties=my_font)
plt.grid(alpha=0.3)
plt.show()
Running results :
【 practice 】 Suppose you know the list a The Chinese films are in 2017-09-14(b_14), 2017-09-15(b_15), 2017-09-16(b_16) Three days at the box office , In order to show the box office of the film itself in the list and the data comparison with other films , How to present the data more intuitively ?
a = [“ Scarlet ball rise 3: The ultimate battle ”,“ Dunkirk ”,“ spider-man : The hero returns ”,“ Warwolf 2”]
b_16 = [15746,312,4497,319]
b_15 = [12357,156,2045,168]
b_14 = [2358,399,2358,362]
from matplotlib import pyplot as plt
import matplotlib
my_font = matplotlib.rc('font',family='MicroSoft YaHei',weight='bold')
a = [" Scarlet ball rise 3: The ultimate battle "," Dunkirk "," spider-man : The hero returns "," Warwolf 2"]
b_16 = [15746,312,4497,319]
b_15 = [12357,156,2045,168]
b_14 = [2358,399,2358,362]
bar_width = 0.2
x_14 = list(range(len(a)))
x_15 = [i+bar_width for i in x_14]
x_16 = [i+bar_width*2 for i in x_14]
# Set graphic size
plt.figure(figsize=(20,8),dpi=80)
plt.bar(range(len(a)),b_14,width=bar_width,label="9 month 14 Japan ")
plt.bar(x_15,b_15,width=bar_width,label="9 month 15 Japan ")
plt.bar(x_16,b_16,width=bar_width,label="9 month 16 Japan ")
# Set legend
plt.legend(prop=my_font)
# Set up x Axis scale
plt.xticks(x_15,a,fontproperties=my_font)
plt.show()
Running results :
More application scenarios of bar chart
- Quantity statistics
- frequency count ( Market saturation )
Draw histogram
use hist() Method to draw histogram
Suppose you get 250 The length of a movie ( list a in ), I hope to find out the distribution of the duration of these films ( For example, the duration is 100 Minutes to 120 The number of movies per minute , Frequency of occurrence ) Etc , How do you present the data ?
a=[131, 98, 125, 131, 124, 139, 131, 117, 128, 108, 135, 138, 131, 102, 107, 114, 119, 128, 121, 142, 127, 130, 124, 101, 110, 116, 117, 110, 128, 128, 115, 99, 136, 126, 134, 95, 138, 117, 111,78, 132, 124, 113, 150, 110, 117, 86, 95, 144, 105, 126, 130,126, 130, 126, 116, 123, 106, 112, 138, 123, 86, 101, 99, 136,123, 117, 119, 105, 137, 123, 128, 125, 104, 109, 134, 125, 127,105, 120, 107, 129, 116, 108, 132, 103, 136, 118, 102, 120, 114,105, 115, 132, 145, 119, 121, 112, 139, 125, 138, 109, 132, 134,156, 106, 117, 127, 144, 139, 139, 119, 140, 83, 110, 102,123,107, 143, 115, 136, 118, 139, 123, 112, 118, 125, 109, 119, 133,112, 114, 122, 109, 106, 123, 116, 131, 127, 115, 118, 112, 135,115, 146, 137, 116, 103, 144, 83, 123, 111, 110, 111, 100, 154,136, 100, 118, 119, 133, 134, 106, 129, 126, 110, 111, 109, 141,120, 117, 106, 149, 122, 122, 110, 118, 127, 121, 114, 125, 126,114, 140, 103, 130, 141, 117, 106, 114, 121, 114, 133, 137, 92,121, 112, 146, 97, 137, 105, 98, 117, 112, 81, 97, 139, 113,134, 106, 144, 110, 137, 137, 111, 104, 117, 100, 111, 101, 110,105, 129, 137, 112, 120, 113, 133, 112, 83, 94, 146, 133, 101,131, 116, 111, 84, 137, 115, 122, 106, 144, 109, 123, 116, 111,111, 133, 150]
from matplotlib import pyplot as plt
a=[131, 98, 125, 131, 124, 139, 131, 117, 128, 108, 135, 138, 131, 102, 107, 114, 119, 128, 121, 142, 127, 130, 124, 101, 110, 116, 117, 110, 128, 128, 115, 99, 136, 126, 134, 95, 138, 117, 111,78, 132, 124, 113, 150, 110, 117, 86, 95, 144, 105, 126, 130,126, 130, 126, 116, 123, 106, 112, 138, 123, 86, 101, 99, 136,123, 117, 119, 105, 137, 123, 128, 125, 104, 109, 134, 125, 127,105, 120, 107, 129, 116, 108, 132, 103, 136, 118, 102, 120, 114,105, 115, 132, 145, 119, 121, 112, 139, 125, 138, 109, 132, 134,156, 106, 117, 127, 144, 139, 139, 119, 140, 83, 110, 102,123,107, 143, 115, 136, 118, 139, 123, 112, 118, 125, 109, 119, 133,112, 114, 122, 109, 106, 123, 116, 131, 127, 115, 118, 112, 135,115, 146, 137, 116, 103, 144, 83, 123, 111, 110, 111, 100, 154,136, 100, 118, 119, 133, 134, 106, 129, 126, 110, 111, 109, 141,120, 117, 106, 149, 122, 122, 110, 118, 127, 121, 114, 125, 126,114, 140, 103, 130, 141, 117, 106, 114, 121, 114, 133, 137, 92,121, 112, 146, 97, 137, 105, 98, 117, 112, 81, 97, 139, 113,134, 106, 144, 110, 137, 137, 111, 104, 117, 100, 111, 101, 110,105, 129, 137, 112, 120, 113, 133, 112, 83, 94, 146, 133, 101,131, 116, 111, 84, 137, 115, 122, 106, 144, 109, 123, 116, 111,111, 133, 150]
# Count groups
d = 3 # Group spacing
num_bins = (max(a)-min(a))//d
print(max(a),min(a),max(a)-min(a))
print(num_bins)
# Set the size of the graph
plt.figure(figsize=(20,8),dpi=80)
plt.hist(a,num_bins,density=True,stacked=True)
# Set up x Axis scale
plt.xticks(range(min(a),max(a)+d,d))
plt.grid()
plt.show()
Running results :
What is given in the tutorial is normed Method , But the program will report an error after running :AttributeError:‘Rectangle’ object has no property ‘normed’
Learned through Baidu , Because this library has been updated , This attribute is no longer available , So put the code normed Replace this attribute with density, Add another attribute stacked=True. After the modification, it will be successful to run again
### Histogram more application scenarios
- User's age distribution status
- The distribution of user clicks over a period of time
- The distribution of user active time
matplotlib Summary of the processes used
- Clarify the problem
- Choose how the graphics are rendered
- Prepare the data
- Drawing and graphic perfection
边栏推荐
- GID: open vision proposes a comprehensive detection model knowledge distillation | CVPR 2021
- Golang根据参数引入相应配置文件的实现方式
- redis中value/SortedSet
- Talk about the pessimistic strategy that triggers full GC?
- Neo4j 中文开发者月刊 - 202206期
- 强大、好用、适合程序员/软件开发者的专业编辑器/笔记软件综合评测和全面推荐
- 二叉堆(一) - 原理与C实现
- How to set decimal places in CAD
- Joint Time-Frequency and Time Domain Learning for Speech Enhancement
- Joint Time-Frequency and Time Domain Learning for Speech Enhancement
猜你喜欢
随机推荐
redis中value/SortedSet
伸展树(一) - 概念和C实现
如何看懂开发的查询语句
How does Nike dominate the list all the year round? Here comes the answer to the latest financial report
Theoretical basis of graph
邻接矩阵无向图(一) - 基本概念与C语言
usb peripheral 驱动 - cable connect/disconnect
Learning summary on June 29, 2022
ABBIRB120工业机器人机械零点位置
Unity XLua 协程封装
强大、好用、适合程序员/软件开发者的专业编辑器/笔记软件综合评测和全面推荐
How to set decimal places in CAD
C knowledge point form summary 2
Share the method of how to preview PSD format and PSD file thumbnail plug-in [easy to understand]
MQ prevent message loss and repeated consumption
Learning summary on June 30, 2022
Personnaliser le plug - in GRPC
C # dependency injection (straight to the point) will be explained as soon as you see the series
S7-1500plc simulation
Unity xlua co process packaging








![[Maui] add click events for label, image and other controls](/img/d6/7ac9632681c970ed99c9e4d3934ddc.jpg)
