当前位置:网站首页>Day45. data analysis practice (1): supermarket operation data analysis
Day45. data analysis practice (1): supermarket operation data analysis
2022-06-13 04:08:00 【Little cute of Jingjing family】
Day45. Data analysis practice (1): Supermarket operation and maintenance data analysis
List of articles
- Day45. Data analysis practice (1): Supermarket operation and maintenance data analysis
- Preface
- One . Reading data
- Two . See which categories of goods are popular
- 3、 ... and . Which goods are selling well
- Four . Proportion of sales in different stores
- 5、 ... and . Peak period of supermarket passenger flow
- summary
Preface
This paper mainly analyzes the operation data of the supermarket , Through analysis , Have a certain intuitive understanding of the recent operation of the supermarket . See if you can get some useful information to improve or optimize the existing operation mode , Including sales means , Customer management helps the supermarket to improve its business status , It feels very interesting .
One . Reading data
We put the data in a table , And then use
pandas
Read it out ( All used here arejupyter notebook
).
Here, because of the data comparisonclean
, Therefore, no cleaning and other operations were performed after reading , But it doesn't mean it's not important .
# Import library
import pandas as pd
# Reading data
data = pd.read_csv(' Supermarket operation data .csv',encoding='gbk',parse_dates=[" Closing time "])
# Look at the first five lines of data
data.head()
Two . See which categories of goods are popular
Let's start by categorizing the data
ID
Grouping , Then sum the sales volume after grouping , Last usereset_index
Reset index .
# Group first
data_group_xl = data.groupby(" Category ID")[" sales "].sum().reset_index()
# In order to take out the top ten categories of goods with the best sales , We can `data_group` Sort by sales .
data_group_xl = data_group_xl.sort_value(by=" sales ", ascending=False).head(10)
# Look at the grouped data
data_group_xl
Here we can see that there is also a big difference in sales , Here, you can conduct purchase expectation and other operations according to the sales volume .
3、 ... and . Which goods are selling well
The analysis logic is consistent with the above categories .
data_group = data.groupby(" goods ID")[" sales "].sum().reset_index()
data_group.head(10)
The difference here is that the above one is classified by category , Here it is directly classified by sales volume , There is still a slight difference , That is, different priorities .
Four . Proportion of sales in different stores
First calculate the sales , Then add to the data
data[' sales '] = data[' sales ']*data[' The unit price ']
# Group by store , Sum the turnover after grouping and recharge the index
data_group_xse = data.groupby(' Store number ')[' sales '].sum().reset_index()
data_group_xse
Then use the pie chart to figure out the proportion of sales
from pyecharts import options as opts
from pyecharts.charts import Pie
# The difference here is , stay pyecharts in , The data are all list.
x = list(data_group[' Store number '])
y = list(data_group[' sales '])
pie = (
Pie()
.add(
"",
[(i,j)for i,j in zip(x,y)],
radius=["30%", "75%"],
center=["50%", "50%"],
rosetype="radius",
label_opts=opts.LabelOpts(is_show=False),
)
.set_global_opts(title_opts=opts.TitleOpts(title=" Proportion of store sales "))
.set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {d}"))
)
pie.render_notebook();
notes : Here, it is suggested to generate it by yourself html Or right here jupyter Look inside , There will be more functions , But when it comes to pictures , This dynamic ( Original html Click on three small rectangles inside to change ) There are no characteristics .
5、 ... and . Peak period of supermarket passenger flow
It is necessary to understand the peak period of passenger flow , It can help the supermarket to determine the most appropriate time to carry out sales activities
# First, extract the hours from the data ( Accurate to the hour is enough ).
data[" Hours "] = data[" Closing time "].map(lambda x: int(x.strftime("%H")))
# Then the hours and orders are de duplicated
traffic = data[[" Hours ", " Order ID"]].drop_duplicates()
# Then calculate the order quantity per hour
traffic_count = traffic.groupby(" Hours ")[" Order ID"].count()
# Finally, use the order quantity to draw a line chart
import pyecharts.options as opts
from pyecharts.charts import Line
x = [str(i) for i in list(range(6,22))]
y = list(traffic_count)
line=(
Line()
.add_xaxis(xaxis_data=x)
.add_yaxis(series_name=" sales ",y_axis=y, is_smooth=True)
.set_global_opts(
title_opts=opts.TitleOpts(title=" Line chart of sales volume in different periods "),
yaxis_opts=opts.AxisOpts(
axistick_opts=opts.AxisTickOpts(is_show=True),
splitline_opts=opts.SplitLineOpts(is_show=True),
),
)
)
line.render_notebook()
- We can see from the picture above ,8 Point to 10 Point is the peak of sales on a wet day , then 17 Point to 19 Point advantage a small peak . From the actual situation, it is also quite reasonable , Because these two time periods correspond to the time of breakfast in the morning or getting up early to buy vegetables to make breakfast, dinner and work .
- Can we conduct discount drainage and other operations according to the time characteristics .
- Here, you can also try at different times ( An hour ) Analysis within , Look at the difference between the best-selling products in different times , It should be possible to find the sales characteristics at different times ; Or the peak sales of products ( In one day ), And then draw a bar chart, which is more intuitive , Or classify by products , Draw a scatter chart , Look at their distribution throughout the day , It's all right idea, By the way, contact the drawing operation .
summary
Here is a brief introduction to a small supermarket operation case analysis , Is to put what you have learned and thought into practice , If you have any good ideas or want to roast, you can say .
边栏推荐
- MCU: NEC protocol infrared remote controller
- ROS话题与节点
- Among the four common technologies for UAV obstacle avoidance, why does Dajiang prefer binocular vision
- Use ASE encryption and decryption cache encapsulation in Vue project
- 1-72 convert string to decimal integer
- 10 minutes to thoroughly understand how to configure sub domain names to deploy multiple projects
- 谈谈激光雷达的波长
- Talking about the wavelength of laser radar
- 5G China unicom AP:B SMS ASCII 转码要求
- JSTL -- JSP standard tag library
猜你喜欢
[test development] automated test selenium (II) -- common APIs for webdriver
R: Airline customer value analysis practice
[test development] file compression project practice
【LeetCode】860. Change with lemonade (2 brushes for wrong questions)
Single chip microcomputer: a/d differential input signal
EGO planner论文翻译
Filter and listener
leetcode. 1 --- sum of two numbers
单片机:RS485 通信与 Modbus 协议
单片机:红外遥控通信原理
随机推荐
Solution to failure to download files by wechat scanning QR code
Translation of ego planner papers
El expression
Lambda termination operation find and match nonematch
Difference between OKR and KPI
try-catch finally执行顺序的例题
How to use debounce in lodash to realize anti shake
[notes] summarize common horizontal and vertical centering methods
单片机:I2C通信协议讲解
缓存读写--写
Dumi construit un blog documentaire
Introduction to MCU peripherals: temperature sensor DS18B20
【LeetCode】860. Change with lemonade (2 brushes for wrong questions)
单片机:EEPROM 多字节读写操作时序
Student management system
Cache read / write -- write
单片机串口通信原理和控制程序
dumi 搭建文档型博客
knife4j aggregation 2.0.9支持路由文档自动刷新
单片机:PCF8591硬件接口