当前位置:网站首页>Day45. data analysis practice (1): supermarket operation data analysis

Day45. data analysis practice (1): supermarket operation data analysis

2022-06-13 04:08:00 Little cute of Jingjing family

Day45. Data analysis practice (1): Supermarket operation and maintenance data analysis



Preface

This paper mainly analyzes the operation data of the supermarket , Through analysis , Have a certain intuitive understanding of the recent operation of the supermarket . See if you can get some useful information to improve or optimize the existing operation mode , Including sales means , Customer management helps the supermarket to improve its business status , It feels very interesting .


One . Reading data

We put the data in a table , And then use pandas Read it out ( All used here are jupyter notebook).
Here, because of the data comparison clean , Therefore, no cleaning and other operations were performed after reading , But it doesn't mean it's not important .

#  Import library 
import pandas as pd

#  Reading data 
data = pd.read_csv(' Supermarket operation data .csv',encoding='gbk',parse_dates=[" Closing time "])
#  Look at the first five lines of data 
data.head()

 The first five lines of data

Two . See which categories of goods are popular

Let's start by categorizing the data ID Grouping , Then sum the sales volume after grouping , Last use reset_index Reset index .

#  Group first 
data_group_xl = data.groupby(" Category ID")[" sales "].sum().reset_index()

#  In order to take out the top ten categories of goods with the best sales , We can `data_group` Sort by sales .
data_group_xl = data_group_xl.sort_value(by=" sales ", ascending=False).head(10)
#  Look at the grouped data 
data_group_xl

 Top ten sales

Here we can see that there is also a big difference in sales , Here, you can conduct purchase expectation and other operations according to the sales volume .

3、 ... and . Which goods are selling well

The analysis logic is consistent with the above categories .

data_group = data.groupby(" goods ID")[" sales "].sum().reset_index()
data_group.head(10)

 Top ten sales

The difference here is that the above one is classified by category , Here it is directly classified by sales volume , There is still a slight difference , That is, different priorities .

Four . Proportion of sales in different stores

First calculate the sales , Then add to the data

data[' sales '] = data[' sales ']*data[' The unit price ']

#  Group by store , Sum the turnover after grouping and recharge the index 
data_group_xse = data.groupby(' Store number ')[' sales '].sum().reset_index()
data_group_xse

 sales

Then use the pie chart to figure out the proportion of sales

from pyecharts import options as opts
from pyecharts.charts import Pie

#  The difference here is , stay pyecharts in , The data are all list.
x = list(data_group[' Store number '])
y = list(data_group[' sales '])
pie = (
    Pie()
   .add(
        "",
        [(i,j)for i,j in zip(x,y)],
        radius=["30%", "75%"],
        center=["50%", "50%"],
        rosetype="radius",
        label_opts=opts.LabelOpts(is_show=False),
    )
    .set_global_opts(title_opts=opts.TitleOpts(title=" Proportion of store sales "))
    .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {d}"))
)
pie.render_notebook();

 Proportion of store sales

notes : Here, it is suggested to generate it by yourself html Or right here jupyter Look inside , There will be more functions , But when it comes to pictures , This dynamic ( Original html Click on three small rectangles inside to change ) There are no characteristics .

5、 ... and . Peak period of supermarket passenger flow

It is necessary to understand the peak period of passenger flow , It can help the supermarket to determine the most appropriate time to carry out sales activities

#  First, extract the hours from the data ( Accurate to the hour is enough ).
data[" Hours "] = data[" Closing time "].map(lambda x: int(x.strftime("%H")))

#  Then the hours and orders are de duplicated 
traffic = data[[" Hours ", " Order ID"]].drop_duplicates()

#  Then calculate the order quantity per hour 
traffic_count = traffic.groupby(" Hours ")[" Order ID"].count()

#  Finally, use the order quantity to draw a line chart 
import pyecharts.options as opts
from pyecharts.charts import Line

x = [str(i) for i in list(range(6,22))]
y = list(traffic_count)
line=(
    Line()
    .add_xaxis(xaxis_data=x)
    .add_yaxis(series_name=" sales ",y_axis=y, is_smooth=True)
    .set_global_opts(
        title_opts=opts.TitleOpts(title=" Line chart of sales volume in different periods "),
        yaxis_opts=opts.AxisOpts(
                axistick_opts=opts.AxisTickOpts(is_show=True),
                splitline_opts=opts.SplitLineOpts(is_show=True),
            ),
    )
)
line.render_notebook()

 Line chart of sales volume in different time periods

  1. We can see from the picture above ,8 Point to 10 Point is the peak of sales on a wet day , then 17 Point to 19 Point advantage a small peak . From the actual situation, it is also quite reasonable , Because these two time periods correspond to the time of breakfast in the morning or getting up early to buy vegetables to make breakfast, dinner and work .
  2. Can we conduct discount drainage and other operations according to the time characteristics .
  3. Here, you can also try at different times ( An hour ) Analysis within , Look at the difference between the best-selling products in different times , It should be possible to find the sales characteristics at different times ; Or the peak sales of products ( In one day ), And then draw a bar chart, which is more intuitive , Or classify by products , Draw a scatter chart , Look at their distribution throughout the day , It's all right idea, By the way, contact the drawing operation .

summary

Here is a brief introduction to a small supermarket operation case analysis , Is to put what you have learned and thought into practice , If you have any good ideas or want to roast, you can say .

原网站

版权声明
本文为[Little cute of Jingjing family]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202280525229618.html