当前位置:网站首页>How to use ARIMA model for prediction?
How to use ARIMA model for prediction?
2022-06-25 12:06:00 【Halosec_ Wei】
1、 effect
ARIMA The full name of the model is called autoregressive moving average model , It is the most common statistical model used for time series prediction .
2、 Input / output description
Input : The characteristic sequence is 1 Quantitative variables of time series data
Output : future N Forecast value of days
3、 Learning Websites
SPSSPRO- Free professional online data analysis platform
4、 Case example
Case study : be based on 1985-2021 The sales volume of a magazine in , Forecast the sales volume of a commodity in the next five years .
5、 Case data

ARIMA Case data
6、 Case operation

Step1: New analysis ;
Step2: Upload data ;
Step3: Select the corresponding data to open and preview , Click start analysis after confirmation ;

step4: choice 【 Time series analysis (ARIMA)】;
step5: View the corresponding data format ,【 Time series analysis (ARIMA)】 Request input 1 Quantitative variables of time series data .
step6: Select the number of periods to forecast backwards .
step7: Click on 【 To analyze 】, Complete the operation .
7、 Output result analysis
Output results 1:ADF Check list

*p<0.05,**p<0.01,***p<0.001
Intelligent analysis : The results of this sequence test show that , Based on field annual sales :
The difference is 0 Step time , Significance P The value is 0.998, Don't show significance on the level , The original hypothesis cannot be rejected , This series is an unstable time series . The difference is 1 Step time , Significance P The value is 0.023*, The level is significant , Rejection of null hypothesis , This series is a stationary time series .
The difference is 2 Step time , Significance P The value is 0.000***, The level is significant , Rejection of null hypothesis , This series is a stationary time series .
( Be careful : In theory , Sufficient difference operations can fully extract the non-stationary deterministic information in the original time series . But what we need to pay attention to when doing the difference operation is , The order of difference operation is not the more the better . Difference is the extraction of information 、 The process of processing , Every time there is a difference, there will be a loss of information , So the order of the difference needs to be appropriate , To avoid excessive difference .)
Output results 2: Optimal differential sequence diagram

Chart description : Because the unit root test of the sequence after the first-order difference P Less than 0.05, It shows that the sequence after the first-order difference is stationary data , The figure above shows the raw data 1 Sequence diagram after order difference .
Output results 3: Final difference data autocorrelation diagram (ACF)

Chart description : According to the autocorrelation diagram , The first-order autocorrelation coefficient is obviously greater than 2 Multiple standard deviation range , After the first-order autocorrelation coefficient , The other autocorrelation coefficients are all in 2 Within times the standard deviation , We can judge that the autocorrelation graph is truncated .
Output results 4: Partial autocorrelation diagram of final difference data (PACF)

Chart description : From the partial autocorrelation diagram , The first-order partial autocorrelation coefficient is obviously greater than 2 Multiple standard deviation range , After the first order partial autocorrelation coefficient , The other autocorrelation coefficients are all in 2 Within times the standard deviation , We can judge that the partial autocorrelation graph is truncated .
Output results 5: Model parameter table

*p<0.05,**p<0.01,***p<0.001
Chart description : It is judged by autocorrelation analysis and partial autocorrelation analysis ARIMA There is artificial subjectivity in the parameters of ,SPSSPRO be based on AIC The information criterion automatically finds the optimal parameters , The result of the model is ARIMA Model (0,1,1) Check list , Based on field : The annual sales , from Q The result of statistical analysis can be :Q6 There is no significant difference in the level , The assumption that the residual of the model is a white noise sequence cannot be rejected , At the same time, the goodness of fit of the model R2 by 0.981, The model performs well , The model basically meets the requirements .( Be careful : Generally speaking , Only before inspection 6 Period and before 12 The delay Q statistic ( namely Q6 and Q12) It can be concluded whether the residual is a random sequence . This is because stationary series usually have short-term correlations , If there is no significant correlation between the values of a short-term delay sequence , There is usually no significant correlation between delays .)
Output results 6: Model residual autocorrelation diagram (ACF

Chart description : The figure above shows the residual autocorrelation diagram of the model ,(ACF) If the correlation coefficients are all in the dotted line (2 Times the standard deviation ) Inside , Autoregressive model (AR) The residual is a white noise sequence , The time series requires that the model residuals be white noise series . Obviously , The autocorrelation coefficients of the residuals are all in the dotted line .
Output results 7: Partial autocorrelation of model residuals (PACF)

Chart description : The figure above shows the residual partial autocorrelation of the model (PACF), If the correlation coefficients are all within the dotted line , Moving average model (MA) The residual is a white noise sequence , The time series requires that the model residuals be white noise series . Obviously , Most of the partial autocorrelation coefficients of the residuals are within the dotted line , Even if the second 9 Order and order 14 The order exceeds 2 Times the standard deviation , This may be caused by accidental factors .
Output results 8: Model check table

*p<0.05,**p<0.01,***p<0.001
Chart description : Based on field annual sales ,SPSSPRO be based on AIC The information criterion automatically finds the optimal parameters , The result of the model is ARIMA Model (0,1,1) The checklist is based on 1 Differential data , The model formula is as follows : y(t)=4.996+0.671*ε(t-1)
Output results 9: Time series diagram

Chart description : The figure above shows the original data graph of the time series model 、 Model fitting value 、 Model predictions . It can be seen from the picture that , There is great similarity between the fitted sequence trend and the real sequence trend , It shows that the fitting effect is good .
Output results 10: Time series prediction table
Chart description : The table above shows the most recent time series models 5 Forecast of the current data .

8、 matters needing attention
- There are the following , Usually regarded as ( partial ) Autocorrelation coefficient d Order truncation :
- In the initial d The order is significantly greater than 2 Multiple standard deviation range
- Then almost 95% Of ( partial ) The autocorrelation coefficients fall in 2 Within times the standard deviation
- And the process of attenuation from non-zero autocorrelation coefficient to small value fluctuation near zero is very sudden
- There are the following , Usually regarded as ( partial ) Autocorrelation coefficient tailing :
- If there is more than 5% The sample of ( partial ) The autocorrelation coefficients fall outside the range of double standard deviation
- Or by significant non 0 Of ( partial ) The attenuation of autocorrelation coefficient to small value fluctuation is slow or very continuous
- After analyzing the autocorrelation diagram and partial autocorrelation diagram , Can be established ARMA Model :
- Partial autocorrelation (PACF) The picture is p Step to truncate , Autocorrelation (ACF) Figure trailing ,ARMA The model can be simplified to AR(p) Model ;
- Autocorrelation (PACF) The picture is q Step to truncate , Partial autocorrelation (ACF) Figure trailing ,ARMA The model can be simplified to MA(q) Model ;
- If both autocorrelation and partial autocorrelation are tailed , Can be combined with PACF、ACF The most significant order in the graph ( minimum value ) As p、q value ;
- If both autocorrelation and partial autocorrelation are truncated , You can choose to change to a higher differential , Or not suitable for establishing ARMA Model ;
- SPSSPRO By default AIC The rule is right q And p Carry out optimization and order determination , use adf test + Differential analysis selects the optimal differential hierarchy d
9、 Model theory
ARIMA Model is a method widely used to analyze and model all kinds of time series data . The model is based on the following concepts : The time series to be predicted is generated by a random process . If the random process that generates the sequence does not change with time , Then the structure of the stochastic process can be To be accurately characterized and described . Using past observations of the sequence , The future value of the sequence can be extrapolated . stay ARIMA In the model , The future value of the sequence is expressed as a linear function of the current period and lag period of the lag term and random interference term , The general form of the model is shown in the following formula :
![]()
ARIMA The modeling process of the model can be divided into the following four steps :
step 1 Stationary test of time series . Usually used ADF or PP Inspection method , Perform unit root test on the original sequence . If the sequence does not Satisfy the stationarity condition , It can be transformed by difference or logarithmic difference , Transforming non-stationary time series into stationary time series , Then level Stable time series construction ARIMA Model ;
step 2 Determine the order of the model . With the help of some statistics that can describe the characteristics of the sequence , Such as autocorrelation (AC) Coefficient and partial autocorrelation (PAC) coefficient , Preliminarily identify the possible forms of the model , And then according to AIC Equal order criterion , Select the best model from the available models ;
step 3 Parameter estimation and diagnostic test . Including testing the significance of model parameters , The validity of the model itself and whether the residual sequence is white noise Sound sequence . If the model passes the test , Then the model setting is basically correct , otherwise , The form of the model must be redefined , And diagnostic tests , Until we get the setting Determine the correct model form ;
step 4 Use the established ARIMA The model predicts .
10、 reference
[1] Wang Yan . Using time series analysis [M]. Beijing : Renmin University Press of China 2005.
[2] Zheng Li , Duandongmei , Lufengbin , etc. . Integrated forecast of pork consumption demand in China —— be based on ARIMA、VAR and VEC Demonstration of the model [J]. Theory and practice of system engineering ,2013,33(4):918-925.
边栏推荐
- Application of analytic hierarchy process in college teaching evaluation system (principle + example + tool)
- 依概率收敛
- R语言使用scale函数对神经网络的输入数据进行最小最大缩放、把数据缩放到0到1之间、并划分数据集为训练集和测试集
- Thirty lines of code prevent VFP forms from running repeatedly, and the function supports parameter transfer
- 分享7个神仙壁纸网站,让新的壁纸,给自己小小的雀跃,不陷入年年日日的重复。
- Dark horse shopping mall ---8 Microservice gateway and JWT token
- Mui scroll bar recovery
- JS judge whether a number is in the set
- 客户经理的开户二维码开户买股票安全吗?有谁知道啊
- Why can't you Ping the website but you can access it?
猜你喜欢

Idea local launch Flink task

黑马畅购商城---1.项目介绍-环境搭建

Flink deeply understands the graph generation process (source code interpretation)

Detailed explanation of Flink checkpoint specific operation process and summary of error reporting and debugging methods

Share 7 immortal wallpaper websites, let the new wallpaper give you a little joy, and don't fall into the repetition year after year.

Customize to prevent repeated submission of annotations (using redis)

The cloud native data lake has passed the evaluation and certification of the ICT Institute with its storage, computing, data management and other capabilities

confluence7.4.X升级实录

Dark horse shopping mall ---6 Brand, specification statistics, condition filtering, paging sorting, highlighting

云原生数据湖以存储、计算、数据管理等能力通过信通院评测认证
随机推荐
交易期货沪镍产品网上怎么开户
R语言dplyr包filter函数过滤dataframe数据中指定数据列的内容不是(不等于指定向量中的其中一个)指定列表中的数据行
The cloud native data lake has passed the evaluation and certification of the ICT Institute with its storage, computing, data management and other capabilities
Multiple clicks of the button result in results
Problems encountered using easyexcel
flutter常用命令及问题
网络上开户买股票是否安全呢?
客从何处来
plt. GCA () picture frame and label
9 cases where elements cannot be located
Uncover gaussdb (for redis): comprehensive comparison of CODIS
Why can't the form be closed? The magic of revealing VFP object references
文献之有效阅读
2022年首期Techo Day腾讯技术开放日将于6月28日线上举办
依概率收敛
如果你也想做自媒体,不妨听大周给您点建议
2020最新最全IT学习线路
TCP如何處理三次握手和四次揮手期間的异常
分享7个神仙壁纸网站,让新的壁纸,给自己小小的雀跃,不陷入年年日日的重复。
apple 为什么要改 objc_msgSend 的类型申明