当前位置:网站首页>Machine learning notes - Introduction to autocorrelation and partial autocorrelation
Machine learning notes - Introduction to autocorrelation and partial autocorrelation
2022-06-30 12:50:00 【Sit and watch the clouds rise】
1、 summary
Autocorrelation and partial autocorrelation graphs are widely used in time series analysis and prediction .
These charts graphically summarize the strength of the relationship with observations in the time series and observations in previous time steps . For beginners of time series prediction , The difference between autocorrelation and partial autocorrelation can be confusing .
We will learn how to use Python Calculate and draw autocorrelation and partial correlation diagrams . And understand the following
How to draw and view the autocorrelation function of time series .
How to plot and view the partial autocorrelation function of time series .
Difference between autocorrelation function and partial autocorrelation function used in time series analysis .
2、 Minimum daily temperature data set
This data set describes the city of Melbourne, Australia 10 year (1981-1990 year ) Minimum daily temperature of .
The unit is centigrade , Yes 3,650 An observation . The data source is Australian meteorological agency .
Dataset download address
link :https://pan.baidu.com/s/19G9YsOKtRDXNYAKdOwBq6A
Extraction code :ateb
Now load the minimum daily temperature and plot the time series .
from pandas import read_csv
from matplotlib import pyplot
series = read_csv('daily-minimum-temperatures.csv', header=0, index_col=0)
series.plot()
pyplot.show()
Run this example to load the dataset as Pandas Series and create a line graph of the time series .

3、 Correlation and autocorrelation
Correlation summarizes the strength of the relationship between two variables . We can assume that the distribution of each variable is Gaussian ( A bell curve ) Distribution . If that's the case , We can use Pearson correlation coefficient to summarize the correlation between variables .
Pearson The correlation coefficient is between -1 and 1 Number between , Describe negative or positive correlation respectively . A value of zero indicates no correlation .
We can calculate the correlation between time series observations and previous time step observations , It is called hysteresis . Because the correlation of time series observations is calculated using the values of previous time series of the same time , So this is called sequence correlation or autocorrelation .
The lag graph of time series autocorrelation is called AutoCorrelation Function, Or acronyms ACF. This graph is sometimes called a correlation graph or an autocorrelation graph .
Here's how to use statsmodels In the library plot_acf() Function to calculate and plot the autocorrelation diagram of the minimum daily temperature .
from pandas import read_csv
from matplotlib import pyplot
from statsmodels.graphics.tsaplots import plot_acf
series = read_csv('daily-minimum-temperatures.csv', header=0, index_col=0)
plot_acf(series)
pyplot.show()
Running the example creates a 2D chart , Display edge x The hysteresis value of the shaft and at -1 and 1 Between y Correlation on the axis .
The confidence interval is plotted as a cone . By default , This is set to 95% The confidence interval of , This indicates that correlation values outside this code are likely to be correlations rather than statistical flukes .

By default , All hysteresis values will be printed , This makes the drawing noisy . We can x The number of lags on the shaft is limited to 50, To make the drawing easier to read .

4、 Partial autocorrelation function
Partial autocorrelation is a summary of the relationship between observations in time series and observations in previous time steps , The relationship between intervention and observation is deleted .
The autocorrelation between the observed value and the observed value of the previous time step consists of direct correlation and indirect correlation . These indirect correlations are linear functions of observed correlations , Observe on the intervention time step .
It is these indirect correlations that the partial autocorrelation function attempts to eliminate .
The following example uses statsmodels In the library plot_pacf() The minimum daily temperature data set is calculated and plotted 50 Partial autocorrelation function with time delay .
from pandas import read_csv
from matplotlib import pyplot
from statsmodels.graphics.tsaplots import plot_pacf
series = read_csv('daily-minimum-temperatures.csv', header=0, index_col=0)
plot_pacf(series, lags=50)
pyplot.show()

边栏推荐
- Why should offline stores do new retail?
- The realization of QT the flipping effect of QQ weather forecast window
- Substrate 源码追新导读: 修复BEEFY的gossip引擎内存泄漏问题, 智能合约删除队列优化
- Substrate 源码追新导读: 质押额度大幅度削减, RocksDB可以完全被Disable
- Qt读写Excel--QXlsx工作表显示/隐藏状态设置4
- Questionnaire star questionnaire packet capturing analysis
- 你想要的异常知识点都在这里了
- 【C】深入理解指针、回调函数(介绍模拟qsort)
- Unity脚本的基础语法(5)-向量
- 【一天学awk】基础中的基础
猜你喜欢
排查问题的方法论(适用于任何多方合作中产生的问题排查)
MATLAB小技巧(22)矩阵分析--逐步回归
Tencent cloud Database Engineer competency certification was launched, and people from all walks of life talked about talent training problems
Redis-緩存問題
Redis - problèmes de cache
Wechat launched the picture big bang function; Apple's self-developed 5g chip may have failed; Microsoft solves the bug that causes edge to stop responding | geek headlines
独立站即web3.0,国家“十四五“规划要求企业建数字化网站!
【C】深入理解指针、回调函数(介绍模拟qsort)
市值蒸发650亿后,“口罩大王”稳健医疗,盯上了安全套
写信宝小程序开源
随机推荐
黑马笔记---List系列集合与泛型
[one day learning awk] use of built-in variables
全面解析免费及收费SSH工具的基本特性和总结
Package based on thinkphp5 -tronapi- wave field interface - source code without encryption - can be opened twice - interface document attached - detailed guidance of the author - June 30, 2022 08:45:2
Introduction to the novelty of substrate source code: comprehensive update of Boca system Boca weight calculation, optimization and adjustment of governance version 2.0
黑马笔记---包装类,正则表达式,Arrays类
Android development interview real question advanced version (with answer analysis)
[300+ continuous sharing of selected interview questions from large manufacturers] column on interview questions of big data operation and maintenance (II)
浅谈 JMeter 运行原理
Redis - problèmes de cache
Flink SQL console, group not recognized_ Concat function?
Three ways for flinksql to customize udaf
Docker installation of mysql8 and sqlyong connection error 2058 solution [jottings]
Idea has a new artifact, a set of code to adapt to multiple terminals!
Event handling in QT
New function of SuperMap iserver11i -- release and use of legend
Qt读写Excel--QXlsx工作表显示/隐藏状态设置4
RDS MySQL数据迁移PolarDB MySQL费用可以转过去吗?
排查问题的方法论(适用于任何多方合作中产生的问题排查)
Definition of variables and assignment of variables in MySQL