当前位置:网站首页>Information leakage and computational complexity of EMD like methods in time series prediction

Information leakage and computational complexity of EMD like methods in time series prediction

2022-06-09 22:31:00 Cyril_ KI

I. Preface

Nowadays, many time series prediction papers are EMD decompose +LSTM The routine , And have achieved good results . But in these papers , Almost no paper has introduced its own data processing process in detail , And how to realize this method in real applications , They just stay at the theoretical level . therefore , This article mainly wants to talk about when we use EMD Method, how to process data .

II. Information leakage and the amount of computation

The data processing method in most papers is like this : First, all the data is EMD decompose , Get multiple components . Then for each component , Divide the training set and test set respectively . For each component, we use a model to train on its training set , Then test on the test set . If we want to get the truth of the original data / Predictive value , Just put the truth of all the components / The predicted values can be superimposed .

However , Generally in time series prediction , This is how we process data : First, divide all the data into training set and test set , Then the training set is normalized and feature engineered , And then model training , In this way , The training set and the test set are separate , The data in the training set is independent of the test set . It is worth noting that , When normalizing the test set , We use the largest in the training set / minimum value . take the reverse into consideration EMD in , We first decompose all the data and then divide the training set and the test set , This creates a problem : The training focused on future data ! In short , In the process of getting each component of the training set , We used the data from the test set , However, for model training , The data in the test set should be unknown , That's what happened Information disclosure problem .

Besides , because EMD There is The number of decomposed components depends heavily on the distribution and length of the sequence The problem of , When the model is actually applied , There are also some problems : Suppose we use the former 24 After a time forecast 12 Time data , In the model training phase , We have trained a total of 10 A component model , That is, the data is decomposed into 10 Weight . After model training , It needs to be put into use , If we want to predict the future 12 Hours of data , Then we only need the nearest 24 Time data , A key question is : How to make this 24 Data is decomposed into 10 Weight ? Because only this 24 Data is decomposed into 10 Weight , We can use the trained 10 A model . The reality is that this decomposition is almost impossible , Because the data used for model training is hundreds of pieces , Tens of thousands in length , Such long data can be decomposed into 10 Weight ,24 Data is unlikely to be decomposed into 10 A component of .

Assume that the time series data used for model training and testing is D, The length is T+k, front T Data for model training , after k Data for model testing , Before utilization 24 After a data forecast 12 Data . For the above two questions , Generally speaking, there are the following 3 Kind of solution :

2.1 Training for many times + Multiple prediction

To get each prediction in the test set , We can do the following :

  1. To the front T Data for EMD decompose , Get multiple components , Then train a model for each component .
  2. After model training , We make use of D[T-23:T] The data of each component model is obtained 12 A prediction output , And then superimpose them , You can get D[T+1:T+12] The predicted value of .
  3. Yes D[13:T+12] Data processing EMD decompose , Also get multiple components , Then train a model for each component . And then use it D[T-11:T+12] this 24 Data to get the... Of each component model 12 A forecast , Then stack , You can get D[T+13:T+24] Of 12 Predicted values .
  4. Repeat the above 123 step , Every time the decomposition window slides 12 Data , Then decompose the data in the window and train the model , Finally, the prediction is made , Until get D[T+1:T+k] All predicted values of .
  5. utilize D[T+k] Calculate the evaluation index with the predicted value and the real value of .

Although the above methods did not cause information leakage , but It takes a lot of calculation . But the biggest advantage of this method is : There is no need to guarantee the IMFs The number of is the same , to D[1:T] When we decompose the data, we can get 10 Weight , At this time, the D[T-23:T] Every data in also exists 10 Weight , Can directly predict , Yes D[13:T+12] The data in , You can get 12 Weight , At this time, the D[T-11:T+12] Every data in also exists 12 Weight , Can directly predict .

2.2 A single workout + Multiple decomposition prediction

Method 2.2 With the method 2.1 The main difference is that you only train once :

  1. First of all, will D[1:T] Of the data , Then the data is decomposed , And train a model for each component .
  2. When making a formal forecast , The first use of D[T-23:T] The decomposition data of D[T+1:T+12] The predicted value of ; Then the data D[13:T+12] decomposition , utilize D[T-11:T+12] obtain D[T+13:T+24] The predicted value of .
  3. Cycle prediction , Until the end of the forecast .

It can be found that this method is to decompose the data with the latest length consistent with the training set every time , Then make predictions . Although this method does not cause information leakage , But the amount of calculation is also very large . meanwhile , This method has a fatal flaw : Ask for every T Data decomposed IMFs The number of must be consistent , Otherwise, the trained model cannot be used .

2.3 Sliding decomposition construction sample

Specific steps :

  1. Yes D[1:24] Of the data , Get component set X, Then on D[13:36] decomposition , Get component set Y, Then construct a sample for each component (x, y), among x For the former 24 Data ,y For after 12 Data .
  2. Yes D[13:36] and D[25:48] Decompose them separately ,13:36 by x,37:48 by y, Also generate a sample for all components .
  3. Repeat the above steps , Both windows slide each time 12 Data , Until the end of the training set , Get all training samples .
  4. Before utilization 3 Step to get data to train multiple component models .
  5. When making a formal forecast , Choose the nearest one every time 24 Data is decomposed , Then carry out prediction superposition .

The above methods have not caused information leakage , But the same It takes a lot of calculation . meanwhile , The method and 2.2 equally : Ask for every 24 Data decomposed IMFs The number of must be consistent , Otherwise the training will fail , At the same time, the trained model cannot be used .

2.3 summary

  1. Method 1 has no loopholes , But the amount of calculation is the largest of the three methods .
  2. Both method 2 and method 3 have the problem that the number of decompositions is uncertain .
  3. For the problems of the last two methods , Some articles have proposed solutions : Used to PyEMD Everyone knows , The maximum number of decompositions can be set during specific decomposition , But this number will not exceed the upper limit of decomposition number , In this case, we can assume a decomposition number in advance , This value is usually small , Each decomposition is carried out according to the number of decomposition , But there is also an extreme case : The number of a certain decomposition cannot even reach the minimum value , At this time we can be in Add... After the data 0, And then break it down , If the number is still not reached, continue to add 0 Until the specified number of decomposition is reached , because 0 After decomposition, the components are added together as 0( Or a little noise ), It has no effect on the final result . But here's the thing , This addition 0 Is not advisable in model training , Can only be used in model prediction .

This improved method has yet to be tested !

III. reference

A review and discussion of decomposition-based hybrid models for wind energy forecasting applications

原网站

版权声明
本文为[Cyril_ KI]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/160/202206092146402199.html