当前位置:网站首页>The representation of time series analysis: is the era of learning coming?
The representation of time series analysis: is the era of learning coming?
2022-07-29 05:07:00 【fareise】
WeChat official account “ Round algorithm notes ”, Continuous updating NLP、CV、 Search and promote dry goods notes and interpretation of cutting-edge work in the industry ~
The background to reply “ communication ” Join in “ Round algorithm notes ” Communication group ; reply “ The time series “、” Multimodal “、” The migration study “、”NLP“、” Picture learning “、” It means learning “、” Meta learning “ Wait to get the dry goods algorithm notes in various fields ~
It means that learning is the core of deep learning , Recently, more and more have been applied to the field of time series , Time series analysis shows that the era of learning has come . This article brings you 2020 It has been the top meeting since 5 This time series represents the core work related to learning .
1.Unsupervised Scalable Representation Learning for Multivariate Time Series(NIPS'20)
The idea of time series representation learning method in this paper comes from the classical word vector model CBOW.CBOW The assumption in is , The context representation of a word should be close to that of the word , At the same time, it is far away from other randomly sampled words . This paper applies this idea to the learning of time series representation , First, we need to construct CBOW Context in (context) And random negative samples , The construction method is shown in the figure below . First select a time series xref, as well as xref A subsequence in xpos.,xref It can be seen as xpos Of context. meanwhile , Random from other time series , Or sampling multiple negative samples in other time segments of the current time series xneg. In this way, a structure similar to CBOW The loss function of , Give Way xref and xpos Be close to , At the same time let xref And other negative samples xneg Far away .

In the model structure , This paper adopts the structure of multilayer cavity convolution , This part of the model structure has been introduced in detail in the previous article , Interested students can refer to :12 Top papers , Summary of classical schemes for deep learning time series prediction

2.Unsupervised representation learning for time series with temporal neighborhood coding(ICLR'21)
The method proposed in this paper is different from the previous article in the selection of positive and negative samples and the design of loss function . The first is the selection of positive and negative samples , For a moment t Centered time series , In this paper, a Gaussian distribution is used to delimit the sampling range of its positive samples . Gaussian distribution with t Centered , Another parameter is the range of the time window . Selection of time window range , In this paper, a new method is used ADF The test method selects the optimal window span . If the time window range is too long , Conditions that may cause the positive sample to be sampled to be irrelevant to the original sample ; If the time window is too small , It will cause too much overlap between the positive sample and the original sample .ADF The test can detect that the time series is in a stable time window , So as to select the most appropriate sampling range .

In terms of the loss function , This paper mainly solves the problem of false negative samples . If the samples outside the window selected above are regarded as negative samples , False negative samples are likely to appear , That is, it is originally related to the original sample , But because it is far away from the original sample, it is mistaken for a negative sample . For example, the time series is based on years , The time window is 1 Months , The series of the same period last year may be regarded as negative samples . This will affect model training , Make it difficult for the model to converge . To solve this problem , In this paper, the samples outside the window are not regarded as negative samples , But as no nothing label sample . In the loss function , Set a weight for each sample , This weight represents the probability that the sample is a positive sample . This method is also known as Positive-Unlabeled (PU) learning. The final loss function can be expressed in the following form :

3. A transformer-based framework for multivariate time series representation learning(KDD'22)
This article draws on the pre training language model Transformer The idea of , It is hoped that the unsupervised method can be used in multivariate time series , With the help of Transformer Model structure , Learn good multivariate time series representation . This paper focuses on the unsupervised pre training task designed for multivariate time series . As shown on the right side of the figure below , For the input multivariate time series , Meeting mask Drop a certain proportion of subsequences ( It can't be too short ), And each variable is mask, instead of mask Drop all variables at the same time . The optimization goal of pre training is to restore the whole multivariate time series . In this way , Let the model be mask When the missing part , Can consider both the front 、 The following sequence , It can also be considered that the same time period has not been mask Sequence .

The following figure shows the effect of unsupervised pre training time series model on time series prediction task . The figure on the left shows , Different have label Data volume , Whether to use unsupervised pre training RMSE Effect comparison . You can see , Whether there is label How much data is there , Adding unsupervised pre training can improve the prediction effect . The figure on the right shows that the larger the amount of unsupervised pre training data used , The better the fitting effect of the final time series prediction .

4. Time-series representation learning via temporal and contextual contrasting(IJCAI'21)
This paper adopts the method of comparative learning to express the learning of time series . First, for the same time series , Use strong and weak Two data enhancement methods generate two of the original sequence view.Strong Augmentation It means that the original sequence is divided into multiple segments and then the sequence is disrupted , Add some random disturbances ;Weak Augmentation It refers to scaling or translating the original sequence .

Next , take strong and weak Two enhanced sequences are input into a convolutional timing network , Get the representation of each sequence at each time . In this paper, we use Temporal Contrasting and Contextual Contrasting Two ways of comparative learning .Temporal Contrasting It means using a view Of context Predict another view In the future moment , The goal is to make this representation different from another view The corresponding real representation is closer , It's used here Transformer As the main model of time series prediction , The formula is as follows , among c Express strong view Of Transformer Output ,Wk It's a mapping function , Is used to c Mapping to predictions for the future ,z yes weak view The expression of the future moment :

Contextual Contrasting It is the contrastive learning of the whole sequence , Narrow the two generated by the same sequence view Distance of , Let different sequences generate view Farther away , The formula is as follows , This is similar to the way of image contrast learning :

5. TS2Vec: Towards Universal Representation of Time Series(AAAI'22)
TS2Vec The core idea is also unsupervised learning , Construct positive sample pairs through data enhancement , Through the optimization goal of comparative learning, the distance between positive sample pairs , The distance between negative samples is far . The core of this paper is mainly in two aspects , The first is the design of positive sample pair construction and comparative learning optimization objectives for the characteristics of time series , The second is the hierarchical comparative learning combined with the characteristics of time series .
For the construction method of positive sample pairs , This paper presents a method of constructing positive sample pairs suitable for time series :Contextual Consistency.Contextual Consistency The core idea is , Time series of two different enhanced views , At the same time step, the distance is closer . Two structures are proposed in this paper Contextual Consistency The method of positive sample pair . The first is Timestamp Masking, After full connection , Random mask Vector representation of some time steps , Re pass CNN Extract the representation of each time step . The second is Random Cropping, Select two subsequences with common parts as positive sample pairs . Both methods make the vector representation of the same time step closer , As shown in the figure above .

TS2Vec Another core point of is hierarchical contrast learning . Time series and images 、 An important difference of natural language is , Through polymerization of different frequencies , Time series with different granularity can be obtained . for example , Time series of day granularity , By weekly aggregation, we can get the sequence of weekly granularity , According to monthly aggregation, we can get the sequence of monthly granularity . In order to integrate the hierarchy of time series into comparative learning ,TS2Vec Proposed the hierarchical comparative learning , The algorithm flow is as follows . For two time series that are mutually positive sample pairs , First through CNN Generate a vector representation of each time step , Then recycle maxpooling Aggregate in the time dimension , The aggregation window used in this article is 2. After each polymerization , Calculate the distance of the aggregation vector corresponding to the time step , Make the same time step closer . The particle size of polymerization keeps getting coarser , Finally, it is aggregated into the granularity of the whole time series , Gradually realized instance-level To learn .
WeChat official account “ Round algorithm notes ”, Continuous updating NLP、CV、 Search and promote dry goods notes and interpretation of cutting-edge work in the industry ~ The background to reply “ communication ” Join in “ Round algorithm notes ” Communication group ; reply “ The time series “、” Multimodal “、” The migration study “、”NLP“、” Picture learning “、” It means learning “、” Meta learning “ Wait to get the dry goods algorithm notes in various fields ~
Backstage message ” communication “, Join the circle algorithm exchange group ~
Backstage message ” The paper “, Get the summary of papers from all directions ~
【 Historical dry goods algorithm notes 】
12 Top papers , Summary of classical schemes for deep learning time series prediction
How to build a time series prediction Transformer Model ?
Spatial-Temporal Summary of time series prediction modeling methods
newest NLP Prompt On behalf of the work organization !ACL 2022 Prompt Direction paper analysis
The picture shows how to study classic work —— The basic chapter
The majestic :14 A large summary of pre training language models
Vision-Language Multi modal modeling method
from ViT To Swin,10 Read the top paper Transformer stay CV The development of the field
边栏推荐
- 时间序列分析的表示学习时代来了?
- Opencv learning 1 (environment configuration)
- Sguard64.exe ace guard client exe: frequent disk reading and writing, game jamming, and Solutions
- Wechat picture identification
- [untitled]
- AUTOSAR从入门到精通100讲(七十八)-AUTOSAR-DEM模块
- [2022 freshmen learning] key points of the third week
- Flink+iceberg environment construction and production problem handling
- MySQL定时调用预置函数完成数据更新
- 荣耀2023内推,内推码ambubk
猜你喜欢

stack和queue和优先级队列(大堆和小堆)模拟实现和仿函数讲解

Flink+iceberg environment construction and production problem handling

Google gtest事件机制

Opencv learning 1 (environment configuration)

Use openmap and ArcGIS to draw maps and transportation networks of any region, and convert OMS data into SHP format

Glory 2023 push, push code ambubk

Quick start JDBC

SSM integration, addition, deletion, modification and query

Jackson解析JSON详细教程

传奇开区网站如何添加流量统计代码
随机推荐
IOS interview preparation - Objective-C
A little knowledge about management
stack和queue和优先级队列(大堆和小堆)模拟实现和仿函数讲解
Excel怎么筛选出自己想要的内容?excel表格筛选内容教程
2021-10-11
Lenovo Savior r7000+ add ssd+ copy and partition the information of the original D disk to the new SSD
Numpy basic learning
Double type nullpointexception in Flink flow calculation
Stack and queue and priority queue (large heap and small heap) simulation implementation and explanation of imitation function
How to debug UDP port
【微信小程序--解决display:flex最后一行对齐问题。(不连续排列会分到两边)】
How to monitor micro web services
开区网站打开自动播放音乐的添加跟修改教程
[untitled]
网安学习-内网安全1
力扣------对奇偶下标分别排序
优炫数据库启动失败,报网络错误
Vivo market API event reporting and docking
Use openmap and ArcGIS to draw maps and transportation networks of any region, and convert OMS data into SHP format
Wechat picture identification