当前位置：网站首页>Thesis learning -- Analysis and Research on similarity query of hydrological time series

Thesis learning -- Analysis and Research on similarity query of hydrological time series

2022-07-01 07:38:00 【Graduate students are not late】

List of articles

1 Abstract
2 introduction
3 Problem description
4 Theoretical methods
5 Piecewise linear representation based on feature points
- 5.1 Piecewise linear representation
- 5.2 Definition of characteristic points
6 Similarity measure of time series
- 6.1 Dynamic pattern matching distance (DPM)
- 6.2 Algorithm steps

Write it at the front ：《 hydrological 》;2009 year ;
author ： Li Wei 、 Sun Honglin

1 Abstract

Hydrological time series similarity query , It can be used for Rain flood process prediction 、 Environmental evolution analysis 、 Analysis of hydrological process law Other aspects .
The most direct application is , Answer questions often asked in flood control command ：“ The current hydrological process is equivalent to the same process in which period in history ”
Introduce the theory and technology of data warehouse and data mining .

2 introduction

Insert picture description here

3 Problem description

Traditional time series similarity search , It mainly emphasizes precise matching , But in data mining applications , Because of the huge amount of data , Generally, it is based on approximate matching “ Approximate search ”.

The key work of hydrological time series similarity mining is ：

Division of subsequences . In the National Hydrological Database , Flood engineering has been divided according to the theory of runoff generation , Form an excerpt of various elements .
however , In the daily value class , It needs to be divided according to the type of problem to be solved , We need to make the partition rules It conforms to the hydrological theory , And suitable for computer processing .
Sequence feature extraction . Generally, the sequence is transformed , For example, Fourier transform 、 Wavelet transform or piecewise average mapping to feature space .
Determination of similarity measure . For hydrological processes , Different hydrological processes have different characteristics . Therefore, according to the characteristics of hydrological process , Determine the appropriate similarity measures .

4 Theoretical methods

Similarity query of hydrological time series , The data objects to be processed are based on hydrological data , The process can be divided into two main stages ： Query preparation stage and Similarity query stage .

Query preparation stage . Include Data preprocessing And Feature extraction of time series .
① In any data mining task , Data preprocessing is one of the essential key tasks , Data preprocessing in this model involves data integration 、 Data purification 、 Data selection and sequence regularization transformation ;
② Pattern representation of time series is a prerequisite for time series data mining , It is one of the key problems of hydrological time series similarity mining , Its effect directly affects the results of data mining .
Similarity query stage . Users submit query requests , Based on the pattern representation, the system performs pattern matching according to the similarity measurement , And display the results visually to users .

Pattern matching （ Similarity measure ）+ Pattern representation of time series It is also called the two cornerstones of time series similarity query .

5 Piecewise linear representation based on feature points

Time series pattern representation ：
This article USES ： Piecewise linear representation based on feature points , As a pattern representation of time series .（PLR）
For the time series with obvious periodicity and frequent fluctuations of short-term patterns , It can effectively realize data compression , So as to grasp the change characteristics of the overall pattern of time series .
An example of segmentation is shown in the figure below ：

5.1 Piecewise linear representation

Insert picture description here

5.2 Definition of characteristic points

Insert picture description here

6 Similarity measure of time series

The definition of similarity measure of time series should meet the following conditions ：
（1） Similarity measures allow for imprecise matching , Support multiple deformations of time series ;
（2） The calculation of similarity measure must be efficient ;
（3） Similarity measures should support fast indexing ;
（4） Similarity measure can be applied to other data mining fields , Such as clustering and classification of time series 、 Frequent pattern discovery and exception discovery, etc ;
Common similarity measures are ：Minkowski distance 、 Dynamic time bending distance 、 Longest common substring, etc .

6.1 Dynamic pattern matching distance (DPM)

DPM Distance is not calculated based on matching between points , They are matched by patterns .
advantage ： The definition of patterns is very flexible ; The average length of the pattern is generally much larger than 1, The dimension reduction of time series is realized （ The number of patterns in time series is much smaller than the length of time series ）

6.2 Algorithm steps

Defining patterns . Extracting pattern features from time series , Transform time series into feature space , Get the pattern representation of the time series .
For piecewise linear representations , A pattern is an interpolated segment of a time series field , It can be characterized by the length of the line segment 、 Slope, etc ;
Define the distance between patterns

原网站

版权声明
本文为[Graduate students are not late]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/182/202207010719060090.html