当前位置:网站首页>A deep learning model for urban traffic flow prediction with traffic events mined from twitter
A deep learning model for urban traffic flow prediction with traffic events mined from twitter
2022-06-24 21:51:00 【panbaoran913】
A deep-learning model for urban traffic flow prediction with traffic events mined from twitter
original text , see here
author :Aniekan Essien & Ilias Petrounias & Pedro Sampaio & Sandra Sampaio
Periodical :
keyword : Traffic flow forecast .LSTM Stackable auto encoder . Deep learning . Microblog information prediction . Intelligent transportation system
Non authoritative evaluation : The focus of this article is that the data has 3 part :traffic+weather+twitter. The structural formula is the depth and two-way of expansion LSTM, Lack of spatial structure . And the English grammar of this article is full of mistakes , Very bad for reading . If it is not to study the characteristics of the data , Reading is not recommended .
Abstract
Short term traffic parameter prediction It is an important part of modern urban traffic management and control system . When the data-driven traffic model encounters non recurrent or non routine traffic events , For example, accidents 、 Road closures and extreme weather conditions , Its prediction accuracy will be reduced . On social networks ( In especial twitter) Data analysis and mining , Urban traffic parameter prediction can be improved by supplementing traffic data and data reported in social media posts that can disrupt conventional traffic patterns . This paper presents a deep learning urban traffic forecasting model , The model combines the information extracted from tweets with traffic and weather information . The prediction model adopts Deep bidirectional long-term and short-term memory deep Bi-directional Long Short-Term Memory(LSTM) Stack type automatic encoder stacked autoencoder(SAE) framework , utilize Microblogging tweets、 traffic traffic And weather weather Data sets Carry out multi-step traffic flow prediction . The model is evaluated on an urban road network in Greater Manchester, England . The results of a large number of empirical analysis using real data show that , And other classics / Statistics and machine learning (ML) Compared to the most advanced models , The effectiveness of this method in improving prediction accuracy . The improvement of prediction accuracy can reduce the frustration of road users , Save cost for enterprise , And reduce the harm to the environment .
1 Introduction
Reducing traffic congestion levels is an important priority for cities around the world , In the past few decades , These cities have made great efforts to improve traffic speed prediction methods and development Intelligent transportation system (ITS) Have important research interests 【26】.ITS The success of depends on The quality of transport information provided to transport stakeholders and its application in policy making 、 Ability of control system and traffic prediction model . Short term traffic forecasting is a multidisciplinary research field , Integrated mathematics 、 Contributions from various fields such as computer science and engineering . Due to the dynamic nature of traffic 、 Complexity and randomness , Accurate prediction of traffic parameters is challenging . And The complexity associated with traffic forecasting From the The nature of the transport sector , Including restrictions imposed by physical transport infrastructure , Such as road network capacity 、 Traffic regulations and management policies 、 Individual agents ( Road users ) act , And the calendar ( Time of day 、 One day of the week )、 The weather 、, Accidents and incidents 、 event 、 Road closure, etc .
Traffic data science has been developed for many years , It extends a large number of data sources used to develop prediction models . Early research has influenced driving behavior 、 Travel needs 、 Travel patterns 、 Road safety and traffic flow characteristics , A survey of Weather data Importance to traffic flow parameters . Besides , Years of research have shown that , rainfall It will reduce the traffic capacity and operation speed , Thus increasing congestion and loss of road network productivity . for example ,[9] The authors report that , Rainfall intensity reduces traffic speed 4-9% Affecting the characteristics of urban traffic flow , There is a significant relationship between traffic congestion and temperature intensity during peak periods . Although weather is an important factor in traffic forecasting , but ITS Most of the traffic forecasting models used in the system assume sunny weather , Therefore, we missed an important environmental data source that can more accurately evaluate the status of the transportation network .
therefore , Recent studies have investigated the impact of incorporating non traffic input data sets on the prediction of urban traffic parameters , many ( If not all ) Data sets have improved prediction accuracy [11,12,22]. for example , One Deep two-way LSTM Model , In addition to using traffic flow characteristics , Rainfall and temperature data sets are also used for training . The results show that , And baseline datasets ( Traffic data set ) comparison , Forecast accuracy has improved . Similar results have been obtained in studies involving non traffic input data [22,23]. This can be interpreted as , Data-driven traffic parameter prediction usually depends on the prediction and analysis technology applied to historical data observation , In extraction mode , Observations used to predict the future . It works , Because urban traffic data have typical seasonality 、 periodic . for example , Morning and evening peaks are easy to predict , So it can be predicted that . therefore , A model that can identify and learn these patterns from historical data sets will be able to “ adroitly ” Forecast future traffic parameters .
However , In unusual or non recurring circumstances , Such as events or events that cannot be inferred from historical observations , Even the most accurate prediction model will show poor prediction performance [12]. Non recurring or random events / Typical examples of incidents include accidents 、 Lane Closed 、 Sports and public events . Considering that such events may be sudden 、 Unexpected and / Or rare , Therefore, it is necessary to develop robust prediction models , So as to accurately predict the traffic flow in these cases . Existing studies have put forward the research results of traffic parameter prediction using social media data , for example , The optimization technology based on linear regression is used in road traffic prediction . The author uses a performance measurement system from California (PeMS) Online road traffic data obtained to test their proposed model . Similarly ,[30] Represents a study , Social media data is used to develop short-term traffic prediction models . This study combined twitter data to predict traffic flow before sports games , The method is evaluated using four models , namely ARIMA、 neural network 、 Support vector regression and k- a near neighbor (k-NN). The literature [1] This paper presents a method of combining twitter data Kalman filter model , For real-time prediction of bus arrival time .
Deep learning (Deep learning, DL) The method has advanced image and speech recognition 、 natural language processing (natural language processing, NLP) And intelligent Gamification , It is also used in the field of short-term traffic forecasting [8,25,33].DL Refers to the use of stacked 、 Multi tier architecture [16] Techniques for learning complex features from high-level data in a hierarchical manner . Use in DL The social media data within the traffic prediction cycle trained on the architecture layer provides additional opportunities to improve the accuracy of traffic prediction . This study aims to contribute to the research in this field . More specifically , We An end-to-end implementation of a deep learning urban traffic flow prediction model is proposed , The model integrates traffic from 、 Weather related data and social media tweets ( Can contain information about non recurring or unexpected events ) Real time information of urban traffic flow prediction obtained in . Road users in traffic jams , They often publish their own traffic conditions and locations through microblogs , Provide important real-time information to other road users or traffic management stakeholders .
Social media as an online discussion platform , There has been a significant explosion in the past few years . for example Facebook、Twitter、Instagram、snapchat etc. . These services are widely used for communication 、 News reports and advertising campaigns . These social media platforms provide application programming interfaces (api), Data can be retrieved in real time .Twitter Is a popular short message ( As many as 280 Characters ) Public social media platform , The resulting data stream can be high-speed 、 Timely dissemination of information related to real events . because twitter It has a huge user base , The magnitude and diversity of their access to information , Many researches try to use this online data warehouse for various data mining purposes , Such as stock market price [29], Crime rate forecast [39], And flow forecast [1,15,41].
Waze and TomTom etc. Advanced traveler information system Advanced Traveler Information Systems (ATIS) already Leverage public informed social media data to improve their traffic navigation and route guidance systems . in general , quite a lot twitter The account will report the current traffic conditions , Road users can use these accounts to infer future traffic conditions , And inform the choice of travel mode . for example , In the north of England , Highway England (@HighwaysNWEST), Greater Manchester traffic (TfGM @OfficialTfGM),@nwtrafficnews and Waze (@WazeTrafficMAN) It is a typical example of providing road traffic information . In addition to the tweets released by major organizations in the transportation field , Road users can also tweet on their own time lines , towards ( Concern ) Broadcast the current road traffic conditions , By mining these conditions, we can infer the future traffic conditions .
therefore , We propose an urban traffic flow forecasting method , This method utilizes information from twitter sources , In addition to traffic and weather related data sets , It can also contain information about non recurring traffic events .[11] The study in shows that , When the weather is relevant ( Rainfall and temperature ) When data sets are integrated for urban traffic speed prediction , The prediction performance of the model is improved . The method proposed in this paper is [11] Enhanced and enhanced versions of the method proposed in , Its The purpose is to improve the prediction accuracy . The emphasis of using Twitter as an auxiliary representation of traffic status is , Compared to other social media platforms , Twitter users tend to react more quickly to events .
The contribution of this study has two aspects :(1) It empirically evaluates whether tweets other than weather and traffic data sets improve urban traffic flow forecasting ,(2) Use tweets 、 Weather and traffic data sets are used for model training in end-to-end depth LSTM Implementation of traffic flow prediction model based on self encoder . The model uses data from Stretford, Greater Manchester, England A56( Chester road ) Of traffic 、 Geo specific twitter and weather data sets To assess the . It is worth noting that , Because the training time of the deep learning model is long , Add additional data sources (tweets) May significantly affect the computational requirements of the overall model . therefore , We use Bidirectional self encoder architecture LSTM neural network . Automatic encoder , It can also be used as a dimension reduction component , Allow the model to train in a shorter time , Because the input vector is reduced to a smaller dimensional space [16].
Due to traffic data ( Time series ) The spatial distribution and sequence properties of , The most important thing is to make the most of all the data it contains .LSTM RNNs The basic operation of is to arrange the input time series data in time order in a chain structure [19] In the time step t-1 To time step t In the direction of positive or “ forward ” spread . therefore , In a sequential data set , Contains a Two way architecture It could be useful , The architecture considers backward propagation , Pass the reverse sequence to LSTM Model . Intuitively speaking , Use two-way for urban traffic data LSTM Should lead to more accurate predictions , Because it can sometimes be useful “ Reverse learning ” data . for example , Learn backwards ( Let's say Friday night ) It can be used to infer the traffic conditions on Saturday morning or afternoon . for example , If many people are celebrating , Until late Friday night , Then there may be fewer people traveling on Saturday mornings or afternoons ( for example ,New Y ear’s E v e). On the other hand , A typical example is , The weather forecast says it will snow tomorrow , This may affect today's traffic , Because people may want to go shopping today , To avoid tomorrow's snow . The empirical results in the literature also show that , Use two-way LSTMs It is better to predict traffic than to use one-way traffic LSTMs Some improvement [3,13]. In our model , We joined in bi-directional LSTM, To improve the prediction performance of the model . Besides , Due to the high complexity and mode structure of urban traffic data sets , Bidirectional sequence / It means that learning may prove to be a more robust solution .
One of the major challenges in including tweets in traffic forecasts is determining authenticity 、 Accuracy and filtering unstructured data sets [15] The process of high level noise in . To explain this , We adjusted our algorithm , To give priority to those from road traffic organizations Twitter Account tweets —— Especially in Greater Manchester (@OfficialTfGM) and Waze (@WazeTrafficMAN). The results show that , When Twitter feed When information is included , The accuracy of model prediction has been significantly improved .
The rest of this article is organized as follows . The first 2 Section reviews existing research on short-term traffic forecasting and incorporating twitter into traffic forecasting . The first 3 Section outlines the proposed approach , Include a brief discussion of key concepts . The first 4 Section describes the data sets used in this study . The first 5 Section outlines the experimental setup 、 Model description and performance evaluation indicators . The first 6 The results of the study are presented in section , And we're in the first 7 Section summarizes the research and puts forward the future work .
2 Related studies
This section presents a review of relevant research , First of all, it gives an overview Short term traffic forecast 、 Technical background of parametric and nonparametric models . This section concludes with a review of integration Twitter Research on traffic forecasting based on message .
2.1 Short-term traffic prediction
Short term traffic forecasting techniques can be roughly divided into parametric methods and non parametric methods . Parametric model A model that simplifies input data into known functions to summarize data . Parametric models are sometimes referred to as model-based forecasting methods , Because the model structure is based on empirical data [16] The calculation model parameters of . Autoregressive comprehensive moving average (ARIMA) Model [37] It is one of the earliest parameter prediction models .ARIMA Model Defined by the difference equation :

type (1) The variables in the p p p and q q q Is an integer greater than or equal to zero , Represent autoregressive and moving average components respectively .ARIMA(p,d,q) The successful application of the model requires that the input time series be stable . For this reason , Sometimes... Is used The difference method To induce the stationarity of the data set , This involves continuous differences between observations . therefore , Calculate the third parameter , That's the difference ( d ) (d) (d), give an example :if d = 0 : y t = y t d=0: y_t=y_t d=0:yt=yt; if d = 1 , y t = y t − 1 d=1,y_t=y_{t-1} d=1,yt=yt−1
However , Of the model The main assumption is the mean 、 Stationarity of variance and autocorrelation . This is a significant disadvantage , Because it often ignores the common extreme values in traffic data sets 【14】. Traffic parameters tend to peak during peak hours , And with the rapid fluctuation of events or accidents . therefore ,ARIMA The predicted value shows weakness when applied to traffic prediction .
stay Nonparametric models in , Algorithm from data “ Study ”, So as to select the most suitable function for the training data set , This means that they can adapt many functions to specific data sets [16].k- Nearest neighbor (k-NN) It is generally considered to be the most easily implemented nonparametric machine learning model [34], It has been widely studied in traffic volume prediction [34,40,43]. The logic that drives this model is , In the feature space k After classifying the most similar observations , The new observation sample is likely to fall into this category 【42】. The parameters of the model are : State vector 、 Distance metric 、 Nearest neighbor number k And prediction algorithm . The distance measure measures the degree of approximation between the sample and the test data . This is calculated using Euclidean distance , Expressed as :
Artificial neural network (ANN) It is another kind of nonparametric traffic prediction model , Its inspiration comes from the internal working mechanism of the human brain [5]. This kind of prediction model has good learning ability , It can also handle multidimensional and nonlinear data .ANN The basic model components of the model are Multilayer perceptron (MLP), Such as the type (3) Described :
among M M M and N N N Represents the number of neurons in the input layer and the hidden layer respectively , g g g and h h h It's a transfer function . θ θ θ Is the weight value of the input layer neuron , and φ φ φ Is the weight or deviation of the hidden layer . Neural networks use optimization algorithms ( Such as back propagation ) To reduce errors . However , When applied to Time series analysis , The traditional neural network shows its shortcomings , Because they ignore the time dimension of time series data , This leads to recurrent neural networks (RNN) The development of [7].
among , W h x W_{hx} Whx Represents the weight between input and recursive hidden nodes , W h h W_{hh} Whh Represents the weight between the recursive node and the previous time step of the hidden node itself , b b b and σ σ σ Respectively represent deviation and nonlinearity (sigmoid) Activate . Even though RNN Better performance in time series prediction , But they still have some problems to be solved . for example , From the above equation (4) It can be seen that , As the time interval increases , Periodic implicit states h t h_t ht Close to zero , This leads to the problem of gradient decline . therefore ,RNN Unable to learn from time series with long time lags . German engineer hocherett (Hochereiter) And schmidubber (Schmidhuber)[19] Our work has solved this problem , namely Long - Short term memory RNN, Its main goal is to model long-term time dependence in time series .LSTM Model Use one “ Storage unit ” Instead of periodic hidden cells .
chart 2 Describes a with a memory block LSTM-NN Architecture of . The memory block contains Input 、 Output and forget gate , They perform writes on each cell separately 、 Read and reset functions . Multiplication gate , namely ⊕ ⊕ ⊕ and ⊗ ⊗ ⊗, Reference matrix addition and dot product operators respectively , Allow the model to store information for a long time , Thus, the common vanishing gradient problem in the traditional neural network model is eliminated [19].
stay LSTM Model in , The input sequence of time series is x = x 1 + x 2 + x 3 , … , x t x= x_1 + x_2 + x_3,…,x_t x= x1 + x2 + x3,…,xt, Output sequence y = y 1 + y 2 + y 3 , … , y t y=y_1 + y_2 + y_3,…,y_t y=y1 + y2 + y3,…,yt Is calculated using the historical data provided , No one was told how many time steps to track back . This is achieved by using the following equations :

LSTM Deep neural network has been widely used in traffic forecasting , For example, in [28] in ,LSTM-NN The model is used for traffic speed prediction , The results are compared with other nonparametric algorithms ( Support vector machine (SVM)、 Kalman filter and ARIMA) Compare . It turns out that ,LSTM The model has advantages in prediction accuracy . Besides ,[23] Came up with a LSTM And deep belief networks (DBN) Deep learning model , The short-term traffic speed is predicted by using the traffic and rainfall data of Beijing, China . Experimental results show that , The fusion of weather and traffic data sources improves the prediction performance of the model .
2.2 Traffic prediction using twitter information
Although Twitter has become a popular social media platform , But there are still opportunities to use data from its large user base to improve traffic forecasting .[21] Introduced a study on social interaction on twitter , This study reveals , Twitter's widely adopted driving process may be because it represents a hidden network , Most messages describe meaningless interactions . This constitutes an important premise of skepticism , This can be seen in the literature on traffic prediction using Twitter messages , It is obvious in the few studies . Due to the open source of the twitter platform 、 Publicity , The data obtained on twitter may be subjective 、 Context specific 、 Containing nuances , Or a statement intended to express irony or satire .
Despite these shortcomings , Many researches try to incorporate twitter information into traffic prediction model training . for example ,[1] Came up with a Kalman filter model , The model uses integrated twitter Traffic information and traffic data training , To predict the arrival time of public vehicles . The study uses real-time twitter related to road traffic information , And semantic analysis of the retrieved data set . Compared with pure traffic data sources , The results showed a significant improvement . Similarly ,[15] This paper presents a deep crowd traffic prediction model trained by semantic mining twitter traffic data set . This study adopts the existing crowd flow prediction model —— Spatiotemporal residual network (ST-ResNet), Such as 【44】 Shown , As a baseline model for comparison . End to end prediction model configuration To use Twitter as an additional input to model training , In order to predict the future traffic congestion flow in the urban environment . The results show that , It is positively correlated with Twitter data and traffic flow , Compared to the baseline model , The prediction accuracy has been improved . Again ,[41] A decision tree model for predicting the severity of road traffic congestion is proposed , The model is tested on the real-time traffic network in Bangkok, Thailand . be based on C4.5 Decision tree model Accept tweets from various road traffic broadcast twitter user accounts , And relate it to a large number of people in a specific area . The results show that , Adding twitter traffic information can improve the performance of the prediction model . Although the above research uses the data set obtained by twitter for traffic prediction , But they use Twitter analysis data as the only non traffic input data set for model training . Although the research results show that the prediction accuracy has been improved , But our experience shows that , except Out of the traffic data set , It also includes weather and twitter data , It will significantly improve the prediction accuracy of the deep learning prediction model .
3.Methodology
This section introduces the basic concept of neural network and its use as an automatic encoder . We first describe the automatic encoder , Including its basic logic and functions , Then summarize the proposed Deep two-way LSTM prediction model . The proposed model is [11] An enhanced version of the model proposed in , In this model , In addition to using Twitter messages as an additional input to the model training data set , We also use Stackable auto encoder Model training . The main advantage of the automatic encoder is to learn the compressed representation of a set of input data vectors ( code ). In other words , The automatic encoder is used as a dimension reduction technique in time series data and a data compression tool in image analysis , Very large data sets can be used for training in a short time 【16】. therefore , And ordinary deep LSTM Neural network comparison , Use LSTM The automatic encoder can process in a shorter time High dimensional and big data .
3.1 Autoencoders
Automatic encoder It's a feedforward neural network , It takes the input vector x x x And convert it to a hidden representation or “ potential ” Space h. let me put it another way , The automatic encoder compresses the input vector into low dimensions “ Code ”, And try to reconstruct the output from the given representation . The automatic encoder consists of Three main components : Encoder 、 Code and decoder . Use the equation (13) Realize input vector transformation or encoder function :
Stackable auto encoder It is a set of automatic encoders , Same as automatic encoder , It learns in an unsupervised way . The learning process includes Layered training , To minimize the error between input and output vectors . The next layer of the automatic encoder is the hidden layer of the previous layer , Each layer passes through Use the gradient descent algorithm of the optimization function for training , The optimization function is the square reconstruction error of a single automatic encoder layer J J J. This is in (15) Description in .
3.2 Deep bi-directional LSTMs
Deep LSTM The Internet Is a kind of ( So it is called deep ) Layer loop network . It USES LSTM The storage unit replaces the traditional cyclic unit . application Deep level LSTM The special advantages of the network lie in , It can understand the long-term dependencies in complex data structures at a hierarchical level [16]. With single or double layers ( Shallow learning / The Internet )LSTM Compared to the network , Deep level LSTM It can extract complex time series or time dependencies in series data sets hierarchically [18].
two-way LSTMbi-directional LSTM Structure It consists of two unidirectional LSTM The structure of the composition - Stack in the opposite direction . therefore , In two directions LSTM The application of time series in the training cycle History and future vectors . In this way , Use two separate hidden layers to process data in both directions , Then forward it to a single output layer . chart 3 Shows two-way LSTM Structure . It can be seen that , Network computing forward hidden sequence h ⃗ \vec{h} h And backward hidden sequences h ← h← h←. then , By going in reverse chronological order ( From t t t Start ) Iterate back to the layer to calculate the output = T To 1), The forward layer is from T = therefore , Deep two-way LSTM It's a deep two-way LSTM The Internet , It is a key component of the success documented in the deep learning architecture . As mentioned earlier , Deep learning networks can be represented hierarchically in complex data sets . Deep two way LSTM It is through vertical superposition of multi-layer two-way LSTM Created . such , The output sequence of one layer can be used as the input sequence of the next layer .
3.3 Data fusion
Data fusion It refers to the automatic detection of processing data from multiple data sources 、 Integrate 、 A multilayer process of prediction and combination . According to the literature [4], There are five categories of data fusion technologies (i) Data input and output Data in-data out(DAI-DAO),(ii) Data input feature output Data in-feature out(DAIFEO),(iii) Feature input feature output Feature in-feature out(FEI-FEO),(iv) Characteristic uncertainty output Feature indecision out(FEI-DEO) and (v) Decision input decision output Decision in-decision out(DEIDEO). In this study , We used DAI-DAO Data fusion technology , Be similar to [11] Fusion technology proposed in , The combination of traffic and weather information is used . As mentioned above , The advantages of converging data at this level lead to more reliable output , Because it can be avoided in the feature or prediction / Errors introduced during decision level fusion .
3.4 Deep bi-directional LSTM model
chart 4 An overview of the model architecture proposed in this paper is described . The model is designed to predict time t When traffic flows down 12 Time steps (12 individual 5 Minute forecast , The predicted traffic flow for the next hour ). Our traffic volume forecasting method follows the use of two-way LSTM Unsupervised The way of learning . The model consists of four main elements : Encoder 、 Repeat vector 、 Decoder and full connection (FC) layer (the encoder, repeat vector, decoder, and fully connected (FC) layers).
The first component Represents the input layer , The input layer accepts Input vector , Include traffic, weather, tweet data, It's a m × n m\times n m×n Vector , among m m m Represents the number of samples in the training data set , n n n The number of features ( In this case 5). The second layer Including two-way LSTM layer , Together they form the encoder layer . two-way LSTM The stack reads the input sequence of the vector . After reading the last sequence , Apply a repeating vector layer , This layer ( seeing the name of a thing one thinks of its function ) Repeat the vector sequence to be copied by the encoder layer . then , Decoder layer Take over the sequence from the repeat vector layer , And output the prediction as a single line vector sequence . And then pass it on to FC layer , Predict the target sequence in this layer .LSTM The concept of automatic encoder is beyond the scope of this study , But we refer to [38] For more details .
4 Data description
For the experiments recorded in this paper , The data set was prepared by the Greater Manchester transport authority Transport for Greater Manchester(TfGM) Provide . Weather data obtained during the study included hourly temperatures ( Centigrade ) And precipitation ( mm ). Twitter data comes from two (2) Road traffic information twitter user accounts . These accounts are TfGM(@OfficialTfGM) and Waze Manchester(@WazeTrafficMAN) Official twitter handle for . Details about data set preparation are described in the following sections .
4.1 Traffic dataset
The The dataset includes 5 Minute traffic flow characteristics ( Speed 、 Flow and density ) Historical observations , Use inductive loop detectors to collect . There are ten traffic sensors in the study area , Each sensor is... Apart 0.3 miles . The study time is 2016 year 4 month 1 solstice 2017 year 4 month 16 Japan . The study area is a main road located in Stratford, Greater Manchester, England Chester road (A56), Located in longitude and latitude coordinates (53.46281,−2.28398) and (53.43822,−2.31394) Between , Such as chart 5 As shown by the precise mark on the map in . This research area was chosen because it represents an ideal test platform , Because it is one of the two main roads from the residential area of South Manchester to the downtown of Manchester . Besides , This road still leads to Manchester United stadium —— Old Trafford —— And other leisure places , Such as shopping malls 、 Club 、 Main roads of restaurants, etc . therefore , This road is in rush hour ( That is, when passengers go to and from work )、 football match 、 Over the weekend ( Due to shopping centers and other scenic spots ) The traffic is always heavy .
4.2 Weather data
The weather data for this study came from the center for Atmospheric Research at the University of Manchester (CAS). This data set includes... During the same study period as described in the previous section Hourly precipitation ( In millimeters ) And temperature ( In degrees Celsius ) The observation data of . however , in consideration of The speed data contains 5 Minute traffic parameters , To merge weather data , It is estimated that the hourly data is the same in every minute of the hour . Although this may be a limitation , Because there is a tendency to lose some rich information , But compared to aggregating traffic and twitter data into hourly Observations , It represents a better choice . Besides , The fact that the estimated temperature remains constant for one hour does not represent a significant loss of information , Because people can argue that , The temperature did not change significantly within one hour . Even popular weather information mobile apps , Such as weather channel 、5 WeatherProHD、6 And Yahoo ! The weather ,7 All weather data are observed hourly .
4.3 Tweets data
In this study , Write the python Script , Use Twitter streams APIFootnote8 and tweepyFootnote9 Packets collect tweets from twitter . To ensure that the data obtained from Twitter is real and location specific , We choose to use only road traffic information users (TfGM and Waze) Twitter . The first script was written to extract all tweets from these two user accounts . In this way , We will @OfficialTfGM and @WazeTrafficMAN Of Twitter User account ID As API Medium follow Parameter passing . The result of this step is 102675. secondly , We filter the result set , Include only for the given section under consideration ( Chester road A56 Number ) Twitter . such ,“A56” and “Chester Road” The keyword of is used as python Trace parameters of the endpoint in the script . This produces a data set , Contains... From two users 9275 Tweets . The fourth step involves extracting a timestamp from each tweet , To get its date - Time format ( namely dd-mm-yyyy-hh:mm). The final step in the script involves merging the timestamp in the tweet with the merged traffic and weather data sets . The merge process involves encoding the data set at each timestamp of the tweet into 1, To reflect tweets about traffic conditions , Or encode the data set as... Without recording tweets 0. The comprehensive data set contains 109728 individual 5 Minutes aggregate traffic flow 、 Speed 、 rainfall 、 Observations of temperature and twitter variables . And then use 70:30 The data set is divided by the train test ratio of . surface 1 The descriptive statistics of the data set used in this study are summarized .

5 Experimental setup
This study adopts Overlapping sliding windows Method , The input multivariate time series data are reconstructed into supervised learning format , Be similar to [11] The process described in . therefore , A prediction model is developed , To learn the characteristics of historical time series data sets , So as to carry out multi-step 1 Hours in advance (5 Minute forecast period 12 A step ) Traffic flow prediction , And use surface 1 The aggregate data set shown in is trained
5.1 Model description

The framework uses Eight floors, two-way LSTM Stackable automatic encoder architecture . For all Interconnection layer ( Except for the output layer ), The activation function used is to correct the linear element (ReLU), It introduces nonlinearity into the learning process . The performance of the deep learning network depends on the key parameters , These parameters must be pre-determined through a hyperparametric optimization or hyperparametric process . In this study , To determine the optimal set of hyperparameters , We applied Grid search framework . This provides a repeatable and flexible method for obtaining the optimal parameter set . Algorithm 1 The overall prediction algorithm is given . We An improved depth bidirectional LSTM Traffic forecasting model , The model comes from 【11】 The model presented in , Used to predict the future 12 A traffic flow of time steps ( namely 5 Within the prediction range of minutes 12 Time steps , Equal to the traffic volume in the next hour ).[11] The model used in is an urban traffic speed prediction model trained with traffic and weather data . therefore , We adopted the following deep learning methods in this study ( see surface 2).
5.2 Model performance evaluation
In this study , We used a predictive evaluation technique , It is called forward verification or backward test . Traditional evaluation methods ( Such as k-fold Cross validation ) Not suitable for time series data , Because they don't take into account the time or order of input data sets / dimension . We use three statistical prediction accuracy evaluation indicators —— Mean absolute error (MAE)、 Root mean square error (RMSE) And symmetrical mean absolute percentage error (sMAPE), These indicators are defined by the following equation .

5.3 Baseline models
We compare the performance of this model with the selected state-of-the-art baseline machine learning model . Use the performance metrics described in the previous section , We compared the proposed model with the following baseline model .(1) Support vector regression [32],(2) Extreme gradient enhancement (xGBoost)[6] And random forest regression [27]. For each baseline model , Use the same training data set to ensure the fairness and objectivity of the model evaluation process .
5.4 Implementation environment
The experimental environment used in this study is in a single GPU On the node , The node has IntelXeonE-2146G [email protected]、32-GB Memory and NVIDIA Tesla V100-PCIE 16GB GPU.GPU Used to accelerate model training , Because the deep learning model requires a lot of computation . Development is using Python 3.6.8、R edition 3.5.1 and TensorFlow 1.12.0 Executive .
6 Results

The range of test data sets used in this study is 2016 year 12 month 23 solstice 2017 year 4 month 16 Japan . surface 3 The performance evaluation results of the proposed model and the baseline model are shown . As shown in the table , When using Twitter messages 、 Weather and flow data sets ( The shadow line ) when , The prediction accuracy has been significantly improved . We can also see that , The proposed model is superior to the traditional machine learning baseline , But with a higher training time . chart 6 Shows the use of three (3) A combination of input data to train each model MAE Bar chart . It can be seen that , Use twitter 、 Weather and traffic data training model records MAE The minimum . Similarly , from surface 3 It can be seen that ,SVR The model is competitive with the results of the proposed model .boosting The algorithm returns the worst result , But the training time is the shortest . in summary , Compared with models trained using only traffic and weather data sets , The results clearly show a significant improvement ( It reduces the error ). Although the model using only traffic has the shortest training time , But they can lead to higher prediction errors (MAE、RMSE and sMAPE). This leads us to the conclusion that : When it comes to combined traffic 、 When training with weather and twitter data sources , Including twitter analytics can significantly improve prediction accuracy .
Although the observed gains in accuracy seem to be negligible , But it is worth mentioning that , By reducing MAE( That is, using traffic and weather data sets from 8 car / Hours reduced to... Using Twitter data sets 5.5 car / Hours ) The cost-effectiveness achieved is significant . for example , If the time loss of road users can be quantified and calculated , And the cost of the environmental impact of congestion caused by emissions and sound pollution , There is no need to improve the accuracy of traffic prediction . Besides , The improvement of accuracy can be directly equivalent to the supply chain / Logistics company ( Amazon 、DHL etc. ) Financial affairs / Cost savings . therefore , A seemingly insignificant or marginal reduction in forecast error could easily lead to savings of millions of pounds , Especially considering that the prediction model discussed considers the urban traffic . Besides , More accurate traffic forecasts can reduce driver frustration , Because this is a very busy road , Especially during rush hours . for example , Suppose there is an accident on this road . Traffic managers must use accurate traffic forecasts , And make appropriate decisions on this basis , Do nothing , Divert traffic to nearby roads , If there is a bus lane , Or change the frequency of traffic lights ( If there is ). If the forecast is not accurate , This leads to wrong decisions , Then this will bring more frustration to the driver 、 More late deliveries 、 More noise / Environmental pollution .
chart 7 Shows surface 2 The first 3、4 and 5 The prediction and ground truth value of the three prediction models mentioned in the stage ( actual ) Observed scatter plot . In a scatter diagram x The axis represents the actual ( The ground is real ) Traffic flow value , and y The axis represents the predicted value . The sub chart title indicates the... Of the forecast model R2 value , Can be obtained from surface 3 Infer from .

chart 7 The figure in the lower right corner of the middle represents the specific period of lane closure in the study area , Multiple graphs of predicted and actual values of each model . The lane closed for four days , The reason is the road construction project in the study area . chart 7 In the lower right figure y The axis represents the traffic flow ( car / Hours ), and x The axis represents... Within four days of the lane closure taking effect 5 Minute time step (48 Time steps ). As you can see from the diagram , Prediction models trained only with traffic data sets and traffic and weather data sets are not as accurate as models trained with integrated twitter data sets . However , The model trained with Twitter dataset performs best in prediction , Because there are several tweets broadcast from twitter accounts , This improves the accuracy of the prediction model .
Similarly , chart 8 Shows the separate use of twitter + Traffic + Weather dataset Before training the model 400 Prediction performance of each time step . For each sub parcel ,x The axis represents the time step ,y The axis represents the flow value (veh/h)( For the upper half of each parcel ) And the absolute error between the predicted value and the actual value ( For the lower half of the sub parcel ). From the picture 8 As can be seen from the figure in , The proposed model accurately captures the time pattern , And significantly better than the most advanced benchmark model of machine learning . As shown in the figure , The proposed model performs best ,SVR The model comes second . When comparing the two models in terms of calculation time , It can be seen that , When MAE Reduce the appointment 35.4%( From 8.5 car / Hour down 5.5 car / Hours ) when , The performance of the proposed model is competitive ( namely 583 second vs.496 Seconds or time increases 17%).

In order to test the statistical significance between the obtained result sets , We did an analysis of variance analysis of variance (ANOVA), Because it estimates the variance within the group ( Variance in raw data / error ) And the variance between groups ( The result of the experiment ). In this section , We used ANOVA to compare twitter from the training data set + The weather + Traffic 、 The weather + Flow and prediction results using only flow . The following hypotheses were tested using analysis of variance .
The assumption statement uses the assumption that 0.05 To test the significance level of (p < 0.05). surface 4 The results of one-way ANOVA of each training model predictive value group are shown ( That is, use Twitter separately + The weather + Traffic data set 、 traffic + Weather and traffic only data sets ). hypothesis F Greater than Fcrit, The null hypothesis is rejected , This means that the corresponding results are significantly different . As can be seen from the table , There are significant differences (p < 0.05), therefore , hypothesis Fcrit = 2.6072 and F = 5.3927, therefore F > Fcrit company . Besides , Inter group p The value shows statistical significance , Support the rejection of the null hypothesis , because p = 0.001058 < 0.05.

7 Conclusions and future work
In this paper , We proposed An urban traffic flow prediction model , The model explores the effectiveness of integrating the rich information obtained from twitter data into urban traffic forecasting , Thus, the existing urban traffic prediction model based on weather and traffic data sets is extended .【11】 The baseline model proposed in this paper adopts the depth bi-directional method LSTM Architecture , The architecture includes traffic flow parameters as well as rainfall and temperature . The enhancement model proposed in this paper adopts two-way LSTM Automatic encoder method , This method accepts traffic information from twitter accounts @OfficialTfGM and @WazeTrafficMAN Geo specific Twitter as additional non traffic input data . The results of empirical analysis show that , Except for transportation 、 Rainfall and temperature data sets , Adding twitter data can also MAE from 8 car / Hours reduced to 5.5 car / Hours , So as to provide a more accurate traffic flow prediction model . A main urban road in Greater Manchester, England ( Chester road -A56) On , Use historical traffic 、 The model was tested with weather and twitter data sets .
The work described in this study is limited to a main road in Greater Manchester, England . Although this is a limitation , But using the model as is to illustrate that other urban roads in big cities can reuse the proposed model architecture using some other data functions . for example , Data sets can be modified 、 Twitter location and twitter source ( In Manchester TfGM、 London TfL etc. ), To accommodate the different positions under consideration , While preserving the model architecture and composition . On the other hand , The correlation between roads can be considered ( namely A How road accidents affect B road ?) It has some additional data functions . However , With the corresponding increase of model computing resources and training time , Extending the model to a very wide geographical area presents considerable complexity challenges . Larger geographical areas involve more data and higher training time requirements , therefore , Increased the possibility of lower accuracy , Especially because the traffic sensing devices on residential streets and roads are less developed . However , Considering that the study area has considered one of the two main pipelines entering the downtown of Manchester , Because the enterprise has its business premises and large-scale Arndale Shopping Mall , The centre of Manchester attracts a lot of traffic , Therefore, the proposed end-to-end model can be copied to other main roads and highways , This will help manage Greater Manchester effectively ( A population of about 280 ten thousand ) And other major geographical areas . Future work will consider additional filtering of twitter , And include additional keywords . However , Trade offs must also be considered , for example , Semantic analysis of twitter may improve model performance , But it will also increase the training and pretreatment time , Such as 【35】 Shown . secondly , In this paper , We only include those from the transportation authorities ( namely TfGM and WazeTrafficMAN) Twitter data for . It may include twitter from other road traffic information twitter accounts , For example, from Manchester United Football Club ( Football Day ) Or twitter from the event company .
边栏推荐
- [精选] 多账号统一登录,你如何设计?
- Object.defineProperty和Reflect.defineProperty的容错问题
- (待补充)GAMES101作业7提高-实现微表面模型你需要了解的知识
- LeetCode-513. 找树左下角的值
- Visit Amazon memorydb and build your own redis memory database
- [featured] how do you design unified login with multiple accounts?
- Advanced secret of xtransfer technology newcomers: the treasure you can't miss mentor
- 手动事务的几个类
- 123. the best time to buy and sell shares III
- Analysis of BBR congestion control state machine
猜你喜欢

LeetCode-513. Find the value in the lower left corner of the tree

将二维数组方阵顺时针旋转90°

Transport layer UDP & TCP

Vscode netless environment rapid migration development environment (VIP collection version)

Wireshark packet capturing skills summarized by myself

Why are life science enterprises on the cloud in succession?

Blender's simple skills - array, rotation, array and curve

Multi task model of recommended model: esmm, MMOE

C语言-关键字1

多线程收尾
随机推荐
#国企央企结构化面试#国企就业#墨斗互动就业服务管家
EditText 控制软键盘出现 搜索
传输层 udp && tcp
Blender FAQs
堆排序和快速排序原理实现
架构实战营 第 6 期 毕业总结
LeetCode-513. 找树左下角的值
Tso hardware sharding is a header copy problem
Introduce the overall process of bootloader, PM, kernel and system startup
Intelligent fish tank control system based on STM32 under Internet of things
【论】A deep-learning model for urban traffic flow prediction with traffic events mined from twitter
Vscode netless environment rapid migration development environment (VIP collection version)
Volcano becomes spark default batch scheduler
装修首页自定义全屏视频播放效果gif动态图片制作视频教程播放代码操作设置全屏居中阿里巴巴国际站
how to install clustershell
数据链路层 && 一些其他的协议or技术
Li Kou daily question - day 26 -496 Next larger element I
memcached全面剖析–2. 理解memcached的內存存儲
介绍BootLoader、PM、kernel和系统开机的总体流程
Object.defineProperty和Reflect.defineProperty的容错问题