ETNA – time series forecasting framework

Last update: Jan 08, 2023

Overview

ETNA Time Series Library

Predict your time series the easiest way

Homepage | Documentation | Tutorials | Contribution Guide | Release Notes

ETNA is an easy-to-use time series forecasting framework. It includes built in toolkits for time series preprocessing, feature generation, a variety of predictive models with unified interface - from classic machine learning to SOTA neural networks, models combination methods and smart backtesting. ETNA is designed to make working with time series simple, productive, and fun.

ETNA is the first python open source framework of Tinkoff.ru Artificial Intelligence Center. The library started as an internal product in our company - we use it in over 10+ projects now, so we often release updates. Contributions are welcome - check our Contribution Guide.

Installation

ETNA is on PyPI, so you can use pip to install it.

pip install --upgrade pip
pip install etna

Get started

Here's some example code for a quick start.

import pandas as pd
from etna.datasets.tsdataset import TSDataset
from etna.models import ProphetModel
from etna.pipeline import Pipeline

# Read the data
df = pd.read_csv("examples/data/example_dataset.csv")

# Create a TSDataset
df = TSDataset.to_dataset(df)
ts = TSDataset(df, freq="D")

# Choose a horizon
HORIZON = 8

# Fit the pipeline
pipeline = Pipeline(model=ProphetModel(), horizon=HORIZON)
pipeline.fit(ts)

# Make the forecast
forecast_ts = pipeline.forecast()

Tutorials

We have also prepared a set of tutorials for an easy introduction:

Notebook	Interactive launch
Get started
Backtest
EDA
Outliers
Clustering
Deep learning models
Ensembles

Documentation

ETNA documentation is available here.

Acknowledgments

ETNA.Team

Andrey Alekseev, Nikita Barinov, Dmitriy Bunin, Aleksandr Chikov, Vladislav Denisov, Martin Gabdushev, Sergey Kolesnikov, Artem Makhin, Ivan Mitskovets, Albina Munirova, Nikolay Romantsov, Julia Shenshina

ETNA.Contributors

Artem Levashov, Aleksey Podkidyshev

License

Feel free to use our library in your commercial and private applications.

ETNA is covered by Apache 2.0. Read more about this license here

Comments

Notebook with forecasting strategies
Before submitting (must do checklist)

[x] Did you read the contribution guide?

[x] Did you update the docs? We use Numpy format for all the methods and classes.

[x] Did you write any new necessary tests?

[ ] Did you update the CHANGELOG?

Proposed Changes

Closing issues

#825
opened by scanhex12 46
Update Notebooks with new EDA methods
Before submitting (must do checklist)

[x] Did you read the contribution guide?

[ ] Did you update the docs? We use Numpy format for all the methods and classes.

[ ] Did you write any new necessary tests?

[ ] Did you update the CHANGELOG?

Proposed Changes

Closing issues

closes #711
opened by DBcreator 12
Fix notebooks in inference track
Before submitting (must do checklist)

[x] Did you read the contribution guide?

[x] Did you update the docs? We use Numpy format for all the methods and classes.

[x] Did you write any new necessary tests?

[x] Did you update the CHANGELOG?

Proposed Changes

Look #973.

Closing issues

Closes #973.
opened by Mr-Geekman 12
Improve sample_acf and sample_pacf plots
Before submitting (must do checklist)

[x] Did you read the contribution guide?

[x] Did you update the docs? We use Numpy format for all the methods and classes.

[x] Did you write any new necessary tests?

[x] Did you update the CHANGELOG?

Proposed Changes

Closing issues

closes #682
opened by DBcreator 6
Classification notebook
Before submitting (must do checklist)

[ ] Did you read the contribution guide?

[ ] Did you update the docs? We use Numpy format for all the methods and classes.

[ ] Did you write any new necessary tests?

[ ] Did you update the CHANGELOG?

Proposed Changes

Closing issues
opened by alex-hse-repository 6
Poc: base classes for deep models and rnn and deepstate with examples
Before submitting (must do checklist)

[ ] Did you read the contribution guide?

[x] Did you update the docs? We use Numpy format for all the methods and classes.

[x] Did you write any new necessary tests?

[x] Did you update the CHANGELOG?

Proposed Changes

Closing issues
opened by martins0n 6
Enhance `TSDataset` to work with hierarchical series
Before submitting (must do checklist)

[ ] Did you read the contribution guide?

[ ] Did you update the docs? We use Numpy format for all the methods and classes.

[ ] Did you write any new necessary tests?

[ ] Did you update the CHANGELOG?

Proposed Changes

Closing issues

closes #1028
opened by alex-hse-repository 5
Speed up columns slices: `etna.datasets.utils.select_columns`
🚀 Feature Request

In a lot of places we use df.loc[:, pd.IndexSlice[segments, column]] to select column from all the segments. It appears to be very slow on a lot of segments.

We should find places where we use it and make sure that it can be replaced with df.loc[:, pd.IndexSlice[:, column]] without problems.

Where was some problem with the second choice: #188. We should investigate is it still existing and in which conditions:

Is it applicable for selection only one column? (SklearnTransform selects many)

Can it be avoided by some trick in taking slices (sorting columns for example).

Proposal

Find all places with slow slice df.loc[:, pd.IndexSlice[segments, column]] where column is scalar. Replace them with function (you can add it etna.datasets.utils). Try to replace slow slice in function with fast slice: df.loc[:, pd.IndexSlice[:, column]. Make sure that in that case we don't have reordering of columns in different pandas versions.

Do the same but with list of values in column (e.g. SklearnTransform) and investigate reordering issue during testing. We want to avoid it without putting all the segments into the slice.

Make some benchmarking that changed transforms (or other calls) become faster. Add code for benchmarking and its results in the comments of PR. E.g. you can take dataframe with 50000 segments, 100 timestamps, 5 additional int columns, 5 additional float columns, 5 additional category columns.

Test cases

Make sure that current tests pass for scalar case.

Make sure that current tests pass for list case.

Add tests on function for selection of one column.

Add tests on function for selection of multiple columns (in SklearnTransform we had some tests on reordering, it can be useful).

Additional context

No response
enhancement important
opened by Mr-Geekman 5
Create assemble_pipelines 717
Before submitting (must do checklist)

[ ] Did you read the contribution guide?

[ ] Did you update the docs? We use Numpy format for all the methods and classes.

[ ] Did you write any new necessary tests?

[ ] Did you update the CHANGELOG?

Proposed Changes

Closing issues

closes #717
opened by scanhex12 5
Fix bugs and documentation for `plot_backtest` and `plot_backtest_interactive`
IMPORTANT: Please do not create a Pull Request without creating an issue first.

Before submitting (must do checklist)

[x] Did you read the contribution guide?

[x] Did you update the docs? We use Numpy format for all the methods and classes.

[x] Did you write any new necessary tests?

[x] Did you update the CHANGELOG?

Type of Change

[ ] Examples / docs / tutorials / contributors update

[x] Bug fix (non-breaking change which fixes an issue)

[ ] Improvement (non-breaking change which improves an existing feature)

[ ] New feature (non-breaking change which adds functionality)

[ ] Breaking change (fix or feature that would cause existing functionality to change)

Proposed Changes

Look #664.

Related Issue

#664.

Closing issues

Closes #664.
bug documentation
opened by Mr-Geekman 5
add flake8-bugbear
IMPORTANT: Please do not create a Pull Request without creating an issue first.

Before submitting (must do checklist)

[x] Did you read the contribution guide?

[ ] Did you update the docs? We use Numpy format for all the methods and classes.

[ ] Did you write any new necessary tests?

[ ] Did you update the CHANGELOG?

Type of Change

[x] Examples / docs / tutorials / contributors update

[ ] Bug fix (non-breaking change which fixes an issue)

[x] Improvement (non-breaking change which improves an existing feature)

[ ] New feature (non-breaking change which adds functionality)

[ ] Breaking change (fix or feature that would cause existing functionality to change)

Proposed Changes

Related Issue

Closing issues
opened by iKintosh 5
Create example notebook about hierarchical pipeline
🚀 Feature Request

Create example notebook explaining how to work with time series in etna

Proposal

Notebook should contain explanation of the following:

What are the hierarchical time series

How to store the hierarchical time series in etna(hierarchical long format) and how to convert it to etna wide format with to_hierarchical_dataset

How HierarchicalStructure works and how can it be created

How to create TSDataset with hierarchical structure and how exog data works in case of hierarchical dataset

What methods exists to forecast hierarchical time series + which methods we have in the library and how to use them

Compere the HierarchucalPipeline and Pipeline for top-down and bottom-up cases

Use Australian domestic tourism dataset for demonstration, download it by the link in the notebook, see example for m4 here in "Load dataset" section

Test cases

No response

Additional context

No response
enhancement notebook
opened by alex-hse-repository 0
Create `generate_hierarchical_df` method
🚀 Feature Request

Create method to generate random hierarchical dataset

Proposal

In etna/datasets/datasets_generation.py create method:

def generate_hierarchical_df(periods: int, n_segments: List[int], freq: str = "D", start_time: str = "2000-01-01", ar_coef: Optional[list] = None, sigma: float = 1, random_seed: int = 1) -> pd.Dataframe

Parameters:

n_segments -- number of segments on each level

Other parameters are the same as in generate_ar_df Description:

Validate n_segments: number of segments on each level should be lower than on the next level

Generate segments on the last level using generate_ar_df

Generate random tree with configuration of nodes on levels from n_segments

In the dataframe replace column segment with columns describing the structure of the tree(one column for each level)

Node names in the levels should be generated as follows "level_<level_id>_<segment_id>" -- you can come up with better ideas for naming

On the bottom level leave the default segment names from generate_ar_df

Add example of creating dataset with hierarchical structure using generate_hierarchical_df to the docs of TSDataset here

Test cases

Method generate dataframe with correct properties

number of segments

number of periods

columns(timestamp, level columns, target)

level columns contains correct values(for some corner cases where randomness does not influence like n_segments=[1, 2])

Check that we can convert this dataframe to the wide format using to_hierarchical_dataset

Additional context

No response
enhancement
opened by alex-hse-repository 0
[BUG] AttributeError: 'NaiveModel'
🐛 Bug Report

при прохождении стартового мануала получаю ошибку AttributeError: 'NaiveModel' object has no attribute 'context_size'

Expected behavior

future_ts = train_ts.make_future(future_steps=HORIZON, tail_steps=model.context_size)

Как исправить ошибку?

How To Reproduce

HORIZON = 8 from etna.models import NaiveModel

Соответствует модели

model = NaiveModel(lag=12) model.fit(train_ts)

Сделайте прогноз

future_ts = train_ts.make_future(future_steps=HORIZON, tail_steps=model.context_size) forecast_ts = model.forecast(future_ts, prediction_size=HORIZON)

Environment

python 3.9 etna: 1.13.0

Additional context

No response

Checklist

[x] Bug appears at the latest library version

bug
opened by vukeep 0
Create `HierarchicalPipeline`
🚀 Feature Request

Create pipeline to process hierarchical time series

Proposal

Create class:

class HierarchicalPipeline(Pipeline): def __init__( self, reconciler: BaseReconciler, model: ModelType, transforms: Sequence[Transform] = (), horizon: int = 1 ):

Implement method fit:

Fit the reconciler using reconciler.fit method

Aggregate dataset on the source_level using reconciler.aggregate

Call the fit method of super class with generated dataset

Implement method raw_forecast()

Call the forecast method of the super class

Implement method forecast()

Call the raw_forecast

Generate the target dataset using reconciler.reconcile

Test cases

Test that after fit pipeline saves correct ts on the source_level of reconciler

Test that raw_forecast generates forecast on the source_level of reconciler

Test that forecast generates forecast on the target_level of reconciler

Test that backtest works and produce correct metrics(you can use constant dataset for example)

All the tests should cover both top-down and bottom-up reconcilers with correct source and target levels

Additional context

blocked by #1037 #1038 #1044
enhancement
opened by alex-hse-repository 0
Add `params_to_tune` method
🚀 Feature Request

For AutoML track we need knowledge of hyperparameters to tune for Transformers and Models and Pipelines. We should add method params_to_tune to all classes.

Proposal

Add default method for Transform and *AbstractModel's it should return empty dict

Add to method to AbstractPipeline

Add method for Pipeline it should iterate over transforms and model and collect all params.

Add implementation for Ensemble and Stacking - raise NotImplementedError

return value of params_to_tune is supposed to be like

params = { "model.n_iterations": optuna.distributions.CategoricalDistribution((10, 100, 200)), "transforms.0.mode": optuna.distributions.CategoricalDistribution(("per-segment", "macro")), }

Test cases

check default implementation for Transform and Models

check if Pipeline correctly combine params

check if Ensemble and Stacking rise error

Additional context

No response
enhancement
opened by martins0n 0
`set_params` method
🚀 Feature Request

Add method to change parameters of etna objects

Proposal

BaseMixin.set_params: Callable[Dict] -> BaseMixin

It works like assert Pipeline(model=CatboostMultiSegmentModel(n_iterations=100)).set_params({'model.n_iterations': 1000}) == Pipeline(model=CatboostMultiSegmentModel(n_iterations=1000))

We suppose to make conversion via to_dict methods (dict -> patching dict -> creation new object from dict)

Test cases

check that set_params change params

check if input dict have unknown parameters we do nothing

Additional context

No response
enhancement
opened by martins0n 0

Releases(1.14.0)

1.14.0(Dec 16, 2022)
Highlights:

Add python 3.10 support (#1005)

Add experimental module with TimeSeriesBinaryClassifier and PredictabilityAnalyzer (#985), see example notebook for the ditails (#997)

Inference track results: add predict method to pipelines, teach some models to work with context, change hierarchy of base models, update notebook examples (#979)

Full changelog:

Added

Add python 3.10 support (#1005)

Add SumTranform(#1021)

Add plot_change_points_interactive (#988)

Add experimental module with TimeSeriesBinaryClassifier and PredictabilityAnalyzer (#985)

Inference track results: add predict method to pipelines, teach some models to work with context, change hierarchy of base models, update notebook examples (#979)

Add get_ruptures_regularization into experimental module (#1001)

Add example classification notebook for experimental classification feature (#997)

Changed

Change returned model in get_model of BATSModel, TBATSModel (#987)

Add acf_plot, deprecated sample_acf_plot, sample_pacf_plot (#1004)

Change returned model in get_model of HoltWintersModel, HoltModel, SimpleExpSmoothingModel (#986)

Fixed

Fix MinMaxDifferenceTransform import (#1030)

Fix release docs and docker images cron job (#982)

Fix forecast first point with CatBoostPerSegmentModel (#1010)

Fix hanging EDA notebook (#1027)

Fix hanging EDA notebook v2 + cache clean script (#1034)

Source code(tar.gz)
Source code(zip)
1.13.0(Oct 10, 2022)
Highlights:

etna.auto module for pipeline greedy search with default pipelines pool wandb sweeps and optuna examples

Full changelog:

Added

Add greater_is_better property for Metric (#921)

etna.auto for greedy search, etna.auto.pool with default pipelines, etna.auto.optuna wrapper for optuna (#895)

Add MinMaxDifferenceTransform (#955)

Add wandb sweeps and optuna examples (#338)

Changed

Make slicing faster in TSDataset._merge_exog, FilterFeaturesTransform, AddConstTransform, LambdaTransform, LagTransform, LogTransform, SklearnTransform, WindowStatisticsTransform; make CICD test different pandas versions (#900)

Mark some tests as long (#929)

Fix to_dict with nn models and add unsafe conversion for callbacks (#949)

Fixed

Fix to_dict with function as parameter (#941)

Fix native networks to work with generated future equals to horizon (#936)

Fix SARIMAXModel to work with exogenous data on pmdarima>=2.0 (#940)

Teach catboost to work with encoders (#957)

Source code(tar.gz)
Source code(zip)
1.12.0(Sep 5, 2022)
Highlights:

ETNA native MLPModel

to_dict method in all the etna objects

DirectEnsemble implementing the direct forecasting strategy

Notebook about forecasting strategies

Full changelog:

Added

Function to transform etna objects to dict(#818)

MLPModel(#860)

DeadlineMovingAverageModel (#827)

DirectEnsemble (#824)

CICD: untaged docker image cleaner (#856)

Notebook about forecasting strategies (#864)

Add ChangePointSegmentationTransform, RupturesChangePointsModel (#821)

Changed

Teach AutoARIMAModel to work with out-sample predictions (#830)

Make TSDataset.to_flatten faster for big datasets (#848)

Fixed

Type hints for external users by PEP 561 (#868)

Type hints for Pipeline.model match models.nn(#768)

Fix behavior of SARIMAXModel if simple_differencing=True is set (#837)

Bug python3.7 and TypedDict import (867)

Fix deprecated pytorch lightning trainer flags (#866)

ProphetModel doesn't work with cap and floor regressors (#842)

Fix problem with encoding category types in OHE (#843)

Change Docker cuda image version from 11.1 to 11.6.2 (#838)

Optimize time complexity of determine_num_steps(#864)

All warning as errors(#880)

Update .gitignore with .DS_Store and checkpoints (#883)

Delete ROADMAP.md ([#904]https://github.com/tinkoff-ai/etna/pull/904)

Fix ci invalid cache (#896)

Source code(tar.gz)
Source code(zip)
1.11.1(Aug 3, 2022)
Full changelog:

Fixed

Fix missing constant_value in TimeSeriesImputerTransform (#819)

Make in-sample predictions of SARIMAXModel non-dynamic in all cases (#812)

Add known_future to cli docs (#823)

Source code(tar.gz)
Source code(zip)
1.11.0(Jul 25, 2022)
Highlights:

ETNA native RNN and base classes for deep learning models

Lambda transform

Prophet 1.1 support without c++ compiler dependency

Prediction intervals for DeepAR and TFTModel

Add known_future parameter to CLI

Full changelog:

Added

LSTM based RNN and native deep models base classes (#776)

Lambda transform (#762)

assemble pipelines (#774)

Tests on in-sample, out-sample predictions with gap for all models (#785)

Changed

Add columns and mode parameters in plot_correlation_matrix (#726)

Add CatBoostPerSegmentModel and CatBoostMultiSegmentModel classes, deprecate CatBoostModelPerSegment and CatBoostModelMultiSegment (#779)

Allow Prophet update to 1.1 (#799)

Make LagTransform, LogTransform, AddConstTransform vectorized (#756)

Improve the behavior of plot_feature_relevance visualizing p-values (#795)

Update poetry.core version (#780)

Make native prediction intervals for DeepAR (#761)

Make native prediction intervals for TFTModel (#770)

Test cases for testing inference of models (#794)

Wandb.log to WandbLogger (#816)

Fixed

Fix missing prophet in docker images (#767)

Add known_future parameter to CLI (#758)

FutureWarning: The frame.append method is deprecated. Use pandas.concat instead (#764)

Correct ordering if multi-index in backtest (#771)

Raise errors in models.nn if they can't make in-sample and some cases out-sample predictions (#813)

Teach BATS/TBATS to work with in-sample, out-sample predictions correctly (#806)

Github actions cache issue with poetry update (#778)

Source code(tar.gz)
Source code(zip)
1.10.0(Jun 15, 2022)
Highlights:

BATS, TBATS and AutoArima models

Fix of empirical prediction intervals

Full changelog:

Added

Add Sign metric (#730)

Add AutoARIMA model (#679)

Add parameters start, end to some eda methods (#665)

Add BATS and TBATS model adapters (#678)

Jupyter extension for black (#742)

Changed

Change color of lines in plot_anomalies and plot_clusters, add grid to all plots, make trend line thicker in plot_trend (#705)

Change format of holidays for holiday_plot (#708)

Make feature selection transforms return columns in inverse_transform(#688)

Add xticks parameter for plot_periodogram, clip frequencies to be >= 1 (#706)

Make TSDataset method to_dataset work with copy of the passed dataframe (#741)

Fixed

Fix bug when ts.plot does not save figure (#714)

Fix bug in plot_clusters (#675)

Fix bugs and documentation for cross_corr_plot (#691)

Fix bugs and documentation for plot_backtest and plot_backtest_interactive (#700)

Make STLTransform to work with NaNs at the beginning (#736)

Fix tiny prediction intervals (#722)

Fix deepcopy issue for fitted deepmodel (#735)

Fix making backtest if all segments start with NaNs (#728)

Fix logging issues with backtest while emp intervals using (#747)

Source code(tar.gz)
Source code(zip)
1.9.0(May 17, 2022)
Added

Add plot_metric_per_segment (#658)

Add metric_per_segment_distribution_plot (#666)

Changed

Remove parameter normalize in linear models (#686)

Fixed

Add missed forecast_params in forecast CLI method (#671)

Add _per_segment_average method to the Metric class (#684)

Fix get_statistics_relevance_table working with NaNs and categoricals (#672)

Fix bugs and documentation for stl_plot (#685)

Fix cuda docker images (#694])

Source code(tar.gz)
Source code(zip)
1.8.0(Apr 28, 2022)
Added

Width and Coverage metrics for prediction intervals (#638)

Masked backtest (#613)

Add seasonal_plot (#628)

Add plot_periodogram (#606)

Add support of quantiles in backtest (#652)

Add prediction_actual_scatter_plot (#610)

Add plot_holidays (#624)

Add instruction about documentation formatting to contribution guide (#648)

Seasonal strategy in TimeSeriesImputerTransform (#639)

Changed

Add logging to Metric.__call__ (#643)

Add in_column to plot_anomalies, plot_anomalies_interactive (#618)

Add logging to TSDataset.inverse_transform (#642)

Fixed

Passing non default params for default models STLTransform (#641)

Fixed bug in SARIMAX model with horizon=1 (#637)

Fixed bug in models get_model method (#623)

Fixed unsafe comparison in plots (#611)

Fixed plot_trend does not work with Linear and TheilSen transforms (#617)

Improve computation time for rolling window statistics (#625)

Don't fill first timestamps in TimeSeriesImputerTransform (#634)

Fix documentation formatting (#636)

Fix bug with exog features in AutoRegressivePipeline (#647)

Fix missed dependencies (#656)

Fix custom_transform_and_model notebook (#651)

Fix MyBinder bug with dependencies (#650)

Source code(tar.gz)
Source code(zip)
1.7.0(Mar 16, 2022)
Highlights:

New plots (a lot!): imputation, trend, change points, residuals, qq-plot, feature relevance, stl.

New regressors logic in TSDatasets, Transforms and Models

Added jupyter notebook with regressors example

Prediction intervals visualization in plot_forecast

Detrending could be polynomial

Added installation instruction for M1

Fixed TSDataset when plot method does not plot all required segments

VotingEnsemble allows to set weights of estimator as weights of pipelines

Full changelog:

Added

Regressors logic to TSDatasets init (https://github.com/tinkoff-ai/etna/pull/357)

FutureMixin into some transforms (https://github.com/tinkoff-ai/etna/pull/361)

Regressors updating in TSDataset transform loops (https://github.com/tinkoff-ai/etna/pull/374)

Regressors handling in TSDataset make_future and train_test_split (https://github.com/tinkoff-ai/etna/pull/447)

Prediction intervals visualization in plot_forecast (https://github.com/tinkoff-ai/etna/pull/538)

Add plot_imputation (https://github.com/tinkoff-ai/etna/pull/598)

Add plot_time_series_with_change_points function (https://github.com/tinkoff-ai/etna/pull/534)

Add plot_trend (https://github.com/tinkoff-ai/etna/pull/565)

Add find_change_points function (https://github.com/tinkoff-ai/etna/pull/521)

Add option day_number_in_year to DateFlagsTransform (https://github.com/tinkoff-ai/etna/pull/552)

Add plot_residuals (https://github.com/tinkoff-ai/etna/pull/539)

Add get_residuals (https://github.com/tinkoff-ai/etna/pull/597)

Create PerSegmentBaseModel, PerSegmentPredictionIntervalModel (https://github.com/tinkoff-ai/etna/pull/537)

Create MultiSegmentModel (https://github.com/tinkoff-ai/etna/pull/551)

Add qq_plot (https://github.com/tinkoff-ai/etna/pull/604)

Add regressors example notebook (https://github.com/tinkoff-ai/etna/pull/577)

Create EnsembleMixin (https://github.com/tinkoff-ai/etna/pull/574)

Add option season_number to DateFlagsTransform (https://github.com/tinkoff-ai/etna/pull/567)

Create BasePipeline, add prediction intervals to all the pipelines, move parameter n_fold to forecast (https://github.com/tinkoff-ai/etna/pull/578)

Add stl_plot (https://github.com/tinkoff-ai/etna/pull/575)

Add plot_features_relevance (https://github.com/tinkoff-ai/etna/pull/579)

Add community section to README.md (https://github.com/tinkoff-ai/etna/pull/580)

Create AbstaractPipeline (https://github.com/tinkoff-ai/etna/pull/573)

Option "auto" to weights parameter of VotingEnsemble, enables to use feature importance as weights of base estimators (https://github.com/tinkoff-ai/etna/pull/587)

Changed

Change the way ProphetModel works with regressors (https://github.com/tinkoff-ai/etna/pull/383)

Change the way SARIMAXModel works with regressors (https://github.com/tinkoff-ai/etna/pull/380)

Change the way Sklearn models works with regressors (https://github.com/tinkoff-ai/etna/pull/440)

Change the way FeatureSelectionTransform works with regressors, rename variables replacing the "regressor" to "feature" (https://github.com/tinkoff-ai/etna/pull/522)

Add table option to ConsoleLogger (https://github.com/tinkoff-ai/etna/pull/544)

Installation instruction (https://github.com/tinkoff-ai/etna/pull/526)

Update plot_forecast for multi-forecast mode (https://github.com/tinkoff-ai/etna/pull/584)

Trainer kwargs for deep models (https://github.com/tinkoff-ai/etna/pull/540)

Update CONTRIBUTING.md (https://github.com/tinkoff-ai/etna/pull/536)

Rename _CatBoostModel, _HoltWintersModel, _SklearnModel (https://github.com/tinkoff-ai/etna/pull/543)

Add logging to TSDataset.make_future, log repr of transform instead of class name (https://github.com/tinkoff-ai/etna/pull/555)

Rename _SARIMAXModel and _ProphetModel, make SARIMAXModel and ProphetModel inherit from PerSegmentPredictionIntervalModel (https://github.com/tinkoff-ai/etna/pull/549)

Update get_started section in README (https://github.com/tinkoff-ai/etna/pull/569)

Make detrending polynomial (https://github.com/tinkoff-ai/etna/pull/566)

Update documentation about transforms that generate regressors, update examples with them (https://github.com/tinkoff-ai/etna/pull/572)

Fix that segment is string (https://github.com/tinkoff-ai/etna/pull/602)

Make LabelEncoderTransform and OneHotEncoderTransform multi-segment (https://github.com/tinkoff-ai/etna/pull/554)

Fixed

Fix TSDataset._update_regressors logic removing the regressors (https://github.com/tinkoff-ai/etna/pull/489)

Fix TSDataset.info, TSDataset.describe methods (https://github.com/tinkoff-ai/etna/pull/519)

Fix regressors handling for OneHotEncoderTransform and HolidayTransform (https://github.com/tinkoff-ai/etna/pull/518)

Fix wandb summary issue with custom plots (https://github.com/tinkoff-ai/etna/pull/535)

Small notebook fixes (https://github.com/tinkoff-ai/etna/pull/595)

Fix import Literal in plotters (https://github.com/tinkoff-ai/etna/pull/558)

Fix plot method bug when plot method does not plot all required segments (https://github.com/tinkoff-ai/etna/pull/596)

Fix dependencies for ARM (https://github.com/tinkoff-ai/etna/pull/599)

[BUG] nn models make forecast without inverse_transform (https://github.com/tinkoff-ai/etna/pull/541)

Source code(tar.gz)
Source code(zip)
1.6.3(Feb 14, 2022)
Highlights:

Fix for version incompatibility of scipy and statsmodels

Full changelog:

Fixed

Fixed adding unnecessary lag=1 in statistics (#523)

Fixed wrong MeanTransform behaviour when using alpha parameter (#523)

Fix processing add_noise=True parameter in datasets generation (#520)

Fix scipy version (#525)

Source code(tar.gz)
Source code(zip)
1.6.2(Feb 9, 2022)
Full changelog:

Added

Holt-Winters', Holt and exponential smoothing models (#502 )

Fixed

Bug with exog features in DifferencingTransform.inverse_transform (#503)

Source code(tar.gz)
Source code(zip)
1.6.1(Feb 3, 2022)
Full changelog:

Added

Allow choosing start and end in TSDataset.plot method (488)

Changed

Make TSDataset.to_flatten faster (#475)

Allow logger percentile metric aggregation to work with NaNs (#483)

Fixed

Can't make forecasting with pipelines, data with nans, and Imputers (#473)

Source code(tar.gz)
Source code(zip)
1.6.0(Jan 28, 2022)
Highlights:

New transforms for feature engineering: DifferencingTransform, OneHotEncoderTransform, LabelEncoderTransform, MADTransform.

New transform for feature selection: MRMRFeatureSelectionTransform.

Warnings in docstrings about possible look-ahead bias in case of using some transfroms.

Version update of sklearn, pytorch-forecasting and PytorchForecastingTransform api minor changes.

Fixes for SARIMAX non-default parameters.

TSDataset.describe method for high-level information about provided time series: % of missing values, number of segments, first and last dates and etc.

Full changelog:

Added

Method TSDataset.info (#409)

DifferencingTransform (#414)

OneHotEncoderTransform and LabelEncoderTransform (#431)

MADTransform (#441)

MRMRFeatureSelectionTransform (#439)

Possibility to change metric representation in backtest using Metric.name (#454)

Warning section in documentation about look-ahead bias (#464)

Parameter figsize to all the plotters #465

Changed

Change method TSDataset.describe (#409)

Group Transforms according to their impact (#420)

Change the way LagTransform, DateFlagsTransform and TimeFlagsTransform generate column names (#421)

Clarify the behaviour of TimeSeriesImputerTransform in case of all NaN values (#427)

Fixed bug in title in sample_acf_plot method (#432)

Pytorch-forecasting and sklearn version update + some pytroch transform API changing (#445)

Fixed

Add relevance_params in GaleShapleyFeatureSelectionTransform (#410)

Docs for statistics transforms (#441)

Handling NaNs in trend transforms (#456)

Logger fails with StackingEnsemble (#460)

SARIMAX parameters fix (#459)

[BUG] Check pytorch-forecasting models with freq > "1D" (#463)

Source code(tar.gz)
Source code(zip)
1.5.0(Dec 24, 2021)
Highlights:

We extend our family of loggers by adding S3FileLogger and LocalFileLogger. They partially duplicate behaviour of WandbLogger: you can run multiple experiments (via Optuna, HyperOpt or cutom loop as example) with different hyperparameters and transformers, save results locally or on S3 and analyze results afterwards.

HolidayTransfrom on the base of holidays library.

Bug fixies for prediction intervals - now they change after inverse_transform like target.

We change behaviour of fit_transform:

before we raised error if some timeseries ended on NaN values

now checking will be made only before forecasting phase, so you can fill NaNs with TimeSeriesImputerTransform and make predictions without raised errors.

N.B.

Special thanks to @Gewissta and his videos about timeseries analysis with ETNA library

Part 1 (Russian)

Part 2 (Russian)

Full changelog:

Added

Holiday Transform (#359)

S3FileLogger and LocalFileLogger (#372)

Parameter changepoint_prior_scale to ProphetModel (#408)

Changed

Set strict_optional = True for mypy (#381)

Move checking the series endings to make_future step (#413)

Fixed

Sarimax bug in future prediction with quantiles (#391)

Catboost version too high (#394)

Add sorting of classes in left bar in docs (#397)

nn notebook in docs (#396)

SklearnTransform column name generation (#398)

Inverse transform doesn't affect quantiles (#395)

Source code(tar.gz)
Source code(zip)
1.4.2(Dec 9, 2021)
Fix docs generation

Source code(tar.gz)
Source code(zip)
1.4.1(Dec 9, 2021)
Made Model, PerSegmentModel, PerSegmentWrapper imports more convenient

Docs now have all neural networks models

Speed up _check_regressors and _merge_exog

Source code(tar.gz)
Source code(zip)
1.4.0(Dec 3, 2021)
Hi! In this release we have focused on speed and bug fixes.

Added

ACF plot

Changed

Add ts.inverse_transform as final step at Pipeline.fit method

Make test_ts optional in plot_forecast

Speed up inference for multisegment regression models

Speed up Pipeline._get_backtest_forecasts

Speed up SegmentEncoderTransform

Wandb Logger does not work unless pytorch is installed

Fixed

Get rid of lambda in DensityOutliersTransform and get_anomalies_density

Fixed import in transforms

Pickle DTWClustering

Removed

Remove TimeSeriesCrossValidation

Source code(tar.gz)
Source code(zip)
1.3.3(Nov 24, 2021)
Added:

RelevanceTable can return rank

GaleShapleyFeatureSelectionTransform based one Gale-Shapley algorithm

FilterFeaturesTransform for selecting features from TSDataset while feature engineering

ResampleWithDistributionTransform helps to resample features according to the other feature distribution

Spell checks in ci

Changed:

Rename confidence interval to prediction interval, start working with quantiles instead of interval_width

Changed format of forecast and test dataframes in WandbLogger

Source code(tar.gz)
Source code(zip)
1.3.2(Nov 18, 2021)
Minor addition:

Add sum for omegaconf resolvers

Source code(tar.gz)
Source code(zip)
1.3.1(Nov 12, 2021)

Also we remove restriction on version of pandas
Source code(tar.gz)
Source code(zip)
1.3.0(Nov 12, 2021)
We are happy to announce 1.3.0 version of the etna library!

We focused on making etna even more user friendly as well as added new features.

We have added:

CLI for backtesting

MeanSegmentEncoderTransform

Several feature relevance algorithms

TreeFeatureSelectionTransform

We have fixed:

Bugs in loggers when aggregate_metrics=True

Bug when TSDataset did not create future if exogenous data has empty future

links in CLI documentation

Source code(tar.gz)
Source code(zip)
1.3.0-alpha.0(Oct 28, 2021)

In progress...

In this prerelease we are testing optional dependencies. Be careful!

Docs available at https://unstable--etna-docs.netlify.app
Source code(tar.gz)
Source code(zip)
1.2.0(Oct 27, 2021)
Boom! Huge update!

Added

Even more documentation

Even more Jupyter Notebooks with examples

Pipeline class, helps unite models and transforms

Ensemble classes, helps unite models

AutoRegressivePipeline

Add confidence intervals to pipelines, models and transforms

Add new Transforms

Add clustering methods

Changed

backtest moved to Pipeline class

Fixed

pandas bugs

TSDataset.to_dataset bug

More in our Changelog
Source code(tar.gz)
Source code(zip)
1.2.0-alpha.1(Oct 18, 2021)

Fix bug in TSDataset
Source code(tar.gz)
Source code(zip)
1.2.0-alpha.0(Oct 14, 2021)
Added

BinsegTrendTransform, ChangePointsTrendTransform (#87)

Interactive plot for anomalies (#95)

Examples to TSDataset methods with doctest (#92)

WandbLogger (#71)

Pipeline (#78)

Sequence anomalies (#96), Histogram anomalies (#79)

'is_weekend' feature in DateFlagsTransform (#101)

Documentation example for models and note about inplace nature of forecast (#112)

Property regressors to TSDataset (#82)

Clustering (#110)

Outliers notebook (#123))

Method inverse_transform in TimeSeriesImputerTransform (#135)

VotingEnsemble (#150)

Forecast command for cli (#133)

MyPy checks in CI/CD and lint commands (#39)

TrendTransform (#139)

Running notebooks in ci (#134)

Cluster plotter to EDA (#169)

Pipeline.backtest method (#161, #192)

STLTransform class (#158)

NN_examples notebook (#159)

Example for ProphetModel (#178)

Instruction notebook for custom model and transform creation (#180)

Add inverse_transform in *OutliersTransform (#160)

Examples for CatBoostModelMultiSegment and CatBoostModelPerSegment (#181)

Changed

Delete offset from WindowStatisticsTransform (#111)

Add Pipeline example in Get started notebook (#115)

Internal implementation of BinsegTrendTransform (#141)

Colorebar scaling in Correlation heatmap plotter (#143)

Add Correlation heatmap in EDA notebook (#144)

Add __repr__ for Pipeline (#151)

Defined random state for every test cases (#155)

Add confidence intervals to Prophet (#153)

Add confidence intervals to SARIMA (#172)

Fixed

Set default value of TSDataset.head method (#170)

Categorical and fillna issues with pandas >=1.2 (#190)

Source code(tar.gz)
Source code(zip)
1.1.3(Oct 8, 2021)
This is a hot fix release. This update is recommended for installation for all etna users!

Limit version of pandas by 1.2

Source code(tar.gz)
Source code(zip)
1.1.2(Oct 8, 2021)
Just some bug fixes:

Changed

SklearnTransform out column names (#99)

Update EDA notebook (#96)

Add 'regressor_' prefix to output columns of LagTransform, DateFlagsTransform, SpecialDaysTransform, SegmentEncoderTransform

Fixed

Add more obvious Exception Error for forecasting with unfitted model (#102)

Fix bug with hardcoded frequency in PytorchForecastingTransform (#107)

Bug with inverse_transform method of TimeSeriesImputerTransform (#148)

Source code(tar.gz)
Source code(zip)
1.1.2-alpha.0(Oct 7, 2021)
In progress... Fixing bugs

Changed

SklearnTransform out column names (#99)

Update EDA notebook (#96)

Add 'regressor_' prefix to output columns of LagTransform, DateFlagsTransform, SpecialDaysTransform, SegmentEncoderTransform

Fixed

Add more obvious Exception Error for forecasting with unfitted model (#102)

Fix bug with hardcoded frequency in PytorchForecastingTransform (#107)

Bug with inverse_transform method of TimeSeriesImputerTransform (#148)

Source code(tar.gz)
Source code(zip)
1.1.1(Sep 23, 2021)

Source code(tar.gz)
Source code(zip)
1.1.0(Sep 22, 2021)
In this release we focused on adding even more features to our library. Please meet new models and transforms:

Added

MedianOutliersTransform, DensityOutliersTransform (#30)

Issues and Pull Request templates

TSDataset checks (#24, #20)

Pytorch-Forecasting models (#29)

SARIMAX model (#10)

Logging, including ConsoleLogger (#46)

Correlation heatmap plotter (#77)

Changed

Backtest is fully parallel

New default hyperparameters for CatBoost

Fixed

Documentation fixes (#55, #53, #52)

Solved warning in LogTransform and AddConstantTransform (#26)

Regressors does not have enough history bug (#35)

make_future(1) and make_future(2) bug

Fix working with 'cap' and 'floor' features in Prophet model (#62))

Fix saving init params for SARIMAXModel (#81)

Imports of nn models, PytorchForecastingTransform and Transform (#80))

Source code(tar.gz)
Source code(zip)

Owner

Tinkoff.AI

Tinkoff AI Center

GitHub Repository https://etna.tinkoff.ru

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

pmdarima Pmdarima (originally pyramid-arima, for the anagram of 'py' + 'arima') is a statistical library designed to fill the void in Python's time se

1.3k Dec 22, 2022

AutoOED: Automated Optimal Experiment Design Platform

AutoOED is an optimal experiment design platform powered with automated machine learning to accelerate the discovery of optimal solutions. Our platform solves multi-objective optimization problems an

107 Jan 03, 2023

neurodsp is a collection of approaches for applying digital signal processing to neural time series

neurodsp is a collection of approaches for applying digital signal processing to neural time series, including algorithms that have been proposed for the analysis of neural time series. It also inclu

224 Dec 02, 2022

Decision Weights in Prospect Theory

Decision Weights in Prospect Theory It's clear that humans are irrational, but how irrational are they? After some research into behavourial economics

32 Nov 08, 2021

💀mummify: a version control tool for machine learning

mummify is a version control tool for machine learning. It's simple, fast, and designed for model prototyping.

43 Jul 09, 2022

Forecasting prices using Facebook/Meta's Prophet model

CryptoForecasting using Machine and Deep learning (Part 1) CryptoForecasting using Machine Learning The main aspect of predicting the stock-related da

1 Nov 27, 2021

Official code for HH-VAEM

HH-VAEM This repository contains the official Pytorch implementation of the Hierarchical Hamiltonian VAE for Mixed-type Data (HH-VAEM) model and the s

8 Nov 30, 2022

Multiple Linear Regression using the LinearRegression class from sklearn.linear_model library

Multiple-Linear-Regression-master - A python program to implement Multiple Linear Regression using the LinearRegression class from sklearn.linear model library

1 Feb 06, 2022

The easy way to combine mlflow, hydra and optuna into one machine learning pipeline.

mlflow_hydra_optuna_the_easy_way The easy way to combine mlflow, hydra and optuna into one machine learning pipeline. Objective TODO Usage 1. build do

9 Sep 09, 2022

A high-performance topological machine learning toolbox in Python

giotto-tda is a high-performance topological machine learning toolbox in Python built on top of scikit-learn and is distributed under the G

632 Dec 29, 2022

Python 3.6+ toolbox for submitting jobs to Slurm

Submit it! What is submitit? Submitit is a lightweight tool for submitting Python functions for computation within a Slurm cluster. It basically wraps

768 Jan 03, 2023

Compare MLOps Platforms. Breakdowns of SageMaker, VertexAI, AzureML, Dataiku, Databricks, h2o, kubeflow, mlflow...

318 Jan 02, 2023

Machine Learning University: Accelerated Natural Language Processing Class

Machine Learning University: Accelerated Natural Language Processing Class This repository contains slides, notebooks and datasets for the Machine Lea

2k Jan 01, 2023

We have a dataset of user performances. The project is to develop a machine learning model that will predict the salaries of baseball players.

Salary-Prediction-with-Machine-Learning 1. Business Problem Can a machine learning project be implemented to estimate the salaries of baseball players

9 Oct 14, 2022

Open-Source CI/CD platform for ML teams. Deliver ML products, better & faster. ⚡️🧑‍🔧

Deliver ML products, better & faster Giskard is an Open-Source CI/CD platform for ML teams. Inspect ML models visually from your Python notebook 📗 Re

335 Jan 04, 2023

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

366 Jan 03, 2023

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

23.6k Jan 03, 2023

ETNA – time series forecasting framework

Related tags

Overview

ETNA Time Series Library

Predict your time series the easiest way

Installation

Get started

Tutorials

Documentation

Acknowledgments

ETNA.Team

ETNA.Contributors

License

Comments

Before submitting (must do checklist)

Proposed Changes

Closing issues

Before submitting (must do checklist)

Proposed Changes

Closing issues

Before submitting (must do checklist)

Proposed Changes

Closing issues

Before submitting (must do checklist)

Proposed Changes

Closing issues

Before submitting (must do checklist)

Proposed Changes

Closing issues

Before submitting (must do checklist)

Proposed Changes

Closing issues

Before submitting (must do checklist)

Proposed Changes

Closing issues

🚀 Feature Request

Proposal

Test cases

Additional context

Before submitting (must do checklist)

Proposed Changes

Closing issues

Before submitting (must do checklist)

Type of Change

Proposed Changes

Related Issue

Closing issues

Before submitting (must do checklist)

Type of Change

Proposed Changes

Related Issue

Closing issues

🚀 Feature Request

Proposal

Test cases

Additional context

🚀 Feature Request

Proposal

Test cases

Additional context

🐛 Bug Report

Expected behavior

How To Reproduce

Соответствует модели

Сделайте прогноз

Environment

Additional context

Checklist

🚀 Feature Request

Proposal

Test cases

Additional context

🚀 Feature Request

Proposal

Test cases

Additional context

🚀 Feature Request

Proposal

Test cases

Additional context