GAM timeseries modeling with auto-changepoint detection. Inspired by Facebook Prophet and implemented in PyMC3

Overview

pm-prophet

Logo

Pymc3-based universal time series prediction and decomposition library (inspired by Facebook Prophet). However, while Faceook prophet is a well-defined model, pm-prophet allows for total flexibility in the choice of priors and thus is potentially suited for a wider class of estimation problems.

⚠️ Only supports Python 3

Table of Contents

Installing pm-prophet

PM-Prophet installation is straightforward using pip: pip install pmprophet

Note that the key dependency of pm-prophet is PyMc3 a library that depends on Theano.

Key Features

  • Nowcasting & Forecasting
  • Intercept, growth
  • Regressors
  • Holidays
  • Additive & multiplicative seasonality
  • Fitting and plotting
  • Custom choice of priors (not in Facebook's prophet original model)
  • Changepoints in growth
  • Automatic changepoint location detection (not in Facebook's prophet original model)
  • Fitting with NUTS/AVDI/Metropolis

Experimental warning ⚠️

  • Note that automatic changepoint detection is experimental

Differences with Prophet:

  • Saturating growth is not implemented
  • Uncertainty estimation is different
  • All components (including seasonality) need to be explicitly added to the model
  • By design pm-prophet places a big emphasis on posteriors and uncertainty estimates, and therefore it won't use MAP for it's estimates.
  • While Faceook prophet is a well-defined fixed model, pm-prophet allows for total flexibility in the choice of priors and thus is potentially suited for a wider class of estimation problems

Peyton Manning example

Predicting the Peyton Manning timeseries:

import pandas as pd
from pmprophet.model import PMProphet, Sampler

df = pd.read_csv("examples/example_wp_log_peyton_manning.csv")
df = df.head(180)

# Fit both growth and intercept
m = PMProphet(df, growth=True, intercept=True, n_changepoints=25, changepoints_prior_scale=.01, name='model')

# Add monthly seasonality (order: 3)
m.add_seasonality(seasonality=30, fourier_order=3)

# Add weekly seasonality (order: 3)
m.add_seasonality(seasonality=7, fourier_order=3)

# Fit the model (using NUTS)
m.fit(method=Sampler.NUTS)

ddf = m.predict(60, alpha=0.2, include_history=True, plot=True)
m.plot_components(
    intercept=False,
)

Model Seasonality-7 Seasonality-30 Growth Change Points

Custom Priors

One of the main reason why PMProphet was built is to allow custom priors for the modeling.

The default priors are:

Variable Prior Parameters
regressors Laplace loc:0, scale:2.5
holidays Laplace loc:0, scale:2.5
seasonality Laplace loc:0, scale:0.05
growth Laplace loc:0, scale:10
changepoints Laplace loc:0, scale:2.5
intercept Normal loc:y.mean, scale: 2 * y.std
sigma Half Cauchy tau:10

But you can change model priors by inspecting and modifying the distributions stored in

m.priors

which is a dictionary of {prior: pymc3-distribution}.

In the example below we will model an additive time-series by imposing a "positive coefficients" constraint by using an Exponential distribution instead of a Laplacian distribution for the regressors.

import pandas as pd
import numpy as np
import pymc3 as pm
from pmprophet.model import PMProphet, Sampler

n_timesteps = 100
n_regressors = 20

regressors = np.random.normal(size=(n_timesteps, n_regressors))
coeffs = np.random.exponential(size=n_regressors) + np.random.normal(size=n_regressors)
# Note that min(coeffs) could be negative due to the white noise

regressors_names = [str(i) for i in range(n_regressors)]

df = pd.DataFrame()
df['y'] = np.dot(regressors, coeffs)
df['ds'] = pd.date_range('2017-01-01', periods=n_timesteps)
for idx, regressor in enumerate(regressors_names):
    df[regressor] = regressors[:, idx]

m = PMProphet(df, growth=False, intercept=False, n_changepoints=0, name='model')

with m.model:
    # Remember to suffix _<model-name> to the custom priors
    m.priors['regressors'] = pm.Exponential('regressors_%s' % m.name, 1, shape=n_regressors)

for regressor in regressors_names:
    m.add_regressor(regressor)

m.fit(
    draws=10 ** 4,
    method=Sampler.NUTS,
)
m.plot_components()

Regressors

Automatic changepoint detection ( ⚠️ experimental)

Pm-prophet is equipped with a non-parametric truncated Dirichlet Process allowing it to automatically detect changepoints in the trend.

To enable it simply initialize the model with auto_changepoints=True as follows:

from pmprophet.model import PMProphet, Sampler
import pandas as pd

df = pd.read_csv("examples/example_wp_log_peyton_manning.csv")
df = df.head(180)
m = PMProphet(df, auto_changepoints=True, growth=True, intercept=True, name='model')
m.fit(method=Sampler.METROPOLIS, draws=2000)
m.predict(60, alpha=0.2, include_history=True, plot=True)
m.plot_components(
    intercept=False,
)

Where n_changepoints is interpreted as the truncation point for the Dirichlet Process.

Pm-prophet will then decide which changepoint values make sense and add a custom weight to them. A call to plot_components() will reveal the changepoint map:

Regressors

A few caveats exist:

  • It's slow to fit since it's a non-parametric model
  • For best results use NUTS as method
  • It will likely require more than the default number of draws to converge
Owner
Luca Giacomel
Luca Giacomel
Binary Classification Problem with Machine Learning

Binary Classification Problem with Machine Learning Solving Approach: 1) Ultimate Goal of the Assignment: This assignment is about solving a binary cl

Dinesh Mali 0 Jan 20, 2022
A Python-based application demonstrating various search algorithms, namely Depth-First Search (DFS), Breadth-First Search (BFS), and A* Search (Manhattan Distance Heuristic)

A Python-based application demonstrating various search algorithms, namely Depth-First Search (DFS), Breadth-First Search (BFS), and the A* Search (using the Manhattan Distance Heuristic)

17 Aug 14, 2022
Neural Machine Translation (NMT) tutorial with OpenNMT-py

Neural Machine Translation (NMT) tutorial with OpenNMT-py. Data preprocessing, model training, evaluation, and deployment.

Yasmin Moslem 29 Jan 09, 2023
This jupyter notebook project was completed by me and my friend using the dataset from Kaggle

ARM This jupyter notebook project was completed by me and my friend using the dataset from Kaggle. The world Happiness 2017, which ranks 155 countries

1 Jan 23, 2022
A framework for building (and incrementally growing) graph-based data structures used in hierarchical or DAG-structured clustering and nearest neighbor search

A framework for building (and incrementally growing) graph-based data structures used in hierarchical or DAG-structured clustering and nearest neighbor search

Nicholas Monath 31 Nov 03, 2022
Python implementation of Weng-Lin Bayesian ranking, a better, license-free alternative to TrueSkill

Python implementation of Weng-Lin Bayesian ranking, a better, license-free alternative to TrueSkill This is a port of the amazing openskill.js package

Open Debates Project 156 Dec 14, 2022
The unified machine learning framework, enabling framework-agnostic functions, layers and libraries.

The unified machine learning framework, enabling framework-agnostic functions, layers and libraries. Contents Overview In a Nutshell Where Next? Overv

Ivy 8.2k Dec 31, 2022
Banpei is a Python package of the anomaly detection.

Banpei Banpei is a Python package of the anomaly detection. Anomaly detection is a technique used to identify unusual patterns that do not conform to

Hirofumi Tsuruta 282 Jan 03, 2023
Flightfare-Prediction - It is a Flightfare Prediction Web Application Using Machine learning,Python and flask

Flight_fare-Prediction It is a Flight_fare Prediction Web Application Using Machine learning,Python and flask Using Machine leaning i have created a F

1 Dec 06, 2022
Library for machine learning stacking generalization.

stacked_generalization Implemented machine learning *stacking technic[1]* as handy library in Python. Feature weighted linear stacking is also availab

114 Jul 19, 2022
customer churn prediction prevention in telecom industry using machine learning and survival analysis

Telco Customer Churn Prediction - Plotly Dash Application Description This dash application allows you to predict telco customer churn using machine l

Benaissa Mohamed Fayçal 3 Nov 20, 2021
Crypto-trading - ML techiques are used to forecast short term returns in 14 popular cryptocurrencies

Crypto-trading - ML techiques are used to forecast short term returns in 14 popular cryptocurrencies. We have amassed a dataset of millions of rows of high-frequency market data dating back to 2018 w

Panagiotis (Panos) Mavritsakis 4 Sep 22, 2022
PennyLane is a cross-platform Python library for differentiable programming of quantum computers

PennyLane is a cross-platform Python library for differentiable programming of quantum computers. Train a quantum computer the same way as a neural ne

PennyLaneAI 1.6k Jan 01, 2023
Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.

Hivemind: decentralized deep learning in PyTorch Hivemind is a PyTorch library to train large neural networks across the Internet. Its intended usage

1.3k Jan 08, 2023
MLflow App Using React, Hooks, RabbitMQ, FastAPI Server, Celery, Microservices

Katana ML Skipper This is a simple and flexible ML workflow engine. It helps to orchestrate events across a set of microservices and create executable

Tom Xu 8 Nov 17, 2022
Decision Tree Regression algorithm implemented on Python from scratch.

Decision_Tree_Regression I implemented the decision tree regression algorithm on Python. Unlike regular linear regression, this algorithm is used when

1 Dec 22, 2021
Machine Learning Study 혼자 해보기

Machine Learning Study 혼자 해보기 기여자 (Contributors) ✨ Teddy Lee 🏠 HongJaeKwon 🏠 Seungwoo Han 🏠 Tae Heon Kim 🏠 Steve Kwon 🏠 SW Song 🏠 K1A2 🏠 Wooil

Teddy Lee 1.7k Jan 01, 2023
PyPOTS - A Python Toolbox for Data Mining on Partially-Observed Time Series

A python toolbox/library for data mining on partially-observed time series, supporting tasks of forecasting/imputation/classification/clustering on incomplete multivariate time series with missing va

Wenjie Du 179 Dec 31, 2022
A Powerful Serverless Analysis Toolkit That Takes Trial And Error Out of Machine Learning Projects

KXY: A Seemless API to 10x The Productivity of Machine Learning Engineers Documentation https://www.kxy.ai/reference/ Installation From PyPi: pip inst

KXY Technologies, Inc. 35 Jan 02, 2023
2D fluid simulation implementation of Jos Stam paper on real-time fuild dynamics, including some suggested extensions.

Fluid Simulation Usage Download this repo and store it in your computer. Open a terminal and go to the root directory of this folder. Make sure you ha

Mariana Ávalos Arce 5 Dec 02, 2022