A tax calculator for stocks and dividends activities.

Overview

Revolut Stocks calculator for Bulgarian National Revenue Agency

GitHub release (latest SemVer) GitHub Release Date GitHub issues by-label GitHub Workflow Status (branch) GitHub all releases Docker Pulls GitHub stars GitHub watchers GitHub followers GitHub license

Twitter URL

Cat owe taxes joke.

Information

Processing and calculating the required information about stock possession and operation is complicated and time-consuming. So that brought the idea of developing a calculator that is able to automate the process from end-to-end.

Revolut Stock calculator is able to parse Revolut statement documents and provide a ready-to-use tax declaration file(dec50_2020_data.xml), that you can import into NAP online system. Each part of the declaration is also exported as csv file for verification.

How it works

  1. The calculator recursively scans the input directory for statement files(*.pdf).
  2. The statement files are then being parsed to extract all activity information.
  3. The calculator then obtains the last published exchange rate(USD to BGN) for the day of each trade.
  4. During the last step all activities are processed to produce the required data.

Considerations

  1. The calculator parses exported statements in pdf format. Parsing a pdf file is a risky task and heavily depends on the structure of the file. In order to prevent miscalculations, please review the generated statements.csv file under the output directory and make sure all activities are correctly extracted from your statement files.
  2. Revolut doesn't provide information about which exact stock asset is being sold during a sale. As currently indicated at the end of each statement file, the default tax lot disposition method is First-In, First-Out. The calculator is developed according to that rule.
  3. The trade date(instead of the settlement date) is being used for every calculation. The decision is based on the fact that the Revolut stock platform makes the cash available immediately after the initiation of a stock sale. Although the cash can't be withdrawn, it could be used in making other deals and so it's assumed that the transfer is finished from a user perspective.
  4. By default the calculator uses locally cached exchange rates located here. If want you can select BNB online service as exchange rates provider by enabling the -b flag. When activating BNB online service provider, make sure you do not spam the BNB service with too many requests. Each execution makes around 3-5 requests.
  5. In application 8 part 1 you have to list all stocks, that you own by the end of the previous year(31.12.20XX). That includes stocks, that were purchased prior to the year, you're filling declaration for. There are comments in both csv and xml files to identify stock symbols along with their records. You can use those identification comments to aggregate records with data, out of the scope of the calculator.

Requirements

  • Python version >= 3.7
  • Docker and Docker Compose(only required for Docker Compose usage option)

Usage

Local

Note: The calculator is not natively tested on Windows OS. When using Windows it's preferable to use WSL and Docker.

Install dependencies

$ pip install -r requirements.txt

Run (single parser)

$ python stocks.py -i <path_to_input_dir> -o <path_to_output_dir>

Run (multiple parsers)

In order to use multiple parsers, you need to sort your statement files into a corresponding parser directory under the selected input directory. For example:

/input-directory/revolut - directory contains Revolut statement files
/input-directory/trading212 - directory contains Trading 212 statement files

You can use the help command to list supported parsers with their names.

$ python stocks.py -i <path_to_input_dir> -o <path_to_output_dir> -p <parser_name_1> -p <parser_name_2> ...

Help

$ python stocks.py -h

Output:

[INFO]: Collecting statement files.
[INFO]: Collected statement files for processing: ['input/statement-3cbc62e0-2e0c-44a4-ae0c-8daa4b7c41bc.pdf', 'input/statement-19ed667d-ba66-4527-aa7a-3a88e9e4d613.pdf'].
[INFO]: Parsing statement files.
[INFO]: Generating [statements.csv] file.
[INFO]: Populating exchange rates.
[INFO]: Generating [app8-part1.csv] file.
[INFO]: Calculating sales information.
[INFO]: Generating [app5-table2.csv] file.
[INFO]: Calculating dividends information.
[INFO]: Generating [app8-part4-1.csv] file.
[INFO]: Generating [dec50_2020_data.xml] file.
[INFO]: Profit/Loss: 1615981 lev.

00s joke.

Docker

Docker Hub images are built and published by GitHub Actions Workflow. The following tags are available:

  • main - the image is built from the latest commit in the main branch.
  • - the image is built from the released version.

Run

$ docker run --rm -v <path_to_input_dir>:/input:ro -v <path_to_output_dir>:/output gretch/nap-stocks-calculator:main -i /input -o /output

Docker Compose

Prepare

Replace and placeholders in the docker-compose.yml with paths to your input and output directories.

Run

$ docker-compose up --build

Results

Output file NAP mapping Description
dec50_2020_data.xml Декларация по чл.50 от ЗДДФЛ, Приложение 5 и 8 Tax declaration - ready for import.
statements.csv N/A Verification file to ensure correct parsing. Should be verified manually.
app5-table2.csv Приложение 5, Таблица 2
app8-part1.csv Приложение 8, Част ІV, 1
app8-part4-1.csv Приложение 8, Част І

Errors

Errors are being reported along with an ERROR label. For example:

[ERROR]: Unable to get exchange rate from BNB. Please, try again later.
Traceback (most recent call last):
  File "/mnt/c/Users/doino/Personal/revolut-stocks/libs/exchange_rates.py", line 57, in query_exchange_rates
    date = datetime.strptime(row[0], BNB_DATE_FORMAT)
  File "/usr/lib/python3.8/_strptime.py", line 568, in _strptime_datetime
    tt, fraction, gmtoff_fraction = _strptime(data_string, format)
  File "/usr/lib/python3.8/_strptime.py", line 349, in _strptime
    raise ValueError("time data %r does not match format %r" %
ValueError: time data ' ' does not match format '%d.%m.%Y'

Please, check the latest reported error in the log for more information.

"Unable to get exchange rate from BNB"

The error indicates that there was an issue obtaining the exchange rate from BNB online service. Please, test BNB online service manually here, before reporting an issue.

"No statement files found"

There was an issue finding input statement files. Please, check your input directory configuration and file permissions.

"Not activities found. Please, check your statement files"

The calculator parser was unable to parse any activities within your statement file. Please, check your statement files and ensure there are reported activities. If there are reported activities, but the error still persists, please open an issue.

"Statements contain unsupported activity types"

The calculator found unsupported activity type/s. Please, open an issue and include the reported activity type.

"Unable to find previously purchased shares to surrender as part of SSP"

The calculator, while trying to perform the SSP surrender shares operation, was unable to find the previously purchased shares for the same stock symbol. Please, ensure there is a statement file in the input directory, containing the original purchase.

Import

NOTE: Importing dec50_2020_data.xml will clear all filling in your current tax declaration.

The dec50_2020_data.xml file contains applications 5 and 8. It could be imported into NAP online system with the use of NAP web interface, navigating to Декларации, документи или данни, подавани от физически лица/Декларации по ЗДДФЛ/Декларация по чл.50 от ЗДДФЛ and clinking on Импорт на файл button.

During the import, a few errors will be reported. That's normal(see exceptions below). The reason for the errors is that the imported file contains data for applications 5 and 8 only, but the system expects a complete filling of the document. After the import, you can continue filling your tax declaration as usual. Don't forget to enable applications 5 and 8 under part 3 of the main document. After you enable them you should navigate to each application, verify the data and click Потвърди button.

During the import, if there are reported errors in the fillings of applications 5 or 8, that's a sign of a bug in the calculator itself. Please report the error here.

Parsers

Revolut

File format: .pdf

That's the default parser and handler statement files downloaded from Revolut app.

Trading 212

File format: .csv

Parser for statement files, generated by Trading 212 platform. Thanks to @bobcho.

CSV

File format: .csv

A generic parser for statements in CSV format. So for there, two identified usage scenarios:

  1. The parser could be used with structured data from any trading platform, that could be easily organized to fit the parser's requirements.
  2. The parser could be used to calculate tax information from multiple trading platforms. For example, you can generate statements.csv file for your Revolut activities and generate statements.csv file for your Trading 212 activities. Then you can append both files and process the resulted file once more. In the end, you'll receive tax information from both platforms.

In order for the file to be correctly parsed the following requirements should be met:

  1. The following columns should be presented:
    1. trade_date: The column should contain the date of the trade in dd.MM.YYYY format.
    2. activity_type: The current row activity type. The following types are supported: ["SELL", "BUY", "DIV", "DIVNRA", "SSP", "MAS"]
    3. company: The name of the stock company. For example Apple INC.
    4. symbol: The symbol of the stock. For example AAPL.
    5. quantity: The quantity of the activity. In order to correctly recognize surrender from addition SSP and MAS activities, the quantity should be positive or negative. For all other activity types, there is no such requirement(it could be an absolute value).
    6. price: The activity price per share.
    7. amount: The total amount of the activity. It should be a result of (quantity x price) + commissions + taxes.
  2. The first row should contain headers, indicating the column name, according to the mapping above. There is no requirement for columns to be presented in any particular order.
  3. The activities, listed in the file/s should be sorted from the earliest trading date to the latest one. The earliest date should be located at the very begging of the file. When you're processing multiple statement files you can append them together(no need to merge the activities).
  4. DIVNRA, which represents the tax that was paid upon receiving dividends, should follow DIV activity. Other activities could be listed between those two events. DIVNRA is not required for all DIVs but would trigger calculations for dividend tax owed to NAP. DIV activity amount should be equal to dividend value + tax.

In order to verify the parser correctness, you can compare the generated statements.csv file with your input file. The data should be the same in both files.

Contribution

As this was a late-night project, improvements could be made. I'm open to new PRs.

Please submit issues here.

Feedback

You can find me on my social media accounts:

Support

🍺 Buy Me A Beer

You might also like...
First and foremost, we want dbt documentation to retain a DRY principle. Every time we repeat ourselves, we waste our time. Second, we want to understand column level lineage and automate impact analysis.

dbt-osmosis First and foremost, we want dbt documentation to retain a DRY principle. Every time we repeat ourselves, we waste our time. Second, we wan

Statistical Analysis 📈 focused on statistical analysis and exploration used on various data sets for personal and professional projects.
Statistical Analysis 📈 focused on statistical analysis and exploration used on various data sets for personal and professional projects.

Statistical Analysis 📈 This repository focuses on statistical analysis and the exploration used on various data sets for personal and professional pr

Spectacular AI SDK fuses data from cameras and IMU sensors and outputs an accurate 6-degree-of-freedom pose of a device.
Spectacular AI SDK fuses data from cameras and IMU sensors and outputs an accurate 6-degree-of-freedom pose of a device.

Spectacular AI SDK examples Spectacular AI SDK fuses data from cameras and IMU sensors (accelerometer and gyroscope) and outputs an accurate 6-degree-

Working Time Statistics of working hours and working conditions by industry and company

Working Time Statistics of working hours and working conditions by industry and company

A python package which can be pip installed to perform statistics and visualize binomial and gaussian distributions of the dataset

GBiStat package A python package to assist programmers with data analysis. This package could be used to plot : Binomial Distribution of the dataset p

ToeholdTools is a Python package and desktop app designed to facilitate analyzing and designing toehold switches, created as part of the 2021 iGEM competition.

ToeholdTools Category Status Repository Package Build Quality A library for the analysis of toehold switch riboregulators created by the iGEM team Cit

A collection of robust and fast processing tools for parsing and analyzing web archive data.

ChatNoir Resiliparse A collection of robust and fast processing tools for parsing and analyzing web archive data. Resiliparse is part of the ChatNoir

Larch: Applications and Python Library for Data Analysis of X-ray Absorption Spectroscopy (XAS, XANES, XAFS, EXAFS), X-ray Fluorescence (XRF) Spectroscopy and Imaging

Larch: Data Analysis Tools for X-ray Spectroscopy and More Documentation: http://xraypy.github.io/xraylarch Code: http://github.com/xraypy/xraylarch L

A real-time financial data streaming pipeline and visualization platform using Apache Kafka, Cassandra, and Bokeh.
A real-time financial data streaming pipeline and visualization platform using Apache Kafka, Cassandra, and Bokeh.

Realtime Financial Market Data Visualization and Analysis Introduction This repo shows my project about real-time stock data pipeline. All the code is

Comments
  • trading212 support (partial)

    trading212 support (partial)

    This adds support for parsing Trading212 statements in csv format in addition to Revolut's pdf statements. Just put all the files in the input folder.

    Only partial support for time being - it lacks some activity types because I don't have a full statement at the moment (trading212 takes time to generate the statements and also I don't think I have received any dividends at all so far on their platform).

    Includes support for market and limit buy/sell orders only (lacks anything else). Also, output/statements.csv won't have settlement date for trading212 activities as its not included in the t212 export. As settle date is not used anywhere I guess this is fine.

    This should not break any current functionality or at least that was my intention.

    opened by bobcho 5
  • Update requirements.txt

    Update requirements.txt

    Issue: Doing a pip install -r requirements.txt as instructed returned an error launchpadlib 1.10.13 requires testresources, which is not installed. on my Ubuntu Server 20.04 LTS. Solution: Adding the testresources package to the requirements list fixed this for me.

    opened by drkskwlkr 2
  • General code improvements.

    General code improvements.

    Hello, I contributed some improvements.

    • Cover edge case when only out of order activities are present https://github.com/skilldeliver/revolut-stocks/blob/main/libs/parsers/revolut.py#L165
    • Using pipenv for dependencies workflow
    • Using black for formatting
    • Using flake8 for linter
    opened by skilldeliver 1
  • Fix activity range end index when SWEEP ACTIVITY string is missing

    Fix activity range end index when SWEEP ACTIVITY string is missing

    Set end_index for activities to the end of the page by default when activities are found. Because of missing SWEEP ACTIVITY string sometimes. Otherwise, activities can be missed.

    opened by micobg 0
Releases(0.6.0)
Owner
Doino Gretchenliev
Doino Gretchenliev
Data analysis and visualisation projects from a range of individual projects and applications

Python-Data-Analysis-and-Visualisation-Projects Data analysis and visualisation projects from a range of individual projects and applications. Python

Tom Ritman-Meer 1 Jan 25, 2022
Python beta calculator that retrieves stock and market data and provides linear regressions.

Stock and Index Beta Calculator Python script that calculates the beta (β) of a stock against the chosen index. The script retrieves the data and resa

sammuhrai 4 Jul 29, 2022
pyETT: Python library for Eleven VR Table Tennis data

pyETT: Python library for Eleven VR Table Tennis data Documentation Documentation for pyETT is located at https://pyett.readthedocs.io/. Installation

Tharsis Souza 5 Nov 19, 2022
Package for decomposing EMG signals into motor unit firings, as used in Formento et al 2021.

EMGDecomp Package for decomposing EMG signals into motor unit firings, created for Formento et al 2021. Based heavily on Negro et al, 2016. Supports G

13 Nov 01, 2022
PyIOmica (pyiomica) is a Python package for omics analyses.

PyIOmica (pyiomica) This repository contains PyIOmica, a Python package that provides bioinformatics utilities for analyzing (dynamic) omics datasets.

G. Mias Lab 13 Jun 29, 2022
We're Team Arson and we're using the power of predictive modeling to combat wildfires.

We're Team Arson and we're using the power of predictive modeling to combat wildfires. Arson Map Inspiration There’s been a lot of wildfires in Califo

Jerry Lee 3 Oct 17, 2021
DaDRA (day-druh) is a Python library for Data-Driven Reachability Analysis.

DaDRA (day-druh) is a Python library for Data-Driven Reachability Analysis. The main goal of the package is to accelerate the process of computing estimates of forward reachable sets for nonlinear dy

2 Nov 08, 2021
A simple and efficient tool to parallelize Pandas operations on all available CPUs

Pandaral·lel Without parallelization With parallelization Installation $ pip install pandarallel [--upgrade] [--user] Requirements On Windows, Pandara

Manu NALEPA 2.8k Dec 31, 2022
Recommendations from Cramer: On the show Mad-Money (CNBC) Jim Cramer picks stocks which he recommends to buy. We will use this data to build a portfolio

Backtesting the "Cramer Effect" & Recommendations from Cramer Recommendations from Cramer: On the show Mad-Money (CNBC) Jim Cramer picks stocks which

Gábor Vecsei 12 Aug 30, 2022
Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano

PyMC3 is a Python package for Bayesian statistical modeling and Probabilistic Machine Learning focusing on advanced Markov chain Monte Carlo (MCMC) an

PyMC 7.2k Dec 30, 2022
A script to "SHUA" H1-2 map of Mercenaries mode of Hearthstone

lushi_script Introduction This script is to "SHUA" H1-2 map of Mercenaries mode of Hearthstone Installation Make sure you installed python=3.6. To in

210 Jan 02, 2023
Big Data & Cloud Computing for Oceanography

DS2 Class 2022, Big Data & Cloud Computing for Oceanography Home of the 2022 ISblue Big Data & Cloud Computing for Oceanography class (IMT-A, ENSTA, I

Ocean's Big Data Mining 5 Mar 19, 2022
Lale is a Python library for semi-automated data science.

Lale is a Python library for semi-automated data science. Lale makes it easy to automatically select algorithms and tune hyperparameters of pipelines that are compatible with scikit-learn, in a type-

International Business Machines 293 Dec 29, 2022
signac-flow - manage workflows with signac

signac-flow - manage workflows with signac The signac framework helps users manage and scale file-based workflows, facilitating data reuse, sharing, a

Glotzer Group 44 Oct 14, 2022
📊 Python Flask game that consolidates data from Nasdaq, allowing the user to practice buying and selling stocks.

Web Trader Web Trader is a trading website that consolidates data from Nasdaq, allowing the user to search up the ticker symbol and price of any stock

Paulina Khew 21 Aug 30, 2022
Exploratory Data Analysis of the 2019 Indian General Elections using a dataset from Kaggle.

2019-indian-election-eda Exploratory Data Analysis of the 2019 Indian General Elections using a dataset from Kaggle. This project is a part of the Cou

Souradeep Banerjee 5 Oct 10, 2022
small package with utility functions for analyzing (fly) calcium imaging data

fly2p Tools for analyzing two-photon (2p) imaging data collected with Vidrio Scanimage software and micromanger. Loading scanimage data relies on scan

Hannah Haberkern 3 Dec 14, 2022
nrgpy is the Python package for processing NRG Data Files

nrgpy nrgpy is the Python package for processing NRG Data Files Website and source: https://github.com/nrgpy/nrgpy Documentation: https://nrgpy.github

NRG Tech Services 23 Dec 08, 2022
Analyze the Gravitational wave data stored at LIGO/VIRGO observatories

Gravitational-Wave-Analysis This project showcases how to analyze the Gravitational wave data stored at LIGO/VIRGO observatories, using Python program

1 Jan 23, 2022
Python-based Space Physics Environment Data Analysis Software

pySPEDAS pySPEDAS is an implementation of the SPEDAS framework for Python. The Space Physics Environment Data Analysis Software (SPEDAS) framework is

SPEDAS 98 Dec 22, 2022