An application that maps an image of a LaTeX math equation to LaTeX code.

Overview

Image to LaTeX

Code style: black pre-commit License

An application that maps an image of a LaTeX math equation to LaTeX code.

Image to Latex streamlit app

Introduction

The problem of image-to-markup generation has been attempted by Deng et al. (2016). They provide the raw and preprocessed versions of im2latex-100K, a dataset consisting of about 100K LaTeX math equation images. Using their dataset, I trained a model that uses ResNet-18 as encoder (up to layer3) and a Transformer as decoder with cross-entropy loss.

Initially, I used the preprocessed dataset to train my model, but the preprocessing turned out to be a huge limitation. Although the model can achieve a reasonable performance on the test set, it performs poorly if the image quality, padding, or font size is different from the images in the dataset. This phenomenon has also been observed by others who have attempted the same problem using the same dataset (e.g., this project, this issue and this issue). This is most likely due to the rigid preprocessing for the dataset (e.g. heavy downsampling).

To this end, I used the raw dataset and included image augmentation (e.g. random scaling, small rotation) in my data processing pipeline to increase the diversity of the samples. Moreover, unlike Deng et al. (2016), I did not group images by size. Rather, I sampled them uniformly and padded them to the size of the largest image in the batch, to increase the generalizability of the model.

Additional problems that I found in the dataset:

  • Some latex code produces visually identical outputs (e.g. \left( and \right) look the same as ( and )), so I normalized them.
  • Some latex code is used to add space (e.g. \vspace{2px} and \hspace{0.3mm}). However, the length of the space is diffcult to judge. Also, I don't want the model generates code on blank images, so I removed them.

The best run has a character error rate (CER) of 0.17 in test set. Most errors seem to come from unnecessary horizontal spacing, e.g., \;, \, and \qquad. (I only removed \vspace and \hspace during preprocessing. I did not know that LaTeX has so many horizontal spacing commands.)

Possible improvements include:

  • Do a better job cleaning the data (e.g., removing spacing commands)
  • Train the model for more epochs (for the sake of time, I only trained the model for 15 epochs, but the validation loss is still going down)
  • Use beam search (I only implemented greedy search)
  • Use a larger model (e.g., use ResNet-34 instead of ResNet-18)
  • Do some hyperparameter tuning

I didn't do any of these, because I had limited computational resources (I was using Google Colab).

How To Use

Setup

Clone the repository to your computer and position your command line inside the repository folder:

git clone https://github.com/kingyiusuen/image-to-latex.git
cd image-to-latex

Then, create a virtual environment named venv and install required packages:

make venv
make install-dev

Data Preprocessing

Run the following command to download the im2latex-100k dataset and do all the preprocessing. (The image cropping step may take over an hour.)

python scripts/prepare_data.py

Model Training and Experiment Tracking

Model Training

An example command to start a training session:

python scripts/run_experiment.py trainer.gpus=1 data.batch_size=32

Configurations can be modified in conf/config.yaml or in command line. See Hydra's documentation to learn more.

Experiment Tracking using Weights & Biases

The best model checkpoint will be uploaded to Weights & Biases (W&B) automatically (you will be asked to register or login to W&B before the training starts). Here is an example command to download a trained model checkpoint from W&B:

python scripts/download_checkpoint.py RUN_PATH

Replace RUN_PATH with the path of your run. The run path should be in the format of //. To find the run path for a particular experiment run, go to the Overview tab in the dashboard.

For example, you can use the following command to download my best run

python scripts/download_checkpoint.py kingyiusuen/image-to-latex/1w1abmg1

The checkpoint will be downloaded to a folder named artifacts under the project directory.

Testing and Continuous Integration

The following tools are used to lint the codebase:

isort: Sorts and formats import statements in Python scripts.

black: A code formatter that adheres to PEP8.

flake8: A code linter that reports stylistic problems in Python scripts.

mypy: Performs static type checking in Python scripts.

Use the following command to run all the checkers and formatters:

make lint

See pyproject.toml and setup.cfg at the root directory for their configurations.

Similar checks are done automatically by the pre-commit framework when a commit is made. Check out .pre-commit-config.yaml for the configurations.

Deployment

An API is created to make predictions using the trained model. Use the following command to get the server up and running:

make api

You can explore the API via the generated documentation at http://0.0.0.0:8000/docs.

To run the Streamlit app, create a new terminal window and use the following command:

make streamlit

The app should be opened in your browser automatically. You can also open it by visiting http://localhost:8501. For the app to work, you need to download the artifacts of an experiment run (see above) and have the API up and running.

To create a Docker image for the API:

make docker

Acknowledgement

Samila is a generative art generator written in Python

Samila is a generative art generator written in Python, Samila let's you create arts based on many thousand points. The position of every single point is calculated by a formula, which has random par

Sepand Haghighi 947 Dec 30, 2022
Simple to use image handler for python sqlite3.

SQLite Image Handler Simple to use image handler for python sqlite3. Functions Function Name Parameters Returns init databasePath : str tableName : st

Mustafa Ozan Çetin 7 Sep 16, 2022
Generate different types of random avatars.

avatar-generator Generate different types of random avatars. Requirements Python3 pytorch=1.6 cv2=3.4 tqdm 1. Github-like avatars python generate_gi

Ming 11 Apr 02, 2022
Convert bitmap images to seeds for Tiny-83 NFT project.

What is this? This tool allows you to convert any 14p high and 22p wide Bitmap (.bmp) to the seed needed for the Tiny-83 NFT project. Project Twitter:

shib_maximalist 1 Oct 31, 2021
Simple AI app that is guessing color of apple in picture

Apple Color Determinant Application that is guessing color of apple from image Install Pillow, sklearn and numpy, using command for your package manag

Gleb Nikitin 1 Oct 25, 2021
Water marker for images.

watermarker linux users: To fix this error,please add truetype font path File "watermark.py", line 58, in module font = ImageFont.truetype("Dro

13 Oct 27, 2022
Short piece of code to create a rainbow gif of gradual contours from two shapefiles

rainbow-elevation-gif Short piece of code to create a rainbow gif of gradual con

Jess Roberts 6 Jan 17, 2022
An open source image editor which can manipulate an image in many ways!

Image Editor - An open source image editor which can manipulate an image in many ways! If you need any more modes in repo or I

TroJanzHEX 44 Nov 17, 2022
Panel Competition Image Generator

Panel Competition Image Generator This project was build by a member of the NFH community and is open for everyone who wants to try it. Relevant links

Juliano Mendieta 1 Oct 22, 2021
A little Python tool to convert a TrueType (ttf/otf) font into a PNG for use in demos.

font2png A little Python tool to convert a TrueType (ttf/otf) font into a PNG for use in demos. To use from command line it expects python3 to be at /

Rich Elmes 3 Dec 22, 2021
Convert Image to ASCII Art

Convert Image to ASCII Art Persiapan aplikasi ini menggunakan bahasa python dan beberapa package python. oleh karena itu harus menginstall python dan

Huda Damar 48 Dec 20, 2022
Py3D - A 3d rendering engine written entirely in python

Py3D is a 3d rendering engine written entirely in python. It is a simple and eas

1up Community 2 Nov 14, 2022
Fast Image Retrieval (FIRe) is an open source image retrieval project

Fast Image Retrieval (FIRe) is an open source image retrieval project release by Center of Image and Signal Processing Lab (CISiP Lab), Universiti Malaya. This project implements most of the major bi

CISiP Lab 39 Nov 25, 2022
Script for the creation of metadatas and the randomization of images of MekaVerse

MekaVerse-random Script for the creation of metadata and the randomization of images of MekaVerse Step to replay the random : Create a folder : output

Miinded 8 Sep 07, 2022
QR Code Generator

In this project, we'll be using some libraries to instantly generate authentic QR Codes and export them in various formats

Hassan Shahzad 3 Jun 02, 2022
Pythonocc nodes for Ryven

Pythonocc-nodes-for-Ryven Pythonocc nodes for Ryven Here a way to work on Pythonocc with a node editor, Ryven in that case. To get it functional you w

Tanneguy 30 Dec 18, 2022
Bot by image recognition simulating (random) human clicks

bbbot22 bot por reconhecimento de imagem simulando cliques humanos (aleatórios) inb4: sim, esse é basicamente o mesmo bot de 2021 porque a Globo não t

Yuri 2 Apr 05, 2022
With this simple py script you will be able to get all the .png from a folder and generate a yml for Oraxen

Oraxen-item-to-yml With this simple py script you will be able to get all the .png from a folder and generate a yml for Oraxen How to use Install the

Akex 1 Dec 29, 2021
BeeRef — A Simple Reference Image Viewer

BeeRef — A Simple Reference Image Viewer BeeRef lets you quickly arrange your reference images and view them while you create. Its minimal interface i

Rebecca 245 Dec 25, 2022
PyGram Instagram-like image filters.

PyGram Instagram-like image filters. Usage First, import the client: from filters import * Instanciate a filter and apply it: f = Nashville("image.jp

Ajay Kumar Nagaraj 102 Feb 21, 2022