MapReader: A computer vision pipeline for the semantic exploration of maps at scale

Overview

MapReader

A computer vision pipeline for the semantic exploration of maps at scale

Continuous integration badge License

MapReader is an end-to-end computer vision (CV) pipeline designed by the Living with Machines project. It has two main components: preprocessing/annotation and training/inference:

MapReader pipeline

MapReader provides a set of tools to:

  • load images/maps stored locally or retrieve maps via web-servers (e.g., tileservers which can be used to retrieve maps from OpenStreetMap (OSM), the National Library of Scotland (NLS), or elsewhere). ⚠️ Refer to the credits and re-use terms section if you are using digitized maps or metadata provided by NLS.
  • preprocess images/maps (e.g., divide them into patches, resampling the images, removing borders outside the neatline or reprojecting the map).
  • annotate images/maps or their patches (i.e. slices of an image/map) using an interactive annotation tool.
  • train, fine-tune, and evaluate various CV models.
  • predict labels (i.e., model inference) on large sets of images/maps.
  • Other functionalities include:
    • various plotting tools using, e.g., matplotlib, cartopy, Google Earth, and kepler.gl.
    • compute mean/standard-deviation pixel intensity of image patches.

Below is an example of MapReader CV model output (see the paper on MapReader for more details):

British railspace and buildings as predicted by a MapReader computer vision model

British 'railspace' and buildings as predicted by a MapReader computer vision model. ~30.5M patches from ~16K nineteenth-century Ordnance Survey map sheets were used (courtesy of the National Library of Scotland). (a) Predicted railspace; (b) predicted buildings; (c) and (d) predicted railspace (red) and buildings (black) in and around Middlesbrough and London, respectively. MapReader extracts information from large images or a set of images at a patch level, as depicted in the insets. For both railspace and buildings, we removed those patches that had no other neighboring patches with the same label within a distance of 250 meters.

Table of contents

Installation

Set up a conda environment

We strongly recommend installation via Anaconda:

conda create -n mr_py38 python=3.8
  • Activate the environment:
conda activate mr_py38

Method 1

  • Install mapreader:
pip install git+https://github.com/Living-with-machines/MapReader.git
python -m ipykernel install --user --name mr_py38 --display-name "Python (mr_py38)"

Method 2

  • Clone mapreader source code:
git clone https://github.com/Living-with-machines/MapReader.git 
cd /path/to/MapReader
poetry install
poetry shell

How to cite MapReader

Please consider acknowledging MapReader if it helps you to obtain results and figures for publications or presentations, by citing:

Link: https://arxiv.org/abs/2111.15592

Kasra Hosseini, Daniel C. S. Wilson, Kaspar Beelen and Katherine McDonough (2021), MapReader: A Computer Vision Pipeline for the Semantic Exploration of Maps at Scale, arXiv:2111.15592.

and in BibTeX:

@misc{hosseini2021mapreader,
      title={MapReader: A Computer Vision Pipeline for the Semantic Exploration of Maps at Scale}, 
      author={Kasra Hosseini and Daniel C. S. Wilson and Kaspar Beelen and Katherine McDonough},
      year={2021},
      eprint={2111.15592},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Credits and re-use terms

Digitized maps

MapReader can retrieve maps from NLS (National Library of Scotland) via webservers. For all the digitized maps (retrieved or locally stored), please note the re-use terms:

⚠️ Use of the digitised maps for commercial purposes is currently restricted by contract. Use of these digitised maps for non-commercial purposes is permitted under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC-BY-NC-SA) licence. Please refer to https://maps.nls.uk/copyright.html#exceptions-os for details on copyright and re-use license.

Metadata

We have provided some metadata files in mapreader/persistent_data. For all these file, please note the re-use terms:

⚠️ Use of the metadata for commercial purposes is currently restricted by contract. Use of this metadata for non-commercial purposes is permitted under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC-BY-NC-SA) licence. Please refer to https://maps.nls.uk/copyright.html#exceptions-os for details on copyright and re-use license.

Acknowledgements

This work was supported by Living with Machines (AHRC grant AH/S01179X/1) and The Alan Turing Institute (EPSRC grant EP/N510129/1). Living with Machines, funded by the UK Research and Innovation (UKRI) Strategic Priority Fund, is a multidisciplinary collaboration delivered by the Arts and Humanities Research Council (AHRC), with The Alan Turing Institute, the British Library and the Universities of Cambridge, East Anglia, Exeter, and Queen Mary University of London.

Comments
  • Update README.md

    Update README.md

    • [x] TODOs: See https://github.com/Living-with-machines/MapReader/pull/38#issuecomment-1109569025
    • [x] Rename Maps / Non-maps to Geospatial / Non-geospatial.
    • [x] @kasra-hosseini Review the changes, check the links and merge.
    opened by kasra-hosseini 24
  • Testing `MapReader`

    Testing `MapReader`

    Hi All 👋🏼

    I will be testing MapReader install and the demo notebooks run to evecute the analysis. I will document my process here

    • [x] Installation
      • [x] Install git clone [email protected]:Living-with-machines/MapReader.git
      • [x] git branch -> * dev
      • [X] git pull origin dev
      • [X] poetry install
      • [X] poetry shell
        • this command was not included in the README.md (unlike conda activate ...)
    • [x] Notebooks code execution
      • [x] 001_retrieve_patchify_plot.ipynb
      • [x] 002_annotation.ipynb
      • [x] 003_train_classifier.ipynb
      • [x] 004_inference.ipynb
    opened by ChristinaLast 18
  • :bug: some errors in `binder` deployment.

    :bug: some errors in `binder` deployment.

    Tasks

    • [x] Fix 'great_circle' is not defined
    • [x] Fix simplekml needs to be installed to create KML outputs!

    Associated tracebacks

    ---------------------------------------------------------------------------
    NameError                                 Traceback (most recent call last)
    /tmp/ipykernel_60/1620428857.py in <module>
          3 
          4 xmin, xmax, ymin, ymax, myimg_shape, size_in_m = \
    ----> 5         mymaps.calc_pixel_width_height(all_maps[0])
    
    /srv/conda/envs/notebook/lib/python3.7/site-packages/mapreader/loader/images.py in calc_pixel_width_height(self, parent_id, calc_size_in_m)
        349 
        350         elif calc_size_in_m in ['gc', 'great-circle']:
    --> 351             bottom = great_circle((ymin, xmin), (ymin, xmax)).meters
        352             right = great_circle((ymin, xmax), (ymax, xmax)).meters
        353             top = great_circle((ymax, xmax), (ymax, xmin)).meters
    
    NameError: name 'great_circle' is not defined
    
    ---------------------------------------------------------------------------
    ModuleNotFoundError                       Traceback (most recent call last)
    /srv/conda/envs/notebook/lib/python3.7/site-packages/mapreader/loader/images.py in _createKML(self, path2kml, value, coords, counter)
        817         try:
    --> 818             import simplekml
        819         except:
    
    ModuleNotFoundError: No module named 'simplekml'
    
    During handling of the above exception, another exception occurred:
    
    ImportError                               Traceback (most recent call last)
    /tmp/ipykernel_60/28836796.py in <module>
          4             save_kml_dir="./kml_tutorial",
          5             figsize=(20, 20),
    ----> 6             image_width_resolution=600)
    
    /srv/conda/envs/notebook/lib/python3.7/site-packages/mapreader/loader/images.py in show(self, image_ids, value, plot_parent, border, border_color, vmin, vmax, colorbar, alpha, discrete_colorbar, tree_level, grid_plot, plot_histogram, save_kml_dir, image_width_resolution, kml_dpi_image, **kwds)
        675                                     value=one_image_id,
        676                                     coords=self.images["parent"][one_image_id]["coord"],
    --> 677                                     counter=-1)
        678                 else:
        679                     plt.title(one_image_id)
    
    /srv/conda/envs/notebook/lib/python3.7/site-packages/mapreader/loader/images.py in _createKML(self, path2kml, value, coords, counter)
        818             import simplekml
        819         except:
    --> 820             raise ImportError("[ERROR] simplekml needs to be installed to create KML outputs!")
        821 
        822         (lon_min, lon_max, lat_min, lat_max) = coords
    
    ImportError: [ERROR] simplekml needs to be installed to create KML outputs!
    
    opened by ChristinaLast 6
  • 🐛 `LoadAnnotations` not returning annotation interface

    🐛 `LoadAnnotations` not returning annotation interface

    When using a local notebook to run through the annotation section of the quick_start notebook, I am unable to see the LoadAnnonations object returned in order to generate new labels! See screen shot below:

    Screenshot 2022-05-09 at 13 58 51

    opened by ChristinaLast 3
  • d actual edits to first para

    d actual edits to first para

    I've restored the order to 'maps' -> 'images' so we get a clearer narative as in the current existing repo; and shortened / combined a sentence, as it was repeating 'non-maps' and 'maps', so I used 'any images' instead to make it more intuitive to read.

    I was also going to add a few sentences giving the nice positive spin about interdisciplinary cross-pollination of image analysis, but not sure where this should go: I don't want to break the flow to the instructions, so perhaps it can go after the bullet points?

    opened by dcsw2 2
  • Deploying `MapReader` through `binder`

    Deploying `MapReader` through `binder`

    • [x] @ChristinaLast and @andrewphilipsmith to walk through binder deployment
      • [x] adding requirements.txt with no hashed libraries for binderhub deployment
    opened by ChristinaLast 2
  • Model inference in one step

    Model inference in one step

    Summary

    Currently, we first need to patchify an image and then do the model inference (in two separate steps). In this issue, we plan to have a method that does both steps, i.e.,

    # example interface
    my_classifier.inference(path2image, **kwds for the slice method, including patch size, ...)
    my_classifier.plot()
    

    TODO

    • Refer to https://github.com/alan-turing-institute/mapreader-plant-scivision. Here, we have a function/method called "predict" that does model inference on an image. Under the hood, it slices an image into patches, does model inference on the patches and then plot the results (and return the predicted labels).
    • It would be interesting to have a similar function/method in MapReader.
    opened by kasra-hosseini 2
  • Dev

    Dev

    Creating requirements.txt from pyproject.toml to generate package list needed for binderhub build

    Commands run:

    • to generate requirements.txt
    poetry export -f requirements.txt --output requirements.txt --without-hashes
    

    After doing this, I am required to add the github repo manually to the requirements.txt file to install MapReader such as:

    git+https://github.com/Living-with-machines/[email protected]#egg=mapreader
    
    opened by ChristinaLast 1
  • Bump ipython from 8.0.0 to 8.0.1

    Bump ipython from 8.0.0 to 8.0.1

    Bumps ipython from 8.0.0 to 8.0.1.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Add plant phenotyping example notebooks and data

    Add plant phenotyping example notebooks and data

    Add directory with cleaned and updated notebooks demonstrating classification of plant patches in images. Also includes examples of open access data that can be used in running these notebooks, annotation files to facilitate annotating plant vs. non-plant patches.

    opened by evangeline-corcoran 1
  • Bump pillow from 8.4.0 to 9.0.0

    Bump pillow from 8.4.0 to 9.0.0

    Bumps pillow from 8.4.0 to 9.0.0.

    Release notes

    Sourced from pillow's releases.

    9.0.0

    https://pillow.readthedocs.io/en/stable/releasenotes/9.0.0.html

    Changes

    ... (truncated)

    Changelog

    Sourced from pillow's changelog.

    9.0.0 (2022-01-02)

    • Restrict builtins for ImageMath.eval(). CVE-2022-22817 #5923 [radarhere]

    • Ensure JpegImagePlugin stops at the end of a truncated file #5921 [radarhere]

    • Fixed ImagePath.Path array handling. CVE-2022-22815, CVE-2022-22816 #5920 [radarhere]

    • Remove consecutive duplicate tiles that only differ by their offset #5919 [radarhere]

    • Improved I;16 operations on big endian #5901 [radarhere]

    • Limit quantized palette to number of colors #5879 [radarhere]

    • Fixed palette index for zeroed color in FASTOCTREE quantize #5869 [radarhere]

    • When saving RGBA to GIF, make use of first transparent palette entry #5859 [radarhere]

    • Pass SAMPLEFORMAT to libtiff #5848 [radarhere]

    • Added rounding when converting P and PA #5824 [radarhere]

    • Improved putdata() documentation and data handling #5910 [radarhere]

    • Exclude carriage return in PDF regex to help prevent ReDoS #5912 [hugovk]

    • Fixed freeing pointer in ImageDraw.Outline.transform #5909 [radarhere]

    • Added ImageShow support for xdg-open #5897 [m-shinder, radarhere]

    • Support 16-bit grayscale ImageQt conversion #5856 [cmbruns, radarhere]

    • Convert subsequent GIF frames to RGB or RGBA #5857 [radarhere]

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Satellite images (some references)

    Satellite images (some references)

    • I just had a talk with one of the REG members on https://github.com/urbangrammarai and they are using this tool to download satellite images: https://github.com/urbangrammarai/gee_pipeline/.
    • The other option is : https://planetarycomputer.microsoft.com/
    opened by kasra-hosseini 0
  • Add `min_std_pixel` and `max_std_pixel` to `prepare_annotation`

    Add `min_std_pixel` and `max_std_pixel` to `prepare_annotation`

    So that we can filter out black patches easier. We have trained some MapReader models using ~6K annotated patches (the plant phenotyping project), and now we need to extend the dataset, particularly for non-black patches.

    enhancement 
    opened by kasra-hosseini 1
  • Choose a tool to simplify diffs on .ipynb files.

    Choose a tool to simplify diffs on .ipynb files.

    Consider

    • https://www.reviewnb.com/
    • https://jupyter.org/enhancement-proposals/08-notebook-diff/notebook-diff.html
    • https://blog.ouseful.info/2017/01/27/displaying-differences-in-jupyter-notebooks-nbdime-nbdiff/

    and others

    Build into workflow using pre-commit/CI as appropriate.

    opened by andrewphilipsmith 1
  • Create CODE_OF_CONDUCT.md

    Create CODE_OF_CONDUCT.md

    @DavidBeavan Could you please review this PR? I am using "Contributor Covenant" of GitHub with the following edit:

    Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at https://livingwithmachines.ac.uk/contact-us/. All complaints will be reviewed and investigated promptly and fairly.

    opened by kasra-hosseini 0
  • Adding a notebook containing start of implementation for maps

    Adding a notebook containing start of implementation for maps

    This PR aims to implement the requirements of this issue https://github.com/Living-with-machines/MapReader/issues/36

    For details: See https://hackmd.io/bL3y2cWdT-y3qPGkyVzD5Q?both

    Tasks:

    • [ ] Get or create annotations for example map data
    • [ ] Complete text in HackMD above and transfer it into an appropriate place within the repo. (readme.md or quick_start.ipynb etc)
    • [ ] Resolve all of the questions in the HackMD (whether adding more detail or explicitly deciding to exclude from a quick start guide).
    • [ ] Give the quick_start.ipynb (maps) and quick_start.ipynb (plants) distinct names.
    • [ ] Complete the quick_start.ipynb (maps) to, at least, the same level of detail as the quick_start.ipynb (plants).
    opened by andrewphilipsmith 1
Releases(v0.3.3)
Owner
Living with Machines
A radical collaboration between computational linguists, curators, data scientists, software engineers, geographers and historians
Living with Machines
Reinforcement learning algorithms in RLlib

raylab Reinforcement learning algorithms in RLlib and PyTorch. Installation pip install raylab Quickstart Raylab provides agents and environments to b

Ângelo 50 Sep 08, 2022
Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation in TensorFlow 2

Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation in TensorFlow 2 Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexan

Phan Nguyen 1 Dec 16, 2021
A criticism of a recent paper on buggy image downsampling methods in popular image processing and deep learning libraries.

A criticism of a recent paper on buggy image downsampling methods in popular image processing and deep learning libraries.

70 Jul 12, 2022
PyTorch Personal Trainer: My framework for deep learning experiments

Alex's PyTorch Personal Trainer (ptpt) (name subject to change) This repository contains my personal lightweight framework for deep learning projects

Alex McKinney 8 Jul 14, 2022
Scripts of Machine Learning Algorithms from Scratch. Implementations of machine learning models and algorithms using nothing but NumPy with a focus on accessibility. Aims to cover everything from basic to advance.

Algo-ScriptML Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The goal of this project is not t

Algo Phantoms 81 Nov 26, 2022
CUAD

Contract Understanding Atticus Dataset This repository contains code for the Contract Understanding Atticus Dataset (CUAD), a dataset for legal contra

The Atticus Project 273 Dec 17, 2022
CryptoFrog - My First Strategy for freqtrade

cryptofrog-strategies CryptoFrog - My First Strategy for freqtrade NB: (2021-04-20) You'll need the latest freqtrade develop branch otherwise you migh

Robert Davey 137 Jan 01, 2023
Simple sinc interpolation in PyTorch.

Kazane: simple sinc interpolation for 1D signal in PyTorch Kazane utilize FFT based convolution to provide fast sinc interpolation for 1D signal when

Chin-Yun Yu 10 May 03, 2022
Implementation of "Learning Multi-Granular Hypergraphs for Video-Based Person Re-Identification"

hypergraph_reid Implementation of "Learning Multi-Granular Hypergraphs for Video-Based Person Re-Identification" If you find this help your research,

62 Dec 21, 2022
Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch

Rotary Embeddings - Pytorch A standalone library for adding rotary embeddings to transformers in Pytorch, following its success as relative positional

Phil Wang 110 Dec 30, 2022
Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

HiSD: Image-to-image Translation via Hierarchical Style Disentanglement Official pytorch implementation of paper "Image-to-image Translation

364 Dec 14, 2022
PyGCL: Graph Contrastive Learning Library for PyTorch

PyGCL: Graph Contrastive Learning for PyTorch PyGCL is an open-source library for graph contrastive learning (GCL), which features modularized GCL com

GCL: Graph Contrastive Learning Library for PyTorch 594 Jan 08, 2023
code and models for "Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation"

Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation This repository contains code and models for the method described in: Golnaz

55 Jun 18, 2022
Custom studies about block sparse attention.

Block Sparse Attention 研究总结 本人近半年来对Block Sparse Attention(块稀疏注意力)的研究总结(持续更新中)。按时间顺序,主要分为如下三部分: PyTorch 自定义 CUDA 算子——以矩阵乘法为例 基于 Triton 的 Block Sparse A

Chen Kai 2 Jan 09, 2022
Library extending Jupyter notebooks to integrate with Apache TinkerPop and RDF SPARQL.

Graph Notebook: easily query and visualize graphs The graph notebook provides an easy way to interact with graph databases using Jupyter notebooks. Us

Amazon Web Services 501 Dec 28, 2022
Brax is a differentiable physics engine that simulates environments made up of rigid bodies, joints, and actuators

Brax is a differentiable physics engine that simulates environments made up of rigid bodies, joints, and actuators. It's also a suite of learning algorithms to train agents to operate in these enviro

Google 1.5k Jan 02, 2023
Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

Causality In Traffic Accident (Under Construction) Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020) Overview Data Prepa

Tackgeun 21 Nov 20, 2022
TraSw for FairMOT - A Single-Target Attack example (Attack ID: 19; Screener ID: 24):

TraSw for FairMOT A Single-Target Attack example (Attack ID: 19; Screener ID: 24): Fig.1 Original Fig.2 Attacked By perturbing only two frames in this

Derry Lin 21 Dec 21, 2022
Exploit Camera Raw Data for Video Super-Resolution via Hidden Markov Model Inference

RawVSR This repo contains the official codes for our paper: Exploit Camera Raw Data for Video Super-Resolution via Hidden Markov Model Inference Xiaoh

Xiaohong Liu 23 Oct 08, 2022
AirCode: A Robust Object Encoding Method

AirCode This repo contains source codes for the arXiv preprint "AirCode: A Robust Object Encoding Method" Demo Object matching comparison when the obj

Chen Wang 30 Dec 09, 2022