Datamol is a python library to work with molecules

Overview

Molecular Manipulation Made Easy


Binder PyPI Conda PyPI - Downloads Conda PyPI - Python Version license GitHub Repo stars GitHub Repo stars

Datamol is a python library to work with molecules. It's a layer built on top of RDKit and aims to be as light as possible.

  • 🐍 Simple pythonic API
  • ⚗ī¸ RDKit first: all you manipulate are rdkit.Chem.Mol objects.
  • ✅ Manipulating molecules often rely on many options; Datamol provides good defaults by design.
  • 🧠 Performance matters: built-in efficient parallelization when possible with optional progress bar.
  • 🕹ī¸ Modern IO: out-of-the-box support for remote paths using fsspec to read and write multiple formats (sdf, xlsx, csv, etc).

Try Online

Visit Binder and try Datamol online.

Documentation

Visit https://doc.datamol.io.

Installation

Use conda:

mamba install -c conda-forge datamol

Quick API Tour

import datamol as dm

# Common functions
mol = dm.to_mol("O=C(C)Oc1ccccc1C(=O)O", sanitize=True)
fp = dm.to_fp(mol)
selfies = dm.to_selfies(mol)
inchi = dm.to_inchi(mol)

# Standardize and sanitize
mol = dm.to_mol("O=C(C)Oc1ccccc1C(=O)O")
mol = dm.fix_mol(mol)
mol = dm.sanitize_mol(mol)
mol = dm.standardize_mol(mol)

# Dataframe manipulation
df = dm.data.freesolv()
mols = dm.from_df(df)

# 2D viz
legends = [dm.to_smiles(mol) for mol in mols[:10]]
dm.viz.to_image(mols[:10], legends=legends)

# Generate conformers
smiles = "O=C(C)Oc1ccccc1C(=O)O"
mol = dm.to_mol(smiles)
mol_with_conformers = dm.conformers.generate(mol)

# 3D viz (using nglview)
dm.viz.conformers(mol, n_confs=10)

# Compute SASA from conformers
sasa = dm.conformers.sasa(mol_with_conformers)

# Easy IO
mols = dm.read_sdf("s3://my-awesome-data-lake/smiles.sdf", as_df=False)
dm.to_sdf(mols, "gs://data-bucket/smiles.sdf")

Compatibilities

Version compatibilities are an essential topic for production-software stacks. We are cautious about documenting compatibility between datamol, python and rdkit.

datamol python rdkit
0.3 >=3.7,<=3.9 >=2020.09,<=2021.03

CI Status

master
Lib build & Testing GitHub Workflow Status
Code Sanity (linting and type analysis) GitHub Workflow Status
Documentation Build GitHub Workflow Status

Changelogs

See the latest changelogs at CHANGELOG.rst.

License

Under the Apache-2.0 license. See LICENSE.

Authors

See AUTHORS.rst.

Comments
  • change force field to MMFF94s without electorstatic component

    change force field to MMFF94s without electorstatic component

    Proposed change to conformer force field from UFF to MMFF94s_noEstat, inspired by Openeye's Omega force field. In addition to FF change, added filters (ewindow, eratio), to allow high-energy conformers to be filtered out.

    opened by matudor 16
  • [ERROR] Runtime.ImportModuleError: Unable to import module 'app': No module named 'sascorer' Traceback (most recent call last):

    [ERROR] Runtime.ImportModuleError: Unable to import module 'app': No module named 'sascorer' Traceback (most recent call last):

    [ERROR] Runtime.ImportModuleError: Unable to import module 'app': No module named 'sascorer' Traceback (most recent call last):
    

    Posting so it's logged somewhere.

    Is this a new dependency for datamol? Disclaimer, I didn't update datamol in my project (lambdomics) since a couple months (I was at 0.5 before upgrading). sascorer seems to have appear in commit 8576d4b with the descriptors module, and this module is imported by default when using import datamol as dm .

    sascorer seems to be a module coming from RDKit but I'm not sure. Is it just a matter of making sure to import RDKit before datamol if you use both in your project?

    bug question 
    opened by MichelML 14
  • Added graph matching and template re-ordering

    Added graph matching and template re-ordering

    Is this relevant to datamol? If so, I'll improve the PR and make some tests.

    Functions to re-order a molecule from a template ordering, and to match molecules with different orderings.

    Checklist:

    • [x] Add tests to cover the fixed bug(s) or the new introduced feature(s) (if appropriate).
    • [x] Update the API documentation is a new function is added or an existing one is deleted.
    • [x] Added a news entry.
      • copy news/TEMPLATE.rst to news/my-feature-or-branch.rst) and edit it.

    opened by DomInvivo 8
  • better sanitize handling in read_sdf

    better sanitize handling in read_sdf

    Checklist:

    • [x] Add tests to cover the fixed bug(s) or the new introduced feature(s) (if appropriate).
    • [x] Added a news entry.
      • copy news/TEMPLATE.rst to news/my-feature-or-branch.rst) and edit it.

    Calling dm.sanitize_mol delete the conformers which should not happens by default since you often want the 3d structure when you read an SDF file.

    (bug spotted while running unit tests on berth with a recent version of datamol)

    opened by hadim 7
  • Mc edits

    Mc edits

    Made small changes to tutorials and documentation Checklist:

    • [x] Removed repeated lines in docs/tutorials/The_Basics.ipynb
    • [x] Removed comments around a couple lines in docs/tutorials/Preprocessing_Molecules.ipynb

    Other questions

    • [x] Fragment and scaffold tutorial does not seem to load correctly. Is it still a wip?
    • [x] In the API section, datamol.actions contains this .. warning:: This is computationally expensive. Should the formatting around 'warning' be changed?
    • [x] Same idea as above in datamol.fragments, but with ..note ::

    opened by craigmichaelm 7
  • `dm.read_smi()` copies a file to an already existing location

    `dm.read_smi()` copies a file to an already existing location

    When I try to load a remote .smi file using Datamol, I run into the following exception:

    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    (...)
    
    File ~/mambaforge/envs/mood/lib/python3.10/site-packages/datamol/io.py:320, in read_smi(urlpath)
        318 if not fsspec.utils.can_be_local(str(urlpath)):
        319     active_path = pathlib.Path(tempfile.mkstemp()[1])
    --> 320     dm.utils.fs.copy_file(urlpath, active_path)
        322 # Read the molecules
        323 supplier = rdmolfiles.SmilesMolSupplier(str(active_path), titleLine=0)
    
    File ~/mambaforge/envs/mood/lib/python3.10/site-packages/datamol/utils/fs.py:250, in copy_file(source, destination, chunk_size, force, progress, leave_progress)
        247     raise ValueError(f"The file being copied does not exist or is not a file: {source}")
        249 if not force and is_file(destination_file):  # type: ignore
    --> 250     raise ValueError(f"The destination file to copy already exists: {destination}")
        252 with source_file as source_stream:
        253     with destination_file as destination_stream:
    
    ValueError: The destination file to copy already exists: /tmp/tmpkzrm7u58
    

    Seems like this function is not up-to-date with recent changes in the dm.utils.fs module? Seems to me like active_path should be set differently to a non-existing file.

    bug 
    opened by cwognum 6
  • New align + news descriptors + some cleaning

    New align + news descriptors + some cleaning

    Checklist:

    • [x] Add tests to cover the fixed bug(s) or the new introduced feature(s) (if appropriate).
    • [ ] Update the API documentation is a new function is added or an existing one is deleted.
    • [x] Added a news entry.
      • copy news/TEMPLATE.rst to news/my-feature-or-branch.rst) and edit it.

    I still need to add the new align code.

    opened by hadim 6
  • expose clean_it arg

    expose clean_it arg

    Exposing this arg can prevent errors for certain problematic molecules when enumerating stereochemistry.

    E.g.

    sm = "N=1C(NC2CC2)=C3C(=NC1)N(/C=C/C=4C=C(C=CC4C)C(=O)NC=5C=C(C=C(C5)N6CCN(CC6)C)C(F)(F)F)C=N3"
    mol = datamol.to_mol(sm)
    
    datamol.enumerate_stereoisomers(mol) # Fails
    datamol.enumerate_stereoisomers(mol, clean_it=False) # Succeeds
    
    opened by jdhorwood 6
  • Use native batch_size from joblib

    Use native batch_size from joblib

    Fix #58 by replacing Sequence by Iterable.

    I had the same bug a few hours ago. @DomInvivo

    Checklist:

    • [x] Add tests to cover the fixed bug(s) or the new introduced feature(s) (if appropriate).
    • [ ] Update the API documentation is a new function is added or an existing one is deleted.
    • [x] Added a news entry.
      • copy news/TEMPLATE.rst to news/my-feature-or-branch.rst) and edit it.

    opened by maclandrol 6
  • add sanitize arg in read_sdf

    add sanitize arg in read_sdf

    Checklist:

    • [x] Add tests to cover the fixed bug(s) or the new introduced feature(s) (not needed).
    • [ ] Added a news entry.
      • copy news/TEMPLATE.rst to news/my-feature-or-branch.rst) and edit it.

    opened by Ishan-Kumar2 6
  • More options for matching with ambiguity

    More options for matching with ambiguity

    Added more options for allow_ambiguous_match, passed as strings. Notably, option "best" allows to find the match with the least amount of errors in terms of atomic number, bond type, and bond stereo.

    Checklist:

    • [x] Add tests to cover the fixed bug(s) or the new introduced feature(s) (if appropriate).
    • [x] Update the API documentation is a new function is added or an existing one is deleted.
    • [x] Added a news entry.
      • copy news/TEMPLATE.rst to news/my-feature-or-branch.rst) and edit it.

    opened by DomInvivo 5
  • Move the fs module to its dedicated section in the docs

    Move the fs module to its dedicated section in the docs

    This is a very useful module, making it under the utils section undermines its visibility imho.

    fsspec ftw

    ref: https://doc.datamol.io/stable/api/datamol.utils.html#datamol.utils.fs

    documentation 
    opened by MichelML 1
  • Add dm.read_records in io module

    Add dm.read_records in io module

    This would simplify things when working in the context of querying a rest api where the json response contains a list of molecules. Quick REST API examples:

    • Molport
    • Chemspace
    • Mcule
    • CDD Vault
    • Dotmatics
    • PubChem
    • Etc...
    enhancement low-priority 
    opened by MichelML 7
  • rdkit.Chem.GetSubstructMatch called in dm.scaffold.fuzzy_scaffolding takes rdkit mol as input, not str

    rdkit.Chem.GetSubstructMatch called in dm.scaffold.fuzzy_scaffolding takes rdkit mol as input, not str

    According to the datamol documentation and code, the enforce_subs parameter of dm.scaffold.fuzzy_scaffolding should be a list of str (presumably SMILES or SMARTS). However, passing in a list of str gives the following rdkit argument error:

    image

    This is because the dm.scaffold.fuzyy_scaffolding function passes the enfocre_subs parameter to rdkit.Chem.GetSubstructMatch, which should take as input an rdkit molecule object, not a string.

    ~~Additionally, upon loading the pattern SMILES as molecule object and passing that into the enforce_subs parameter, the function runs without error but dm.scaffold.fuzyy_scaffolding will miss the loaded pattern:~~

    Edit: Image removed. This was a bad example I falsely thought wasn't capturing an enforce_subs r-group. Will find a better example.

    bug documentation 
    opened by colinrsmall 3
  • `dm.align.template_align` modify the global rdkit behaviour

    `dm.align.template_align` modify the global rdkit behaviour

    See https://github.com/datamol-org/datamol/blob/013d93012abb7fb309a382d8b3eedaca4c2f4425/datamol/align.py#L107

    We should not modify the global behaviour. The best is probably to put back the original value if it has been modified. This is not ideal in when doing multithreading.

    bug low-priority 
    opened by hadim 2
  • Refactor `dm.scaffold.fuzzy_scaffolding`

    Refactor `dm.scaffold.fuzzy_scaffolding`

    dm.scaffold.fuzzy_scaffolding is quite a powerful function but its output is often hard to understand and also process for downstream task.

    We could keep backward compat by keeping dm.scaffold.fuzzy_scaffolding and propose an alternative function that will do the same kind of processing under the hood but return a data structure that is more intuitive and easier to use (a dataframe or a list of dataframe?).

    enhancement low-priority 
    opened by hadim 0
Releases(0.8.8)
  • 0.8.8(Dec 15, 2022)

  • 0.8.7(Nov 29, 2022)

    Added:

    • Add multiple utilities to work with mapped SMILES with hydrogens.
    • Add dm.clear_atom_props() to remove atom's properties.
    • Add dm.clear_atom_map_number() to remove the atom map number property.
    • Add dm.get_atom_positions() to retrieve the atomic positions of a conformer of a molecule.
    • Add dm.set_atom_positions() to add a new confomer to a molecule given a list of atomic positions.

    Changed:

    • Add new arguments to dm.to_mol: allow_cxsmiles, parse_name, remove_hs and strict_cxsmiles. Refers to the docstring for the details.
    • Set copy to True by default to dm.atom_indices_to_mol().
    • Allow to specify the property keys to clear in dm.clear_mol_props(). If not set, the original default beahviour is to clear everything.

    Authors:

    • Hadrien Mary
    Source code(tar.gz)
    Source code(zip)
    datamol-0.8.7.tar.gz(3.42 MB)
  • 0.8.6(Nov 28, 2022)

  • 0.8.5(Nov 28, 2022)

    Added:

    • Support for max_num_mols in dm.read_sdf(). Useful when files are large and debugging code.
    • Support for returning the invalid molecules in dm.read_sdf. Useful when we need to know which one failed.
    • Support for more compression formats when reading SDF files using fssep.open(..., compression="infer").
    • Add CODEOWNERS file.
    • Add dm.descriptors.n_spiro_atoms and dm.descriptors.n_stereo_centers_unspecified.

    Changed:

    • Overload output types for dm.read_sdf and dm.data.*.
    • Reduce tests duration (especially in CI).

    Authors:

    • DomInvivo
    • Hadrien Mary
    Source code(tar.gz)
    Source code(zip)
    datamol-0.8.5.tar.gz(3.42 MB)
  • 0.8.4(Nov 11, 2022)

  • 0.8.3(Nov 5, 2022)

  • 0.8.2(Oct 31, 2022)

  • 0.8.1(Oct 28, 2022)

  • 0.8.0(Oct 28, 2022)

    Added:

    • dm.Atom and dm.Bond types.
    • Add RDKit as a pypi dep.
    • Add datamol.hash_mol() based on rdkit.Chem.RegistrationHash.

    Changed:

    • RDKit 2022.09: use Draw.shouldKekulize instead of Draw._okToKekulizeMol.
    • RDKit 2022.09: don't use dm.convert._ChangeMoleculeRendering for RDKit >=2022.09.

    Authors:

    • Hadrien Mary
    Source code(tar.gz)
    Source code(zip)
    datamol-0.8.0.tar.gz(3.42 MB)
  • 0.7.18(Oct 18, 2022)

  • 0.7.17(Oct 14, 2022)

  • 0.7.16(Oct 12, 2022)

    Changed:

    • Bump upstream GH actions versions.
    • dm.fs.copy_dir now uses the internal fsspec copy when the two source and destination fs are the same. It makes the copy much faster.

    Fixed:

    • Use os.PathLike to recognize a broader range of string-based path inputs in the dm.fs module. It prevents file objects such as py._path.local.LocalPath not being recognized as path.

    Authors:

    • Hadrien Mary
    Source code(tar.gz)
    Source code(zip)
    datamol-0.7.16.tar.gz(3.36 MB)
  • 0.7.15(Oct 11, 2022)

  • 0.7.14(Oct 3, 2022)

    Added:

    • Add with_atom_indices to dm.to_smiles. If enable, atom indices will be added to the SMILES.

    Changed:

    • Changed the default for dm.fs.is_file() from True`` toFalse`.
    • Refactor the API doc to breakdown all the submodules in individual doc. Thanks to @MichelML for the suggestion.
    • Re-enable pipy activity in rever.

    Fixed:

    • Minor typo in the documentation of dm.conformers.generate()

    Authors:

    • Cas
    • Hadrien Mary
    • Valence-JonnyHsu
    Source code(tar.gz)
    Source code(zip)
    datamol-0.7.14.tar.gz(3.36 MB)
  • 0.7.13(Sep 9, 2022)

  • 0.7.12(Sep 6, 2022)

  • 0.7.11(Sep 4, 2022)

    Added:

    • Add configurations for dev containers based on the micromamba Docker image. More informations about dev container at https://docs.github.com/en/codespaces/setting-up-your-project-for-codespaces/introduction-to-dev-containers.
    • support for two additional forcefields: MMFF94s with and without electrostatic component
    • energies output along with delta-energy to lowest energy conformer

    Changed:

    • API of dm.conformers.generate() to support choice of forcefield. In addition ewindow and eratio flags added to reject high energy conformers, either on absoute scale, or as ratio to rotatable bonds
    • Revamped all the datamol tutorials and add new tutorials. Huge thanks to @Valence-jonnyhsu for leading the refactoring of the datamol tutorials.
    • Improve documentation for dm.standardize_mol()
    • Multiple various docstring and typing improvments.
    • Embed the cdk2.sdf and solubility_*.sdf files within the datamol package to prevent issue with the RDKit config dir.
    • Enable strict mode on the documentation to prevent any issues and inconsistency with the types and docstrings of datamol.
    • Refactor micromamba CI to use latest and simplify it.

    Removed:

    • Remove unused and unmaintained dm.actions and dm.reactions module.
    • Remove copy args from add_hs and remove_hs (RDKit already returns copies).

    Fixed:

    • Errors in ECFP fingerprints that computes FCFP instead of ECFP.

    Authors:

    • Emmanuel Noutahi
    • Hadrien Mary
    • Matt
    Source code(tar.gz)
    Source code(zip)
    datamol-0.7.11.tar.gz(2.95 MB)
  • 0.7.10(Jul 18, 2022)

    Added:

    • New possibilities for ambiguous matching of molecules in the function reorder_mol_from_template

    Changed:

    • Replaced allow_ambiguous_hs_only by the option "hs_only" for the ambiguous_match_mode parameter
    • ambiguous_match_mode is now a String, no longer a bool.

    Deprecated:

    • allow_ambiguous_hs_only is no longer deprecated, but without warning since the feature is brand new.
    • Same for ambiguous_match_mode being a bool.

    Authors:

    • DomInvivo
    • Hadrien Mary
    Source code(tar.gz)
    Source code(zip)
    datamol-0.7.10.tar.gz(1.03 MB)
  • 0.7.9(Jun 17, 2022)

    Added:

    • datamol.graph.match_molecular_graphs, with unit-tests
    • datamol.graph.reorder_mol_from_template, with unit-tests

    Changed:

    • Typing in datamol.graph.py, changed rdkit.Chem.rdchem.Mol to dm.Mol

    Deprecated:

    • NOTHING

    Removed:

    • NOTHING

    Fixed:

    • NOTHING

    Security:

    • NOTHING

    Authors:

    • DomInvivo
    • Emmanuel Noutahi
    Source code(tar.gz)
    Source code(zip)
    datamol-0.7.9.tar.gz(1.03 MB)
  • 0.7.8(Jun 3, 2022)

  • 0.7.7(May 30, 2022)

  • 0.7.6(May 20, 2022)

  • 0.7.5(May 19, 2022)

  • 0.7.4(May 13, 2022)

  • 0.7.3(Apr 12, 2022)

  • 0.7.2(Mar 22, 2022)

  • 0.7.1(Mar 18, 2022)

    Added:

    • A new dm.align module with various functions to align a list of molecules. Use dm.align.template_align to align a molecule to a template and dm.align.auto_align_many to automatically partition and align a list of molecules.
    • New descriptors: formal_charge
    • New descriptors: refractivity
    • New descriptors: n_rigid_bonds
    • New descriptors: n_stereo_centers
    • New descriptors: n_charged_atoms
    • Add dm.clear_props to clear all the properties of a mol.
    • Add a new dataset in addition to freesolv based on RDKit CDK2 at dm.cdk2().
    • Add dm.strip_mol_to_core to remove all R groups from a molecule.
    • Add dm.UNSPECIFIED_BOND
    • dm.compute_ring_system to extract the ring systems from a molecule.

    Changed:

    • Improve typing.
    • Improve relative imports coverage.
    • Adapt dm.to_image to use the align module.

    Removed:

    • Remove a lot of # type: ignore as those can be error prone (hopefully the tests are here!)

    Authors:

    • Hadrien Mary
    Source code(tar.gz)
    Source code(zip)
    datamol-0.7.1.tar.gz(1.03 MB)
  • 0.7.0(Mar 11, 2022)

    Added:

    • Add dm.conformers.keep_conformers in order to only keep one or multiple conformers from a molecules.

    Changed:

    • Change the conformer generation arguments to use useRandomCoords=True by default.
    • Start using explicit Optional instead of implicit Optional for typing.
    • Start using relative imports instead of absolute ones.
    • When conformers are not minimized, sort them by energy (can be turned to False).

    Removed:

    • Remove fallback_to_random_coords argument from generate_conformers.

    Authors:

    • Hadrien Mary
    Source code(tar.gz)
    Source code(zip)
    datamol-0.7.0.tar.gz(1.02 MB)
  • 0.6.9(Mar 2, 2022)

    Added:

    • Support for selfies<2.0.0 in tests

    Changed:

    • Behaviour of all inchi functions to return None with a warning instead of silently returning an empty string
    • Order of str evaluation on convertion function. isinstance(str) is now evaluated before is None

    Fixed:

    • Bug in unique_id making this evaluation falling back on 'd41d8cd98f00b204e9800998ecf8427e' on unsupported inputs. Instead None is returned now

    Authors:

    • Emmanuel Noutahi
    Source code(tar.gz)
    Source code(zip)
    datamol-0.6.9.tar.gz(1.02 MB)
  • 0.6.8(Feb 27, 2022)

Owner
datamol
A python library to work with molecules.
datamol
collection of interesting Computer Science resources

collection of interesting Computer Science resources

Kirill Bobyrev 137 Dec 22, 2022
Read-only mirror of https://gitlab.gnome.org/GNOME/pybliographer

Pybliographer Pybliographer provides a framework for working with bibliographic databases. This software is licensed under the GPLv2. For more informa

GNOME Github Mirror 15 May 07, 2022
Doing bayesian data analysis - Python/PyMC3 versions of the programs described in Doing bayesian data analysis by John K. Kruschke

Doing_bayesian_data_analysis This repository contains the Python version of the R programs described in the great book Doing bayesian data analysis (f

Osvaldo Martin 851 Dec 27, 2022
A modular single-molecule analysis interface

MOSAIC: A modular single-molecule analysis interface MOSAIC is a single molecule analysis toolbox that automatically decodes multi-state nanopore data

National Institute of Standards and Technology 35 Dec 13, 2022
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.

Cookiecutter Data Science A logical, reasonably standardized, but flexible project structure for doing and sharing data science work. Project homepage

6.4k Jan 02, 2023
A flexible package manager that supports multiple versions, configurations, platforms, and compilers.

Spack Spack is a multi-platform package manager that builds and installs multiple versions and configurations of software. It works on Linux, macOS, a

Spack 3.1k Dec 31, 2022
PsychoPy is an open-source package for creating experiments in behavioral science.

PsychoPy is an open-source package for creating experiments in behavioral science. It aims to provide a single package that is: precise enoug

PsychoPy 1.3k Dec 31, 2022
🍊 :bar_chart: :bulb: Orange: Interactive data analysis

Orange Data Mining Orange is a data mining and visualization toolbox for novice and expert alike. To explore data with Orange, one requires no program

Bioinformatics Laboratory 3.9k Jan 05, 2023
A mathematica expression evaluator with PokemonTypes

A simple mathematical expression evaluator that uses Pokemon types to replace symbols.

Arnav Jindal 2 Nov 14, 2021
AnuGA for the simulation of the shallow water equation

ANUGA Contents ANUGA What is ANUGA? Installation Documentation and Help Mailing Lists Web sites Latest source code Bug reports Developer information L

Geoscience Australia 147 Dec 14, 2022
Float2Binary - A simple python class which finds the binary representation of a floating-point number.

Float2Binary A simple python class which finds the binary representation of a floating-point number. You can find a class in IEEE754.py file with the

Bora Canbula 3 Dec 14, 2021
Discontinuous Galerkin finite element method (DGFEM) for Maxwell Equations

DGFEM Maxwell Equations Discontinuous Galerkin finite element method (DGFEM) for Maxwell Equations. Work in progress. Currently, the 1D Maxwell equati

Rafael de la Fuente 9 Aug 16, 2022
Statsmodels: statistical modeling and econometrics in Python

About statsmodels statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics an

statsmodels 8.1k Dec 30, 2022
Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Aesara

PyMC3 is a Python package for Bayesian statistical modeling and Probabilistic Machine Learning focusing on advanced Markov chain Monte Carlo (MCMC) an

PyMC 7.2k Dec 30, 2022
CS 506 - Computational Tools for Data Science

CS 506 - Computational Tools for Data Science Code, slides, and notes for Boston University CS506 Fall 2021 The Final Project Repository can be found

Lance Galletti 14 Mar 23, 2022
3D medical imaging reconstruction software

InVesalius InVesalius generates 3D medical imaging reconstructions based on a sequence of 2D DICOM files acquired with CT or MRI equipments. InVesaliu

443 Jan 01, 2023
Mathics is a general-purpose computer algebra system (CAS). It is an open-source alternative to Mathematica

Mathics is a general-purpose computer algebra system (CAS). It is an open-source alternative to Mathematica. It is free both as in "free beer" and as in "freedom".

Mathics 535 Jan 04, 2023
CONCEPT (COsmological N-body CodE in PyThon) is a free and open-source simulation code for cosmological structure formation

CONCEPT (COsmological N-body CodE in PyThon) is a free and open-source simulation code for cosmological structure formation. The code should run on any Linux system, from massively parallel computer

Jeppe Dakin 62 Dec 08, 2022
Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis. You write a high level configuration file specifying your in

Blue Collar Bioinformatics 915 Dec 29, 2022
An open-source application for biological image analysis

CellProfiler is a free open-source software designed to enable biologists without training in computer vision or programming to quantitatively measure

CellProfiler 734 Jan 08, 2023