MICOM is a Python package for metabolic modeling of microbial communities

Overview

https://github.com/micom-dev/micom/raw/master/docs/source/micom.png

actions status coverage pypi status

Welcome

MICOM is a Python package for metabolic modeling of microbial communities currently developed in the Gibbons Lab at the Institute for Systems Biology and the Human Systems Biology Group of Prof. Osbaldo Resendis Antonio at the National Institute of Genomic Medicine Mexico.

MICOM allows you to construct a community model from a list on input COBRA models and manages exchange fluxes between individuals and individuals with the environment. It explicitly accounts for different abundances of individuals in the community and can thus incorporate data from biomass quantification, cytometry, ampliconsequencing, or metagenomic shotgun sequencing.

It identifies a relevant flux space by incorporating an ecological model for the trade-off between individual taxa growth and community-wide growth that shows good agreement with experimental data.

Attribution

MICOM is published in

MICOM: Metagenome-Scale Modeling To Infer Metabolic Interactions in the Gut Microbiota
Christian Diener, Sean M. Gibbons, Osbaldo Resendis-Antonio
mSystems 5:e00606-19
https://doi.org/10.1128/mSystems.00606-19

Please cite this publication when referencing MICOM. Thanks πŸ˜„

Installation

MICOM is available on PyPi and can be installed via

pip install micom

Getting started

Documentation can be found at https://micom-dev.github.io/micom .

Getting help

General questions on usage can be asked in Github Discussions
https://github.com/micom-dev/micom/discussions
We are also available on the cobrapy Gitter channel
https://gitter.im/opencobra/cobrapy
Questions specific to the MICOM Qiime2 plugin (q2-micom) can also be asked on the Qiime2 forum
https://forum.qiime2.org/c/community-plugin-support/
Comments
  • exchanges not identified correctly in CarveME models

    exchanges not identified correctly in CarveME models

    Goodmorning. I have recently used your program to analyze metabolic exchanges in a community. I have only a question: in the output, I have for each metabolite two columns one called, for example, EX_but_e and the other EX_but_m. The EX_but_m appears only in one row which is the medium one. I supposed that summing the EX_but_e for all the microbes in my community I should have got the value in the medium row at EX_but_m column. But it is not the case. Why?

    Sincerely, Arianna Basile

    opened by arianccbasile 14
  • interpretation of plot_exchanges_per_sample default behavior

    interpretation of plot_exchanges_per_sample default behavior

    The "Plotting consumed metabolites" section of the documentation describes the function plot_exchanges_per_sample with default parameter direction='import' as plotting the metabolites consumed by the microbial community.

    In the source code for plot_exchanges_per_sample, the subset with taxon='medium' is selected. But isn't the "import" for the 'medium' actually the export for the microbial community? This seems at odds with the description of plot_exchanges_per_sample(direction='import') as plotting the consumption patterns of the microbes.

    Perhaps I am misinterpreting the direction of the fluxes for the 'medium' in the output of the grow function?

    opened by mmp3 6
  • optimize_all(fluxes=True) and optimize_single(fluxes=True) do not return fluxes

    optimize_all(fluxes=True) and optimize_single(fluxes=True) do not return fluxes

    Even with fluxes=True passed into community.optimize_all(), only maximal growth rates are returned. In fact, community.optimize_single() does not accept the fluxes argument.

    Using version '0.22.6'

    opened by michaelsilverstein 6
  • Fix #37 by allowing the user to define the compression...

    Fix #37 by allowing the user to define the compression...

    …type and level when building a zipped database. Default behavior stays the same (ZIP_STORED with no compression).

    This commit reworks the behavior of the compress parameter and adds the optional parameter compresslevel to micom.workflows.build_database().

    opened by nigiord 5
  • Fraction variation influences

    Fraction variation influences "status"

    Hi! I have a question about the flag "status" which comes as an output when "cooperative_tradeoff" is run. In particular, sometimes I get as a result "optimal" but most of times I get "numeric". What does it mean?

    Sincerely, Arianna Basile

    opened by arianccbasile 5
  • Optimize community growth to match known relative abundance

    Optimize community growth to match known relative abundance

    I'm new to MICOM and COBRA, and I'd like to use MICOM to predict the metabolites similar to MAMBO and the Garza et al. 2018 paper's approach which, as I understand it, is to maximize the correlation of microbial growth with the known relative abundances. Is this possible with MICOM?

    I'd guess this would involve changing the community objective function somehow, or are alternate objective functions already supported?

    Thanks!

    opened by krcurtis 5
  • Add basic installation instructions for QP solvers to the docs

    Add basic installation instructions for QP solvers to the docs

    I'm trying to run com.cooperative_tradeoff on a community using some sbml files but cannot because I'm using the default GLPK and don't know how to use IBM Cplex instead. I downloaded it as an academic edition but do not know how to set it as the solver for micom. Thanks in advance!

    opened by jfoldi81 4
  • Error occurs when joining two models after changing their metabolite IDs

    Error occurs when joining two models after changing their metabolite IDs

    Hi, I am trying to build a community for two models and I've already changed the metabolite IDs for each model. I renamed the metabolite IDs as + such as C12145cytoplasm ('C12145' is KEGG ID and 'cytoplasm' is the name of the compartment). Firstly one thing I'd like to double-check is that do I need to change the compartment ID as well even though I didn't use the compartment ID for matching models?

    Secondly, when I tried to join the two renamed models an error occurred:

    File "/Users/wintermute/opt/anaconda3/lib/python3.7/urllib/parse.py", line 107, in <genexpr>
        return tuple(x.decode(encoding, errors) if x else '' for x in args)
    AttributeError: 'Model' object has no attribute 'decode'
    

    Here is my code:

    sc = micom.util.load_model('/Users/wintermute/OneDrive - University of Cambridge/cambridge/during_cam/Department/work/code/coculture/yeast7.6_changeID_final.xml')
    
    kp = micom.util.load_model('/Users/wintermute/OneDrive - University of Cambridge/cambridge/during_cam/Department/work/code/coculture/KP_changeID_final3.xml')
    
    community = [sc,kp]
    
    micom.util.join_models(community)
    

    Does anyone have any idea about that? Thanks.

    opened by wintermute221 4
  • Why does problems.solve() return None when status is not optimal?

    Why does problems.solve() return None when status is not optimal?

    Just curious since there is additional information about what did not work in the cobrapy Solution object that could be returned. Or does it just not make sense to return a CommunitySolution object when the optimization did not work? I'm most interested in knowing what the solver status is.

    opened by mmundy42 4
  • Adjust active demands when loading models

    Adjust active demands when loading models

    Problem description

    Custom models and some models from AGORA have active demand reactions for instance for biotin. This can make MICOM fail due to infeasible models in low nutrient settings.

    Code Sample

    See https://github.com/micom-dev/media/issues/2 .

    Suggested fix

    Adjust those demands so that the zero flux solution is included. Raise a warning or info in the logger if that is the case.

    opened by cdiener 3
  • build.py:177: FutureWarning: The default value of regex will change from True to False in a future version.

    build.py:177: FutureWarning: The default value of regex will change from True to False in a future version.

    Following your recommendation from #31 , I execute build_database as follows:

    # df has columns: 
    #     id   kingdom   phylum  class  order  family   genus     species    file
    > m = build_database( manifest = df , out_path = "/data/out" , threads = 8 )
    

    which gives warning message (but still completes successfully):

    /usr/local/lib/python3.8/dist-packages/micom/workflows/build.py:177: FutureWarning: The default value of regex will change from True to False in a future version.
      meta.index = meta[rank].str.replace("[^\\w\\_]", "_")
    Running ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━  79% 0:02:12
    

    System information:

    $ python3  -c "import micom; micom.show_versions()"
    
    System Information
    ==================
    OS                  Linux
    OS-release 5.4.0-1030-aws
    Python              3.8.5
    
    Package Versions
    ================
    cobra        0.21.0
    jinja2       2.10.1
    micom        0.22.5
    pip          20.0.2
    scikit-learn 0.24.1
    scipy         1.6.1
    setuptools   45.2.0
    symengine     0.7.2
    wheel        0.34.2
    
    opened by mmp3 3
  • Error while copying model using model.copy()

    Error while copying model using model.copy()

    Discussed in https://github.com/micom-dev/micom/discussions/69

    Originally posted by anubhavdas0907 March 8, 2022 Hello Christian,

    I was trying to copy a community model to a different variable, but I get an error. I want to manipulate a model, without changing the original one.

    Following are the details.

    from micom import load_pickle
    model = load_pickle("ERR1883210.pickle")
    model_1 = model.copy()
    
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/data/Conda_base/envs/MyConda/lib/python3.7/site-packages/cobra/core/model.py", line 321, in copy
        new = self.__class__()
    TypeError: __init__() missing 1 required positional argument: 'taxonomy'
    

    Can you please suggest, what's wrong in this code, and what can be the solution?

    Regards Anubhav

    opened by cdiener 2
  • Improve the warning for media metabolites with a missing transport reaction

    Improve the warning for media metabolites with a missing transport reaction

    Checklist

    Is your feature related to a problem? Please describe it.

    The warning regarding media components with missing imports is a bit too fatalistic given that it is not a real problem in most cases. See #63 for instance.

    Describe the solution you would like.

    Maybe change it to be a an info and only trigger a warning if a large number of metabolites can not be consumed or if there are no consumable carbon and nitrogen sources.

    Describe alternatives you considered

    Just changing the text of the warning, but it may still be too verbose.

    Additional context

    opened by cdiener 0
  • Update and fix the docs

    Update and fix the docs

    Checklist

    Bundled action items

    This is issue is a collection of small things that should be fixed in the docs.

    • [ ] fix links in the docs
    • [ ] provide a section summarizing the available DBs and media
    • [ ] Improve the workflows section and maybe split it
    • [ ] give more info and the tradeoff parameter and how to choose it
    • [ ] update the theoretical/methods intro
    • [ ] document the QIIME2 artifact readers
    • [ ] update the docs for elasticities
    • [x] add solver installation to docs
    opened by cdiener 0
  • Support for GTDB taxonomy?

    Support for GTDB taxonomy?

    Checklist

    Is your feature related to a problem? Please describe it.

    The Genome Taxonomy Database (GTDB) is comprehensive (especially the new v202 release) and more robust than the NCBI microbial taxonomy, especially given that the GTDB taxonomy is completely based off of genome phylogenic relatedness.

    Although the MICOM docs are vague about the taxonomy that one must use, it appears that the NCBI taxonomy is required.

    Describe the solution you would like.

    Provide direct support for the GTDB taxonomy.

    opened by nick-youngblut 3
  • [MICOM 1.0 API] Proposed new format for fluxes

    [MICOM 1.0 API] Proposed new format for fluxes

    This is a proposal for a new format for fluxes slated for MICOM 1.0. Feel free to comment :smile:

    Checklist

    Current state

    The current format for fluxes returned by MICOM is a table in wide format:

    In [1]: from micom import Community
    
    In [2]: from micom.data import test_taxonomy
    
    In [3]: com = Community(test_taxonomy())
    Building ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
    
    In [7]: sol = com.cooperative_tradeoff(fluxes=True)
    
    In [8]: sol.fluxes
    Out[8]: 
    reaction               ACALD    ACALDt      ACKr    ACONTa    ACONTb     ACt2r          ADK1  ...     SUCDi    SUCOAS      TALA          THD2      TKT1      TKT2       TPI
    compartment                                                                                   ...                                                                          
    Escherichia_coli_1  0.049190 -0.008897 -0.004224  5.999485  5.999485 -0.004224  3.388665e-11  ...  5.017641 -5.017641  1.489184  1.924736e-10  1.489184  1.173698  7.513137
    Escherichia_coli_2 -0.079989 -0.115231  0.072559  6.001066  6.001066  0.072559  4.264225e-11  ...  5.033051 -5.033051  1.491048  1.924125e-10  1.491048  1.175562  7.495742
    Escherichia_coli_3  0.102350  0.197394 -0.100513  6.004985  6.004985 -0.100513  3.662292e-11  ...  5.083935 -5.083935  1.506075  1.926208e-10  1.506075  1.190589  7.460396
    Escherichia_coli_4 -0.071551 -0.073266  0.032177  6.023463  6.023463  0.032177  4.133342e-11  ...  5.122875 -5.122875  1.501628  1.926284e-10  1.501628  1.186143  7.440253
    medium                   NaN       NaN       NaN       NaN       NaN       NaN           NaN  ...       NaN       NaN       NaN           NaN       NaN       NaN       NaN
    
    [5 rows x 115 columns]
    
    

    This has resulted in some issues:

    1. It is incompatible with cobra.Solution.fluxes which breaks a lot of the cobra functionality like for instance summary methods.
    2. It can be pretty sparse for very divergent models (many NA entries)
    3. It mixes medium and taxa fluxes
    4. It does not specify if export fluxes denote import or export which is one of the most common help requests we receive
    5. Basically all methods using flux results in MICOM will convert them to a long format

    Proposed new API for fluxes

    CommunitySolution.fluxes will retain the cobrapy format and will superseded by new accessors that all return fluxes in long format:

    CommunitySolution.exchange_fluxes

    Similar to the previous one but with the taxa annotated.

          reaction                     name               taxon          flux direction                       micom_id
    0      EX_ac_m     ac_m medium exchange              medium  1.814984e-11    export                        EX_ac_m
    1   EX_acald_m  acald_m medium exchange              medium  1.328645e-11    export                     EX_acald_m
    2     EX_akg_m    akg_m medium exchange              medium  3.225128e-12    export                       EX_akg_m
    3     EX_co2_m    co2_m medium exchange              medium  2.280983e+01    export                       EX_co2_m
    4    EX_etoh_m   etoh_m medium exchange              medium  1.515389e-11    export                      EX_etoh_m
    ..         ...                      ...                 ...           ...       ...                           
    

    CommunitySolution.internal_fluxes

        reaction                                               name               taxon          flux                    micom_id
    0      ACALD           Acetaldehyde dehydrogenase (acetylating)  Escherichia_coli_1  1.312146e+00   ACALD__Escherichia_coli_1
    1     ACALDt                  Acetaldehyde reversible transport  Escherichia_coli_1  3.236132e+00  ACALDt__Escherichia_coli_1
    2       ACKr                                     Acetate kinase  Escherichia_coli_1 -1.304078e+00    ACKr__Escherichia_coli_1
    3     ACONTa   Aconitase (half-reaction A, Citrate hydro-lyase)  Escherichia_coli_1  5.987675e+00  ACONTa__Escherichia_coli_1
    4     ACONTb  Aconitase (half-reaction B, Isocitrate hydro-l...  Escherichia_coli_1  5.987675e+00  ACONTb__Escherichia_coli_1
    

    This will consolidate GrowthResults and CommunitySolution and gives a more readable format. All those properties are generated on the fly when accessing the property.

    Additionaly, we may also want to save the annotations in the solution but they may be large, so it might be better to have a property on the model class like Community.annotations.

    Additional context

    A similar format change is planned for Community.knockout_taxa. elasticities already uses a long format.

    feature 
    opened by cdiener 0
  • Implement more checks and help for model tables

    Implement more checks and help for model tables

    The format for the taxonomy table community model manifests can be unclear and both are often confused for one another. Provide a validation helper and better error messages.

    opened by cdiener 0
Releases(v0.32.3)
Owner
Developers of the microbial community modeling package micom.
A complete guide to start and improve in machine learning (ML)

A complete guide to start and improve in machine learning (ML), artificial intelligence (AI) in 2021 without ANY background in the field and stay up-to-date with the latest news and state-of-the-art

Louis-François Bouchard 3.3k Jan 04, 2023
LibRerank is a toolkit for re-ranking algorithms. There are a number of re-ranking algorithms, such as PRM, DLCM, GSF, miDNN, SetRank, EGRerank, Seq2Slate.

LibRerank LibRerank is a toolkit for re-ranking algorithms. There are a number of re-ranking algorithms, such as PRM, DLCM, GSF, miDNN, SetRank, EGRer

126 Dec 28, 2022
Automated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning

The mljar-supervised is an Automated Machine Learning Python package that works with tabular data. I

MLJAR 2.4k Jan 02, 2023
ML Optimizers from scratch using JAX

Toy implementations of some popular ML optimizers using Python/JAX

Shreyansh Singh 38 Jul 29, 2022
Skoot is a lightweight python library of machine learning transformer classes that interact with scikit-learn and pandas.

Skoot is a lightweight python library of machine learning transformer classes that interact with scikit-learn and pandas. Its objective is to ex

Taylor G Smith 54 Aug 20, 2022
Built various Machine Learning algorithms (Logistic Regression, Random Forest, KNN, Gradient Boosting and XGBoost. etc)

Built various Machine Learning algorithms (Logistic Regression, Random Forest, KNN, Gradient Boosting and XGBoost. etc). Structured a custom ensemble model and a neural network. Found a outperformed

Chris Yuan 1 Feb 06, 2022
AI and Machine Learning with Kubeflow, Amazon EKS, and SageMaker

Data Science on AWS - O'Reilly Book Get the book on Amazon.com Book Outline Quick Start Workshop (4-hours) In this quick start hands-on workshop, you

Data Science on AWS 2.8k Jan 03, 2023
Simple data balancing baselines for worst-group-accuracy benchmarks.

BalancingGroups Code to replicate the experimental results from Simple data balancing baselines achieve competitive worst-group-accuracy. Replicating

Facebook Research 29 Dec 02, 2022
Databricks Certified Associate Spark Developer preparation toolkit to setup single node Standalone Spark Cluster along with material in the form of Jupyter Notebooks.

Databricks Certification Spark Databricks Certified Associate Spark Developer preparation toolkit to setup single node Standalone Spark Cluster along

19 Dec 13, 2022
Estudos e projetos feitos com PySpark.

PySpark (Spark com Python) PySpark Γ© uma biblioteca Spark escrita em Python, e seu objetivo Γ© permitir a anΓ‘lise interativa dos dados em um ambiente d

Karinne Cristina 54 Nov 06, 2022
Kubeflow is a machine learning (ML) toolkit that is dedicated to making deployments of ML workflows on Kubernetes simple, portable, and scalable.

SDK: Overview of the Kubeflow pipelines service Kubeflow is a machine learning (ML) toolkit that is dedicated to making deployments of ML workflows on

Kubeflow 3.1k Jan 06, 2023
Python package for machine learning for healthcare using a OMOP common data model

This library was developed in order to facilitate rapid prototyping in Python of predictive machine-learning models using longitudinal medical data from an OMOP CDM-standard database.

Sontag Lab 75 Jan 03, 2023
Graphsignal is a machine learning model monitoring platform.

Graphsignal is a machine learning model monitoring platform. It helps ML engineers, MLOps teams and data scientists to quickly address issues with data and models as well as proactively analyze model

Graphsignal 143 Dec 05, 2022
Simple linear model implementations from scratch.

Hand Crafted Models Simple linear model implementations from scratch. Table of contents Overview Project Structure Getting started Citing this project

Jonathan Sadighian 2 Sep 13, 2021
SIMD-accelerated bitwise hamming distance Python module for hexidecimal strings

hexhamming What does it do? This module performs a fast bitwise hamming distance of two hexadecimal strings. This looks like: DEADBEEF = 1101111010101

Michael Recachinas 12 Oct 14, 2022
Machine learning algorithms implementation

Machine learning algorithms implementation This repository consisits of implementation of various machine learning algorithms. The algorithms implemen

Karun Dawadi 1 Jan 03, 2022
Simple, fast, and parallelized symbolic regression in Python/Julia via regularized evolution and simulated annealing

Parallelized symbolic regression built on Julia, and interfaced by Python. Uses regularized evolution, simulated annealing, and gradient-free optimization.

Miles Cranmer 924 Jan 03, 2023
Microsoft Machine Learning for Apache Spark

Microsoft Machine Learning for Apache Spark MMLSpark is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark

Microsoft Azure 3.9k Dec 30, 2022
Python package for stacking (machine learning technique)

vecstack Python package for stacking (stacked generalization) featuring lightweight functional API and fully compatible scikit-learn API Convenient wa

Igor Ivanov 671 Dec 25, 2022
AP1 Transcription Factor Binding Site Prediction

A machine learning project that predicted binding sites of AP1 transcription factor, using ChIP-Seq data and local DNA shape information.

1 Jan 21, 2022