refineGEMs package
Here is an overview on all functions. All imports are mocked in autodoc_mock_imports in the conf.py file to enable automatic building.
refineGEMs.biomass module
Most functions within this module were copied from the MEMOTE GitHub page and modified by Gwendolyn O. Gusak.
This module provides functions to be used to assess the biomass weight as well as normalise it.
- refinegems.biomass.check_normalise_biomass(model: cobra.Model) cobra.Model | None
Checks if at least one biomass reaction is present
For each found biomass reaction checks if it sums up to 1g[CDW]
Normalises the coefficients of each biomass reaction where the sum is not 1g[CDW] until the sum is 1g[CDW]
Returns model with adjusted biomass function(s)
- Args:
model (cobraModel): Model loaded with COBRApy
- Returns:
cobraModel: COBRApy model with adjusted biomass functions
- refinegems.biomass.normalise_biomass(biomass: cobra.Reaction, current_sum: float) cobra.Reaction
Normalises the coefficients according to current biomass weight to one g[CDW]
- Args:
biomass (Reaction): Biomass function/reaction current_sum (float): Biomass weight calculated with sum_biomass_weight in g/mmol
- Returns:
Reaction: Biomass function/reaction with updated coefficients
- refinegems.biomass.sum_biomass_weight(reaction: cobra.Reaction) float
From MEMOTE: https://github.com/opencobra/memote/blob/81a55a163262a0e06bfcb036d98e8e551edc3873/src/memote/support/biomass.py#L95
Compute the sum of all reaction compounds.
This function expects all metabolites of the biomass reaction to have formula information assigned.
Parameters
- reactioncobra.core.reaction.Reaction
The biomass reaction of the model under investigation.
Returns
- float
The molecular weight of the biomass reaction in units of g/mmol.
- refinegems.biomass.test_biomass_consistency(model: cobra.Model, reaction_id: str) float | str
Modified from MEMOTE: https://github.com/opencobra/memote/blob/81a55a163262a0e06bfcb036d98e8e551edc3873/src/memote/suite/tests/test_biomass.py#L89
Expect biomass components to sum up to 1 g[CDW].
This test only yields sensible results if all biomass precursor metabolites have chemical formulas assigned to them. The molecular weight of the biomass reaction in metabolic models is defined to be equal to 1 g/mmol. Conforming to this is essential in order to be able to reliably calculate growth yields, to cross-compare models, and to obtain valid predictions when simulating microbial consortia. A deviation from 1 - 1E-03 to 1 + 1E-06 is accepted.
Implementation: Multiplies the coefficient of each metabolite of the biomass reaction with its molecular weight calculated from the formula, then divides the overall sum of all the products by 1000.
- refinegems.biomass.test_biomass_presence(model: cobra.Model) list[str] | None
Modified from MEMOTE: https://github.com/opencobra/memote/blob/81a55a163262a0e06bfcb036d98e8e551edc3873/src/memote/suite/tests/test_biomass.py#LL42C3-L42C3
Expect the model to contain at least one biomass reaction.
The biomass composition aka biomass formulation aka biomass reaction is a common pseudo-reaction accounting for biomass synthesis in constraints-based modelling. It describes the stoichiometry of intracellular compounds that are required for cell growth. While this reaction may not be relevant to modeling the metabolism of higher organisms, it is essential for single-cell modeling.
Implementation: Identifies possible biomass reactions using two principal steps:
1. Return reactions that include the SBO annotation “SBO:0000629” for biomass.
If no reactions can be identified this way:
Look for the
buzzwords“biomass”, “growth” and “bof” in reaction IDs.Look for metabolite IDs or names that contain the
buzzword“biomass” and obtain the set of reactions they are involved in.Remove boundary reactions from this set.
Return the union of reactions that match the buzzwords and of the reactions that metabolites are involved in that match the buzzword.
This test checks if at least one biomass reaction is present.
If no reaction can be identified return None.
refineGEMs.charges module
Provides functions for adding charges to metabolites
When iterating through all metabolites present in a model, you will find several which have no defined charge (metab.getPlugin(‘fbc’).isSetCharge() = false). This can lead to charge imbalanced reactions. This script takes information on metabolite charges from the ModelSEED database. A charge is automatically added to a metabolite if it has no defined charge and if there is only one charge denoted in ModelSEED. When multiple charges are present, the metabolite and the possible charges are noted and later returned in a dictionary.
It is possible to use the correct_charges_from_db function with other databases. The user just needs to make sure that the compounds dataframe has a ‘BiGG’ and a ‘charge’ column.
- refinegems.charges.correct_charges_from_db(model: libsbml.Model, compounds: pandas.DataFrame) tuple[libsbml.Model, dict]
Adds charges taken from given database to metabolites which have no defined charge
- Args:
model (libModel): Model loaded with libsbml
compounds (pd.DataFrame): Containing database data with ‘BiGG’ (BiGG-Ids) and ‘charge’ (float or int) as columns
- Returns:
- tuple: libSBML model (1) & dictionary ‘metabolite_id’: list(charges) (2)
libModel: Model with added charges
dict: Metabolites with respective multiple charges
- refinegems.charges.correct_charges_modelseed(model: libsbml.Model) tuple[libsbml.Model, dict]
Wrapper function which completes the steps to charge correction with the ModelSEED database
- Args:
model (libModel): Model loaded with libsbml
- Returns:
- tuple: libSBML model (1) & dictionary ‘metabolite_id’: list(charges) (2)
libModel: Model with added charges
dict: Metabolites with respective multiple charges
refineGEMs.comparison module
Provides functions to compare and visualize multiple models
Can mainly be used to compare growth behaviour of multiple models. All other stats are shown in the memote report.
- refinegems.comparison.get_sbo_mapping_multiple(models: list[libsbml.Model]) pandas.DataFrame
Determines number of reactions per SBO Term and adds label of SBO Terms
- Args:
models (list[libModel]): Models loaded with libSBML
- Returns:
pd.DataFrame: SBO Terms, number of reactions per Model and SBO Label
- refinegems.comparison.plot_heatmap_dt(growth: pandas.DataFrame)
Creates heatmap of simulated doubling times with additives
- Args:
growth (pd.DataFrame): Containing growth data from simulate_all
- Returns:
plot: Seaborn Heatmap
- refinegems.comparison.plot_heatmap_native(growth: pandas.DataFrame)
Creates a plot were if growth without additives is possible is marked from yellow to green otherwise black
- Args:
growth (pd.DataFrame): Containing growth data from simulate_all
- Returns:
plot: Seaborn Heatmap
- refinegems.comparison.plot_initial_analysis(models: list[libsbml.Model])
Creates bar plot of number of entities per Model
- Args:
models (list[libModel]): Models loaded with libSBML
- Returns:
plot: Pandas Barchart
- refinegems.comparison.plot_rea_sbo_multiple(models: list[libsbml.Model], rename=None)
Plots reactions per SBO Term in horizontal bar chart with stacked bars for the models
- Args:
models (list[libModel]): Models loaded with libSBML
rename (dict, optional): Rename model ids to custom names. Defaults to None.
- Returns:
plot: Pandas stacked barchart
- refinegems.comparison.plot_venn(models: list[cobra.Model], entity: str, perc: bool = False, rename=None)
Creates Venn diagram to show the overlap of model entities
- Args:
models (list[cobraModel]): Models loaded with cobrapy
entity (str): Compare on metabolite|reaction
perc (bool, optional): True if percentages should be used. Defaults to False.
rename (dict, optional): Rename model ids to custom names. Defaults to None.
- Returns:
plot: Venn diagram
- refinegems.comparison.simulate_all(models: list[cobra.Model], media: list[str], basis: str, anaerobic: bool) pandas.DataFrame
Does a run of growth simulation for multiple models on different media
- Args:
models (list[cobraModel]): Models loaded with cobrapy
media (list[str]): Media of interest (f.ex. LB, M9, …)
basis (str): Either default_uptake (adding metabs from default) or minimal_uptake (adding metabs from minimal medium)
anaerobic (bool): If True ‘EX_o2_e’ is set to 0.0 to simulate anaerobic conditions
- Returns:
pd.DataFrame: table containing the results of the growth simulation
refineGEMs.curate module
Functions to enable annotation of entities using a manual curated table
While working on GEMs the user might come across ill-annotated or missing metabolites, reactions and genes. This module aims to enable faster manual curation by allowing to edit an excel table directly which is used to update the given model. This module makes use of the cvterms module aswell.
- refinegems.curate.add_reactions_from_table(model: libsbml.Model, table: pandas.DataFrame, email: str) libsbml.Model
Wrapper function to use with table format given in data/manual_curation.xlsx, sheet gapfill: Adds all reactions with their info given in the table to the given model
- Args:
model (libModel): Model loaded with libSBML
table (pd-DataFrame): Table in format of sheet gapfill from manual_curation.xlsx located in the data folder
email (str): User Email to access the NCBI Entrez database
- Returns:
libModel: Modified model with new reactions
- refinegems.curate.update_annotations_from_others(model: libsbml.Model) libsbml.Model
Synchronizes metabolite annotations for core, periplasm and extracelullar
- Args:
model (libModel): Model loaded with libSBML
- Returns:
libModel: Modified model with synchronized annotations
- refinegems.curate.update_annotations_from_table(model: libsbml.Model, table: pandas.DataFrame) libsbml.Model
Wrapper function to use with table format given in data/manual_curation.xlsx, sheet metabs: Updates annotation of metabolites given in the table
- Args:
model (libModel): Model loaded with libSBML
table (pd-DataFrame): Table in format of sheet metabs from manual_curation.xlsx located in the data folder
- Returns:
libModel: Modified model with new annotations
refineGEMs.cvterms module
Helper module to work with annotations (CVTerms)
Stores dictionaries which hold information the identifiers.org syntax, has functions to add CVTerms to different entities and parse CVTerms.
- refinegems.cvterms.add_cv_term_genes(entry: str, db_id: str, gene: libsbml.GeneProduct, lab_strain: bool = False)
Adds CVTerm to a gene
- Args:
entry (str): Id to add as annotation
db_id (str): Database to which entry belongs. Must be in gene_db_dict.keys().
gene (GeneProduct): Gene to add CVTerm to
lab_strain (bool, optional): For locally sequenced strains the qualifiers are always HOMOLOG_TO. Defaults to False.
- refinegems.cvterms.add_cv_term_metabolites(entry: str, db_id: str, metab: libsbml.Species)
Adds CVTerm to a metabolite
- Args:
entry (str): Id to add as annotation
db_id (str): Database to which entry belongs. Must be in metabol_db_dict.keys().
metab (Species): Metabolite to add CVTerm to
- refinegems.cvterms.add_cv_term_pathways(entry: str, db_id: str, path: libsbml.Group)
Add CVTerm to a groups pathway
- Args:
entry (str): Id to add as annotation
db_id (str): Database to which entry belongs. Must be in pathway_db_dict.keys().
path (Group): Pathway to add CVTerm to
- refinegems.cvterms.add_cv_term_pathways_to_entity(entry: str, db_id: str, reac: libsbml.Reaction)
Add CVTerm to a reaction as OCCURS IN pathway
- Args:
entry (str): Id to add as annotation
db_id (str): Database to which entry belongss
reac (Reaction): Reaction to add CVTerm to
- refinegems.cvterms.add_cv_term_reactions(entry: str, db_id: str, reac: libsbml.Reaction)
Adds CVTerm to a reaction
- Args:
entry (str): Id to add as annotation
db_id (str): Database to which entry belongs. Must be in reaction_db_dict.keys().
reac (Reaction): Reaction to add CVTerm to
- refinegems.cvterms.add_cv_term_units(unit_id: str, unit: libsbml.Unit, relation: int)
Adds CVTerm to a unit
- Args:
unit_id (str): ID to add as URI to annotation
unit (Unit): Unit to add CVTerm to
relation (int): Provides model qualifier to be added
- refinegems.cvterms.generate_cvterm(qt, b_m_qt) libsbml.CVTerm
Generates a CVTerm with the provided qualifier & biological or model qualifier types
- Args:
qt (libSBML qualifier type): BIOLOGICAL_QUALIFIER or MODEL_QUALIFIER
b_m_qt (libSBML qualifier): BQM_IS, BQM_IS_HOMOLOG_TO, etc.
- Returns:
CVTerm: With provided qualifier & biological or model qualifier types
- refinegems.cvterms.get_id_from_cv_term(entity: libsbml.SBase, db_id: str) list[str]
Extract Id for a specific database from CVTerm
- Args:
entity (SBase): Species, Reaction, Gene, Pathway
db_id (str): Database of interest
- Returns:
list[str]: Ids of entity belonging to db_id
- refinegems.cvterms.print_cvterm(cvterm: libsbml.CVTerm)
Debug function: Prints the URIs contained in the provided CVTerm along with the provided qualifier & biological/model qualifier types
- Args:
cvterm (CVTerm): A libSBML CVTerm
refineGEMs.gapfill module
The gapfill module can be used either with KEGG were you only need the KEGG organism ID or with BioCyc or with both (Options: ‘KEGG’, ‘BioCyc’, ‘KEGG+BioCyc’). For how to obtain the BioCyc tables look into the documentation under ‘Filling gaps with refineGEMs’ > ‘Automated gap filling’.
Run times:
‘KEGG’: ~ 2h
‘BioCyc’: ~ 45mins - 1h
‘KEGG+BioCyc’: ~ 3 - 4h
- refinegems.gapfill.gap_analysis(model_libsbml: libsbml.Model, gapfill_params: dict[slice(<class 'str'>, <class 'str'>, None)], filename: str) pandas.DataFrame | tuple
- Main function to infer gaps in a model by comparing the locus tags of the GeneProducts | to KEGG/BioCyc/both
- Args:
model_libsbml (libModel): Model loaded with libSBML
gapfill_params (dict): Dictionary obtained from YAML file containing the parameter mappings
filename (str): Path to output file for gapfill analysis result
- Returns:
- Case ‘KEGG’
pd.DataFrame: Table containing the columns ‘bigg_id’ ‘locus_tag’ ‘EC’ ‘KEGG’ ‘name’ ‘GPR’
- Case ‘BioCyc’
- tuple: Five tables (1) - (4)
- pd.DataFrame: Gap fill statistics with the columns
‘Missing entity’ ‘Total’ ‘Have BiGG ID’ ‘Can be added’ ‘Notes’
- pd.DataFrame: Genes with the columns
‘locus_tag’ ‘protein_id’ ‘model_id’ ‘name’
- pd.DataFrame: Metabolites with the columns
‘bigg_id’ ‘name’ ‘BioCyc’ ‘compartment’ ‘Chemical Formula’ ‘InChI-Key’ ‘ChEBI’ ‘charge’
- pd.DataFrame: Reactions with the columns
‘bigg_id’ ‘name’ ‘BioCyc’ ‘locus_tag’ ‘Reactants’ ‘Products’ ‘EC’ ‘Fluxes’ ‘Spontaneous?’ ‘bigg_reaction’
- Case ‘KEGG+BioCyc’:
- tuple: Five tables (1)-(4) from output of ‘BioCyc’ & (5) from output of ‘KEGG’
-> Table reactions contains additionally column ‘KEGG’
- refinegems.gapfill.gapfill(model_libsbml: libsbml.Model, gapfill_params: dict[slice(<class 'str'>, <class 'str'>, None)], filename: str) tuple[pandas.DataFrame, libsbml.Model] | tuple[tuple, libsbml.Model]
- Main function to fill gaps in a model by comparing the locus tags of the GeneProducts to | KEGG/BioCyc/(Genbank) GFF file
- Args:
model_libsbml (libModel): Model loaded with libSBML
gapfill_params (dict): Dictionary obtained from YAML file containing the parameter mappings
filename (str): Path to output file for gapfill analysis result
gapfill_model_out (str): Path where gapfilled model should be written to
- Returns:
- tuple:
gap_analysis()table(s) (1) & libSBML model (2) pd.DataFrame|tuple(pd.DataFrame): Result from function
gap_analysis()libModel: Gap filled model
- tuple:
- refinegems.gapfill.gapfill_model(model_libsbml: libsbml.Model, gap_analysis_result: str | tuple) libsbml.Model
Main function to fill gaps in a model from a table
- Args:
model_libsbml (libModel): Model loaded with libSBML
gap_analysis_result (str|tuple): Path to Excel file from gap_analysis|Tuple of pd.DataFrames obtained from gap_analysis
- Returns:
libModel: Gap filled model
refineGEMs.growth module
Provides functions to simulate growth on any medium
Tailored to work with the media denoted in the local db, should work with any medium as long as its defined in a csv with ; as delimiter and BiGG Ids for the compounds. Use refinegems.io.load_medium_custom and hand this to the growth_one_medium_from_default or growth_one_medium_from_minimum function.
- refinegems.growth.find_additives(model: cobra.Model, base_medium: dict) pandas.DataFrame
Iterates through all exchanges to find metabolites that lead to a higher growth rate compared to the growth rate yielded on the base_medium
- Args:
model (cobraModel): Model loaded with COBRApy
base_medium (dict): Exchanges as keys and their flux bound as value (f.ex {‘EX_glc__D_e’ : 10.0})
- Returns:
pd.DataFrame: Exchanges sorted from highest to lowest growth rate improvement
- refinegems.growth.find_minimum_essential(medium: pandas.DataFrame, essential: list[str]) list[str]
Report metabolites necessary for growth and not in custom medium
- Args:
medium (pd.DataFrame): Dataframe with medium definition
essential (list[str]): Ids of all metabolites which lead to zero growth if blocked. Output of find_missing_essential.
- Returns:
list[str]: Ids of exchanges of metabolites not present in the medium but necessary for growth
- refinegems.growth.find_missing_essential(model: cobra.Model, growth_medium: dict, default_uptake: list[str], anaerobic: bool) list[str]
Report which exchange reactions are needed for growth, combines default uptake and valid new medium
- Args:
model (cobraModel): Model loaded with COBRApy
growth_medium (dict): Growth medium definition that can be used with the model. Output of modify_medium.
default_uptake (list[str]): Metabolites consumed in standard medium
anaerobic (bool): If True ‘EX_o2_e’ is set to 0.0 to simulate anaerobic conditions
- Returns:
list[str]: Ids of exchanges of all metabolites which lead to zero growth if blocked
- refinegems.growth.get_all_minimum_essential(model: cobra.Model, media: list[str]) pandas.DataFrame
Returns metabolites necessary for growth and not in media
- Args:
model (cobraModel): Model loaded with COBRApy
media (list[str]): Containing the names of all media for which the growth essential metabolites not contained in the media should be returned
- Returns:
pd.DataFrame: information on different media which metabs are missing
- refinegems.growth.get_default_secretion(model: cobra.Model) list[str]
Checks fluxes after FBA, if positive the metabolite is produced
- Args:
model (cobraModel): Model loaded with COBRApy
- Returns:
list[str]: BiGG Ids of produced metabolites
- refinegems.growth.get_default_uptake(model: cobra.Model) list[str]
Determines which metabolites are used in the standard medium
- Args:
model (cobraModel): Model loaded with COBRApy
- Returns:
list[str]: Metabolites consumed in standard medium
- refinegems.growth.get_essential_reactions(model: cobra.Model) list[str]
Knocks out each reaction, if no growth is detected the reaction is seen as essential
- Args:
model (cobraModel): Model loaded with COBRApy
- Returns:
list[str]: BiGG Ids of essential reactions
- refinegems.growth.get_essential_reactions_via_bounds(model: cobra.Model) list[str]
Knocks out reactions by setting their bounds to 0, if no growth is detected the reaction is seen as essential
- Args:
model (cobraModel): Model loaded with COBRApy
- Returns:
list[str]: BiGG Ids of essential reactions
- refinegems.growth.get_growth_selected_media(model: cobra.Model, media: list[str], basis: str, anaerobic: bool) pandas.DataFrame
Simulates growth on all given media
- Args:
model (cobraModel): Model loaded with COBRApy
media (list[str]): Ids of media to simulate on
basis (str): Either default_uptake (adding metabs from default) or minimal_uptake (adding metabs from minimal medium)
anaerobic (bool): If True ‘EX_o2_e’ is set to 0.0 to simulate anaerobic conditions
- Returns:
pd.DataFrame: Information on growth behaviour on given media
- refinegems.growth.get_minimal_uptake(model: cobra.Model) list[str]
Determines which metabolites are used in a minimal medium
- Args:
model (cobraModel): Model loaded with COBRApy
- Returns:
list[str]: Metabolites consumed in minimal medium
- refinegems.growth.get_missing_exchanges(model: cobra.Model, medium: pandas.DataFrame) list[str]
Look for exchange reactions needed by the medium but not in the model
- Args:
model (cobraModel): Model loaded with COBRApy
medium (pd.DataFrame): Dataframe with medium definition
- Returns:
list[str]: Ids of all exchanges missing in the model but given in medium
- refinegems.growth.growth_one_medium_from_default(model: cobra.Model, medium: pandas.DataFrame, anaerobic: bool) pandas.DataFrame
Simulates growth on given medium, adding missing metabolites from the default uptake
- Args:
model (cobraModel): Model loaded with COBRApy
medium (pd.DataFrame): Dataframe with medium definition
anaerobic (bool): If True ‘EX_o2_e’ is set to 0.0 to simulate anaerobic conditions
- Returns:
pd.DataFrame: Information on growth behaviour on given medium
- refinegems.growth.growth_one_medium_from_minimal(model: cobra.Model, medium: pandas.DataFrame, anaerobic: bool) pandas.DataFrame
Simulates growth on given medium, adding missing metabolites from a minimal uptake
- Args:
model (cobraModel): Model loaded with COBRApy
medium (pd.DataFrame): Dataframe with medium definition
anaerobic (bool): If True ‘EX_o2_e’ is set to 0.0 to simulate anaerobic conditions
- Returns:
pd.DataFrame: Information on growth behaviour on given medium
- refinegems.growth.modify_medium(medium: pandas.DataFrame, missing_exchanges: list[str]) dict
Helper function: Remove exchanges from medium that are not in the model to avoid KeyError
- Args:
medium (pd.DataFrame): Dataframe with medium definition
missing_exchanges (list): Ids of exchanges not in the model
- Returns:
dict: Growth medium definition that can be used with the model (f.ex {‘EX_glc__D_e’ : 10.0})
- refinegems.growth.set_fluxes_to_simulate(reaction: cobra.Reaction) cobra.Reaction
Helper function: Set flux bounds to -1000.0 and 1000.0 to enable model simulation with growth_one_medium_from_minimal/default
- Args:
reaction (Reaction): Reaction with unusable flux bounds
- Returns:
Reaction: Reaction with usable flux bounds
- refinegems.growth.simulate_minimum_essential(model: cobra.Model, growth_medium: dict, minimum: list[str], anaerobic: bool) float
Simulate growth with custom medium plus necessary uptakes
- Args:
model (cobraModel): Model loaded with COBRApy
growth_medium (dict): Growth medium definition that can be used with the model. Output of modify_medium.
minimum (list[str]): Ids of exchanges of metabolites not present in the medium but necessary for growth. Output of find_minimum_essential.
anaerobic (bool): If True ‘EX_o2_e’ is set to 0.0 to simulate anaerobic conditions
- Returns:
float: Growth value in mmol per (gram dry weight) per hour
refineGEMs.investigate module
Provides functions to investigate the model and test with MEMOTE
These functions enable simple testing of any model using MEMOTE and access to its number of reactions, metabolites and genes.
- refinegems.investigate.get_egc(model: cobra.Model) pandas.DataFrame
Energy-generating cycles represent thermodynamically infeasible states. Charging of energy metabolites without any energy source causes such cycles. Detection method is based on (Fritzemeier et al., 2017)
- Args:
model (cobraModel): Model loaded with COBRApy
- Returns:
pd.DataFrame: Table with possible EGCs
- refinegems.investigate.get_mass_charge_unbalanced(model: cobra.Model) tuple[list[str], list[str]]
Creates lists of mass and charge unbalanced reactions,vwithout exchange reactions since they are unbalanced per definition
- Args:
model (cobraModel): Model loaded with COBRApy
- Returns:
tuple: Lists of reactions that might cause errors (1) & (2) (1) list: List of mass unbalanced reactions (2) list: List of charge unbalanced reactions
- refinegems.investigate.get_memote_score(memote_report: dict) float
Extracts MEMOTE score from report
- Args:
memote_report (dict): Output from run_memote.
- Returns:
float: MEMOTE score
- refinegems.investigate.get_metabs_with_one_cvterm(model: libsbml.Model) list[str]
Reports metabolites which have only one annotation, can be used as basis for further annotation research
- Args:
model (libModel): Model loaded with libSBML
- Returns:
list: Metabolite Ids with only one annotation
- refinegems.investigate.get_model_info(modelpath: str) pandas.DataFrame
Reports core information of given model
- Args:
modelpath (str): Path to model file
- Returns:
pd.DataFrame: Overview on model parameters
- refinegems.investigate.get_orphans_deadends_disconnected(model: cobra.Model) tuple[list[str], list[str], list[str]]
Uses MEMOTE functions to extract orphans, deadends and disconnected metabolites
- Args:
model (cobraModel): Model loaded with COBRApy
- Returns:
- tuple: Lists of metabolites that might cause errors (1) - (3)
list: List of orphans
list: List of deadends
list: List of disconnected metabolites
- refinegems.investigate.get_reactions_per_sbo(model: libsbml.Model) dict
Counts number of reactions of all SBO Terms present
- Args:
model (libModel): Model loaded with libSBML
- Returns:
dict: SBO Term as keys and number of reactions as values
- refinegems.investigate.initial_analysis(model: libsbml.Model) tuple[str, int, int, int]
Extracts most important numbers of GEM
- Args:
model (libModel): Model loaded with libSBML
- Returns:
- tuple: Model name (1) & corresponding amounts of entities (2) - (4)
str: Name of model
int: Number of reactions
int: Number of metabolites
int: Number of genes
- refinegems.investigate.parse_reaction(eq: str, model: cobra.Model) dict
Parses a reaction equation string to dictionary
- Args:
eq (str): Equation of a reaction
model (cobraModel): Model loaded with COBRApy
- Returns:
dict: Metabolite Ids as keys and their coefficients as values (negative = educts, positive = products)
- refinegems.investigate.plot_rea_sbo_single(model: libsbml.Model)
Plots reactions per SBO Term in horizontal bar chart
- Args:
model (libModel): Model loaded with libSBML
- Returns:
plot: Pandas Barchart
- refinegems.investigate.run_memote(model: cobra.Model) dict
Runs MEMOTE to obtain report as dict
- Args:
model (cobraModel): Model loaded with COBRApy
- Returns:
dict: MEMOTE report as json in dict format
- refinegems.investigate.run_memote_sys(model: cobra.Model)
Run MEMOTE on the local linux machine
- Args:
model (cobraModel): Model loaded with COBRApy
refineGEMs.io module
Provides functions to load and write models, media definitions and the manual annotation table
Depending on the application the model needs to be loaded with cobra (memote) or with libSBML (activation of groups). The media definitions are denoted in a csv within the data folder of this repository, thus the functions will only work if the user clones the repository. The manual_annotations table has to follow the specific layout given in the data folder in order to work with this module.
- refinegems.io.load_a_table_from_database(table_name_or_query: str) pandas.DataFrame
- Loads the table for which the name is provided or a table containing all rows for which the query evaluates to | true from the refineGEMs database (‘data/database/data.db’)
- Args:
table_name_or_query (str): Name of a table contained in the database ‘data.db’/ a SQL query
- Returns:
pd.DataFrame: Containing the table for which the name was provided from the database ‘data.db’
- refinegems.io.load_all_media_from_db(mediumpath: str) pandas.DataFrame
Helper function to extract media definitions from media_db.csv
- Args:
mediumpath (str): Path to csv file with medium database
- Returns:
pd.DataFrame: Table from csv with metabs added as BiGG_EX exchange reactions
- refinegems.io.load_document_libsbml(modelpath: str) libsbml.SBMLDocument
Loads model document using libSBML
- Args:
modelpath (str): Path to GEM
- Returns:
SBMLDocument: Loaded document by libSBML
- refinegems.io.load_manual_annotations(tablepath: str = 'data/manual_curation.xlsx', sheet_name: str = 'metab') pandas.DataFrame
Loads metabolite sheet from manual curation table
- Args:
tablepath (str): Path to manual curation table. Defaults to ‘data/manual_curation.xlsx’.
sheet_name (str): Sheet name for metabolite annotations. Defaults to ‘metab’.
- Returns:
pd.DataFrame: Table containing specified sheet from Excel file
- refinegems.io.load_manual_gapfill(tablepath: str = 'data/manual_curation.xlsx', sheet_name: str = 'gapfill') pandas.DataFrame
Loads gapfill sheet from manual curation table
- Args:
tablepath (str): Path to manual curation table. Defaults to ‘data/manual_curation.xlsx’.
sheet_name (str): Sheet name for reaction gapfilling. Defaults to ‘gapfill’.
- Returns:
pd.DataFrame: Table containing sheet with name ‘gapfill’|specified sheet_name from Excel file
- refinegems.io.load_medium_custom(mediumpath: str) pandas.DataFrame
Helper function to read medium csv
- Args:
mediumpath (str): path to csv file with medium
- Returns:
pd.DataFrame: Table of csv
- refinegems.io.load_medium_from_db(mediumname: str) pandas.DataFrame
Wrapper function to extract subtable for the requested medium from the database ‘data.db’
- Args:
mediumname (str): Name of medium to test growth on
- Returns:
pd.DataFrame: Table containing composition for one medium with metabs added as BiGG_EX exchange reactions
- refinegems.io.load_model_cobra(modelpath: str) cobra.Model
Loads model using COBRApy
- Args:
modelpath (str): Path to GEM
- Returns:
cobraModel: Loaded model by COBRApy
- refinegems.io.load_model_libsbml(modelpath: str) libsbml.Model
Loads model using libSBML
- Args:
modelpath (str): Path to GEM
- Returns:
libModel: loaded model by libSBML
- refinegems.io.load_multiple_models(models: list[str], package: str) list
Loads multiple models into a list
- Args:
models (list): List of paths to models
package (str): COBRApy|libSBML
- Returns:
list: List of model objects loaded with COBRApy|libSBML
- refinegems.io.parse_dict_to_dataframe(str2list: dict) pandas.DataFrame
- Parses dictionary of form {str: list} & | Transforms it into a table with a column containing the strings and a column containing the lists
- Args:
str2list (dict): Dictionary mapping strings to lists
- Returns:
pd.DataFrame: Table with column containing the strings and column containing the lists
- refinegems.io.parse_fasta_headers(filepath: str, id_for_model: bool = False) pandas.DataFrame
Parses FASTA file headers to obtain:
the protein_id
and the model_id (like it is obtained from CarveMe)
corresponding to the locus_tag
- Args:
filepath (str): Path to FASTA file
id_for_model (bool): True if model_id similar to autogenerated GeneProduct ID should be contained in resulting table
- Returns:
pd.DataFrame: Table containing the columns locus_tag, Protein_id & Model_id
- refinegems.io.parse_gff_for_gp_info(gff_file: str) pandas.DataFrame
Parses gff file of organism to find gene protein reactions based on locus tags
- Args:
gff_file (str): Path to gff file of organism of interest
- Returns:
pd.DataFrame: Table containing mapping from locus tag to GPR
- refinegems.io.save_user_input(configpath: str) dict[slice(<class 'str'>, <class 'str'>, None)]
This aims to collect user input from the command line to create a config file, will also save the user input to a config if no config was given
- Args:
configpath (str): Path to config file if present
- Returns:
dict: Either loaded config file or created from user input
- refinegems.io.search_ncbi_for_gpr(locus: str) str
Fetches protein name from NCBI
- Args:
locus (str): NCBI compatible locus_tag
- Returns:
str: Protein name|description
- refinegems.io.search_sbo_label(sbo_number: str) str
Looks up the SBO label corresponding to a given SBO Term number
- Args:
sbo_number (str): Last three digits of SBO-Term as str
- Returns:
str: Denoted label for given SBO Term
- refinegems.io.validate_libsbml_model(model: libsbml.Model) int
Debug method: Validates a libSBML model with the libSBML validator Args:
model (libModel): A libSBML model
- Returns:
int: Integer specifying if vaidate was successful or not
- refinegems.io.write_report(dataframe: pandas.DataFrame, filepath: str)
Writes reports stored in dataframes to xlsx file
- Args:
dataframe (pd.DataFrame): Table containing output
filepath (str): Path to file with filename
- refinegems.io.write_to_file(model: libsbml.Model, new_filename: str)
Writes modified model to new file
- Args:
model (libModel): Model loaded with libSBML
new_filename (str): Filename|Path for modified model
refineGEMs.modelseed module
Reports mismatches in charges and formulae based on ModelSEED
Extracts ModelSEED data from a given tsv file, extracts all metabolites from a given model. Both lists of metabolites are compared by charge and formula.
- refinegems.modelseed.compare_model_modelseed(model_charges: pandas.DataFrame, modelseed_charges: pandas.DataFrame) pandas.DataFrame
Compares tables with charges / formulae from model & modelseed
- Args:
model_charges (pd.DataFrame): Charges and formulae of model metabolites. Output of get_model_charges.
modelseed_charges (pd.DataFrame): Charges and formulae of ModelSEED metabolites. Output of get_modelseed_charges.
- Returns:
pd.DataFrame: Table containing info whether charges / formulae match
- refinegems.modelseed.compare_to_modelseed(model: cobra.Model) tuple[pandas.DataFrame, pandas.DataFrame]
Executes all steps to compare model metabolites to ModelSEED metabolites
- Args:
model (cobraModel): Model loaded with COBRApy
- Returns:
- tuple: Tables with charge (1) & formula (2) mismatches
pd.DataFrame: Table with charge mismatches
pd.DataFrame: Table with formula mismatches
- refinegems.modelseed.get_charge_mismatch(df_comp: pandas.DataFrame) pandas.DataFrame
Extracts metabolites with charge mismatch of model & modelseed
- Args:
df_comp (pd.DataFrame): Charge and formula mismatches. Output from compare_model_modelseed.
- Returns:
pd.DataFrame: Table containing metabolites with charge mismatch
- refinegems.modelseed.get_compared_formulae(formula_mismatch: pandas.DataFrame) pandas.DataFrame
Compare formula by atom pattern
- Args:
formula_mismatch (pd.DataFrame): Table with column containing atom comparison. Output from get_formula_mismatch.
- Returns:
pd.DataFrame: table containing metabolites with formula mismatch
- refinegems.modelseed.get_formula_mismatch(df_comp: pandas.DataFrame) pandas.DataFrame
Extracts metabolites with formula mismatch of model & modelseed
- Args:
df_comp (pd.DataFrame): Charge and formula mismatches. Output from compare_model_modelseed.
- Returns:
pd.DataFrame: Table containing metabolites with formula mismatch
- refinegems.modelseed.get_model_charges(model: cobra.Model) pandas.DataFrame
Extracts all metabolites from model
- Args:
model (cobraModel): Model loaded with COBRApy
- Returns:
pd.DataFrame: Table containing charges and formulae of model metabolites
- refinegems.modelseed.get_modelseed_charges(modelseed_compounds: pandas.DataFrame) pandas.DataFrame
Extract table with BiGG, charges and formulae
- Args:
modelseed_compounds (pd.DataFrame): ModelSEED data. Output from get_modelseed_compounds.
- Returns:
pd.DataFrame: Table containing charges and formulae of ModelSEED metabolites
- refinegems.modelseed.get_modelseed_compounds() pandas.DataFrame
Extracts compounds from ModelSEED which have BiGG Ids
- Returns:
pd.DataFrame: Table containing ModelSEED data
refineGEMs.pathways module
Provides functions for adding KEGG reactions as Group Pathways
If your organism occurs in the KEGG database, extract the KEGG reaction ID from the annotations of your reactions and identify, in which KEGG pathways this reaction occurs. Add all KEGG pathways for a reaction then as annotations with the biological qualifier ‘OCCURS_IN’ to the respective reaction.
- refinegems.pathways.add_kegg_pathways(model, kegg_pathways)
Add KEGG reactions as BQB_OCCURS_IN
- Args:
model (libModel): Model loaded with libSBML. Output of load_model_enable_groups.
kegg_pathways (dict): Reaction Id as key and Kegg Pathway Id as value. Output of extract_kegg_pathways.
- Returns:
libsbml-model: modified model with Kegg pathways
- refinegems.pathways.create_pathway_groups(model: libsbml.Model, pathway_groups)
Use group module to add reactions to Kegg pathway
- Args:
model (libModel): Model loaded with libSBML. Output of load_model_enable_groups.
pathway_groups (dict): Kegg Pathway Id as key and reactions Ids as values. Output of get_pathway_groups.
- Returns:
libModel: modified model with groups for pathways
- refinegems.pathways.extract_kegg_pathways(kegg_reactions: dict) dict
Finds pathway for reactions in model with KEGG Ids, accesses KEGG API, uses tqdm to report progres to user
- Args:
kegg_reactions (dict): Reaction Id as key and Kegg Id as value. Output[0] from extract_kegg_reactions.
- Returns:
dict: Reaction Id as key and Kegg Pathway Id as value
- refinegems.pathways.extract_kegg_reactions(model: libsbml.Model) tuple[dict, list]
Extract KEGG Ids from reactions
- Args:
model (libModel): Model loaded with libSBML. Output of load_model_enable_groups.
- Returns:
- tuple: Dictionary ‘reaction_id’: ‘KEGG_id’ (1) & List of reactions without KEGG Id (2)
dict: Reaction Id as key and Kegg Id as value
list: Ids of reactions without KEGG annotation
- refinegems.pathways.get_pathway_groups(kegg_pathways)
Group reaction into pathways
- Args:
kegg_pathways (dict): Reaction Id as key and Kegg Pathway Id as value. Output of extract_kegg_pathways.
- Returns:
dict: Kegg Pathway Id as key and reactions Ids as values
- refinegems.pathways.kegg_pathways(modelpath: str) tuple[libsbml.Model, list[str]]
Executes all steps to add KEGG pathways as groups
- Args:
modelpath (str): Path to GEM
- Returns:
- tuple: libSBML model (1) & List of reactions without KEGG Id (2)
libModel: Modified model with Pathways as groups
list: Ids of reactions without KEGG annotation
- refinegems.pathways.load_model_enable_groups(modelpath: str) libsbml.Model
Loads model as document using libSBML and enables groups extension
- Args:
modelpath (str): Path to GEM
- Returns:
libModel: Model loaded with libSBML
refineGEMs.polish module
Can be used to polish a model (created with CarveMe v.1.5.1)
The newer version of CarveMe leads to some irritations in the model, these scripts enable for example the addition of BiGG Ids to the annotations as well as a correct formatting of the annotations.
- refinegems.polish.add_compartment_structure_specs(model: libsbml.Model)
- Adds the required specifications for the compartment structure | if not set (size & spatial dimension)
- Args:
model (libModel): Model loaded with libSBML
- refinegems.polish.add_fba_units(model: libsbml.Model)
- Adds: | - mmol per gDW per h | - mmol per gDW | - hour (h) | - femto litre (fL) | | to the list of unit definitions (needed for FBA)
- Args:
model (libModel): Model loaded with libSBML
- refinegems.polish.add_metab(entity_list: list[libsbml.Species], id_db: str)
- Adds the ID of metabolites as URI to the annotation field | For a VMH model, additionally, the corresponding BiGG IDs are added! | (Currently, only BiGG & VMH IDs supported!)
- Args:
entity_list (list): libSBML ListOfSpecies
id_db (str): Name of the database of the IDs contained in a model
- refinegems.polish.add_reac(entity_list: list[libsbml.Reaction], id_db: str)
- Adds the ID of reactions as URI to the annotation field(Currently, only BiGG & VMH IDs supported!)
- Args:
entity_list (list): libSBML ListOfReactions
id_db (str): Name of the database of the IDs contained in a model
- refinegems.polish.add_uri_set(entity: libsbml.SBase, qt, b_m_qt, uri_set: sortedcontainers.SortedSet.<class 'str'>) list[str]
Add a complete URI set to the provided CVTerm
- Args:
entity (SBase): A libSBML SBase object like model, GeneProduct, etc.
qt: A libSBML qualifier type: BIOLOGICAL_QUALIFIER|MODEL_QUALIFIER
b_m_qt: A libSBML biological or model qualifier type like BQB_IS|BQM_IS
uri_set (SortedSet[str]): SortedSet containing URIs
- refinegems.polish.change_all_qualifiers(model: libsbml.Model, lab_strain: bool) libsbml.Model
Wrapper function to change qualifiers of all entities at once
- Args:
model (libModel): Model loaded with libSBML
lab_strain (bool): True if the strain was sequenced in a local lab
- Returns:
libModel: Model with all qualifiers updated to be MIRIAM compliant
- refinegems.polish.change_qualifier_per_entity(entity: libsbml.SBase, new_qt, new_b_m_qt, specific_db_prefix: str | None = None) list
Updates Qualifiers to be MIRIAM compliant for an entity
- Args:
entity (SBase): A libSBML SBase object like model, GeneProduct, etc.
new_qt (Qualifier): A libSBML qualifier type: BIOLOGICAL_QUALIFIER|MODEL_QUALIFIER
new_b_m_qt (QualifierType): A libSBML biological or model qualifier type like BQB_IS|BQM_IS
specific_db_prefix (str): Has to be set if only for a specific database the qualifier type should be changed. Can be ‘kegg.genes’, ‘biocyc’, etc.
- Returns:
list: CURIEs that are not MIRIAM compliant
- refinegems.polish.change_qualifiers(model: libsbml.Model, entity_type: str, new_qt, new_b_m_qt, specific_db_prefix: str | None = None) libsbml.Model
Updates Qualifiers to be MIRIAM compliant for an entity type of a given model
- Args:
model (libModel): Model loaded with libSBML
entity_type (str): Any string of the following: model|compartment|metabolite|parameter|reaction|unit definition|unit|gene product|group
new_qt (Qualifier): A libSBML qualifier type: BIOLOGICAL_QUALIFIER|MODEL_QUALIFIER
new_b_m_qt (QualifierType): A libSBML biological or model qualifier type like BQB_IS|BQM_IS
specific_db_prefix (str): Has to be set if only for a specific database the qualifier type should be changed. Can be ‘kegg.genes’, ‘biocyc’, etc.
- Returns:
libModel: Model with changed qualifier for given entity type
- refinegems.polish.create_fba_units(model: libsbml.Model) list[libsbml.UnitDefinition]
Creates all fba units required for a constraint-based model
- Args:
model (libModel): Model loaded with libSBML
- Returns:
list: List of libSBML UnitDefinitions
- refinegems.polish.create_unit(model_specs: tuple[int], meta_id: str, kind: str, e: int, m: int, s: int, uri_is: str = '', uri_idf: str = '') libsbml.Unit
Creates unit for SBML model according to arguments
- Args:
model_specs (tuple): Level & Version of SBML model
meta_id (str): Meta ID for unit (Neccessary for URI)
kind (str): Unit kind constant (see libSBML for available constants)
e (int): Exponent of unit
m (int): Multiplier of unit
s (int): Scale of unit
uri_is (str): URI supporting the specified unit
uri_idf (str): URI supporting the derived from unit
- Returns:
Unit: libSBML unit object
- refinegems.polish.create_unit_definition(model_specs: tuple[int], identifier: str, name: str, units: list[libsbml.Unit]) libsbml.UnitDefinition
Creates unit definition for SBML model according to arguments
- Args:
model_specs (tuple): Level & Version of SBML model
identifier (str): Identifier for the defined unit
name (str): Full name of the defined unit
units (list): All units the defined unit consists of
- Returns:
UnitDefinition: libSBML unit definition object
- refinegems.polish.cv_ncbiprotein(gene_list, email, protein_fasta: str, lab_strain: bool = False)
Adds NCBI Id to genes as annotation
- Args:
gene_list (list): libSBML ListOfGenes
email (str): User Email to access the Entrez database
protein_fasta (str): The path to the CarveMe protein.fasta input file
- lab_strain (bool): Needs to be set to True if strain was self-annotated
and/or the locus tags in the CarveMe input file should be kept
- refinegems.polish.cv_notes_metab(species_list: list[libsbml.Species])
- Checks the notes field for information which should be in the annotation field | removes entry from notes and adds it as URL to the CVTerms of a metabolite
- Args:
species_list (list): libSBML ListOfSpecies
- refinegems.polish.cv_notes_reac(reaction_list: list[libsbml.Reaction])
- Checks the notes field for information which should be in the annotation field | removes entry from notes and adds it as URL to the CVTerms of a reaction
- Args:
reaction_list (list): libSBML ListOfReactions
- refinegems.polish.generate_miriam_compliant_uri_set(prefix2id: sortedcontainers.SortedDict.slice(<class 'str'>, sortedcontainers.SortedSet.<class 'str'>, None)) sortedcontainers.SortedSet.<class 'str'>
Generate a set of complete MIRIAM compliant URIs from the provided prefix to identifier mapping
- Args:
prefix2id (SortedDict[str: SortedSet[str]]): Dictionary containing a mapping from database prefixes to their respective identifier sets
- Returns:
SortedSet: Sorted set containing complete URIs
- refinegems.polish.generate_uri_set_with_specific_pattern(prefix2id: sortedcontainers.SortedDict.slice(<class 'str'>, sortedcontainers.SortedSet.<class 'str'>, None), new_pattern: bool) sortedcontainers.SortedSet.<class 'str'>
Generate a set of complete URIs from the provided prefix to identifier mapping
- Args:
prefix2id (SortedDict[str: SortedSet[str]]): Dictionary containing a mapping from database prefixes to their respective identifier sets
new_pattern (bool): True if new pattern is wanted, otherwise False
- Returns:
SortedSet: Sorted set containing complete URIs
- refinegems.polish.get_set_of_curies(uri_list: list[str]) tuple[sortedcontainers.SortedDict.slice(<class 'str'>, sortedcontainers.SortedSet.<class 'str'>, None), list[str]]
- Gets a list of URIs | & maps the database prefixes to their respective identifier sets
- Args:
uri_list (list[str]): List containing CURIEs
- Returns:
- tuple: Two dictionaries (1) & (2)
SortedDict: Sorted dictionary mapping database prefixes from the provided CURIEs to their respective identifier sets also provided by the CURIEs
list: List of CURIEs that are invalid according to bioregistry
- refinegems.polish.improve_uri_per_entity(entity: libsbml.SBase, bioregistry: bool, new_pattern: bool) tuple[list[str], list[str]]
Helper function: Removes duplicates & changes pattern according to new_pattern
- Args:
entity (SBase): A libSBML SBase object, either a model or an entity
bioregistry (bool): Specifies whether the URIs should be changed with the help of bioregistry to be MIRIAM compliant or changed according to new or old pattern
new_pattern (bool): True if new pattern is wanted, otherwise False
- Returns:
- tuple: Two lists (1) & (2)
list: List of all collected invalid annotations of one entity
list: List of all collected invalid CURIEs of one entity
- refinegems.polish.improve_uris(entities: libsbml.SBase, bioregistry: bool, new_pattern: bool) tuple[dict[slice(<class 'str'>, list[str], None)], dict[slice(<class 'str'>, list[str], None)]]
Removes duplicates & changes pattern according to bioregistry or new_pattern
- Args:
entities (SBase): A libSBML SBase object, either a model or a list of entities
bioregistry (bool): Specifies whether the URIs should be changed with the help of bioregistry to be MIRIAM compliant or changed according to new or old pattern
new_pattern (bool): True if new pattern is wanted, otherwise False
- Returns:
- tuple: Two dictionnaries (1) & (2)
dictionary: Mapping of entity identifier to list of corresponding not MIRIAM compliant annotations
dictionary: Mapping of entity identifier to list of corresponding invalid CURIEs
- refinegems.polish.polish(model: libsbml.Model, email: str, id_db: str, protein_fasta: str, lab_strain: bool, path: str) libsbml.Model
- Completes all steps to polish a model | (Tested for models having either BiGG or VMH identifiers.)
- Args:
model (libModel): model loaded with libSBML
email (str): E-mail for Entrez
id_db (str): Main database where identifiers in model come from
protein_fasta (str): File used as input for CarveMe
lab_strain (bool): True if the strain was sequenced in a local lab
path (str): Output path for incorrect annotations file(s)
- Returns:
libModel: Polished libSBML model
- refinegems.polish.polish_annotations(model: libsbml.Model, bioregistry: bool, new_pattern: bool, filename: str) libsbml.Model
- Polishes all annotations in a model such that no duplicates are present | & the same pattern is used for all CURIEs
- Args:
model (libModel): Model loaded with libSBML
bioregistry (bool): Specifies whether the URIs should be changed with the help of bioregistry to be MIRIAM compliant or changed according to new or old pattern
new_pattern (bool): True if new pattern is wanted, otherwise False
filename (str): Path to output file for invalid CURIEs detected by improve_uris
- Returns:
libModel: libSBML model with polished annotations
- refinegems.polish.polish_entities(entity_list: list, metabolite: bool)
Sets boundary condition and constant if not set for a metabolite
- Args:
entity_list (list): libSBML ListOfSpecies or ListOfReactions
metabolite (boolean): flag to determine whether entity = metabolite
- refinegems.polish.print_UnitDefinitions(contained_unit_defs: list[libsbml.UnitDefinition])
Prints a list of libSBML UnitDefinitions as XMLNodes
- Args:
contained_unit_defs (list): List of libSBML UnitDefinition objects
- refinegems.polish.print_remaining_UnitDefinitions(model: libsbml.Model, list_of_fba_units: list[libsbml.UnitDefinition])
Prints UnitDefinitions from the model that were removed as these were not contained in the list_of_fba_units
- Args:
model (libModel): Model loaded with libSBML
list_of_fba_units (list): List of libSBML UnitDefinitions
- refinegems.polish.set_default_units(model: libsbml.Model)
Sets default units of model
- Args:
model (libModel): Model loaded with libSBML
- refinegems.polish.set_initial_amount(model: libsbml.Model)
Sets initial amount to all metabolites if not already set or if initial concentration is not set
- Args:
model (libModel): Model loaded with libSBML
- refinegems.polish.set_units(model: libsbml.Model)
Sets units of parameters in model
- Args:
model (libModel): Model loaded with libSBML
refineGEMs.sboann module
Provides functions to automate the addition of SBO terms to the model
Script written by Elisabeth Fritze in her bachelor thesis. Modified by Gwendolyn O. Gusak during her master thesis. Commented by Famke Bäuerle and extended by Nantia Leonidou.
It is splitted into a lot of small functions which are all annotated, however when using it for SBO-Term annotation it only makes sense to run the “main” function: sbo_annotation(model_libsbml, database_user, database_name) if you want to continue with the model. The smaller functions might be useful if special information is needed for a reaction without the context of a bigger model or when the automated annotation fails for some reason.
- refinegems.sboann.addSBOforCompartments(model)
- refinegems.sboann.addSBOforGenes(model)
- refinegems.sboann.addSBOforGroups(model)
- refinegems.sboann.addSBOforMetabolites(model)
- refinegems.sboann.addSBOforModel(model)
- refinegems.sboann.addSBOforParameters(model)
- refinegems.sboann.addSBOfromDB(reac: libsbml.Reaction, cur) bool
Adds SBO term based on bigg id of a reaction
- Args:
reac (Reaction): Reaction from sbml model
cur (sqlite3.connect.cursor): Used to access the sqlite3 database
- Returns:
bool: True if SBO Term was changed
- refinegems.sboann.addSBOviaEC(reac: libsbml.Reaction, cur)
Adds SBO terms based on EC numbers given in the annotations of a reactions
- Args:
reac (Reaction): Reaction from sbml model
cur (sqlite3.connect.cursor): Used to access the sqlite3 database
- refinegems.sboann.checkAcetylationViaEC(reac: libsbml.Reaction)
Tests if reac is acetylation by its EC-Code and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkActiveTransport(reac: libsbml.Reaction)
Tests if reac is active transport (uses atp/pep) and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkBiomass(reac: libsbml.Reaction)
Tests if reac is biomass / growth and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkCoTransport(reac: libsbml.Reaction)
Tests if reac is co-transport and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkDeamination(reac: libsbml.Reaction)
Tests if reac is deamination and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkDeaminationViaEC(reac: libsbml.Reaction)
Tests if reac is deamination by its EC-Code and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkDecarbonylation(reac: libsbml.Reaction)
Tests if reac is decarbonylation and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkDecarboxylation(reac: libsbml.Reaction)
Tests if reac is decarboxylation and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkDecarboxylationViaEC(reac: libsbml.Reaction)
Tests if reac is decarboxylation by its EC-Code and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkDemand(reac: libsbml.Reaction)
Tests if reac is demand and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkExchange(reac: libsbml.Reaction)
Tests if reac is exchange and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkGlycosylation(reac: libsbml.Reaction)
Tests if reac is glycosylation and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkGlycosylationViaEC(reac: libsbml.Reaction)
Tests if reac is glycosylation by its EC-Code and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkHydrolysisViaEC(reac: libsbml.Reaction)
Tests if reac is hydrolysis by its EC-Code and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkIsomerisationViaEC(reac: libsbml.Reaction)
Tests if reac is isomerisation by its EC-Code and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkMethylationViaEC(reac: libsbml.Reaction)
Tests if reac is methylation by its EC-Code and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkPassiveTransport(reac: libsbml.Reaction)
Tests if reac is passive transport and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkPhosphorylation(reac: libsbml.Reaction)
Tests if reac is phosphorylase / kinase and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkRedox(reac: libsbml.Reaction)
Tests if reac is redox and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkRedoxViaEC(reac: libsbml.Reaction)
Tests if reac is redox by its EC-Code and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkSink(reac: libsbml.Reaction)
Tests if reac is sink and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.checkTransaminationViaEC(reac: libsbml.Reaction)
Tests if reac is transamination by its EC-Code and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.getCompartmentDict(reac: libsbml.Reaction)
sorts metabolites by compartment
- Args:
reac (Reaction): Reaction from sbml model
- Returns:
dict: compartment as key and metabolites as values
- refinegems.sboann.getCompartmentFromSpeciesRef(speciesReference: libsbml.SpeciesReference) libsbml.Compartment
Extracts compartment from a species by its reference
- Args:
speciesReference (SpeciesReference): Reference to species
- Returns:
Compartment: Compartment which the species lives in
- refinegems.sboann.getCompartmentList(reac: libsbml.Reaction)
Extracts compartments of metabolites
- Args:
reac (Reaction): Reaction from sbml model
- Returns:
set: compartment information of all metabolites
- refinegems.sboann.getCompartmentlessMetaboliteIds(reac: libsbml.Reaction)
Extracts metabolites which have no compartment information
- Args:
reac (Reaction): Reaction from sbml model
- Returns:
list: all metabolites which have no compartment
- refinegems.sboann.getCompartmentlessProductIds(reac: libsbml.Reaction)
Extracts products which have no compartment information
- Args:
reac (Reaction): Reaction from sbml model
- Returns:
list: products (metabolites) without compartments
- refinegems.sboann.getCompartmentlessReactantIds(reac: libsbml.Reaction)
Extracts reactants which have no compartment information
- Args:
reac (Reaction): Reaction from sbml model
- Returns:
list: reactants (metabolites) without compartments
- refinegems.sboann.getCompartmentlessSpeciesId(speciesReference: libsbml.SpeciesReference) str
Determines wheter a species has compartment by its refernece
- Args:
speciesReference (SpeciesReference): Reference to species
- Returns:
libsbml-species-id: id of species without compartment
- refinegems.sboann.getECNums(reac: libsbml.Reaction)
Extracts EC-Code from the reaction annotations
- Args:
reac (Reaction): Reaction from sbml model
- Returns:
list: all EC-Numbers of the reaction
- refinegems.sboann.getListOfMetabolites(reac: libsbml.Reaction)
Extracts list of metabolites of the reaction
- Args:
reac (Reaction): Reaction from sbml model
- Returns:
list: metabolites that are part of the reaction
- refinegems.sboann.getMetaboliteIds(reac: libsbml.Reaction)
Extracts list of metabolite ids of reaction
- Args:
reac (Reaction): Reaction from sbml model
- Returns:
list: metabolite ids
- refinegems.sboann.getProductCompartmentList(reac: libsbml.Reaction)
Extracts compartments of products
- Args:
reac (Reaction): Reaction from sbml model
- Returns:
set: compartment information of all products (metabolites)
- refinegems.sboann.getProductIds(reac: libsbml.Reaction)
Extracts products (metabolites) of reaction
- Args:
reac (Reaction): Reaction from sbml model
- Returns:
list: products (metabolites) ids
- refinegems.sboann.getReactantCompartmentList(reac: libsbml.Reaction)
Extracts compartments of reactants
- Args:
reac (Reaction): Reaction from sbml model
- Returns:
set: compartment information of all reactants (metabolites)
- refinegems.sboann.getReactantIds(reac: libsbml.Reaction) list[str]
Extracts reactants (metabolites) of reaction
- Args:
reac (Reaction): Reaction from sbml model
- Returns:
list[str]: Reactants (metabolites) ids
- refinegems.sboann.hasReactantPair(reac: libsbml.Reaction, met1: libsbml.Species, met2: libsbml.Species) bool
- Checks if a pair of metabolites is present in reaction | needed for special reactions like redox or deamination
- Args:
reac (Reaction): Reaction from sbml model
met1 (Species): metabolite 1 of metabolite pair
met2 (Species): metabolite 2 of metabolite pair
- Returns:
bool: True if one of the metabolites is in reactants and the other in products
- refinegems.sboann.isProtonTransport(reac: libsbml.Reaction)
check if reaction is proton transport
- Args:
reac (Reaction): Reaction from sbml model
- Returns:
bool: True if reaction is proton transport
- refinegems.sboann.moreThanTwoCompartmentTransport(reac: libsbml.Reaction)
check if reaction traverses more than 2 compartments
- Args:
reac (Reaction): Reaction from sbml model
- Returns:
bool: True if reaction traverses more than 2 compartments
- refinegems.sboann.returnCompartment(id)
Helper to split compartment id
- refinegems.sboann.sbo_annotation(model_libsbml: libsbml.Model) libsbml.Model
Executes all steps to annotate SBO terms to a given model (former main function of original script by Elisabeth Fritze)
- Args:
model_libsbml (libModel): Model loaded with libsbml
- Returns:
libModel: Modified model with SBO terms
- refinegems.sboann.soleProtonTransported(reac: libsbml.Reaction)
check if reaction is transport powered by one H
- Args:
reac (Reaction): Reaction from sbml model
- Returns:
bool: True if reaction is transport powered by one H
- refinegems.sboann.splitSymAntiPorter(reac: libsbml.Reaction)
Tests if reac is sym- or antiporter and sets SBO Term if true
- Args:
reac (Reaction): Reaction from sbml model
- refinegems.sboann.splitTransportBiochem(reac: libsbml.Reaction)
Tests if reaction traverses more than 1 compartment and set SBO Term
- Args:
reac (Reaction): Reaction from sbml model