refineGEMs package

Here is an overview on all functions. All imports are mocked in autodoc_mock_imports in the conf.py file to enable automatic building.

refineGEMs.biomass module

refineGEMs.charges module

Provides functions for adding charges to metabolites

When iterating through all metabolites present in a model, you will find several which have no defined charge (metab.getPlugin(‘fbc’).isSetCharge() = false). This can lead to charge imbalanced reactions. This script takes information on metabolite charges from the ModelSEED database. A charge is automatically added to a metabolite if it has no defined charge and if there is only one charge denoted in ModelSEED. When multiple charges are present, the metabolite and the possible charges are noted and later returned in a dictionary.

It is possible to use the correct_charges_from_db function with other databases. The user just needs to make sure that the compounds dataframe has a ‘BiGG’ and a ‘charge’ column.

refinegems.charges.correct_charges_from_db(model: libsbml.Model, compounds: pandas.DataFrame) → tuple[libsbml.Model, dict]

Adds charges taken from given database to metabolites which have no defined charge

Args:

model (libModel): Model loaded with libsbml
compounds (pd.DataFrame): Containing database data with ‘BiGG’ (BiGG-Ids) and ‘charge’ (float or int) as columns

Returns:

tuple: libSBML model (1) & dictionary ‘metabolite_id’: list(charges) (2)

libModel: Model with added charges
dict: Metabolites with respective multiple charges

refinegems.charges.correct_charges_modelseed(model: libsbml.Model) → tuple[libsbml.Model, dict]

Wrapper function which completes the steps to charge correction with the ModelSEED database

Args:

model (libModel): Model loaded with libsbml

Returns:

tuple: libSBML model (1) & dictionary ‘metabolite_id’: list(charges) (2)

libModel: Model with added charges
dict: Metabolites with respective multiple charges

refineGEMs.comparison module

refineGEMs.curate module

Functions to enable annotation of entities using a manual curated table

While working on GEMs the user might come across ill-annotated or missing metabolites, reactions and genes. This module aims to enable faster manual curation by allowing to edit an excel table directly which is used to update the given model. This module makes use of the cvterms module aswell.

refinegems.curate.add_reactions_from_table(model: libsbml.Model, table: pandas.DataFrame, email: str) → libsbml.Model

Wrapper function to use with table format given in data/manual_curation.xlsx, sheet gapfill: Adds all reactions with their info given in the table to the given model

Args:

model (libModel): Model loaded with libSBML
table (pd-DataFrame): Table in format of sheet gapfill from manual_curation.xlsx located in the data folder
email (str): User Email to access the NCBI Entrez database

Returns:

libModel: Modified model with new reactions

refinegems.curate.update_annotations_from_others(model: libsbml.Model) → libsbml.Model

Synchronizes metabolite annotations for core, periplasm and extracelullar

Args:

model (libModel): Model loaded with libSBML

Returns:

libModel: Modified model with synchronized annotations

refinegems.curate.update_annotations_from_table(model: libsbml.Model, table: pandas.DataFrame) → libsbml.Model

Wrapper function to use with table format given in data/manual_curation.xlsx, sheet metabs: Updates annotation of metabolites given in the table

Args:

model (libModel): Model loaded with libSBML
table (pd-DataFrame): Table in format of sheet metabs from manual_curation.xlsx located in the data folder

Returns:

libModel: Modified model with new annotations

refineGEMs.cvterms module

Helper module to work with annotations (CVTerms)

Stores dictionaries which hold information the identifiers.org syntax, has functions to add CVTerms to different entities and parse CVTerms.

refinegems.cvterms.add_cv_term_genes(entry: str, db_id: str, gene: libsbml.GeneProduct, lab_strain: bool = False)

Adds CVTerm to a gene

Args:

entry (str): Id to add as annotation
db_id (str): Database to which entry belongs. Must be in gene_db_dict.keys().
gene (GeneProduct): Gene to add CVTerm to
lab_strain (bool, optional): For locally sequenced strains the qualifiers are always HOMOLOG_TO. Defaults to False.

refinegems.cvterms.add_cv_term_metabolites(entry: str, db_id: str, metab: libsbml.Species)

Adds CVTerm to a metabolite

Args:

entry (str): Id to add as annotation
db_id (str): Database to which entry belongs. Must be in metabol_db_dict.keys().
metab (Species): Metabolite to add CVTerm to

refinegems.cvterms.add_cv_term_pathways(entry: str, db_id: str, path: libsbml.Group)

Add CVTerm to a groups pathway

Args:

entry (str): Id to add as annotation
db_id (str): Database to which entry belongs. Must be in pathway_db_dict.keys().
path (Group): Pathway to add CVTerm to

refinegems.cvterms.add_cv_term_pathways_to_entity(entry: str, db_id: str, reac: libsbml.Reaction)

Add CVTerm to a reaction as OCCURS IN pathway

Args:

entry (str): Id to add as annotation
db_id (str): Database to which entry belongss
reac (Reaction): Reaction to add CVTerm to

refinegems.cvterms.add_cv_term_reactions(entry: str, db_id: str, reac: libsbml.Reaction)

Adds CVTerm to a reaction

Args:

entry (str): Id to add as annotation
db_id (str): Database to which entry belongs. Must be in reaction_db_dict.keys().
reac (Reaction): Reaction to add CVTerm to

refinegems.cvterms.add_cv_term_units(unit_id: str, unit: libsbml.Unit, relation: int)

Adds CVTerm to a unit

Args:

unit_id (str): ID to add as URI to annotation
unit (Unit): Unit to add CVTerm to
relation (int): Provides model qualifier to be added

refinegems.cvterms.generate_cvterm(qt, b_m_qt) → libsbml.CVTerm

Generates a CVTerm with the provided qualifier & biological or model qualifier types

Args:

qt (libSBML qualifier type): BIOLOGICAL_QUALIFIER or MODEL_QUALIFIER
b_m_qt (libSBML qualifier): BQM_IS, BQM_IS_HOMOLOG_TO, etc.

Returns:

CVTerm: With provided qualifier & biological or model qualifier types

refinegems.cvterms.get_id_from_cv_term(entity: libsbml.SBase, db_id: str) → list[str]

Extract Id for a specific database from CVTerm

Args:

entity (SBase): Species, Reaction, Gene, Pathway
db_id (str): Database of interest

Returns:

list[str]: Ids of entity belonging to db_id

refinegems.cvterms.print_cvterm(cvterm: libsbml.CVTerm)

Debug function: Prints the URIs contained in the provided CVTerm along with the provided qualifier & biological/model qualifier types

Args:: cvterm (CVTerm): A libSBML CVTerm

refineGEMs.gapfill module

refineGEMs.growth module

Provides functions to simulate growth on any medium

Tailored to work with the media denoted in the local db, should work with any medium as long as its defined in a csv with ; as delimiter and BiGG Ids for the compounds. Use refinegems.io.load_medium_custom and hand this to the growth_one_medium_from_default or growth_one_medium_from_minimum function.

refinegems.growth.find_additives(model: cobra.Model, base_medium: dict) → pandas.DataFrame

Iterates through all exchanges to find metabolites that lead to a higher growth rate compared to the growth rate yielded on the base_medium

Args:

model (cobraModel): Model loaded with COBRApy
base_medium (dict): Exchanges as keys and their flux bound as value (f.ex {‘EX_glc__D_e’ : 10.0})

Returns:

pd.DataFrame: Exchanges sorted from highest to lowest growth rate improvement

refinegems.growth.find_minimum_essential(medium: pandas.DataFrame, essential: list[str]) → list[str]

Report metabolites necessary for growth and not in custom medium

Args:

medium (pd.DataFrame): Dataframe with medium definition
essential (list[str]): Ids of all metabolites which lead to zero growth if blocked. Output of find_missing_essential.

Returns:

list[str]: Ids of exchanges of metabolites not present in the medium but necessary for growth

refinegems.growth.find_missing_essential(model: cobra.Model, growth_medium: dict, default_uptake: list[str], anaerobic: bool) → list[str]

Report which exchange reactions are needed for growth, combines default uptake and valid new medium

Args:

model (cobraModel): Model loaded with COBRApy
growth_medium (dict): Growth medium definition that can be used with the model. Output of modify_medium.
default_uptake (list[str]): Metabolites consumed in standard medium
anaerobic (bool): If True ‘EX_o2_e’ is set to 0.0 to simulate anaerobic conditions

Returns:

list[str]: Ids of exchanges of all metabolites which lead to zero growth if blocked

refinegems.growth.get_all_minimum_essential(model: cobra.Model, media: list[str]) → pandas.DataFrame

Returns metabolites necessary for growth and not in media

Args:

model (cobraModel): Model loaded with COBRApy
media (list[str]): Containing the names of all media for which the growth essential metabolites not contained in the media should be returned

Returns:

pd.DataFrame: information on different media which metabs are missing

refinegems.growth.get_default_secretion(model: cobra.Model) → list[str]

Checks fluxes after FBA, if positive the metabolite is produced

Args:

model (cobraModel): Model loaded with COBRApy

Returns:

list[str]: BiGG Ids of produced metabolites

refinegems.growth.get_default_uptake(model: cobra.Model) → list[str]

Determines which metabolites are used in the standard medium

Args:

model (cobraModel): Model loaded with COBRApy

Returns:

list[str]: Metabolites consumed in standard medium

refinegems.growth.get_essential_reactions(model: cobra.Model) → list[str]

Knocks out each reaction, if no growth is detected the reaction is seen as essential

Args:

model (cobraModel): Model loaded with COBRApy

Returns:

list[str]: BiGG Ids of essential reactions

refinegems.growth.get_essential_reactions_via_bounds(model: cobra.Model) → list[str]

Knocks out reactions by setting their bounds to 0, if no growth is detected the reaction is seen as essential

Args:

model (cobraModel): Model loaded with COBRApy

Returns:

list[str]: BiGG Ids of essential reactions

refinegems.growth.get_growth_selected_media(model: cobra.Model, media: list[str], basis: str, anaerobic: bool) → pandas.DataFrame

Simulates growth on all given media

Args:

model (cobraModel): Model loaded with COBRApy
media (list[str]): Ids of media to simulate on
basis (str): Either default_uptake (adding metabs from default) or minimal_uptake (adding metabs from minimal medium)
anaerobic (bool): If True ‘EX_o2_e’ is set to 0.0 to simulate anaerobic conditions

Returns:

pd.DataFrame: Information on growth behaviour on given media

refinegems.growth.get_minimal_uptake(model: cobra.Model) → list[str]

Determines which metabolites are used in a minimal medium

Args:

model (cobraModel): Model loaded with COBRApy

Returns:

list[str]: Metabolites consumed in minimal medium

refinegems.growth.get_missing_exchanges(model: cobra.Model, medium: pandas.DataFrame) → list[str]

Look for exchange reactions needed by the medium but not in the model

Args:

model (cobraModel): Model loaded with COBRApy
medium (pd.DataFrame): Dataframe with medium definition

Returns:

list[str]: Ids of all exchanges missing in the model but given in medium

refinegems.growth.growth_one_medium_from_default(model: cobra.Model, medium: pandas.DataFrame, anaerobic: bool) → pandas.DataFrame

Simulates growth on given medium, adding missing metabolites from the default uptake

Args:

model (cobraModel): Model loaded with COBRApy
medium (pd.DataFrame): Dataframe with medium definition
anaerobic (bool): If True ‘EX_o2_e’ is set to 0.0 to simulate anaerobic conditions

Returns:

pd.DataFrame: Information on growth behaviour on given medium

refinegems.growth.growth_one_medium_from_minimal(model: cobra.Model, medium: pandas.DataFrame, anaerobic: bool) → pandas.DataFrame

Simulates growth on given medium, adding missing metabolites from a minimal uptake

Args:

model (cobraModel): Model loaded with COBRApy
medium (pd.DataFrame): Dataframe with medium definition
anaerobic (bool): If True ‘EX_o2_e’ is set to 0.0 to simulate anaerobic conditions

Returns:

pd.DataFrame: Information on growth behaviour on given medium

refinegems.growth.modify_medium(medium: pandas.DataFrame, missing_exchanges: list[str]) → dict

Helper function: Remove exchanges from medium that are not in the model to avoid KeyError

Args:

medium (pd.DataFrame): Dataframe with medium definition
missing_exchanges (list): Ids of exchanges not in the model

Returns:

dict: Growth medium definition that can be used with the model (f.ex {‘EX_glc__D_e’ : 10.0})

refinegems.growth.set_fluxes_to_simulate(reaction: cobra.Reaction) → cobra.Reaction

Helper function: Set flux bounds to -1000.0 and 1000.0 to enable model simulation with growth_one_medium_from_minimal/default

Args:

reaction (Reaction): Reaction with unusable flux bounds

Returns:

Reaction: Reaction with usable flux bounds

refinegems.growth.simulate_minimum_essential(model: cobra.Model, growth_medium: dict, minimum: list[str], anaerobic: bool) → float

Simulate growth with custom medium plus necessary uptakes

Args:

model (cobraModel): Model loaded with COBRApy
growth_medium (dict): Growth medium definition that can be used with the model. Output of modify_medium.
minimum (list[str]): Ids of exchanges of metabolites not present in the medium but necessary for growth. Output of find_minimum_essential.
anaerobic (bool): If True ‘EX_o2_e’ is set to 0.0 to simulate anaerobic conditions

Returns:

float: Growth value in mmol per (gram dry weight) per hour

refineGEMs.investigate module

refineGEMs.io module

Provides functions to load and write models, media definitions and the manual annotation table

Depending on the application the model needs to be loaded with cobra (memote) or with libSBML (activation of groups). The media definitions are denoted in a csv within the data folder of this repository, thus the functions will only work if the user clones the repository. The manual_annotations table has to follow the specific layout given in the data folder in order to work with this module.

refinegems.io.load_a_table_from_database(table_name_or_query: str) → pandas.DataFrame

Loads the table for which the name is provided or a table containing all rows for which the query evaluates to | true from the refineGEMs database (‘data/database/data.db’)

Args:

table_name_or_query (str): Name of a table contained in the database ‘data.db’/ a SQL query

Returns:

pd.DataFrame: Containing the table for which the name was provided from the database ‘data.db’

refinegems.io.load_all_media_from_db(mediumpath: str) → pandas.DataFrame

Helper function to extract media definitions from media_db.csv

Args:

mediumpath (str): Path to csv file with medium database

Returns:

pd.DataFrame: Table from csv with metabs added as BiGG_EX exchange reactions

refinegems.io.load_document_libsbml(modelpath: str) → libsbml.SBMLDocument

Loads model document using libSBML

Args:

modelpath (str): Path to GEM

Returns:

SBMLDocument: Loaded document by libSBML

refinegems.io.load_manual_annotations(tablepath: str = 'data/manual_curation.xlsx', sheet_name: str = 'metab') → pandas.DataFrame

Loads metabolite sheet from manual curation table

Args:

tablepath (str): Path to manual curation table. Defaults to ‘data/manual_curation.xlsx’.
sheet_name (str): Sheet name for metabolite annotations. Defaults to ‘metab’.

Returns:

pd.DataFrame: Table containing specified sheet from Excel file

refinegems.io.load_manual_gapfill(tablepath: str = 'data/manual_curation.xlsx', sheet_name: str = 'gapfill') → pandas.DataFrame

Loads gapfill sheet from manual curation table

Args:

tablepath (str): Path to manual curation table. Defaults to ‘data/manual_curation.xlsx’.
sheet_name (str): Sheet name for reaction gapfilling. Defaults to ‘gapfill’.

Returns:

pd.DataFrame: Table containing sheet with name ‘gapfill’|specified sheet_name from Excel file

refinegems.io.load_medium_custom(mediumpath: str) → pandas.DataFrame

Helper function to read medium csv

Args:

mediumpath (str): path to csv file with medium

Returns:

pd.DataFrame: Table of csv

refinegems.io.load_medium_from_db(mediumname: str) → pandas.DataFrame

Wrapper function to extract subtable for the requested medium from the database ‘data.db’

Args:

mediumname (str): Name of medium to test growth on

Returns:

pd.DataFrame: Table containing composition for one medium with metabs added as BiGG_EX exchange reactions

refinegems.io.load_model_cobra(modelpath: str) → cobra.Model

Loads model using COBRApy

Args:

modelpath (str): Path to GEM

Returns:

cobraModel: Loaded model by COBRApy

refinegems.io.load_model_libsbml(modelpath: str) → libsbml.Model

Loads model using libSBML

Args:

modelpath (str): Path to GEM

Returns:

libModel: loaded model by libSBML

refinegems.io.load_multiple_models(models: list[str], package: str) → list

Loads multiple models into a list

Args:

models (list): List of paths to models
package (str): COBRApy|libSBML

Returns:

list: List of model objects loaded with COBRApy|libSBML

refinegems.io.parse_dict_to_dataframe(str2list: dict) → pandas.DataFrame

Parses dictionary of form {str: list} & | Transforms it into a table with a column containing the strings and a column containing the lists

Args:: str2list (dict): Dictionary mapping strings to lists
Returns:: pd.DataFrame: Table with column containing the strings and column containing the lists

refinegems.io.parse_fasta_headers(filepath: str, id_for_model: bool = False) → pandas.DataFrame

Parses FASTA file headers to obtain:

the protein_id

and the model_id (like it is obtained from CarveMe)

corresponding to the locus_tag

Args:

filepath (str): Path to FASTA file
id_for_model (bool): True if model_id similar to autogenerated GeneProduct ID should be contained in resulting table

Returns:

pd.DataFrame: Table containing the columns locus_tag, Protein_id & Model_id

refinegems.io.parse_gff_for_gp_info(gff_file: str) → pandas.DataFrame

Parses gff file of organism to find gene protein reactions based on locus tags

Args:

gff_file (str): Path to gff file of organism of interest

Returns:

pd.DataFrame: Table containing mapping from locus tag to GPR

refinegems.io.save_user_input(configpath: str) → dict[slice(<class 'str'>, <class 'str'>, None)]

This aims to collect user input from the command line to create a config file, will also save the user input to a config if no config was given

Args:

configpath (str): Path to config file if present

Returns:

dict: Either loaded config file or created from user input

refinegems.io.search_ncbi_for_gpr(locus: str) → str

Fetches protein name from NCBI

Args:

locus (str): NCBI compatible locus_tag

Returns:

str: Protein name|description

refinegems.io.search_sbo_label(sbo_number: str) → str

Looks up the SBO label corresponding to a given SBO Term number

Args:

sbo_number (str): Last three digits of SBO-Term as str

Returns:

str: Denoted label for given SBO Term

refinegems.io.validate_libsbml_model(model: libsbml.Model) → int

Debug method: Validates a libSBML model with the libSBML validator Args:

model (libModel): A libSBML model

Returns:: int: Integer specifying if vaidate was successful or not

refinegems.io.write_report(dataframe: pandas.DataFrame, filepath: str)

Writes reports stored in dataframes to xlsx file

Args:

dataframe (pd.DataFrame): Table containing output
filepath (str): Path to file with filename

refinegems.io.write_to_file(model: libsbml.Model, new_filename: str)

Writes modified model to new file

Args:

model (libModel): Model loaded with libSBML
new_filename (str): Filename|Path for modified model

refineGEMs.modelseed module

Reports mismatches in charges and formulae based on ModelSEED

Extracts ModelSEED data from a given tsv file, extracts all metabolites from a given model. Both lists of metabolites are compared by charge and formula.

refinegems.modelseed.compare_model_modelseed(model_charges: pandas.DataFrame, modelseed_charges: pandas.DataFrame) → pandas.DataFrame

Compares tables with charges / formulae from model & modelseed

Args:

model_charges (pd.DataFrame): Charges and formulae of model metabolites. Output of get_model_charges.
modelseed_charges (pd.DataFrame): Charges and formulae of ModelSEED metabolites. Output of get_modelseed_charges.

Returns:

pd.DataFrame: Table containing info whether charges / formulae match

refinegems.modelseed.compare_to_modelseed(model: cobra.Model) → tuple[pandas.DataFrame, pandas.DataFrame]

Executes all steps to compare model metabolites to ModelSEED metabolites

Args:

model (cobraModel): Model loaded with COBRApy

Returns:

tuple: Tables with charge (1) & formula (2) mismatches

pd.DataFrame: Table with charge mismatches
pd.DataFrame: Table with formula mismatches

refinegems.modelseed.get_charge_mismatch(df_comp: pandas.DataFrame) → pandas.DataFrame

Extracts metabolites with charge mismatch of model & modelseed

Args:: df_comp (pd.DataFrame): Charge and formula mismatches. Output from compare_model_modelseed.
Returns:: pd.DataFrame: Table containing metabolites with charge mismatch

refinegems.modelseed.get_compared_formulae(formula_mismatch: pandas.DataFrame) → pandas.DataFrame

Compare formula by atom pattern

Args:: formula_mismatch (pd.DataFrame): Table with column containing atom comparison. Output from get_formula_mismatch.
Returns:: pd.DataFrame: table containing metabolites with formula mismatch

refinegems.modelseed.get_formula_mismatch(df_comp: pandas.DataFrame) → pandas.DataFrame

Extracts metabolites with formula mismatch of model & modelseed

Args:: df_comp (pd.DataFrame): Charge and formula mismatches. Output from compare_model_modelseed.
Returns:: pd.DataFrame: Table containing metabolites with formula mismatch

refinegems.modelseed.get_model_charges(model: cobra.Model) → pandas.DataFrame

Extracts all metabolites from model

Args:

model (cobraModel): Model loaded with COBRApy

Returns:

pd.DataFrame: Table containing charges and formulae of model metabolites

refinegems.modelseed.get_modelseed_charges(modelseed_compounds: pandas.DataFrame) → pandas.DataFrame

Extract table with BiGG, charges and formulae

Args:

modelseed_compounds (pd.DataFrame): ModelSEED data. Output from get_modelseed_compounds.

Returns:

pd.DataFrame: Table containing charges and formulae of ModelSEED metabolites

refinegems.modelseed.get_modelseed_compounds() → pandas.DataFrame

Extracts compounds from ModelSEED which have BiGG Ids

Returns:: pd.DataFrame: Table containing ModelSEED data

refineGEMs.pathways module

Provides functions for adding KEGG reactions as Group Pathways

If your organism occurs in the KEGG database, extract the KEGG reaction ID from the annotations of your reactions and identify, in which KEGG pathways this reaction occurs. Add all KEGG pathways for a reaction then as annotations with the biological qualifier ‘OCCURS_IN’ to the respective reaction.

refinegems.pathways.add_kegg_pathways(model, kegg_pathways)

Add KEGG reactions as BQB_OCCURS_IN

Args:

model (libModel): Model loaded with libSBML. Output of load_model_enable_groups.
kegg_pathways (dict): Reaction Id as key and Kegg Pathway Id as value. Output of extract_kegg_pathways.

Returns:

libsbml-model: modified model with Kegg pathways

refinegems.pathways.create_pathway_groups(model: libsbml.Model, pathway_groups)

Use group module to add reactions to Kegg pathway

Args:

model (libModel): Model loaded with libSBML. Output of load_model_enable_groups.
pathway_groups (dict): Kegg Pathway Id as key and reactions Ids as values. Output of get_pathway_groups.

Returns:

libModel: modified model with groups for pathways

refinegems.pathways.extract_kegg_pathways(kegg_reactions: dict) → dict

Finds pathway for reactions in model with KEGG Ids, accesses KEGG API, uses tqdm to report progres to user

Args:

kegg_reactions (dict): Reaction Id as key and Kegg Id as value. Output[0] from extract_kegg_reactions.

Returns:

dict: Reaction Id as key and Kegg Pathway Id as value

refinegems.pathways.extract_kegg_reactions(model: libsbml.Model) → tuple[dict, list]

Extract KEGG Ids from reactions

Args:

model (libModel): Model loaded with libSBML. Output of load_model_enable_groups.

Returns:

tuple: Dictionary ‘reaction_id’: ‘KEGG_id’ (1) & List of reactions without KEGG Id (2)

dict: Reaction Id as key and Kegg Id as value
list: Ids of reactions without KEGG annotation

refinegems.pathways.get_pathway_groups(kegg_pathways)

Group reaction into pathways

Args:

kegg_pathways (dict): Reaction Id as key and Kegg Pathway Id as value. Output of extract_kegg_pathways.

Returns:

dict: Kegg Pathway Id as key and reactions Ids as values

refinegems.pathways.kegg_pathways(modelpath: str) → tuple[libsbml.Model, list[str]]

Executes all steps to add KEGG pathways as groups

Args:

modelpath (str): Path to GEM

Returns:

tuple: libSBML model (1) & List of reactions without KEGG Id (2)

libModel: Modified model with Pathways as groups
list: Ids of reactions without KEGG annotation

refinegems.pathways.load_model_enable_groups(modelpath: str) → libsbml.Model

Loads model as document using libSBML and enables groups extension

Args:

modelpath (str): Path to GEM

Returns:

libModel: Model loaded with libSBML

refineGEMs.polish module

refineGEMs.sboann module

Provides functions to automate the addition of SBO terms to the model

Script written by Elisabeth Fritze in her bachelor thesis. Modified by Gwendolyn O. Gusak during her master thesis. Commented by Famke Bäuerle and extended by Nantia Leonidou.

It is splitted into a lot of small functions which are all annotated, however when using it for SBO-Term annotation it only makes sense to run the “main” function: sbo_annotation(model_libsbml, database_user, database_name) if you want to continue with the model. The smaller functions might be useful if special information is needed for a reaction without the context of a bigger model or when the automated annotation fails for some reason.

refinegems.sboann.addSBOforCompartments(model)

refinegems.sboann.addSBOforGenes(model)

refinegems.sboann.addSBOforGroups(model)

refinegems.sboann.addSBOforMetabolites(model)

refinegems.sboann.addSBOforModel(model)

refinegems.sboann.addSBOforParameters(model)

refinegems.sboann.addSBOfromDB(reac: libsbml.Reaction, cur) → bool

Adds SBO term based on bigg id of a reaction

Args:

reac (Reaction): Reaction from sbml model
cur (sqlite3.connect.cursor): Used to access the sqlite3 database

Returns:

bool: True if SBO Term was changed

refinegems.sboann.addSBOviaEC(reac: libsbml.Reaction, cur)

Adds SBO terms based on EC numbers given in the annotations of a reactions

Args:

reac (Reaction): Reaction from sbml model
cur (sqlite3.connect.cursor): Used to access the sqlite3 database

refinegems.sboann.checkAcetylationViaEC(reac: libsbml.Reaction)

Tests if reac is acetylation by its EC-Code and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkActiveTransport(reac: libsbml.Reaction)

Tests if reac is active transport (uses atp/pep) and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkBiomass(reac: libsbml.Reaction)

Tests if reac is biomass / growth and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkCoTransport(reac: libsbml.Reaction)

Tests if reac is co-transport and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkDeamination(reac: libsbml.Reaction)

Tests if reac is deamination and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkDeaminationViaEC(reac: libsbml.Reaction)

Tests if reac is deamination by its EC-Code and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkDecarbonylation(reac: libsbml.Reaction)

Tests if reac is decarbonylation and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkDecarboxylation(reac: libsbml.Reaction)

Tests if reac is decarboxylation and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkDecarboxylationViaEC(reac: libsbml.Reaction)

Tests if reac is decarboxylation by its EC-Code and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkDemand(reac: libsbml.Reaction)

Tests if reac is demand and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkExchange(reac: libsbml.Reaction)

Tests if reac is exchange and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkGlycosylation(reac: libsbml.Reaction)

Tests if reac is glycosylation and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkGlycosylationViaEC(reac: libsbml.Reaction)

Tests if reac is glycosylation by its EC-Code and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkHydrolysisViaEC(reac: libsbml.Reaction)

Tests if reac is hydrolysis by its EC-Code and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkIsomerisationViaEC(reac: libsbml.Reaction)

Tests if reac is isomerisation by its EC-Code and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkMethylationViaEC(reac: libsbml.Reaction)

Tests if reac is methylation by its EC-Code and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkPassiveTransport(reac: libsbml.Reaction)

Tests if reac is passive transport and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkPhosphorylation(reac: libsbml.Reaction)

Tests if reac is phosphorylase / kinase and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkRedox(reac: libsbml.Reaction)

Tests if reac is redox and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkRedoxViaEC(reac: libsbml.Reaction)

Tests if reac is redox by its EC-Code and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkSink(reac: libsbml.Reaction)

Tests if reac is sink and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.checkTransaminationViaEC(reac: libsbml.Reaction)

Tests if reac is transamination by its EC-Code and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.getCompartmentDict(reac: libsbml.Reaction)

sorts metabolites by compartment

Args:

reac (Reaction): Reaction from sbml model

Returns:

dict: compartment as key and metabolites as values

refinegems.sboann.getCompartmentFromSpeciesRef(speciesReference: libsbml.SpeciesReference) → libsbml.Compartment

Extracts compartment from a species by its reference

Args:

speciesReference (SpeciesReference): Reference to species

Returns:

Compartment: Compartment which the species lives in

refinegems.sboann.getCompartmentList(reac: libsbml.Reaction)

Extracts compartments of metabolites

Args:

reac (Reaction): Reaction from sbml model

Returns:

set: compartment information of all metabolites

refinegems.sboann.getCompartmentlessMetaboliteIds(reac: libsbml.Reaction)

Extracts metabolites which have no compartment information

Args:

reac (Reaction): Reaction from sbml model

Returns:

list: all metabolites which have no compartment

refinegems.sboann.getCompartmentlessProductIds(reac: libsbml.Reaction)

Extracts products which have no compartment information

Args:

reac (Reaction): Reaction from sbml model

Returns:

list: products (metabolites) without compartments

refinegems.sboann.getCompartmentlessReactantIds(reac: libsbml.Reaction)

Extracts reactants which have no compartment information

Args:

reac (Reaction): Reaction from sbml model

Returns:

list: reactants (metabolites) without compartments

refinegems.sboann.getCompartmentlessSpeciesId(speciesReference: libsbml.SpeciesReference) → str

Determines wheter a species has compartment by its refernece

Args:

speciesReference (SpeciesReference): Reference to species

Returns:

libsbml-species-id: id of species without compartment

refinegems.sboann.getECNums(reac: libsbml.Reaction)

Extracts EC-Code from the reaction annotations

Args:

reac (Reaction): Reaction from sbml model

Returns:

list: all EC-Numbers of the reaction

refinegems.sboann.getListOfMetabolites(reac: libsbml.Reaction)

Extracts list of metabolites of the reaction

Args:

reac (Reaction): Reaction from sbml model

Returns:

list: metabolites that are part of the reaction

refinegems.sboann.getMetaboliteIds(reac: libsbml.Reaction)

Extracts list of metabolite ids of reaction

Args:

reac (Reaction): Reaction from sbml model

Returns:

list: metabolite ids

refinegems.sboann.getProductCompartmentList(reac: libsbml.Reaction)

Extracts compartments of products

Args:

reac (Reaction): Reaction from sbml model

Returns:

set: compartment information of all products (metabolites)

refinegems.sboann.getProductIds(reac: libsbml.Reaction)

Extracts products (metabolites) of reaction

Args:

reac (Reaction): Reaction from sbml model

Returns:

list: products (metabolites) ids

refinegems.sboann.getReactantCompartmentList(reac: libsbml.Reaction)

Extracts compartments of reactants

Args:

reac (Reaction): Reaction from sbml model

Returns:

set: compartment information of all reactants (metabolites)

refinegems.sboann.getReactantIds(reac: libsbml.Reaction) → list[str]

Extracts reactants (metabolites) of reaction

Args:

reac (Reaction): Reaction from sbml model

Returns:

list[str]: Reactants (metabolites) ids

refinegems.sboann.hasReactantPair(reac: libsbml.Reaction, met1: libsbml.Species, met2: libsbml.Species) → bool

Checks if a pair of metabolites is present in reaction | needed for special reactions like redox or deamination

Args:

reac (Reaction): Reaction from sbml model
met1 (Species): metabolite 1 of metabolite pair
met2 (Species): metabolite 2 of metabolite pair

Returns:

bool: True if one of the metabolites is in reactants and the other in products

refinegems.sboann.isProtonTransport(reac: libsbml.Reaction)

check if reaction is proton transport

Args:

reac (Reaction): Reaction from sbml model

Returns:

bool: True if reaction is proton transport

refinegems.sboann.moreThanTwoCompartmentTransport(reac: libsbml.Reaction)

check if reaction traverses more than 2 compartments

Args:

reac (Reaction): Reaction from sbml model

Returns:

bool: True if reaction traverses more than 2 compartments

refinegems.sboann.returnCompartment(id): Helper to split compartment id

refinegems.sboann.sbo_annotation(model_libsbml: libsbml.Model) → libsbml.Model

Executes all steps to annotate SBO terms to a given model (former main function of original script by Elisabeth Fritze)

Args:

model_libsbml (libModel): Model loaded with libsbml

Returns:

libModel: Modified model with SBO terms

refinegems.sboann.soleProtonTransported(reac: libsbml.Reaction)

check if reaction is transport powered by one H

Args:

reac (Reaction): Reaction from sbml model

Returns:

bool: True if reaction is transport powered by one H

refinegems.sboann.splitSymAntiPorter(reac: libsbml.Reaction)

Tests if reac is sym- or antiporter and sets SBO Term if true

Args:

reac (Reaction): Reaction from sbml model

refinegems.sboann.splitTransportBiochem(reac: libsbml.Reaction)

Tests if reaction traverses more than 1 compartment and set SBO Term

Args:

reac (Reaction): Reaction from sbml model