refineGEMs package

Here is an overview on all functions. All imports are mocked in autodoc_mock_imports in the conf.py file to enable automatic building.

refineGEMs.biomass module

refineGEMs.charges module

Provides functions for adding charges to metabolites

When iterating through all metabolites present in a model, you will find several which have no defined charge (metab.getPlugin(‘fbc’).isSetCharge() = false). This can lead to charge imbalanced reactions. This script takes information on metabolite charges from the ModelSEED database. A charge is automatically added to a metabolite if it has no defined charge and if there is only one charge denoted in ModelSEED. When multiple charges are present, the metabolite and the possible charges are noted and later returned in a dictionary.

It is possible to use the correct_charges_from_db function with other databases. The user just needs to make sure that the compounds dataframe has a ‘BiGG’ and a ‘charge’ column.

refinegems.charges.correct_charges_from_db(model: libsbml.Model, compounds: pandas.DataFrame) tuple[libsbml.Model, dict]

Adds charges taken from given database to metabolites which have no defined charge

Args:
  • model (libModel): Model loaded with libsbml

  • compounds (pd.DataFrame): Containing database data with ‘BiGG’ (BiGG-Ids) and ‘charge’ (float or int) as columns

Returns:
tuple: libSBML model (1) & dictionary ‘metabolite_id’: list(charges) (2)
  1. libModel: Model with added charges

  2. dict: Metabolites with respective multiple charges

refinegems.charges.correct_charges_modelseed(model: libsbml.Model) tuple[libsbml.Model, dict]

Wrapper function which completes the steps to charge correction with the ModelSEED database

Args:
  • model (libModel): Model loaded with libsbml

Returns:
tuple: libSBML model (1) & dictionary ‘metabolite_id’: list(charges) (2)
  1. libModel: Model with added charges

  2. dict: Metabolites with respective multiple charges

refineGEMs.comparison module

refineGEMs.curate module

Functions to enable annotation of entities using a manual curated table

While working on GEMs the user might come across ill-annotated or missing metabolites, reactions and genes. This module aims to enable faster manual curation by allowing to edit an excel table directly which is used to update the given model. This module makes use of the cvterms module aswell.

refinegems.curate.add_reactions_from_table(model: libsbml.Model, table: pandas.DataFrame, email: str) libsbml.Model

Wrapper function to use with table format given in data/manual_curation.xlsx, sheet gapfill: Adds all reactions with their info given in the table to the given model

Args:
  • model (libModel): Model loaded with libSBML

  • table (pd-DataFrame): Table in format of sheet gapfill from manual_curation.xlsx located in the data folder

  • email (str): User Email to access the NCBI Entrez database

Returns:

libModel: Modified model with new reactions

refinegems.curate.update_annotations_from_others(model: libsbml.Model) libsbml.Model

Synchronizes metabolite annotations for core, periplasm and extracelullar

Args:
  • model (libModel): Model loaded with libSBML

Returns:

libModel: Modified model with synchronized annotations

refinegems.curate.update_annotations_from_table(model: libsbml.Model, table: pandas.DataFrame) libsbml.Model

Wrapper function to use with table format given in data/manual_curation.xlsx, sheet metabs: Updates annotation of metabolites given in the table

Args:
  • model (libModel): Model loaded with libSBML

  • table (pd-DataFrame): Table in format of sheet metabs from manual_curation.xlsx located in the data folder

Returns:

libModel: Modified model with new annotations

refineGEMs.cvterms module

Helper module to work with annotations (CVTerms)

Stores dictionaries which hold information the identifiers.org syntax, has functions to add CVTerms to different entities and parse CVTerms.

refinegems.cvterms.add_cv_term_genes(entry: str, db_id: str, gene: libsbml.GeneProduct, lab_strain: bool = False)

Adds CVTerm to a gene

Args:
  • entry (str): Id to add as annotation

  • db_id (str): Database to which entry belongs. Must be in gene_db_dict.keys().

  • gene (GeneProduct): Gene to add CVTerm to

  • lab_strain (bool, optional): For locally sequenced strains the qualifiers are always HOMOLOG_TO. Defaults to False.

refinegems.cvterms.add_cv_term_metabolites(entry: str, db_id: str, metab: libsbml.Species)

Adds CVTerm to a metabolite

Args:
  • entry (str): Id to add as annotation

  • db_id (str): Database to which entry belongs. Must be in metabol_db_dict.keys().

  • metab (Species): Metabolite to add CVTerm to

refinegems.cvterms.add_cv_term_pathways(entry: str, db_id: str, path: libsbml.Group)

Add CVTerm to a groups pathway

Args:
  • entry (str): Id to add as annotation

  • db_id (str): Database to which entry belongs. Must be in pathway_db_dict.keys().

  • path (Group): Pathway to add CVTerm to

refinegems.cvterms.add_cv_term_pathways_to_entity(entry: str, db_id: str, reac: libsbml.Reaction)

Add CVTerm to a reaction as OCCURS IN pathway

Args:
  • entry (str): Id to add as annotation

  • db_id (str): Database to which entry belongss

  • reac (Reaction): Reaction to add CVTerm to

refinegems.cvterms.add_cv_term_reactions(entry: str, db_id: str, reac: libsbml.Reaction)

Adds CVTerm to a reaction

Args:
  • entry (str): Id to add as annotation

  • db_id (str): Database to which entry belongs. Must be in reaction_db_dict.keys().

  • reac (Reaction): Reaction to add CVTerm to

refinegems.cvterms.add_cv_term_units(unit_id: str, unit: libsbml.Unit, relation: int)

Adds CVTerm to a unit

Args:
  • unit_id (str): ID to add as URI to annotation

  • unit (Unit): Unit to add CVTerm to

  • relation (int): Provides model qualifier to be added

refinegems.cvterms.generate_cvterm(qt, b_m_qt) libsbml.CVTerm

Generates a CVTerm with the provided qualifier & biological or model qualifier types

Args:
  • qt (libSBML qualifier type): BIOLOGICAL_QUALIFIER or MODEL_QUALIFIER

  • b_m_qt (libSBML qualifier): BQM_IS, BQM_IS_HOMOLOG_TO, etc.

Returns:

CVTerm: With provided qualifier & biological or model qualifier types

refinegems.cvterms.get_id_from_cv_term(entity: libsbml.SBase, db_id: str) list[str]

Extract Id for a specific database from CVTerm

Args:
  • entity (SBase): Species, Reaction, Gene, Pathway

  • db_id (str): Database of interest

Returns:

list[str]: Ids of entity belonging to db_id

refinegems.cvterms.print_cvterm(cvterm: libsbml.CVTerm)

Debug function: Prints the URIs contained in the provided CVTerm along with the provided qualifier & biological/model qualifier types

Args:

cvterm (CVTerm): A libSBML CVTerm

refineGEMs.gapfill module

refineGEMs.growth module

Provides functions to simulate growth on any medium

Tailored to work with the media denoted in the local db, should work with any medium as long as its defined in a csv with ; as delimiter and BiGG Ids for the compounds. Use refinegems.io.load_medium_custom and hand this to the growth_one_medium_from_default or growth_one_medium_from_minimum function.

refinegems.growth.find_additives(model: cobra.Model, base_medium: dict) pandas.DataFrame

Iterates through all exchanges to find metabolites that lead to a higher growth rate compared to the growth rate yielded on the base_medium

Args:
  • model (cobraModel): Model loaded with COBRApy

  • base_medium (dict): Exchanges as keys and their flux bound as value (f.ex {‘EX_glc__D_e’ : 10.0})

Returns:

pd.DataFrame: Exchanges sorted from highest to lowest growth rate improvement

refinegems.growth.find_minimum_essential(medium: pandas.DataFrame, essential: list[str]) list[str]

Report metabolites necessary for growth and not in custom medium

Args:
  • medium (pd.DataFrame): Dataframe with medium definition

  • essential (list[str]): Ids of all metabolites which lead to zero growth if blocked. Output of find_missing_essential.

Returns:

list[str]: Ids of exchanges of metabolites not present in the medium but necessary for growth

refinegems.growth.find_missing_essential(model: cobra.Model, growth_medium: dict, default_uptake: list[str], anaerobic: bool) list[str]

Report which exchange reactions are needed for growth, combines default uptake and valid new medium

Args:
  • model (cobraModel): Model loaded with COBRApy

  • growth_medium (dict): Growth medium definition that can be used with the model. Output of modify_medium.

  • default_uptake (list[str]): Metabolites consumed in standard medium

  • anaerobic (bool): If True ‘EX_o2_e’ is set to 0.0 to simulate anaerobic conditions

Returns:

list[str]: Ids of exchanges of all metabolites which lead to zero growth if blocked

refinegems.growth.get_all_minimum_essential(model: cobra.Model, media: list[str]) pandas.DataFrame

Returns metabolites necessary for growth and not in media

Args:
  • model (cobraModel): Model loaded with COBRApy

  • media (list[str]): Containing the names of all media for which the growth essential metabolites not contained in the media should be returned

Returns:

pd.DataFrame: information on different media which metabs are missing

refinegems.growth.get_default_secretion(model: cobra.Model) list[str]

Checks fluxes after FBA, if positive the metabolite is produced

Args:
  • model (cobraModel): Model loaded with COBRApy

Returns:

list[str]: BiGG Ids of produced metabolites

refinegems.growth.get_default_uptake(model: cobra.Model) list[str]

Determines which metabolites are used in the standard medium

Args:
  • model (cobraModel): Model loaded with COBRApy

Returns:

list[str]: Metabolites consumed in standard medium

refinegems.growth.get_essential_reactions(model: cobra.Model) list[str]

Knocks out each reaction, if no growth is detected the reaction is seen as essential

Args:
  • model (cobraModel): Model loaded with COBRApy

Returns:

list[str]: BiGG Ids of essential reactions

refinegems.growth.get_essential_reactions_via_bounds(model: cobra.Model) list[str]

Knocks out reactions by setting their bounds to 0, if no growth is detected the reaction is seen as essential

Args:
  • model (cobraModel): Model loaded with COBRApy

Returns:

list[str]: BiGG Ids of essential reactions

refinegems.growth.get_growth_selected_media(model: cobra.Model, media: list[str], basis: str, anaerobic: bool) pandas.DataFrame

Simulates growth on all given media

Args:
  • model (cobraModel): Model loaded with COBRApy

  • media (list[str]): Ids of media to simulate on

  • basis (str): Either default_uptake (adding metabs from default) or minimal_uptake (adding metabs from minimal medium)

  • anaerobic (bool): If True ‘EX_o2_e’ is set to 0.0 to simulate anaerobic conditions

Returns:

pd.DataFrame: Information on growth behaviour on given media

refinegems.growth.get_minimal_uptake(model: cobra.Model) list[str]

Determines which metabolites are used in a minimal medium

Args:
  • model (cobraModel): Model loaded with COBRApy

Returns:

list[str]: Metabolites consumed in minimal medium

refinegems.growth.get_missing_exchanges(model: cobra.Model, medium: pandas.DataFrame) list[str]

Look for exchange reactions needed by the medium but not in the model

Args:
  • model (cobraModel): Model loaded with COBRApy

  • medium (pd.DataFrame): Dataframe with medium definition

Returns:

list[str]: Ids of all exchanges missing in the model but given in medium

refinegems.growth.growth_one_medium_from_default(model: cobra.Model, medium: pandas.DataFrame, anaerobic: bool) pandas.DataFrame

Simulates growth on given medium, adding missing metabolites from the default uptake

Args:
  • model (cobraModel): Model loaded with COBRApy

  • medium (pd.DataFrame): Dataframe with medium definition

  • anaerobic (bool): If True ‘EX_o2_e’ is set to 0.0 to simulate anaerobic conditions

Returns:

pd.DataFrame: Information on growth behaviour on given medium

refinegems.growth.growth_one_medium_from_minimal(model: cobra.Model, medium: pandas.DataFrame, anaerobic: bool) pandas.DataFrame

Simulates growth on given medium, adding missing metabolites from a minimal uptake

Args:
  • model (cobraModel): Model loaded with COBRApy

  • medium (pd.DataFrame): Dataframe with medium definition

  • anaerobic (bool): If True ‘EX_o2_e’ is set to 0.0 to simulate anaerobic conditions

Returns:

pd.DataFrame: Information on growth behaviour on given medium

refinegems.growth.modify_medium(medium: pandas.DataFrame, missing_exchanges: list[str]) dict

Helper function: Remove exchanges from medium that are not in the model to avoid KeyError

Args:
  • medium (pd.DataFrame): Dataframe with medium definition

  • missing_exchanges (list): Ids of exchanges not in the model

Returns:

dict: Growth medium definition that can be used with the model (f.ex {‘EX_glc__D_e’ : 10.0})

refinegems.growth.set_fluxes_to_simulate(reaction: cobra.Reaction) cobra.Reaction

Helper function: Set flux bounds to -1000.0 and 1000.0 to enable model simulation with growth_one_medium_from_minimal/default

Args:
  • reaction (Reaction): Reaction with unusable flux bounds

Returns:

Reaction: Reaction with usable flux bounds

refinegems.growth.simulate_minimum_essential(model: cobra.Model, growth_medium: dict, minimum: list[str], anaerobic: bool) float

Simulate growth with custom medium plus necessary uptakes

Args:
  • model (cobraModel): Model loaded with COBRApy

  • growth_medium (dict): Growth medium definition that can be used with the model. Output of modify_medium.

  • minimum (list[str]): Ids of exchanges of metabolites not present in the medium but necessary for growth. Output of find_minimum_essential.

  • anaerobic (bool): If True ‘EX_o2_e’ is set to 0.0 to simulate anaerobic conditions

Returns:

float: Growth value in mmol per (gram dry weight) per hour

refineGEMs.investigate module

refineGEMs.io module

Provides functions to load and write models, media definitions and the manual annotation table

Depending on the application the model needs to be loaded with cobra (memote) or with libSBML (activation of groups). The media definitions are denoted in a csv within the data folder of this repository, thus the functions will only work if the user clones the repository. The manual_annotations table has to follow the specific layout given in the data folder in order to work with this module.

refinegems.io.load_a_table_from_database(table_name_or_query: str) pandas.DataFrame
Loads the table for which the name is provided or a table containing all rows for which the query evaluates to | true from the refineGEMs database (‘data/database/data.db’)
Args:
  • table_name_or_query (str): Name of a table contained in the database ‘data.db’/ a SQL query

Returns:

pd.DataFrame: Containing the table for which the name was provided from the database ‘data.db’

refinegems.io.load_all_media_from_db(mediumpath: str) pandas.DataFrame

Helper function to extract media definitions from media_db.csv

Args:
  • mediumpath (str): Path to csv file with medium database

Returns:

pd.DataFrame: Table from csv with metabs added as BiGG_EX exchange reactions

refinegems.io.load_document_libsbml(modelpath: str) libsbml.SBMLDocument

Loads model document using libSBML

Args:
  • modelpath (str): Path to GEM

Returns:

SBMLDocument: Loaded document by libSBML

refinegems.io.load_manual_annotations(tablepath: str = 'data/manual_curation.xlsx', sheet_name: str = 'metab') pandas.DataFrame

Loads metabolite sheet from manual curation table

Args:
  • tablepath (str): Path to manual curation table. Defaults to ‘data/manual_curation.xlsx’.

  • sheet_name (str): Sheet name for metabolite annotations. Defaults to ‘metab’.

Returns:

pd.DataFrame: Table containing specified sheet from Excel file

refinegems.io.load_manual_gapfill(tablepath: str = 'data/manual_curation.xlsx', sheet_name: str = 'gapfill') pandas.DataFrame

Loads gapfill sheet from manual curation table

Args:
  • tablepath (str): Path to manual curation table. Defaults to ‘data/manual_curation.xlsx’.

  • sheet_name (str): Sheet name for reaction gapfilling. Defaults to ‘gapfill’.

Returns:

pd.DataFrame: Table containing sheet with name ‘gapfill’|specified sheet_name from Excel file

refinegems.io.load_medium_custom(mediumpath: str) pandas.DataFrame

Helper function to read medium csv

Args:
  • mediumpath (str): path to csv file with medium

Returns:

pd.DataFrame: Table of csv

refinegems.io.load_medium_from_db(mediumname: str) pandas.DataFrame

Wrapper function to extract subtable for the requested medium from the database ‘data.db’

Args:
  • mediumname (str): Name of medium to test growth on

Returns:

pd.DataFrame: Table containing composition for one medium with metabs added as BiGG_EX exchange reactions

refinegems.io.load_model_cobra(modelpath: str) cobra.Model

Loads model using COBRApy

Args:
  • modelpath (str): Path to GEM

Returns:

cobraModel: Loaded model by COBRApy

refinegems.io.load_model_libsbml(modelpath: str) libsbml.Model

Loads model using libSBML

Args:
  • modelpath (str): Path to GEM

Returns:

libModel: loaded model by libSBML

refinegems.io.load_multiple_models(models: list[str], package: str) list

Loads multiple models into a list

Args:
  • models (list): List of paths to models

  • package (str): COBRApy|libSBML

Returns:

list: List of model objects loaded with COBRApy|libSBML

refinegems.io.parse_dict_to_dataframe(str2list: dict) pandas.DataFrame
Parses dictionary of form {str: list} & | Transforms it into a table with a column containing the strings and a column containing the lists
Args:

str2list (dict): Dictionary mapping strings to lists

Returns:

pd.DataFrame: Table with column containing the strings and column containing the lists

refinegems.io.parse_fasta_headers(filepath: str, id_for_model: bool = False) pandas.DataFrame

Parses FASTA file headers to obtain:

  • the protein_id

  • and the model_id (like it is obtained from CarveMe)

corresponding to the locus_tag

Args:
  • filepath (str): Path to FASTA file

  • id_for_model (bool): True if model_id similar to autogenerated GeneProduct ID should be contained in resulting table

Returns:

pd.DataFrame: Table containing the columns locus_tag, Protein_id & Model_id

refinegems.io.parse_gff_for_gp_info(gff_file: str) pandas.DataFrame

Parses gff file of organism to find gene protein reactions based on locus tags

Args:
  • gff_file (str): Path to gff file of organism of interest

Returns:

pd.DataFrame: Table containing mapping from locus tag to GPR

refinegems.io.save_user_input(configpath: str) dict[slice(<class 'str'>, <class 'str'>, None)]

This aims to collect user input from the command line to create a config file, will also save the user input to a config if no config was given

Args:
  • configpath (str): Path to config file if present

Returns:

dict: Either loaded config file or created from user input

refinegems.io.search_ncbi_for_gpr(locus: str) str

Fetches protein name from NCBI

Args:
  • locus (str): NCBI compatible locus_tag

Returns:

str: Protein name|description

refinegems.io.search_sbo_label(sbo_number: str) str

Looks up the SBO label corresponding to a given SBO Term number

Args:
  • sbo_number (str): Last three digits of SBO-Term as str

Returns:

str: Denoted label for given SBO Term

refinegems.io.validate_libsbml_model(model: libsbml.Model) int

Debug method: Validates a libSBML model with the libSBML validator Args:

  • model (libModel): A libSBML model

Returns:

int: Integer specifying if vaidate was successful or not

refinegems.io.write_report(dataframe: pandas.DataFrame, filepath: str)

Writes reports stored in dataframes to xlsx file

Args:
  • dataframe (pd.DataFrame): Table containing output

  • filepath (str): Path to file with filename

refinegems.io.write_to_file(model: libsbml.Model, new_filename: str)

Writes modified model to new file

Args:
  • model (libModel): Model loaded with libSBML

  • new_filename (str): Filename|Path for modified model

refineGEMs.modelseed module

Reports mismatches in charges and formulae based on ModelSEED

Extracts ModelSEED data from a given tsv file, extracts all metabolites from a given model. Both lists of metabolites are compared by charge and formula.

refinegems.modelseed.compare_model_modelseed(model_charges: pandas.DataFrame, modelseed_charges: pandas.DataFrame) pandas.DataFrame

Compares tables with charges / formulae from model & modelseed

Args:
  • model_charges (pd.DataFrame): Charges and formulae of model metabolites. Output of get_model_charges.

  • modelseed_charges (pd.DataFrame): Charges and formulae of ModelSEED metabolites. Output of get_modelseed_charges.

Returns:

pd.DataFrame: Table containing info whether charges / formulae match

refinegems.modelseed.compare_to_modelseed(model: cobra.Model) tuple[pandas.DataFrame, pandas.DataFrame]

Executes all steps to compare model metabolites to ModelSEED metabolites

Args:
  • model (cobraModel): Model loaded with COBRApy

Returns:
tuple: Tables with charge (1) & formula (2) mismatches
  1. pd.DataFrame: Table with charge mismatches

  2. pd.DataFrame: Table with formula mismatches

refinegems.modelseed.get_charge_mismatch(df_comp: pandas.DataFrame) pandas.DataFrame

Extracts metabolites with charge mismatch of model & modelseed

Args:

df_comp (pd.DataFrame): Charge and formula mismatches. Output from compare_model_modelseed.

Returns:

pd.DataFrame: Table containing metabolites with charge mismatch

refinegems.modelseed.get_compared_formulae(formula_mismatch: pandas.DataFrame) pandas.DataFrame

Compare formula by atom pattern

Args:

formula_mismatch (pd.DataFrame): Table with column containing atom comparison. Output from get_formula_mismatch.

Returns:

pd.DataFrame: table containing metabolites with formula mismatch

refinegems.modelseed.get_formula_mismatch(df_comp: pandas.DataFrame) pandas.DataFrame

Extracts metabolites with formula mismatch of model & modelseed

Args:

df_comp (pd.DataFrame): Charge and formula mismatches. Output from compare_model_modelseed.

Returns:

pd.DataFrame: Table containing metabolites with formula mismatch

refinegems.modelseed.get_model_charges(model: cobra.Model) pandas.DataFrame

Extracts all metabolites from model

Args:
  • model (cobraModel): Model loaded with COBRApy

Returns:

pd.DataFrame: Table containing charges and formulae of model metabolites

refinegems.modelseed.get_modelseed_charges(modelseed_compounds: pandas.DataFrame) pandas.DataFrame

Extract table with BiGG, charges and formulae

Args:
  • modelseed_compounds (pd.DataFrame): ModelSEED data. Output from get_modelseed_compounds.

Returns:

pd.DataFrame: Table containing charges and formulae of ModelSEED metabolites

refinegems.modelseed.get_modelseed_compounds() pandas.DataFrame

Extracts compounds from ModelSEED which have BiGG Ids

Returns:

pd.DataFrame: Table containing ModelSEED data

refineGEMs.pathways module

Provides functions for adding KEGG reactions as Group Pathways

If your organism occurs in the KEGG database, extract the KEGG reaction ID from the annotations of your reactions and identify, in which KEGG pathways this reaction occurs. Add all KEGG pathways for a reaction then as annotations with the biological qualifier ‘OCCURS_IN’ to the respective reaction.

refinegems.pathways.add_kegg_pathways(model, kegg_pathways)

Add KEGG reactions as BQB_OCCURS_IN

Args:
  • model (libModel): Model loaded with libSBML. Output of load_model_enable_groups.

  • kegg_pathways (dict): Reaction Id as key and Kegg Pathway Id as value. Output of extract_kegg_pathways.

Returns:

libsbml-model: modified model with Kegg pathways

refinegems.pathways.create_pathway_groups(model: libsbml.Model, pathway_groups)

Use group module to add reactions to Kegg pathway

Args:
  • model (libModel): Model loaded with libSBML. Output of load_model_enable_groups.

  • pathway_groups (dict): Kegg Pathway Id as key and reactions Ids as values. Output of get_pathway_groups.

Returns:

libModel: modified model with groups for pathways

refinegems.pathways.extract_kegg_pathways(kegg_reactions: dict) dict

Finds pathway for reactions in model with KEGG Ids, accesses KEGG API, uses tqdm to report progres to user

Args:
  • kegg_reactions (dict): Reaction Id as key and Kegg Id as value. Output[0] from extract_kegg_reactions.

Returns:

dict: Reaction Id as key and Kegg Pathway Id as value

refinegems.pathways.extract_kegg_reactions(model: libsbml.Model) tuple[dict, list]

Extract KEGG Ids from reactions

Args:
  • model (libModel): Model loaded with libSBML. Output of load_model_enable_groups.

Returns:
tuple: Dictionary ‘reaction_id’: ‘KEGG_id’ (1) & List of reactions without KEGG Id (2)
  1. dict: Reaction Id as key and Kegg Id as value

  2. list: Ids of reactions without KEGG annotation

refinegems.pathways.get_pathway_groups(kegg_pathways)

Group reaction into pathways

Args:
  • kegg_pathways (dict): Reaction Id as key and Kegg Pathway Id as value. Output of extract_kegg_pathways.

Returns:

dict: Kegg Pathway Id as key and reactions Ids as values

refinegems.pathways.kegg_pathways(modelpath: str) tuple[libsbml.Model, list[str]]

Executes all steps to add KEGG pathways as groups

Args:
  • modelpath (str): Path to GEM

Returns:
tuple: libSBML model (1) & List of reactions without KEGG Id (2)
  1. libModel: Modified model with Pathways as groups

  2. list: Ids of reactions without KEGG annotation

refinegems.pathways.load_model_enable_groups(modelpath: str) libsbml.Model

Loads model as document using libSBML and enables groups extension

Args:
  • modelpath (str): Path to GEM

Returns:

libModel: Model loaded with libSBML

refineGEMs.polish module

refineGEMs.sboann module

Provides functions to automate the addition of SBO terms to the model

Script written by Elisabeth Fritze in her bachelor thesis. Modified by Gwendolyn O. Gusak during her master thesis. Commented by Famke Bäuerle and extended by Nantia Leonidou.

It is splitted into a lot of small functions which are all annotated, however when using it for SBO-Term annotation it only makes sense to run the “main” function: sbo_annotation(model_libsbml, database_user, database_name) if you want to continue with the model. The smaller functions might be useful if special information is needed for a reaction without the context of a bigger model or when the automated annotation fails for some reason.

refinegems.sboann.addSBOforCompartments(model)
refinegems.sboann.addSBOforGenes(model)
refinegems.sboann.addSBOforGroups(model)
refinegems.sboann.addSBOforMetabolites(model)
refinegems.sboann.addSBOforModel(model)
refinegems.sboann.addSBOforParameters(model)
refinegems.sboann.addSBOfromDB(reac: libsbml.Reaction, cur) bool

Adds SBO term based on bigg id of a reaction

Args:
  • reac (Reaction): Reaction from sbml model

  • cur (sqlite3.connect.cursor): Used to access the sqlite3 database

Returns:

bool: True if SBO Term was changed

refinegems.sboann.addSBOviaEC(reac: libsbml.Reaction, cur)

Adds SBO terms based on EC numbers given in the annotations of a reactions

Args:
  • reac (Reaction): Reaction from sbml model

  • cur (sqlite3.connect.cursor): Used to access the sqlite3 database

refinegems.sboann.checkAcetylationViaEC(reac: libsbml.Reaction)

Tests if reac is acetylation by its EC-Code and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkActiveTransport(reac: libsbml.Reaction)

Tests if reac is active transport (uses atp/pep) and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkBiomass(reac: libsbml.Reaction)

Tests if reac is biomass / growth and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkCoTransport(reac: libsbml.Reaction)

Tests if reac is co-transport and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkDeamination(reac: libsbml.Reaction)

Tests if reac is deamination and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkDeaminationViaEC(reac: libsbml.Reaction)

Tests if reac is deamination by its EC-Code and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkDecarbonylation(reac: libsbml.Reaction)

Tests if reac is decarbonylation and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkDecarboxylation(reac: libsbml.Reaction)

Tests if reac is decarboxylation and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkDecarboxylationViaEC(reac: libsbml.Reaction)

Tests if reac is decarboxylation by its EC-Code and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkDemand(reac: libsbml.Reaction)

Tests if reac is demand and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkExchange(reac: libsbml.Reaction)

Tests if reac is exchange and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkGlycosylation(reac: libsbml.Reaction)

Tests if reac is glycosylation and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkGlycosylationViaEC(reac: libsbml.Reaction)

Tests if reac is glycosylation by its EC-Code and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkHydrolysisViaEC(reac: libsbml.Reaction)

Tests if reac is hydrolysis by its EC-Code and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkIsomerisationViaEC(reac: libsbml.Reaction)

Tests if reac is isomerisation by its EC-Code and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkMethylationViaEC(reac: libsbml.Reaction)

Tests if reac is methylation by its EC-Code and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkPassiveTransport(reac: libsbml.Reaction)

Tests if reac is passive transport and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkPhosphorylation(reac: libsbml.Reaction)

Tests if reac is phosphorylase / kinase and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkRedox(reac: libsbml.Reaction)

Tests if reac is redox and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkRedoxViaEC(reac: libsbml.Reaction)

Tests if reac is redox by its EC-Code and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkSink(reac: libsbml.Reaction)

Tests if reac is sink and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.checkTransaminationViaEC(reac: libsbml.Reaction)

Tests if reac is transamination by its EC-Code and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.getCompartmentDict(reac: libsbml.Reaction)

sorts metabolites by compartment

Args:
  • reac (Reaction): Reaction from sbml model

Returns:

dict: compartment as key and metabolites as values

refinegems.sboann.getCompartmentFromSpeciesRef(speciesReference: libsbml.SpeciesReference) libsbml.Compartment

Extracts compartment from a species by its reference

Args:
  • speciesReference (SpeciesReference): Reference to species

Returns:

Compartment: Compartment which the species lives in

refinegems.sboann.getCompartmentList(reac: libsbml.Reaction)

Extracts compartments of metabolites

Args:
  • reac (Reaction): Reaction from sbml model

Returns:

set: compartment information of all metabolites

refinegems.sboann.getCompartmentlessMetaboliteIds(reac: libsbml.Reaction)

Extracts metabolites which have no compartment information

Args:
  • reac (Reaction): Reaction from sbml model

Returns:

list: all metabolites which have no compartment

refinegems.sboann.getCompartmentlessProductIds(reac: libsbml.Reaction)

Extracts products which have no compartment information

Args:
  • reac (Reaction): Reaction from sbml model

Returns:

list: products (metabolites) without compartments

refinegems.sboann.getCompartmentlessReactantIds(reac: libsbml.Reaction)

Extracts reactants which have no compartment information

Args:
  • reac (Reaction): Reaction from sbml model

Returns:

list: reactants (metabolites) without compartments

refinegems.sboann.getCompartmentlessSpeciesId(speciesReference: libsbml.SpeciesReference) str

Determines wheter a species has compartment by its refernece

Args:
  • speciesReference (SpeciesReference): Reference to species

Returns:

libsbml-species-id: id of species without compartment

refinegems.sboann.getECNums(reac: libsbml.Reaction)

Extracts EC-Code from the reaction annotations

Args:
  • reac (Reaction): Reaction from sbml model

Returns:

list: all EC-Numbers of the reaction

refinegems.sboann.getListOfMetabolites(reac: libsbml.Reaction)

Extracts list of metabolites of the reaction

Args:
  • reac (Reaction): Reaction from sbml model

Returns:

list: metabolites that are part of the reaction

refinegems.sboann.getMetaboliteIds(reac: libsbml.Reaction)

Extracts list of metabolite ids of reaction

Args:
  • reac (Reaction): Reaction from sbml model

Returns:

list: metabolite ids

refinegems.sboann.getProductCompartmentList(reac: libsbml.Reaction)

Extracts compartments of products

Args:
  • reac (Reaction): Reaction from sbml model

Returns:

set: compartment information of all products (metabolites)

refinegems.sboann.getProductIds(reac: libsbml.Reaction)

Extracts products (metabolites) of reaction

Args:
  • reac (Reaction): Reaction from sbml model

Returns:

list: products (metabolites) ids

refinegems.sboann.getReactantCompartmentList(reac: libsbml.Reaction)

Extracts compartments of reactants

Args:
  • reac (Reaction): Reaction from sbml model

Returns:

set: compartment information of all reactants (metabolites)

refinegems.sboann.getReactantIds(reac: libsbml.Reaction) list[str]

Extracts reactants (metabolites) of reaction

Args:
  • reac (Reaction): Reaction from sbml model

Returns:

list[str]: Reactants (metabolites) ids

refinegems.sboann.hasReactantPair(reac: libsbml.Reaction, met1: libsbml.Species, met2: libsbml.Species) bool
Checks if a pair of metabolites is present in reaction | needed for special reactions like redox or deamination
Args:
  • reac (Reaction): Reaction from sbml model

  • met1 (Species): metabolite 1 of metabolite pair

  • met2 (Species): metabolite 2 of metabolite pair

Returns:

bool: True if one of the metabolites is in reactants and the other in products

refinegems.sboann.isProtonTransport(reac: libsbml.Reaction)

check if reaction is proton transport

Args:
  • reac (Reaction): Reaction from sbml model

Returns:

bool: True if reaction is proton transport

refinegems.sboann.moreThanTwoCompartmentTransport(reac: libsbml.Reaction)

check if reaction traverses more than 2 compartments

Args:
  • reac (Reaction): Reaction from sbml model

Returns:

bool: True if reaction traverses more than 2 compartments

refinegems.sboann.returnCompartment(id)

Helper to split compartment id

refinegems.sboann.sbo_annotation(model_libsbml: libsbml.Model) libsbml.Model

Executes all steps to annotate SBO terms to a given model (former main function of original script by Elisabeth Fritze)

Args:
  • model_libsbml (libModel): Model loaded with libsbml

Returns:

libModel: Modified model with SBO terms

refinegems.sboann.soleProtonTransported(reac: libsbml.Reaction)

check if reaction is transport powered by one H

Args:
  • reac (Reaction): Reaction from sbml model

Returns:

bool: True if reaction is transport powered by one H

refinegems.sboann.splitSymAntiPorter(reac: libsbml.Reaction)

Tests if reac is sym- or antiporter and sets SBO Term if true

Args:
  • reac (Reaction): Reaction from sbml model

refinegems.sboann.splitTransportBiochem(reac: libsbml.Reaction)

Tests if reaction traverses more than 1 compartment and set SBO Term

Args:
  • reac (Reaction): Reaction from sbml model