Skip to content

COBRA-k API reference

COBRAk is a general COBRA/COBRA-k suite written as Python package. For more about it, visit its repository: https://github.com/klamt-lab/COBRAk

The init file of COBRAk initializes the rich text output & tracebacks as well as its logger. Furthermore, graceful shutdown of user-induced shutdowns is enabled.

exit_signal_handler(sig, frame)

Handles the exit signal by printing a shutdown message and exiting the program.

Parameters:

Name Type Description Default
sig int

The signal number.

required
frame Optional[FrameType]

The current frame.

required
Source code in cobrak/__init__.py
21
22
23
24
25
26
27
28
29
30
31
32
33
34
def exit_signal_handler(
    sig: int,  # noqa: ARG001
    frame: FrameType | None,  # noqa: ARG001
) -> None:  # pragma: no cover
    """Handles the exit signal by printing a shutdown message and exiting the program.

    Args:
        sig (int): The signal number.
        frame (Optional[FrameType]): The current frame.
    """
    print(
        "COBRAk received user signal to terminate (this message may appear multiple times in parallelized contexts). Shutting down..."
    )
    sys.exit(0)

set_logging_handler(show_path=False, show_time=False, show_level=True, keywords=['info', 'warning', 'error', 'critical'], **args)

Sets up the logging handler with the given options.

Parameters:

Name Type Description Default
show_path bool

Whether to show the path. Defaults to False.

False
show_time bool

Whether to show the time. Defaults to False.

False
show_level bool

Whether to show the level. Defaults to True.

True
keywords Dict[str, str]

The keywords to highlight. Defaults to ["info", "warning", "error", "critical"]

['info', 'warning', 'error', 'critical']
**args Any

Additional Rich handler arguments.

{}
Source code in cobrak/__init__.py
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
def set_logging_handler(
    show_path: bool = False,
    show_time: bool = False,
    show_level: bool = True,
    keywords: list[str] = [
        "info",
        "warning",
        "error",
        "critical",
    ],
    **args: Any,  # noqa: ANN401
) -> RichHandler:
    """
    Sets up the logging handler with the given options.

    Args:
        show_path (bool, optional): Whether to show the path. Defaults to False.
        show_time (bool, optional): Whether to show the time. Defaults to False.
        show_level (bool, optional): Whether to show the level. Defaults to True.
        keywords (Dict[str, str], optional): The keywords to highlight. Defaults to ["info", "warning", "error", "critical"]
        **args (Any, optional): Additional Rich handler arguments.
    """
    return RichHandler(
        show_path=show_path,
        show_time=show_time,
        show_level=show_level,
        keywords=keywords,
        **args,
    )

set_logging_level(level)

Sets the logging level.

E.g. INFO, ERROR, WARNING and CRITICAL from Python's logging module.

Parameters:

Name Type Description Default
level int

The logging level.

required
Source code in cobrak/__init__.py
46
47
48
49
50
51
52
53
54
def set_logging_level(level: int) -> None:
    """Sets the logging level.

    E.g. INFO, ERROR, WARNING and CRITICAL from Python's logging module.

    Args:
        level (int): The logging level.
    """
    logger.setLevel(level)

setup_rich_tracebacks(show_locals)

Sets up rich tracebacks with the given options.

Parameters:

Name Type Description Default
show_locals bool

Whether to show local variables in the traceback.

required
Source code in cobrak/__init__.py
37
38
39
40
41
42
43
def setup_rich_tracebacks(show_locals: bool) -> None:
    """Sets up rich tracebacks with the given options.

    Args:
        show_locals (bool): Whether to show local variables in the traceback.
    """
    install(show_locals=show_locals)

alphafold_db_functionality

Functionality to retrieve PDB files from the AlphaFold Protein Structure Database

Link to database (as of December 2, 2025): https://alphafold.ebi.ac.uk/

download_alphafold_pdb(uniprot_id, output_dir='.', as_gzip=False)

Downloads the predicted Protein Data Bank (PDB) file for a given UniProt ID from the AlphaFold Protein Structure Database (AlphaFold DB).

This function queries the AlphaFold API, identifies the most complete/latest PDB URL (specifically by finding the highest numbered entry in fragmented predictions), and downloads the file to the specified directory.

Parameters:

Name Type Description Default
uniprot_id str

The UniProt accession ID (e.g., "P00520") for the target protein.

required
output_dir str

The directory path where the PDB file should be saved. Defaults to the current directory (".").

'.'
as_gzip bool

Whether or not the file shall be compressed through gzip (.gz is added to the file name). Defaults to False.

False

Returns:

Name Type Description
None None

The function prints status messages and saves the file to disk.

Raises:

Type Description
HTTPError

If the initial API request fails due to an HTTP error (e.g., 404, 500).

Notes
  • The download file is named using the format: <uniprot_id>__<original_filename> (e.g., P00520__AF-P00520-F1-model_v4.pdb).
  • The AlphaFold DB API is used to handle multi-fragment predictions (for proteins over ~2700 residues) by attempting to select the final fragment or the entry with the highest numerical identifier in the URL path.
Source code in cobrak/alphafold_db_functionality.py
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
def download_alphafold_pdb(
    uniprot_id: str, output_dir: str = ".", as_gzip: bool = False
) -> None:
    """Downloads the predicted Protein Data Bank (PDB) file for a given UniProt ID
    from the AlphaFold Protein Structure Database (AlphaFold DB).

    This function queries the AlphaFold API, identifies the most complete/latest PDB
    URL (specifically by finding the highest numbered entry in fragmented predictions),
    and downloads the file to the specified directory.

    Args:
        uniprot_id: The UniProt accession ID (e.g., "P00520") for the target protein.
        output_dir: The directory path where the PDB file should be saved.
                    Defaults to the current directory (``"."``).
        as_gzip: Whether or not the file shall be compressed through gzip (.gz is added to the file name).
            Defaults to False.

    Returns:
        None: The function prints status messages and saves the file to disk.

    Raises:
        requests.exceptions.HTTPError: If the initial API request fails due to an
                                       HTTP error (e.g., 404, 500).

    Notes:
        * The download file is named using the format:
          ``<uniprot_id>__<original_filename>`` (e.g., ``P00520__AF-P00520-F1-model_v4.pdb``).
        * The AlphaFold DB API is used to handle multi-fragment predictions (for
          proteins over ~2700 residues) by attempting to select the final fragment
          or the entry with the highest numerical identifier in the URL path.
    """
    output_dir = standardize_folder(output_dir)
    ensure_folder_existence(output_dir)

    output_dir_files = get_files(output_dir)
    for file in output_dir_files:
        if file.startswith(uniprot_id + "__"):
            print(f"File for {uniprot_id} already exists as {file} in {output_dir}")
        return

    # 1. API Endpoint for the specific UniProt ID
    api_url = f"https://alphafold.ebi.ac.uk/api/prediction/{uniprot_id}"

    try:
        response = requests.get(api_url)
        response.raise_for_status()
        data = response.json()

        # 2. Iterate through predictions (essential for proteins > 2700 residues)
        if not data:
            print(f"No AlphaFold prediction found for {uniprot_id}")
            return

        last_pdb_url = ""
        last_pdb_number = -1
        for entry in data:
            # Extract the PDB URL from the JSON response
            pdb_url = entry.get("pdbUrl")
            url_split = pdb_url.split("-")[2]
            try:
                int(url_split)
                int_url_split_possible = True
            except ValueError:
                int_url_split_possible = False
            if pdb_url and (
                (not int_url_split_possible)
                or (int_url_split_possible and int(url_split) > last_pdb_number)
            ):
                last_pdb_url = pdb_url
                if int_url_split_possible:
                    last_pdb_number = int(url_split)

        if last_pdb_url:
            # 3. Download the file
            file_name = uniprot_id + "__" + os.path.basename(last_pdb_url)
            save_path = os.path.join(output_dir, file_name)

            print(f"Downloading {file_name}...")
            pdb_response = requests.get(last_pdb_url)
            if as_gzip:
                gzip_write_file(f"{save_path}.gz", [pdb_response.text])
            else:
                with open(save_path, "wb") as f:
                    f.write(pdb_response.content)

        if not last_pdb_url:
            print(f"PDB file not found in metadata for {uniprot_id}")

    except requests.exceptions.HTTPError as err:
        print(f"HTTP Error for {uniprot_id}: {err}")

download_alphafold_pdb_for_all_enzymes(cobrak_model, uniprot_annotation_id='uniprot', output_dir='.', sleep_time=1.0, as_gzip=False)

Downloads AlphaFold PDB files for all enzymes in a COBRA-k Model that have a given UniProt ID annotation.

It iterates through the reactions in the COBRA-k Model, extracts the UniProt ID from the reaction's annotation, and calls download_alphafold_pdb for each. A delay is introduced between downloads to respect API rate limits.

Parameters:

Name Type Description Default
cobrak_model Model

A COBRA-k Model instance.

required
uniprot_annotation_id str

The key used in the reaction's annotation dictionary to store the UniProt ID (defaults to "uniprot").

'uniprot'
output_dir str

The directory path where the PDB files should be saved. Defaults to the current directory (".").

'.'
sleep_time float

The time in seconds to pause between consecutive downloads to cool down the AlphaFold API server. Defaults to 1.0 second.

1.0
as_gzip bool

Whether or not the file shall be compressed through gzip (.gz is added to the file name). Defaults to False.

False

Returns:

Name Type Description
None None

The function manages file downloads and prints status messages.

Source code in cobrak/alphafold_db_functionality.py
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
def download_alphafold_pdb_for_all_enzymes(
    cobrak_model: Model,
    uniprot_annotation_id: str = "uniprot",
    output_dir: str = ".",
    sleep_time: float = 1.0,
    as_gzip: bool = False,
) -> None:
    """Downloads AlphaFold PDB files for all enzymes in a COBRA-k Model
    that have a given UniProt ID annotation.

    It iterates through the reactions in the COBRA-k Model, extracts the UniProt ID
    from the reaction's annotation, and calls `download_alphafold_pdb` for each.
    A delay is introduced between downloads to respect API rate limits.

    Args:
        cobrak_model: A COBRA-k Model instance.
        uniprot_annotation_id: The key used in the reaction's `annotation`
                               dictionary to store the UniProt ID (defaults to "uniprot").
        output_dir: The directory path where the PDB files should be saved.
                    Defaults to the current directory (``"."``).
        sleep_time: The time in seconds to pause between consecutive downloads
                    to cool down the AlphaFold API server. Defaults to 1.0 second.
        as_gzip: Whether or not the file shall be compressed through gzip (.gz is added to the file name).
            Defaults to False.

    Returns:
        None: The function manages file downloads and prints status messages.
    """
    for enzyme_data in cobrak_model.enzymes.values():
        if uniprot_annotation_id not in enzyme_data.annotation:
            continue
        uniprot_id = enzyme_data.annotation[uniprot_annotation_id]
        download_alphafold_pdb(
            uniprot_id=uniprot_id, output_dir=output_dir, as_gzip=as_gzip
        )
        sleep(sleep_time)

bigg_metabolites_functionality

bigg_parse_metabolites_file.py

This module contains a function which transforms a BIGG metabolites .txt list into an machine-readable JSON.

bigg_parse_metabolites_file(bigg_metabolites_txt_path, bigg_metabolites_json_path)

Parses a BIGG metabolites text file and returns a dictionary for this file.

As of Sep 14 2024, a BIGG metabolites list of all BIGG-included metabolites is retrievable under http://bigg.ucsd.edu/data_access

Arguments
  • bigg_metabolites_file_path: str ~ The file path to the BIGG metabolites file. The usual file name (which has to be included too in this argument) is bigg_models_metabolites.txt
  • output_folder: str ~ The folder in which the JSON including the parsed BIGG metabolites file data is stored with the name 'bigg_id_name_mapping.json'
Output
  • A JSON file with the name 'bigg_id_name_mapping.json' in the given output folder, with the following structure:
 {
     "$BIGG_ID": "$CHEMICAL_OR_USUAL_NAME",
     (...),
     "$BIGG_ID": "$BIGG_ID",
     (...),
 }

The BIGG ID <-> BIGG ID mapping is done for models which already use the BIGG IDs.

Source code in cobrak/bigg_metabolites_functionality.py
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
@validate_call
def bigg_parse_metabolites_file(
    bigg_metabolites_txt_path: str,
    bigg_metabolites_json_path: str,
) -> None:
    """Parses a BIGG metabolites text file and returns a dictionary for this file.

    As of Sep 14 2024, a BIGG metabolites list of all BIGG-included metabolites
    is retrievable under http://bigg.ucsd.edu/data_access

    Arguments
    ----------
    * bigg_metabolites_file_path: str ~ The file path to the BIGG metabolites file.
      The usual file name (which has to be included too in this argument) is
      bigg_models_metabolites.txt
    * output_folder: str ~ The folder in which the JSON including the parsed BIGG
      metabolites file data is stored with the name 'bigg_id_name_mapping.json'

    Output
    ----------
    * A JSON file with the name 'bigg_id_name_mapping.json' in the given output folder,
      with the following structure:
    <pre>
     {
         "$BIGG_ID": "$CHEMICAL_OR_USUAL_NAME",
         (...),
         "$BIGG_ID": "$BIGG_ID",
         (...),
     }
    </pre>
    The BIGG ID <-> BIGG ID mapping is done for models which already use the BIGG IDs.
    """
    # Open the BIGG metabolites file as string list, and remove all newlines
    with open(bigg_metabolites_txt_path, encoding="utf-8") as f:
        lines = f.readlines()
    lines = [x.replace("\n", "") for x in lines if len(x) > 0]

    # Mapping variable which will store the BIGG ID<->
    bigg_id_name_mapping = {}
    # Go through each BIGG metabolites file line (which is a tab-separated file)
    # and retrieve the BIGG ID and the name (if there is a name for the given BIGG
    # ID)
    for line in lines:
        bigg_id = line.split("\t")[1]
        bigg_id_name_mapping[bigg_id] = bigg_id

        # Exception to check if there is no name :O
        try:
            name = line.split("\t")[2].lower()
        except Exception:
            continue
        bigg_id_name_mapping[name] = bigg_id

        try:
            database_links = line.split("\t")[4]
        except Exception:
            continue
        for database_link_part in database_links.split(": "):
            if "CHEBI:" not in database_link_part:
                continue
            subpart = database_link_part.split("CHEBI:")[1].strip()
            chebi_id = subpart.split("; ")[0] if "; " in subpart else subpart
            bigg_id_name_mapping[chebi_id] = bigg_id

    # Write the JSON in the given folder :D
    json_write(bigg_metabolites_json_path, bigg_id_name_mapping)

brenda_functionality

Contains functions which allow one to create a model-specific and BRENDA-depending kinetic data database.

brenda_select_enzyme_kinetic_data_for_sbml(sbml_path, brenda_json_targz_file_path, bigg_metabolites_json_path, brenda_version, base_species, ncbi_parsed_json_path, kinetic_ignored_metabolites=[], kinetic_ignored_enzyme_ids=[], custom_enzyme_kinetic_data={}, min_ph=-float('inf'), max_ph=float('inf'), accept_nan_ph=True, min_temperature=-float('inf'), max_temperature=float('inf'), accept_nan_temperature=True, kcat_overwrite={}, transfered_ec_number_json='', max_taxonomy_level=1000000000.0, kis_and_kas_only_for_same_compartments=True)

Select and assign enzyme kinetic data for each reaction in an SBML model based on BRENDA database entries and taxonomic similarity.

This function retrieves enzyme kinetic data from a compressed BRENDA JSON file, merges it with BiGG metabolite translation data and taxonomy information from NCBI. It then iterates over the reactions in the provided SBML model to:

  • Filter reactions that have EC code annotations.
  • Identify eligible EC codes (ignoring those with hyphens).
  • Collect kinetic entries (e.g., turnover numbers, KM values, KI values) for each metabolite involved in the reaction.
  • Choose the best kinetic parameters (k_cat, k_ms, k_is) based on taxonomic similarity to a base species.
  • Apply conversion factors (e.g., s⁻¹ to h⁻¹ for k_cat, mM to M for KM and KI).
  • Respect ignore lists for metabolites and enzymes.
  • Override computed k_cat values if provided in the kcat_overwrite dictionary.
  • Merge with any custom enzyme kinetic data provided.

Parameters:

Name Type Description Default
sbml_path str

Path to SBML model.

required
brenda_json_targz_file_path str

Path to the compressed JSON file containing BRENDA enzyme kinetic data.

required
bigg_metabolites_json_path str

Path to the JSON file mapping metabolite IDs to BiGG identifiers.

required
brenda_version str

String identifier for the BRENDA database version.

required
base_species str

Species identifier used as the reference for taxonomic similarity.

required
ncbi_parsed_json_path str

Path to the parsed JSON file containing NCBI taxonomy data.

required
kinetic_ignored_metabolites list[str]

List of metabolite IDs to exclude from kinetic parameter selection. Defaults to an empty list.

[]
kinetic_ignored_enzyme_ids list[str]

List of enzyme identifiers to ignore when considering a reaction. Defaults to an empty list.

[]
custom_enzyme_kinetic_data dict[str, EnzymeReactionData | None]

Dictionary of custom enzyme kinetic data to override or supplement computed data. The keys are reaction IDs and the values are EnzymeReactionData instances or None. Defaults to an empty dictionary.

{}
min_ph float

The minimum pH value for kinetic data inclusion. Defaults to negative infinity.

-float('inf')
max_ph float

The maximum pH value for kinetic data inclusion. Defaults to positive infinity.

float('inf')
accept_nan_ph bool

If True, kinetic entries with NaN pH values are accepted. Defaults to True.

True
min_temperature float

The minimum temperature value (e.g., in Kelvin) for kinetic data inclusion. Defaults to negative infinity.

-float('inf')
max_temperature float

The maximum temperature value for kinetic data inclusion. Defaults to positive infinity.

float('inf')
accept_nan_temperature bool

If True, kinetic entries with NaN temperature values are accepted. Defaults to True.

True
kcat_overwrite dict[str, float]

Dictionary mapping reaction IDs to k_cat values that should override computed values. Defaults to an empty dictionary.

{}
kis_and_kas_only_for_same_compartments bool

bool, default False If True, kis and kas can only be attributed to a reaction if the affected metabolite has shares one of the reaction metabolite's compartments

True

Returns:

Type Description
dict[str, EnzymeReactionData | None]

dict[str, EnzymeReactionData | None]: A dictionary mapping reaction IDs (str) from the COBRApy model to their corresponding EnzymeReactionData instances. If no suitable kinetic data are found (or if the enzyme is in the ignore list), the value will be None for that reaction.

Notes
  • Kinetic values are converted to standardized units:
    • k_cat values use the unit h⁻¹.
    • KM, KA and KI values use the unit M=mol⋅l⁻¹.
  • The function leverages taxonomic similarity (using NCBI TAXONOMY data) to select the most relevant kinetic values.
  • Custom enzyme kinetic data and k_cat overrides will replace any computed values.
Source code in cobrak/brenda_functionality.py
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def brenda_select_enzyme_kinetic_data_for_sbml(
    sbml_path: str,
    brenda_json_targz_file_path: str,
    bigg_metabolites_json_path: str,
    brenda_version: str,
    base_species: str,
    ncbi_parsed_json_path: str,
    kinetic_ignored_metabolites: list[str] = [],
    kinetic_ignored_enzyme_ids: list[str] = [],
    custom_enzyme_kinetic_data: dict[str, EnzymeReactionData | None] = {},
    min_ph: float = -float("inf"),
    max_ph: float = float("inf"),
    accept_nan_ph: bool = True,
    min_temperature: float = -float("inf"),
    max_temperature: float = float("inf"),
    accept_nan_temperature: bool = True,
    kcat_overwrite: dict[str, float] = {},
    transfered_ec_number_json: str = "",
    max_taxonomy_level: NonNegativeInt = 1e9,
    kis_and_kas_only_for_same_compartments: bool = True,
) -> dict[str, EnzymeReactionData | None]:
    """Select and assign enzyme kinetic data for each reaction in an SBML model based on BRENDA
    database entries and taxonomic similarity.

    This function retrieves enzyme kinetic data from a compressed BRENDA JSON file, merges it
    with BiGG metabolite translation data and taxonomy information from NCBI. It then iterates
    over the reactions in the provided SBML model to:

      - Filter reactions that have EC code annotations.
      - Identify eligible EC codes (ignoring those with hyphens).
      - Collect kinetic entries (e.g., turnover numbers, KM values, KI values) for each metabolite
        involved in the reaction.
      - Choose the best kinetic parameters (k_cat, k_ms, k_is) based on taxonomic similarity to
        a base species.
      - Apply conversion factors (e.g., s⁻¹ to h⁻¹ for k_cat, mM to M for KM and KI).
      - Respect ignore lists for metabolites and enzymes.
      - Override computed k_cat values if provided in the kcat_overwrite dictionary.
      - Merge with any custom enzyme kinetic data provided.

    Parameters:
        sbml_path (str): Path to SBML model.
        brenda_json_targz_file_path (str): Path to the compressed JSON file containing
            BRENDA enzyme kinetic data.
        bigg_metabolites_json_path (str): Path to the JSON file mapping metabolite IDs to
            BiGG identifiers.
        brenda_version (str): String identifier for the BRENDA database version.
        base_species (str): Species identifier used as the reference for taxonomic similarity.
        ncbi_parsed_json_path (str): Path to the parsed JSON file containing NCBI taxonomy data.
        kinetic_ignored_metabolites (list[str], optional): List of metabolite IDs to exclude
            from kinetic parameter selection. Defaults to an empty list.
        kinetic_ignored_enzyme_ids (list[str], optional): List of enzyme identifiers to ignore
            when considering a reaction. Defaults to an empty list.
        custom_enzyme_kinetic_data (dict[str, EnzymeReactionData | None], optional):
            Dictionary of custom enzyme kinetic data to override or supplement computed data.
            The keys are reaction IDs and the values are EnzymeReactionData instances or None.
            Defaults to an empty dictionary.
        min_ph (float, optional): The minimum pH value for kinetic data inclusion. Defaults
            to negative infinity.
        max_ph (float, optional): The maximum pH value for kinetic data inclusion. Defaults
            to positive infinity.
        accept_nan_ph (bool, optional): If True, kinetic entries with NaN pH values are accepted.
            Defaults to True.
        min_temperature (float, optional): The minimum temperature value (e.g., in Kelvin) for
            kinetic data inclusion. Defaults to negative infinity.
        max_temperature (float, optional): The maximum temperature value for kinetic data inclusion.
            Defaults to positive infinity.
        accept_nan_temperature (bool, optional): If True, kinetic entries with NaN temperature values
            are accepted. Defaults to True.
        kcat_overwrite (dict[str, float], optional): Dictionary mapping reaction IDs to k_cat values
            that should override computed values. Defaults to an empty dictionary.
        kis_and_kas_only_for_same_compartments: bool, default False
            If True, kis and kas can only be attributed to a reaction if the affected metabolite has
            shares one of the reaction metabolite's compartments

    Returns:
        dict[str, EnzymeReactionData | None]:
            A dictionary mapping reaction IDs (str) from the COBRApy model to their corresponding
            EnzymeReactionData instances. If no suitable kinetic data are found (or if the enzyme
            is in the ignore list), the value will be None for that reaction.

    Notes:
        - Kinetic values are converted to standardized units:
            - k_cat values use the unit h⁻¹.
            - KM, KA and KI values use the unit M=mol⋅l⁻¹.
        - The function leverages taxonomic similarity (using NCBI TAXONOMY data)
          to select the most relevant kinetic values.
        - Custom enzyme kinetic data and k_cat overrides will replace any computed values.
    """
    cobra_model = cobra.io.read_sbml_model(sbml_path)
    transfered_ec_codes: dict[str, str] = (
        json_load(transfered_ec_number_json, dict[str, str])
        if transfered_ec_number_json
        else {}
    )
    brenda_database_for_model = _brenda_get_all_enzyme_kinetic_data_for_model(
        cobra_model,
        brenda_json_targz_file_path,
        bigg_metabolites_json_path,
        brenda_version,
        min_ph,
        max_ph,
        accept_nan_ph,
        min_temperature,
        max_temperature,
        accept_nan_temperature,
        transfered_ec_codes=transfered_ec_codes,
    )
    ncbi_parsed_json_data = json_zip_load(ncbi_parsed_json_path)

    bigg_metabolites_data: dict[str, str] = json_load(
        bigg_metabolites_json_path,
        dict[str, str],
    )

    # Get reaction<->enzyme reaction data mapping
    enzyme_reaction_data: dict[str, EnzymeReactionData | None] = {}
    for reaction in cobra_model.reactions:
        if reaction.id.startswith("EX_"):
            continue
        if "ec-code" not in reaction.annotation:
            continue
        substrate_names_and_ids = []
        for metabolite, stoichiometry in reaction.metabolites.items():
            if stoichiometry < 0:
                substrate_names_and_ids.extend((metabolite.id, metabolite.name.lower()))
                for suffix in [f"_{compartment}" for compartment in BIGG_COMPARTMENTS]:
                    if metabolite.id.endswith(suffix):
                        substrate_names_and_ids.append(
                            (metabolite.id + "\b").replace(suffix + "\b", "")
                        )
                for checked_string in (metabolite.id, metabolite.name.lower()):
                    bigg_id = _search_metname_in_bigg_ids(
                        checked_string,
                        bigg_id="",
                        entry=None,
                        name_to_bigg_id_dict=bigg_metabolites_data,
                    )
                    if bigg_id:
                        substrate_names_and_ids.append(bigg_id)
        substrate_names_and_ids_set = set(substrate_names_and_ids)

        reaction_ec_codes = reaction.annotation["ec-code"]
        if isinstance(reaction_ec_codes, str):
            reaction_ec_codes = [reaction_ec_codes]
        eligible_reaction_ec_codes = [
            ec_code
            for ec_code in reaction_ec_codes
            if (ec_code in brenda_database_for_model) and ("-" not in ec_code)
        ]

        reaction_transfered_ec_codes = []
        for ec_code in eligible_reaction_ec_codes:
            if ec_code in transfered_ec_codes:
                single_transfered_ec_code = transfered_ec_codes[ec_code]
                if single_transfered_ec_code in brenda_database_for_model:
                    reaction_transfered_ec_codes.append(single_transfered_ec_code)
        eligible_reaction_ec_codes += reaction_transfered_ec_codes

        metabolite_entries: dict[str, dict[str, Any]] = {}
        for ec_code in eligible_reaction_ec_codes:
            ec_code_entry = brenda_database_for_model[ec_code]
            for met_id in ec_code_entry:
                if met_id == "WILDCARD":
                    continue
                if met_id not in metabolite_entries:
                    metabolite_entries[met_id] = {}
                for organism in ec_code_entry[met_id]:
                    if organism not in metabolite_entries[met_id]:
                        metabolite_entries[met_id][organism] = []
                    metabolite_entries[met_id][organism] += ec_code_entry[met_id][
                        organism
                    ]

        # Choose kcats and kms taxonomically
        best_kcat_taxonomy_level = float("inf")
        best_km_taxonomy_levels = {
            metabolite.id: float("inf") for metabolite in cobra_model.metabolites
        }
        best_ki_taxonomy_levels = {
            metabolite.id: float("inf") for metabolite in cobra_model.metabolites
        }
        taxonomically_best_kcats: list[float] = []
        taxonomically_best_kms: dict[str, list[float]] = {}
        taxonomically_best_kis: dict[str, list[float]] = {}
        k_cat_references: list[ParameterReference] = []
        k_m_references: dict[str, list[ParameterReference]] = {}
        k_i_references: dict[str, list[ParameterReference]] = {}
        reaction_compartments = [met.compartment for met in reaction.metabolites]
        for metabolite in cobra_model.metabolites:
            idx_last_underscore = metabolite.id.rfind("_")
            met_id = metabolite.id[:idx_last_underscore]
            if metabolite.id in kinetic_ignored_metabolites:
                continue
            if met_id not in metabolite_entries:
                continue
            organisms = list(metabolite_entries[met_id].keys())
            if base_species not in organisms:
                organisms.append(base_species)
            taxonomy_dict = get_taxonomy_dict_from_nbci_taxonomy(
                organisms, ncbi_parsed_json_data
            )
            taxonomy_similarities = most_taxonomic_similar(base_species, taxonomy_dict)
            highest_taxonomy_level = max(taxonomy_similarities.values())
            for taxonomy_level in range(highest_taxonomy_level + 1):
                if taxonomy_level > max_taxonomy_level:
                    continue

                level_organisms = [
                    organism
                    for organism in organisms
                    if taxonomy_similarities[organism] == taxonomy_level
                ]
                for level_organism in level_organisms:
                    if (level_organism not in metabolite_entries[met_id]) and (
                        level_organism == base_species
                    ):  # I.e., if it is the base species
                        continue
                    kinetic_entries = metabolite_entries[met_id][level_organism]
                    if taxonomy_level <= best_kcat_taxonomy_level:
                        kcat_entries = [
                            km_kcat_entry
                            for km_kcat_entry in kinetic_entries
                            if (km_kcat_entry[0] == "turnover_number")
                            and not (
                                substrate_names_and_ids_set.isdisjoint(km_kcat_entry[5])
                            )
                        ]

                        if len(kcat_entries) > 0:
                            if (
                                best_kcat_taxonomy_level > taxonomy_level
                            ):  # "Erase" if we find a better level
                                taxonomically_best_kcats = []
                            best_kcat_taxonomy_level = min(
                                taxonomy_level, best_kcat_taxonomy_level
                            )
                            if taxonomy_level <= best_kcat_taxonomy_level:
                                for kcat_entry in kcat_entries:
                                    taxonomically_best_kcats.append(
                                        kcat_entry[1] * 3_600
                                    )  # convert from s⁻¹ to h⁻¹
                                    k_cat_references.append(
                                        ParameterReference(
                                            database="BRENDA",
                                            comment=kcat_entry[3],
                                            species=level_organism,
                                            pubs=kcat_entry[2],
                                            substrate=kcat_entry[4],
                                            tax_distance=taxonomy_level,
                                            value=kcat_entry[1] * 3_600,
                                        )
                                    )

                    if taxonomy_level <= best_km_taxonomy_levels[metabolite.id]:
                        km_entries = [
                            km_kcat_entry
                            for km_kcat_entry in kinetic_entries
                            if km_kcat_entry[0] == "km_value"
                            and not (
                                substrate_names_and_ids_set.isdisjoint(km_kcat_entry[5])
                            )
                        ]
                        if len(km_entries) > 0:
                            if metabolite.id not in taxonomically_best_kms:
                                taxonomically_best_kms[metabolite.id] = []
                                k_m_references[metabolite.id] = []
                            if (
                                best_km_taxonomy_levels[metabolite.id] > taxonomy_level
                            ):  # "Erase" if we find a better level
                                taxonomically_best_kms[metabolite.id] = []
                            best_km_taxonomy_levels[metabolite.id] = min(
                                taxonomy_level, best_km_taxonomy_levels[metabolite.id]
                            )
                            if taxonomy_level <= best_km_taxonomy_levels[metabolite.id]:
                                for km_entry in km_entries:
                                    taxonomically_best_kms[metabolite.id].append(
                                        km_entry[1] / 1_000
                                    )  # convert from mM to M
                                    k_m_references[metabolite.id].append(
                                        ParameterReference(
                                            database="BRENDA",
                                            comment=km_entry[3],
                                            species=level_organism,
                                            pubs=km_entry[2],
                                            substrate=km_entry[4],
                                            tax_distance=taxonomy_level,
                                            value=km_entry[1] / 1_000,
                                        )
                                    )

                    if taxonomy_level <= best_ki_taxonomy_levels[metabolite.id]:
                        ki_entries = [
                            kinetic_entry
                            for kinetic_entry in kinetic_entries
                            if kinetic_entry[0] == "ki_value"
                            and not (
                                substrate_names_and_ids_set.isdisjoint(kinetic_entry[5])
                            )
                        ]
                        if len(ki_entries) > 0:
                            if metabolite.id not in taxonomically_best_kis:
                                taxonomically_best_kis[metabolite.id] = []
                                k_i_references[metabolite.id] = []
                            if (
                                best_ki_taxonomy_levels[metabolite.id] > taxonomy_level
                            ):  # "Erase" if we find a better level
                                taxonomically_best_kis[metabolite.id] = []
                            best_ki_taxonomy_levels[metabolite.id] = min(
                                taxonomy_level, best_ki_taxonomy_levels[metabolite.id]
                            )
                            if taxonomy_level <= best_ki_taxonomy_levels[metabolite.id]:
                                for ki_entry in ki_entries:
                                    if (
                                        metabolite.compartment
                                        not in reaction_compartments
                                        and kis_and_kas_only_for_same_compartments
                                    ):
                                        continue
                                    taxonomically_best_kis[metabolite.id].append(
                                        ki_entry[1] / 1_000
                                    )  # convert from mM to M
                                    k_i_references[metabolite.id].append(
                                        ParameterReference(
                                            database="BRENDA",
                                            comment=ki_entry[3],
                                            species=level_organism,
                                            pubs=ki_entry[2],
                                            substrate=ki_entry[4],
                                            value=ki_entry[1] / 1_000,
                                            tax_distance=taxonomy_level,
                                        )
                                    )

        if reaction.id in kcat_overwrite:
            taxonomically_best_kcats = [kcat_overwrite[reaction.id]]
            k_cat_references = [
                ParameterReference(database="OVERWRITE", tax_distance=-1)
            ]
        elif len(list(kcat_overwrite.keys())) > 0:
            taxonomically_best_kcats = []
            k_cat_references = []

        reaction_kms = {}
        for met_id, values in taxonomically_best_kms.items():
            if met_id not in [x.id for x in reaction.metabolites]:
                continue
            reaction_kms[met_id] = median(values)

        reaction_kis = {}
        for met_id, values in taxonomically_best_kis.items():
            if met_id not in taxonomically_best_kis:
                continue
            if len(taxonomically_best_kis[met_id]) == 0:
                continue
            reaction_kis[met_id] = median(taxonomically_best_kis[met_id])

        enzyme_identifiers = reaction.gene_reaction_rule.split(" and ")
        has_found_ignored_enzyme = False
        for enzyme_identifier in enzyme_identifiers:
            if enzyme_identifier in kinetic_ignored_enzyme_ids:
                has_found_ignored_enzyme = True
                break

        if (len(taxonomically_best_kcats) > 0) and (not has_found_ignored_enzyme):
            reaction_kcat = median(taxonomically_best_kcats)  # or max(), min(), ...
            enzyme_reaction_data[reaction.id] = EnzymeReactionData(
                identifiers=enzyme_identifiers,
                k_cat=reaction_kcat,
                k_cat_references=k_cat_references,
                k_ms=reaction_kms,
                k_m_references=k_m_references,
                k_is=reaction_kis,
                k_i_references=k_i_references,
            )

    enzyme_reaction_data = {**enzyme_reaction_data, **custom_enzyme_kinetic_data}

    for reac_id in kcat_overwrite:  # noqa: PLC0206
        if reac_id not in enzyme_reaction_data:
            reaction = cobra_model.reactions.get_by_id(reac_id)
            enzyme_identifiers = reaction.gene_reaction_rule.split(" and ")
            if enzyme_identifiers != [""]:
                enzyme_reaction_data[reac_id] = EnzymeReactionData(
                    identifiers=enzyme_identifiers,
                    k_cat=kcat_overwrite[reac_id],
                    k_cat_references=[
                        ParameterReference(database="OVERWRITE", tax_distance=-1)
                    ],
                    k_ms={},
                    k_is={},
                )
    return enzyme_reaction_data

cobrapy_model_functionality

Contains methods that directly apply on COBRApy models.

get_fullsplit_cobra_model(cobra_model, fwd_suffix=REAC_FWD_SUFFIX, rev_suffix=REAC_REV_SUFFIX, add_cobrak_sbml_annotation=False, cobrak_default_min_conc=1e-06, cobrak_default_max_conc=0.2, cobrak_extra_linear_constraints=[], cobrak_kinetic_ignored_metabolites=[], cobrak_no_extra_versions=False, reac_lb_ub_cap=float('inf'), delete_old_cobrak_id_annotations=False)

Return a COBRApy model where reactions are split according to reversibility and enzymes.

"Reversibility" means that, if a reaction i can run in both directions (α_i<0), then it is split as follows: Ri: A<->B [-50;100]=> Ri_FWD: A->B [0;100]; Ri_REV: B->A [0;50] where the ending "FWD" and "REV" are set in COBRAk's constants REAC_FWD_SUFFIX and REAC_REV_SUFFIX.

"enzymes" means that, if a reaction i can be catalyzed by multiple enzymes (i.e., at least one OR block in the reaction's gene-protein rule), then it is split for each reaction. Say, for example, Rj: A->B [0;100] has the following gene-protein rule: (E1 OR E2) ...then, Rj is split into: Rj_ENZ_E1: A->B [0;100] Rj_ENZ_E2: A->B [0;100] where the infix "ENZ" is set in COBRAk's constants REAC_ENZ_SEPARATOR.

Parameters:

Name Type Description Default
cobra_model Model

The COBRApy model that shall be 'fullsplit'.

required

Returns:

Type Description
Model

cobra.Model: The 'fullsplit' COBRApy model.

Source code in cobrak/cobrapy_model_functionality.py
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def get_fullsplit_cobra_model(
    cobra_model: cobra.Model,
    fwd_suffix: str = REAC_FWD_SUFFIX,
    rev_suffix: str = REAC_REV_SUFFIX,
    add_cobrak_sbml_annotation: bool = False,
    cobrak_default_min_conc: float = 1e-6,
    cobrak_default_max_conc: float = 0.2,
    cobrak_extra_linear_constraints: list[ExtraLinearConstraint] = [],
    cobrak_kinetic_ignored_metabolites: list[str] = [],
    cobrak_no_extra_versions: bool = False,
    reac_lb_ub_cap: float = float("inf"),
    delete_old_cobrak_id_annotations: bool = False,
) -> cobra.Model:
    """Return a COBRApy model where reactions are split according to reversibility and enzymes.

    "Reversibility" means that, if a reaction i can run in both directions (α_i<0), then it is split as follows:
    Ri: A<->B [-50;100]=> Ri_FWD: A->B [0;100]; Ri_REV: B->A [0;50]
    where the ending "FWD" and "REV" are set in COBRAk's constants REAC_FWD_SUFFIX and REAC_REV_SUFFIX.

    "enzymes" means that, if a reaction i can be catalyzed by multiple enzymes (i.e., at least one OR block in the
    reaction's gene-protein rule), then it is split for each reaction. Say, for example,
    Rj: A->B [0;100]
    has the following gene-protein rule:
    (E1 OR E2)
    ...then, Rj is split into:
    Rj_ENZ_E1: A->B [0;100]
    Rj_ENZ_E2: A->B [0;100]
    where the infix "_ENZ_" is set in COBRAk's constants REAC_ENZ_SEPARATOR.

    Args:
        cobra_model (cobra.Model): The COBRApy model that shall be 'fullsplit'.

    Returns:
        cobra.Model: The 'fullsplit' COBRApy model.
    """
    fullsplit_cobra_model = cobra.Model(cobra_model.id)

    if add_cobrak_sbml_annotation:
        settings_reac = cobra.Reaction(
            id="cobrak_global_settings",
            name="Global COBRA-k settings",
            lower_bound=0.0,
            upper_bound=0.0,
        )
        settings_reac.annotation["cobrak_max_prot_pool"] = 1000.0
        settings_reac.annotation["cobrak_R"] = STANDARD_R
        settings_reac.annotation["cobrak_T"] = STANDARD_T
        settings_reac.annotation["cobrak_kinetic_ignored_metabolites"] = {}
        settings_reac.annotation["cobrak_reac_rev_suffix"] = (
            rev_suffix  # A "special" suffix to show that this is added
        )
        settings_reac.annotation["cobrak_reac_fwd_suffix"] = fwd_suffix
        settings_reac.annotation["cobrak_reac_enz_separator"] = REAC_ENZ_SEPARATOR
        settings_reac.annotation["cobrak_extra_linear_constraints"] = str(
            [asdict(x) for x in cobrak_extra_linear_constraints]
        )
        settings_reac.annotation["cobrak_kinetic_ignored_metabolites"] = str(
            cobrak_kinetic_ignored_metabolites
        )

        fullsplit_cobra_model.add_reactions([settings_reac])

    fullsplit_cobra_model.add_metabolites(cobra_model.metabolites)

    for gene in cobra_model.genes:
        fullsplit_cobra_model.genes.add(deepcopy(gene))

    if delete_old_cobrak_id_annotations:
        for reaction in cobra_model.reactions:
            annotkeys = list(reaction.annotation.keys())
            for annotkey in annotkeys:
                del reaction.annotation[annotkey]
    for reaction_x in cobra_model.reactions:
        reaction: cobra.Reaction = reaction_x

        if add_cobrak_sbml_annotation:
            for old_name, new_name in (
                ("dG0", "cobrak_dG0"),
                ("dG0_uncertainty", "cobrak_dG0_uncertainty"),
            ):
                if old_name in reaction.annotation:
                    reaction.annotation[new_name] = reaction.annotation[old_name]

            fwd_dG0 = (
                float(reaction.annotation["cobrak_dG0"])
                if "cobrak_dG0" in reaction.annotation
                else None
            )
            dG0_uncertainty = (
                abs(float(reaction.annotation["cobrak_dG0_uncertainty"]))
                if "cobrak_dG0_uncertainty" in reaction.annotation
                else None
            )

        is_reversible = False
        if reaction.lower_bound < 0.0:
            is_reversible = True

        single_enzyme_blocks = (
            reaction.gene_reaction_rule.replace("(", "").replace(")", "").split(" or ")
        )
        current_reac_version = 0
        for single_enzyme_block in single_enzyme_blocks:
            if single_enzyme_block:
                new_reac_base_id = (
                    reaction.id
                    + REAC_ENZ_SEPARATOR
                    + single_enzyme_block.replace(" ", "_")
                )
            else:
                new_reac_base_id = reaction.id
            new_reaction_1 = cobra.Reaction(
                id=new_reac_base_id,
                lower_bound=reaction.lower_bound,
                upper_bound=min(reac_lb_ub_cap, reaction.upper_bound),
            )
            new_reaction_1.annotation = deepcopy(reaction.annotation)
            if add_cobrak_sbml_annotation:
                if fwd_dG0 is not None:
                    new_reaction_1.annotation[f"cobrak_dG0_V{current_reac_version}"] = (
                        fwd_dG0
                    )
                if dG0_uncertainty is not None:
                    new_reaction_1.annotation[
                        f"cobrak_dG0_uncertainty_V{current_reac_version}"
                    ] = dG0_uncertainty
                new_reaction_1.annotation[f"cobrak_id_V{current_reac_version}"] = (
                    new_reaction_1.id + (fwd_suffix if is_reversible else "")
                )
            if single_enzyme_block:
                new_reaction_1.gene_reaction_rule = single_enzyme_block
            new_reaction_1_met_addition = {}
            for met, stoichiometry in reaction.metabolites.items():
                new_reaction_1_met_addition[met] = stoichiometry
            new_reaction_1.add_metabolites(new_reaction_1_met_addition)

            if is_reversible:
                current_reac_version += 1

                original_lb = new_reaction_1.lower_bound
                new_reaction_2 = cobra.Reaction(
                    id=new_reac_base_id,
                )
                new_reaction_2.annotation = deepcopy(reaction.annotation)
                if add_cobrak_sbml_annotation:
                    if fwd_dG0 is not None:
                        new_reaction_2.annotation[
                            f"cobrak_dG0_V{current_reac_version}"
                        ] = -fwd_dG0
                    if dG0_uncertainty is not None:
                        new_reaction_2.annotation[
                            f"cobrak_dG0_uncertainty_V{current_reac_version}"
                        ] = dG0_uncertainty
                    new_reaction_2.annotation[f"cobrak_id_V{current_reac_version}"] = (
                        new_reaction_2.id + rev_suffix
                    )
                if single_enzyme_block:
                    new_reaction_2.gene_reaction_rule = single_enzyme_block
                new_reaction_1.id += fwd_suffix
                new_reaction_1.lower_bound = 0
                new_reaction_2.id += rev_suffix
                new_reaction_2.lower_bound = 0
                new_reaction_2.upper_bound = min(reac_lb_ub_cap, abs(original_lb))

                new_reaction_2_met_addition = {}
                for met, stoichiometry in new_reaction_1.metabolites.items():
                    new_reaction_2_met_addition[met] = -stoichiometry
                new_reaction_2.add_metabolites(new_reaction_2_met_addition)
                new_reaction_2.name = reaction.name

                fullsplit_cobra_model.add_reactions([new_reaction_2])
            new_reaction_1.name = reaction.name
            fullsplit_cobra_model.add_reactions([new_reaction_1])
            current_reac_version += 1
            if cobrak_no_extra_versions and (
                ("cobrak_k_cat_V0" not in reaction.annotation)
                or ("cobrak_k_cat" not in reaction.annotation)
            ):
                break

    for metabolite in fullsplit_cobra_model.metabolites:
        for old_name, new_name in (("Cmin", "cobrak_Cmin"), ("Cmax", "cobrak_Cmax")):
            if old_name in metabolite.annotation:
                metabolite.annotation[new_name] = metabolite.annotation[old_name]
        if "cobrak_Cmin" not in metabolite.annotation:
            metabolite.annotation["cobrak_Cmin"] = cobrak_default_min_conc
        if "cobrak_Cmax" not in metabolite.annotation:
            metabolite.annotation["cobrak_Cmax"] = cobrak_default_max_conc

    return fullsplit_cobra_model

get_fullsplit_cobra_model_from_sbml(sbml_path, fwd_suffix=REAC_FWD_SUFFIX, rev_suffix=REAC_REV_SUFFIX, add_cobrak_sbml_annotation=False, cobrak_default_min_conc=1e-06, cobrak_default_max_conc=0.2, cobrak_extra_linear_constraints=[], cobrak_kinetic_ignored_metabolites=[], cobrak_no_extra_versions=False, reac_lb_ub_cap=float('inf'))

Return a COBRApy model (loaded from the SBML) where reactions are split according to reversibility and enzymes.

"Reversibility" means that, if a reaction i can run in both directions (α_i<0), then it is split as follows: Ri: A<->B [-50;100]=> Ri_FWD: A->B [0;100]; Ri_REV: B->A [0;50] where the ending "FWD" and "REV" are set in COBRAk's constants REAC_FWD_SUFFIX and REAC_REV_SUFFIX.

"enzymes" means that, if a reaction i can be catalyzed by multiple enzymes (i.e., at least one OR block in the reaction's gene-protein rule), then it is split for each reaction. Say, for example, Rj: A->B [0;100] has the following gene-protein rule: (E1 OR E2) ...then, Rj is split into: Rj_ENZ_E1: A->B [0;100] Rj_ENZ_E2: A->B [0;100] where the infix "ENZ" is set in COBRAk's constants REAC_ENZ_SEPARATOR.

Parameters:

Name Type Description Default
cobra_model Model

The COBRApy model that shall be 'fullsplit'.

required

Returns:

Type Description
Model

cobra.Model: The 'fullsplit' COBRApy model.

Source code in cobrak/cobrapy_model_functionality.py
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def get_fullsplit_cobra_model_from_sbml(
    sbml_path: str,
    fwd_suffix: str = REAC_FWD_SUFFIX,
    rev_suffix: str = REAC_REV_SUFFIX,
    add_cobrak_sbml_annotation: bool = False,
    cobrak_default_min_conc: float = 1e-6,
    cobrak_default_max_conc: float = 0.2,
    cobrak_extra_linear_constraints: list[ExtraLinearConstraint] = [],
    cobrak_kinetic_ignored_metabolites: list[str] = [],
    cobrak_no_extra_versions: bool = False,
    reac_lb_ub_cap: float = float("inf"),
) -> cobra.Model:
    """Return a COBRApy model (loaded from the SBML) where reactions are split according to reversibility and enzymes.

    "Reversibility" means that, if a reaction i can run in both directions (α_i<0), then it is split as follows:
    Ri: A<->B [-50;100]=> Ri_FWD: A->B [0;100]; Ri_REV: B->A [0;50]
    where the ending "FWD" and "REV" are set in COBRAk's constants REAC_FWD_SUFFIX and REAC_REV_SUFFIX.

    "enzymes" means that, if a reaction i can be catalyzed by multiple enzymes (i.e., at least one OR block in the
    reaction's gene-protein rule), then it is split for each reaction. Say, for example,
    Rj: A->B [0;100]
    has the following gene-protein rule:
    (E1 OR E2)
    ...then, Rj is split into:
    Rj_ENZ_E1: A->B [0;100]
    Rj_ENZ_E2: A->B [0;100]
    where the infix "_ENZ_" is set in COBRAk's constants REAC_ENZ_SEPARATOR.

    Args:
        cobra_model (cobra.Model): The COBRApy model that shall be 'fullsplit'.

    Returns:
        cobra.Model: The 'fullsplit' COBRApy model.
    """
    return get_fullsplit_cobra_model(
        cobra.io.read_sbml_model(sbml_path),
        fwd_suffix=fwd_suffix,
        rev_suffix=rev_suffix,
        add_cobrak_sbml_annotation=add_cobrak_sbml_annotation,
        cobrak_default_min_conc=cobrak_default_min_conc,
        cobrak_default_max_conc=cobrak_default_max_conc,
        cobrak_extra_linear_constraints=cobrak_extra_linear_constraints,
        cobrak_kinetic_ignored_metabolites=cobrak_kinetic_ignored_metabolites,
        cobrak_no_extra_versions=cobrak_no_extra_versions,
        reac_lb_ub_cap=reac_lb_ub_cap,
    )

constants

This module contains all COBRAk constants that are used throughout its packages.

These constants are especially used for problem constructions (to determine prefixes, suffixes, names, ... for pyomo variables) as well as thermodynamic standard values.

ALL_OK_KEY = 'ALL_OK' module-attribute

Shows that the result is optimal and the termination condition is ok

ALPHA_VAR_PREFIX = 'alpha_var_' module-attribute

Prefix for variables representing the activation of a reaction (used in non-linear programs)

BIGG_COMPARTMENTS = ['c', 'e', 'p', 'm', 'x', 'r', 'v', 'n', 'g', 'u', 'l', 'h', 'f', 's', 'im', 'cx', 'um', 'cm', 'i', 'mm', 'w', 'y'] module-attribute

List of BiGG compartment suffixes (without _) as defined in http://bigg.ucsd.edu/compartments/

BIG_M = 10000 module-attribute

Big M value for MILPs

DF_VAR_PREFIX = 'f_var_' module-attribute

Prefix for driving force problem variables

DG0_VAR_PREFIX = 'dG0_' module-attribute

Prefix for Gibb's free energy problem variables

EC_INNER_TO_OUTER_COMPARTMENTS = ['c', 'p', 'e'] module-attribute

Inner to outer compartments in this order for E. coli models used here

EC_IONIC_STRENGTHS = {'c': 250, 'p': 250, 'e': 250} module-attribute

Ionic strenghts (in mM) for E. coli model compartments used here

EC_PHS = {'c': 7.5, 'p': 7.5, 'e': 7.5} module-attribute

pH values (unitless) for E. coli model compartments used here

EC_PMGS = {'c': 2.5, 'p': 2.5, 'e': 2.5} module-attribute

pMg values (unitless) for E. coli model compartments used here

EC_POTENTIAL_DIFFERENCES = {('c', 'p'): -0.15, ('p', 'e'): -0.15} module-attribute

Potential differences (in V) for E. coli model compartments used here

ENZYME_VAR_INFIX = '_of_' module-attribute

Infix for separation of enzyme name and reaction name

ENZYME_VAR_PREFIX = 'enzyme_' module-attribute

Prefix of problem variables which stand for enzyme concentrations

ERROR_BOUND_LOWER_CHANGE_PREFIX = 'bound_error_change_lower_' module-attribute

Prefix for fixed variables that show how much affected lower variable bounds have to be changed

ERROR_BOUND_UPPER_CHANGE_PREFIX = 'bound_error_change_upper_' module-attribute

Prefix for fixed variables that show how much affected lower variable bounds have to be changed

ERROR_CONSTRAINT_PREFIX = 'flux_error_' module-attribute

Prefix for the constraint that defines a scenario flux constraint

ERROR_SUM_VAR_ID = 'error_sum' module-attribute

Name for the variable that holds the sum of all error term variables

ERROR_VAR_PREFIX = 'error_' module-attribute

Prefix for error term variables for feasibility-making optimizations

FLUX_SUM_VAR_ID = 'FLUX_SUM_VAR' module-attribute

Name of optional variable that holds the sum of all reaction fluxes

GAMMA_VAR_PREFIX = 'gamma_var_' module-attribute

Prefix for variables representing the thermodynamic restriction of a reaction (used in non-linear programs)

GENERALIZED_SUM_CONSTRAINT_NAME = 'generalized_sum_constraint' module-attribute

Name of constraint for generalized sum (protein pool + metabolite concentrations)

IOTA_VAR_PREFIX = 'iota_var_' module-attribute

Prefix for variables representing the inhibition of a reaction (used in non-linear programs)

KAPPA_PRODUCTS_VAR_PREFIX = 'kappa_products_' module-attribute

Prefix for variables representing the sum of logairthmized product concentration minus the logarithmized sum of km values

KAPPA_SUBSTRATES_VAR_PREFIX = 'kappa_substrates_' module-attribute

Prefix for variables representing the sum of logairthmized substrate concentration minus the logarithmized sum of km values

KAPPA_VAR_PREFIX = 'kappa_var_' module-attribute

Prefix for variables representing the thermodynamic restriction of a reaction (used in non-linear programs)

LNCONC_VAR_PREFIX = 'x_' module-attribute

Prefix for logarithmized concentration problem variables

MDF_VAR_ID = 'var_B' module-attribute

Name for minimally occuring driving force variable

OBJECTIVE_CONSTRAINT_NAME = 'objective_constraint' module-attribute

Name for constraint that defines the objective function's term

OBJECTIVE_VAR_NAME = 'OBJECTIVE_VAR' module-attribute

Name for variable that holds the objective value

PROT_POOL_MET_NAME = 'prot_pool' module-attribute

Identifier of the protein pool representing pseudo-metabolite

PROT_POOL_REAC_NAME = PROT_POOL_MET_NAME + '_delivery' module-attribute

Identifier of the pseudo-reaction which created the protein pool pseudo-metabolite

QUASI_INF = 100000 module-attribute

Big number (larger than big M) for values that would reach inf (thereby potentially causing solver problems)

REAC_ENZ_SEPARATOR = '_ENZ_' module-attribute

Separator between enzyme-constrained reaction ID and attached enzyme name

REAC_FWD_SUFFIX = '_FWD' module-attribute

Standard suffix for reaction IDs that represent forward directions of originally irreversible reactions

REAC_REV_SUFFIX = '_REV' module-attribute

Standard suffix for reaction IDs that represent reverse directions of originally irreversible reactions

SOLVER_STATUS_KEY = 'SOLVER_STATUS' module-attribute

Solver status optimization dict key

STANDARD_CONC_RANGES = {'DEFAULT': (1e-06, 0.2), 'h_c': (1.0, 1.0), 'h_p': (1.0, 1.0), 'h_e': (1.0, 1.0), 'h20_c': (1.0, 1.0), 'h20_p': (1.0, 1.0), 'h20_e': (1.0, 1.0)} module-attribute

Standard concentration ranges applicable to models with BiGG IDs; water and protons are set to one as their effect is directly included in the ΔG'° calculation (see the eQuilibrator FAQ), while the rest is set to wide ranges.

STANDARD_MAX_PROT_POOL = 0.25 module-attribute

Just a (for E. coli) quite high pool of metabolic enzymes on the total dry weight mass (in g⋅gDW⁻¹).

STANDARD_MIN_MDF = 0.001 module-attribute

Standard minimally ocurring driving force for active reactions in kJ⋅mol⁻¹

STANDARD_R = 0.008314 module-attribute

Standard gas constant in kJ⋅K⁻1⋅mol⁻1 (Attention: Standard value is often given in J⋅K⁻1⋅mol⁻1, but we need in kJ⋅K⁻1⋅mol⁻1)

STANDARD_T = 298.15 module-attribute

Standard temperature in Kelvin

TERMINATION_CONDITION_KEY = 'TERMINATION_CONDITION' module-attribute

Solver termination condition key in optimization dict

USED_IDENTIFIERS_FOR_EQUILIBRATOR = ['inchi', 'inchi_key', 'metanetx.chemical', 'bigg.metabolite', 'kegg.compound', 'chebi', 'sabiork.compound', 'metacyc.compound', 'hmdb', 'swisslipid', 'reactome', 'lipidmaps', 'seed.compound'] module-attribute

Standard bunch of reaction identifier annotation names for E. coli models used here

ZB_VAR_PREFIX = 'zb_var_' module-attribute

Extra zb variable prefix for thermodynamic bottleneck analyses

Z_VAR_PREFIX = 'z_var_' module-attribute

Prefix of z variables (used with thermodynamic constraints in MI(N)LPs)

dataclasses

Contains all dataclasses (and enums) used by COBRAk to define a metabolic model and its extra constraints and optimization objective.

Dataclasses are similar to structs in C: They are not intended to have member functions, only other types of member variables. The main dataclass used by COBRAk is Model, which contains the full information about the metabolic model. As member variables, a Model contains further dataclasses (such as Reaction, Metabolite, ...). As dataclass_json is also invoked, it is possible to store and load the COBRAk dataclasses as JSON.

ErrorScenario = dict[str, tuple[float, float]] module-attribute

A COBRAk error scenario type alias for a ConfigurationConfig; Is dict[str, tuple[float, float]]

OptResult = dict[str, float] module-attribute

A COBRAk variability optimization result type alias; Is dict[str, float]

VarResult = dict[str, tuple[float | None, float | None]] module-attribute

A COBRAk variability result type alias; Is dict[str, tuple[float | None, float | None]]

CorrectionConfig

Stores the configuration for corrections in a model (see parameter corrections chapter in documentation).

Source code in cobrak/dataclasses.py
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
@dataclass
class CorrectionConfig:
    """Stores the configuration for corrections in a model (see parameter corrections chapter in documentation)."""

    error_scenario: dict[str, tuple[float, float]] = Field(default_factory=dict)
    """A dictionary where keys are error scenarios and values are tuples representing the lower and upper bounds of the error. Defaults to {}."""
    add_flux_error_term: bool = False
    """Indicates whether to add flux error terms. Defaults to False."""
    add_met_logconc_error_term: bool = False
    """Indicates whether to add metabolite log concentration error terms. Defaults to False."""
    add_enzyme_conc_error_term: bool = False
    """Indicates whether to add enzyme concentration error terms. Defaults to False."""
    add_kcat_times_e_error_term: bool = False
    """Indicates whether to add k_cat ⋅ [E] error terms. Defaults to False."""
    kcat_times_e_error_cutoff: PositiveFloat = 1.0
    """The cutoff value for the k_cat ⋅ [E] error term. Defaults to 1.0."""
    max_rel_kcat_times_e_correction: PositiveFloat = QUASI_INF
    """Maximal relative correction for the k_cat ⋅ [E] error error term. Defaults to QUASI_INF."""
    add_dG0_error_term: bool = False
    """Indicates whether to add ΔG'° error terms. Defaults to False."""
    dG0_error_cutoff: PositiveFloat = 1.0
    """The cutoff value for the ΔG'° error terms. Defaults to 1.0."""
    max_abs_dG0_correction: PositiveFloat = QUASI_INF
    """Maximal absolute correction for the dG0 error term. Defaults to QUASI_INF."""
    add_km_error_term: bool = False
    """Indicates whether to add a kappa error term. Defaults to False."""
    km_error_cutoff: PositiveFloat = 1.0
    """Cutoff value for the κ error term. Defaults to 1.0."""
    max_rel_km_correction: PositiveFloat = 0.999
    """Maximal relative correction for the κ error term. Defaults to 0.999."""
    add_ki_error_term: bool = False
    """Indicates whether to add a ι error term. Defaults to False."""
    ki_error_cutoff: PositiveFloat = 1.0
    """Cutoff value for the ι error term. Defaults to 1.0."""
    max_rel_ki_correction: PositiveFloat = 0.999
    """Maximal relative correction for the ι error term. Defaults to 0.999."""
    add_ka_error_term: bool = False
    """Indicates whether to add an α error term. Defaults to False."""
    ka_error_cutoff: PositiveFloat = 1.0
    """Cutoff value for the α error term. Defaults to 1.0."""
    max_rel_ka_correction: PositiveFloat = 0.999
    """Maximal relative correction for the α error term. Defaults to 0.999."""
    error_sum_as_qp: bool = False
    """Indicates whether to use a quadratic programming approach for the error sum. Defaults to False."""
    add_error_sum_term: bool = True
    """Whether to add an error sum term. Defaults to True."""
    use_weights: bool = False
    """Indicates whether to use weights for the corrections (otherwise, the weight is 1.0). Defaults to False."""
    weight_percentile: NonNegativeInt = 90
    """Percentile to use for weight calculation. Defaults to 90."""
    extra_weights: dict[str, float] = Field(default_factory=dict)
    """Dictionary to store extra weights for specific corrections. Defaults to {}."""
    var_lb_ub_application: Literal["", "exp", "log"] = ""
    """The application method for variable lower and upper bounds. Either '' (x=x), 'exp' or 'log'. Defaults to ''."""

add_dG0_error_term = False class-attribute instance-attribute

Indicates whether to add ΔG'° error terms. Defaults to False.

add_enzyme_conc_error_term = False class-attribute instance-attribute

Indicates whether to add enzyme concentration error terms. Defaults to False.

add_error_sum_term = True class-attribute instance-attribute

Whether to add an error sum term. Defaults to True.

add_flux_error_term = False class-attribute instance-attribute

Indicates whether to add flux error terms. Defaults to False.

add_ka_error_term = False class-attribute instance-attribute

Indicates whether to add an α error term. Defaults to False.

add_kcat_times_e_error_term = False class-attribute instance-attribute

Indicates whether to add k_cat ⋅ [E] error terms. Defaults to False.

add_ki_error_term = False class-attribute instance-attribute

Indicates whether to add a ι error term. Defaults to False.

add_km_error_term = False class-attribute instance-attribute

Indicates whether to add a kappa error term. Defaults to False.

add_met_logconc_error_term = False class-attribute instance-attribute

Indicates whether to add metabolite log concentration error terms. Defaults to False.

dG0_error_cutoff = 1.0 class-attribute instance-attribute

The cutoff value for the ΔG'° error terms. Defaults to 1.0.

error_scenario = Field(default_factory=dict) class-attribute instance-attribute

A dictionary where keys are error scenarios and values are tuples representing the lower and upper bounds of the error. Defaults to {}.

error_sum_as_qp = False class-attribute instance-attribute

Indicates whether to use a quadratic programming approach for the error sum. Defaults to False.

extra_weights = Field(default_factory=dict) class-attribute instance-attribute

Dictionary to store extra weights for specific corrections. Defaults to {}.

ka_error_cutoff = 1.0 class-attribute instance-attribute

Cutoff value for the α error term. Defaults to 1.0.

kcat_times_e_error_cutoff = 1.0 class-attribute instance-attribute

The cutoff value for the k_cat ⋅ [E] error term. Defaults to 1.0.

ki_error_cutoff = 1.0 class-attribute instance-attribute

Cutoff value for the ι error term. Defaults to 1.0.

km_error_cutoff = 1.0 class-attribute instance-attribute

Cutoff value for the κ error term. Defaults to 1.0.

max_abs_dG0_correction = QUASI_INF class-attribute instance-attribute

Maximal absolute correction for the dG0 error term. Defaults to QUASI_INF.

max_rel_ka_correction = 0.999 class-attribute instance-attribute

Maximal relative correction for the α error term. Defaults to 0.999.

max_rel_kcat_times_e_correction = QUASI_INF class-attribute instance-attribute

Maximal relative correction for the k_cat ⋅ [E] error error term. Defaults to QUASI_INF.

max_rel_ki_correction = 0.999 class-attribute instance-attribute

Maximal relative correction for the ι error term. Defaults to 0.999.

max_rel_km_correction = 0.999 class-attribute instance-attribute

Maximal relative correction for the κ error term. Defaults to 0.999.

use_weights = False class-attribute instance-attribute

Indicates whether to use weights for the corrections (otherwise, the weight is 1.0). Defaults to False.

var_lb_ub_application = '' class-attribute instance-attribute

The application method for variable lower and upper bounds. Either '' (x=x), 'exp' or 'log'. Defaults to ''.

weight_percentile = 90 class-attribute instance-attribute

Percentile to use for weight calculation. Defaults to 90.

Enzyme

Represents an enzyme in a metabolic model (note: 'enzyme' stands for a single polypeptide).

Members

molecular_weight (float): The enzyme's molecular weight in kDa. min_conc (float | None): [Optional] If wanted, one can set a special minimal concentration for the enzyme. Defaults to None, i.e., no given concentration value (i.e., only the total enzyme pool is the limit). max_conc (float | None): [Optional] If wanted, one can set a special maximal concentration for the enzyme. Defaults to None, i.e., no given concentration value (i.e., only the total enzyme pool is the limit). annotation (dict[str, str | list[str]]): [Optional] Dictionary containing additional enzyme annotation, e.g., {"UNIPROT_ID": "b12345"}. Defaults to '{}'. name: str: [Optional] Colloquial name of enzyme sequence: str: [Optional] Protein sequence of enzyme (note: 'enzyme' stands for a single polypeptide)

Source code in cobrak/dataclasses.py
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
@dataclass
class Enzyme:
    """Represents an enzyme in a metabolic model (note: 'enzyme' stands for a single polypeptide).

    Members:
        molecular_weight (float):
            The enzyme's molecular weight in kDa.
        min_conc (float | None):
            [Optional] If wanted, one can set a special minimal concentration
            for the enzyme.
            Defaults to None, i.e., no given concentration value (i.e., only the total
            enzyme pool is the limit).
        max_conc (float | None):
            [Optional] If wanted, one can set a special maximal concentration
            for the enzyme.
            Defaults to None, i.e., no given concentration value (i.e., only the total
            enzyme pool is the limit).
        annotation (dict[str, str | list[str]]):
            [Optional] Dictionary containing additional enzyme annotation,
            e.g., {"UNIPROT_ID": "b12345"}.
            Defaults to '{}'.
        name: str:
            [Optional] Colloquial name of enzyme
        sequence: str:
            [Optional] Protein sequence of enzyme (note: 'enzyme' stands for a single polypeptide)
    """

    molecular_weight: float = 1e20
    """The enzyme's molecular weight in kDa. Defaults to 1e20 (a very high value that shall be replaced with a real molecular weight)."""
    min_conc: PositiveFloat | None = None
    """The enzyme's minimal concentration in mmol⋅gDW⁻¹"""
    max_conc: PositiveFloat | None = None
    """The enzyme's minimal concentration in mmol⋅gDW⁻¹"""
    annotation: dict[str, str | list[str]] = Field(default_factory=dict)
    """Any annotation data for the enzyme (e.g., references). Has no effect on calculations"""
    name: str = ""
    """Colloquial name of enzyme"""
    sequence: str = ""
    """Protein sequence of enzyme (note: 'enzyme' stands for a single polypeptide)"""

annotation = Field(default_factory=dict) class-attribute instance-attribute

Any annotation data for the enzyme (e.g., references). Has no effect on calculations

max_conc = None class-attribute instance-attribute

The enzyme's minimal concentration in mmol⋅gDW⁻¹

min_conc = None class-attribute instance-attribute

The enzyme's minimal concentration in mmol⋅gDW⁻¹

molecular_weight = 1e+20 class-attribute instance-attribute

The enzyme's molecular weight in kDa. Defaults to 1e20 (a very high value that shall be replaced with a real molecular weight).

name = '' class-attribute instance-attribute

Colloquial name of enzyme

sequence = '' class-attribute instance-attribute

Protein sequence of enzyme (note: 'enzyme' stands for a single polypeptide)

EnzymeReactionData

Represents the enzymes used by a reaction.

Source code in cobrak/dataclasses.py
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
@dataclass
class EnzymeReactionData:
    """Represents the enzymes used by a reaction."""

    identifiers: list[str]
    """The identifiers (must be given in the associated Model enzymes instance) of the reaction's enzyme(s)"""
    k_cat: PositiveFloat = 1e20
    """The reaction's k_cat (turnover numbers) in h⁻¹"""
    k_cat_references: list[ParameterReference] = Field(default_factory=list)
    """[Optional] List of references showing the source(s) of the k_cat value"""
    k_ms: dict[str, PositiveFloat] = Field(default_factory=dict)
    """[Optional] The reaction's k_ms (Michaelis-Menten constants) in M=mol⋅l⁻¹. Metabolite IDs are keys, k_ms the values. Default is {}"""
    k_m_references: dict[str, list[ParameterReference]] = Field(default_factory=dict)
    """[Optional] References showing the source(s) of the k_m values. Metabolite IDs are keys, the source lists values. Default is {}"""
    k_is: dict[str, PositiveFloat] = Field(default_factory=dict)
    """[Optional] The reaction's k_is (Inhibition constants) in M=mol⋅l⁻¹. Metabolite IDs are keys, k_is the values. Default is {}"""
    k_i_references: dict[str, list[ParameterReference]] = Field(default_factory=dict)
    """[Optional] References showing the source(s) of the k_i values. Metabolite IDs are keys, the source lists values. Default is {}"""
    k_as: dict[str, PositiveFloat] = Field(default_factory=dict)
    """[Optional] The reaction's k_as (Activation constants) in M=mol⋅l⁻¹. Metabolite IDs are keys, k_as the values. Default is {}"""
    k_a_references: dict[str, list[ParameterReference]] = Field(default_factory=dict)
    """[Optional] References showing the source(s) of the k_a values. Metabolite IDs are keys, the source lists values. Default is {}"""
    hill_coefficients: HillCoefficients = Field(default_factory=HillCoefficients)
    """[Optional] If given, the reaction's Hill coefficients. Metabolite IDs are keys, coefficients the  in form of HillCoefficients instances. Default is empty HillCoefficients()."""
    hill_coefficient_references: HillParameterReferences = Field(
        default_factory=HillParameterReferences
    )
    """[Optional] References showing the source(s) of the Hill coefficients. Metabolite IDs are keys, the source lists values. Default is {}"""
    special_stoichiometries: dict[str, PositiveFloat] = Field(default_factory=dict)
    """[Optional] Special (non-1) stoichiometries of polypeptides/enzymes in the reaction's enzyme. Default is {}"""

hill_coefficient_references = Field(default_factory=HillParameterReferences) class-attribute instance-attribute

[Optional] References showing the source(s) of the Hill coefficients. Metabolite IDs are keys, the source lists values. Default is {}

hill_coefficients = Field(default_factory=HillCoefficients) class-attribute instance-attribute

[Optional] If given, the reaction's Hill coefficients. Metabolite IDs are keys, coefficients the in form of HillCoefficients instances. Default is empty HillCoefficients().

identifiers instance-attribute

The identifiers (must be given in the associated Model enzymes instance) of the reaction's enzyme(s)

k_a_references = Field(default_factory=dict) class-attribute instance-attribute

[Optional] References showing the source(s) of the k_a values. Metabolite IDs are keys, the source lists values. Default is {}

k_as = Field(default_factory=dict) class-attribute instance-attribute

[Optional] The reaction's k_as (Activation constants) in M=mol⋅l⁻¹. Metabolite IDs are keys, k_as the values. Default is {}

k_cat = 1e+20 class-attribute instance-attribute

The reaction's k_cat (turnover numbers) in h⁻¹

k_cat_references = Field(default_factory=list) class-attribute instance-attribute

[Optional] List of references showing the source(s) of the k_cat value

k_i_references = Field(default_factory=dict) class-attribute instance-attribute

[Optional] References showing the source(s) of the k_i values. Metabolite IDs are keys, the source lists values. Default is {}

k_is = Field(default_factory=dict) class-attribute instance-attribute

[Optional] The reaction's k_is (Inhibition constants) in M=mol⋅l⁻¹. Metabolite IDs are keys, k_is the values. Default is {}

k_m_references = Field(default_factory=dict) class-attribute instance-attribute

[Optional] References showing the source(s) of the k_m values. Metabolite IDs are keys, the source lists values. Default is {}

k_ms = Field(default_factory=dict) class-attribute instance-attribute

[Optional] The reaction's k_ms (Michaelis-Menten constants) in M=mol⋅l⁻¹. Metabolite IDs are keys, k_ms the values. Default is {}

special_stoichiometries = Field(default_factory=dict) class-attribute instance-attribute

[Optional] Special (non-1) stoichiometries of polypeptides/enzymes in the reaction's enzyme. Default is {}

ExtraLinearConstraint

Represents a general linear Model constraint.

This can affect not only reactions, but also all other variables (including watches) set in a COBRAk model. E.g., if one wants (for whatever reason) the following constraint: 0.5 <= [A] - 2 * r_R1 <= 2.1 the corresponding ExtraLinearConstraint instance would be: ExtraLinearConstraint( stoichiometries = { "x_A": 1.0, "R1": -2, }, lower_value = 0.5, upper_value = 2.1, ) lower_value or upper_value can be None if no such limit is desired.

Source code in cobrak/dataclasses.py
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
@dataclass
class ExtraLinearConstraint:
    """Represents a general linear Model constraint.

    This can affect not only reactions, but also all other
    variables (including watches) set in a COBRAk model. E.g., if one wants (for whatever
    reason) the following constraint:
    0.5 <= [A] - 2 * r_R1 <= 2.1
    the corresponding ExtraLinearConstraint instance would be:
    ExtraLinearConstraint(
        stoichiometries = {
            "x_A": 1.0,
            "R1": -2,
        },
        lower_value = 0.5,
        upper_value = 2.1,
    )
    lower_value or upper_value can be None if no such limit is desired.
    """

    stoichiometries: dict[str, float]
    """Keys: Model variable names; Children: Multipliers of constraint"""
    lower_value: float | None = None
    """Minimal numeric constraint value. Either this and/or upper_value must be not None. Defaults to None."""
    upper_value: float | None = None
    """Maximal numeric constraint value. Either this and/or lower_value must be not None. Defaults to None."""

lower_value = None class-attribute instance-attribute

Minimal numeric constraint value. Either this and/or upper_value must be not None. Defaults to None.

stoichiometries instance-attribute

Keys: Model variable names; Children: Multipliers of constraint

upper_value = None class-attribute instance-attribute

Maximal numeric constraint value. Either this and/or lower_value must be not None. Defaults to None.

ExtraLinearWatch

Represents a linear 'watch', i.e. a variable that shows the linear sum of other variables.

A watch can be not only about reactions, but also all other variables (except of watches that are defined after this one in the Model's extra_linear_watches member variable) set in a COBRAk model. E.g., if one wants (for whatever reason) a variable for the following constraint: [A] - 2 * r_R1, we set ExtraLinearWatch( stoichiometries = { "x_A": 1.0, "R1": -2, }, )

The name of the watch is set in as dictionary key for the model's extra_linear_watches member variable.

Source code in cobrak/dataclasses.py
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
@dataclass
class ExtraLinearWatch:
    """Represents a linear 'watch', i.e. a variable that shows the linear sum of other variables.

    A watch can be not only about reactions, but also all other
    variables (except of watches that are defined *after* this one in the Model's extra_linear_watches
    member variable) set in a COBRAk model. E.g., if one wants (for whatever
    reason) a variable for the following constraint:
    [A] - 2 * r_R1, we set
    ExtraLinearWatch(
        stoichiometries = {
            "x_A": 1.0,
            "R1": -2,
        },
    )

    The name of the watch is set in as dictionary key for the model's extra_linear_watches
    member variable.
    """

    stoichiometries: dict[str, float]

ExtraNonlinearConstraint

Represents a general non-linear Model constraint.

Important note: Setting such a non-linear watch makes any optimization non-linear and thus incompatible with linear solvers and computationally much more expensive!

This can affect not only reactions, but also all other variables (including watches) set in a COBRA-k model. E.g., if one wants (for whatever reason) the following constraint: 0.5 <= log([A]^2 - 2 * exp(r_R1)) <= 2.1 the corresponding ExtraNonlinearConstraint instance would be: ExtraNonlinearConstraint( stoichiometries = { "x_A": (1.0, "power2"), "R1": (-2, "exp"), }, full_application = "log", lower_value = 0.5, upper_value = 2.1, ) Allowed non-linear functions are currently 'powerX' (with X as float-readable exponent), 'exp' and 'log'. If you just want the normal value, 'same' can be used (i.e. multiply with 1). lower_value or upper_value can be None if no such limit is desired. Also, full_application is by default 'same', which is to be set if no function on the full term is wished.

Source code in cobrak/dataclasses.py
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
@dataclass
class ExtraNonlinearConstraint:
    """Represents a general non-linear Model constraint.

    Important note: Setting such a non-linear watch makes any optimization non-linear and thus incompatible
    with linear solvers and computationally much more expensive!

    This can affect not only reactions, but also all other
    variables (including watches) set in a COBRA-k model. E.g., if one wants (for whatever
    reason) the following constraint:
    0.5 <= log([A]^2 - 2 * exp(r_R1)) <= 2.1
    the corresponding ExtraNonlinearConstraint instance would be:
    ExtraNonlinearConstraint(
        stoichiometries = {
            "x_A": (1.0, "power2"),
            "R1": (-2, "exp"),
        },
        full_application = "log",
        lower_value = 0.5,
        upper_value = 2.1,
    )
    Allowed non-linear functions are currently 'powerX' (with X as float-readable exponent), 'exp' and 'log'. If you just want
    the normal value, 'same' can be used (i.e. multiply with 1).
    lower_value or upper_value can be None if no such limit is desired.
    Also, full_application is by default 'same', which is to be set if no function on the full term is wished.
    """

    stoichiometries: dict[str, tuple[float, str]]
    """Keys: Model variable names; Children: (Multipliers of constraint, function name 'same' (multiply with 1), 'powerX' (with X as float-readable exponent), 'exp' or 'log')"""
    full_application: str = "same"
    """Either function name 'same' (multiply with 1), 'powerX' (with X as float-readable exponent), 'exp' or 'log'). Defaults to 'same'."""
    lower_value: float | None = None
    """Minimal numeric constraint value. Either this and/or upper_value must be not None. Defaults to None."""
    upper_value: float | None = None
    """Maximal numeric constraint value. Either this and/or lower_value must be not None. Defaults to None."""

full_application = 'same' class-attribute instance-attribute

Either function name 'same' (multiply with 1), 'powerX' (with X as float-readable exponent), 'exp' or 'log'). Defaults to 'same'.

lower_value = None class-attribute instance-attribute

Minimal numeric constraint value. Either this and/or upper_value must be not None. Defaults to None.

stoichiometries instance-attribute

Keys: Model variable names; Children: (Multipliers of constraint, function name 'same' (multiply with 1), 'powerX' (with X as float-readable exponent), 'exp' or 'log')

upper_value = None class-attribute instance-attribute

Maximal numeric constraint value. Either this and/or lower_value must be not None. Defaults to None.

ExtraNonlinearWatch

Represents a non-linear 'watch', i.e. a variable that shows the linear sum of other variables.

Important note: Setting such a non-linear watch makes any optimization non-linear and thus incompatible with linear solvers and computationally much more expensive!

A watch can be not only about reactions, but also all other variables (except of watches that are defined after this one in the Model's extra_linear_watches member variable) set in a COBRAk model. E.g., if one wants (for whatever reason) a variable for the following constraint: exp([A]) - 2 * r_R1^3, we set ExtraLinearWatch( stoichiometries = { "x_A": (1.0, "exp"), "R1": (-2, "power3"), }, )

Allowed non-linear functions are currently 'powerX' (with X as float-readable exponent), 'exp' and 'log'. If you just want the normal value, 'same' can be used (i.e. multiply with 1). The name of the watch is set in as dictionary key for the model's extra_linear_watches member variable.

Source code in cobrak/dataclasses.py
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
@dataclass
class ExtraNonlinearWatch:
    """Represents a non-linear 'watch', i.e. a variable that shows the linear sum of other variables.

    Important note: Setting such a non-linear watch makes any optimization non-linear and thus incompatible
    with linear solvers and computationally much more expensive!

    A watch can be not only about reactions, but also all other
    variables (except of watches that are defined *after* this one in the Model's extra_linear_watches
    member variable) set in a COBRAk model. E.g., if one wants (for whatever
    reason) a variable for the following constraint:
    exp([A]) - 2 * r_R1^3, we set
    ExtraLinearWatch(
        stoichiometries = {
            "x_A": (1.0, "exp"),
            "R1": (-2, "power3"),
        },
    )

    Allowed non-linear functions are currently 'powerX' (with X as float-readable exponent), 'exp' and 'log'. If you just want
    the normal value, 'same' can be used (i.e. multiply with 1).
    The name of the watch is set in as dictionary key for the model's extra_linear_watches
    member variable.
    """

    stoichiometries: dict[str, tuple[float, str]]

HillCoefficients

Represents the Hill coefficients of a reactions, seperated according to efficiency terms

Source code in cobrak/dataclasses.py
 99
100
101
102
103
104
105
106
107
108
@dataclass
class HillCoefficients:
    """Represents the Hill coefficients of a reactions, seperated according to efficiency terms"""

    kappa: dict[str, PositiveFloat] = Field(default_factory=dict)
    """Hill coefficients affecting the κ saturation term. Metabolite IDs are keys, coefficients values. Defaults to {}."""
    iota: dict[str, PositiveFloat] = Field(default_factory=dict)
    """Hill coefficients affecting the ι inhibition term. Metabolite IDs are keys, coefficients values. Defaults to {}."""
    alpha: dict[str, PositiveFloat] = Field(default_factory=dict)
    """Hill coefficients affecting the α activation term. Metabolite IDs are keys, coefficients values. Defaults to {}."""

alpha = Field(default_factory=dict) class-attribute instance-attribute

Hill coefficients affecting the α activation term. Metabolite IDs are keys, coefficients values. Defaults to {}.

iota = Field(default_factory=dict) class-attribute instance-attribute

Hill coefficients affecting the ι inhibition term. Metabolite IDs are keys, coefficients values. Defaults to {}.

kappa = Field(default_factory=dict) class-attribute instance-attribute

Hill coefficients affecting the κ saturation term. Metabolite IDs are keys, coefficients values. Defaults to {}.

HillParameterReferences

Represents the database reference for the ι, α and κ Hill coefficients.

Source code in cobrak/dataclasses.py
87
88
89
90
91
92
93
94
95
96
@dataclass
class HillParameterReferences:
    """Represents the database reference for the ι, α and κ Hill coefficients."""

    kappa: dict[str, list[ParameterReference]] = Field(default_factory=dict)
    """References for κ Hill coefficients."""
    iota: dict[str, list[ParameterReference]] = Field(default_factory=dict)
    """References for ι Hill coefficients."""
    alpha: dict[str, list[ParameterReference]] = Field(default_factory=dict)
    """References for α Hill coefficients."""

alpha = Field(default_factory=dict) class-attribute instance-attribute

References for α Hill coefficients.

iota = Field(default_factory=dict) class-attribute instance-attribute

References for ι Hill coefficients.

kappa = Field(default_factory=dict) class-attribute instance-attribute

References for κ Hill coefficients.

Metabolite

Represents a Model's metabolite.

Source code in cobrak/dataclasses.py
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
@dataclass
class Metabolite:
    """Represents a Model's metabolite."""

    log_min_conc: FiniteFloat = log(1e-6)
    """Maximal logarithmic concentration (only relevant for thermodynamic constraints); Default is log(1e-6 M)"""
    log_max_conc: FiniteFloat = log(0.02)
    """Maximal logarithmic concentration (only relevant for thermodynamic constraints); Default is log(0.02 M)"""
    annotation: dict[str, str | list[str]] = Field(default_factory=dict)
    """Annotation (e.g., CHEBI numbers, ...); Default is {}"""
    name: str = ""
    """Colloquial name of metabolite"""
    formula: str = ""
    """Chemical formula of metabolite"""
    charge: int = 0
    """Electron charge of metabolite"""
    smiles: str = ""
    """SMILES string of metabolite"""
    compartment: str = ""
    """Identifier for a metabolite's compartment"""
    molar_mass: None | float = None
    """Molar mass of metaobolite (g⋅mol⁻¹)"""

annotation = Field(default_factory=dict) class-attribute instance-attribute

Annotation (e.g., CHEBI numbers, ...); Default is {}

charge = 0 class-attribute instance-attribute

Electron charge of metabolite

compartment = '' class-attribute instance-attribute

Identifier for a metabolite's compartment

formula = '' class-attribute instance-attribute

Chemical formula of metabolite

log_max_conc = log(0.02) class-attribute instance-attribute

Maximal logarithmic concentration (only relevant for thermodynamic constraints); Default is log(0.02 M)

log_min_conc = log(1e-06) class-attribute instance-attribute

Maximal logarithmic concentration (only relevant for thermodynamic constraints); Default is log(1e-6 M)

molar_mass = None class-attribute instance-attribute

Molar mass of metaobolite (g⋅mol⁻¹)

name = '' class-attribute instance-attribute

Colloquial name of metabolite

smiles = '' class-attribute instance-attribute

SMILES string of metabolite

Model

Represents a metabolic model in COBRAk.

This includes its Reaction instances (which define the reaction stoichiometries), its Metabolite instances (which are referenced in the mentioned stoichiometries), as well as optional enzymatic and thermodynamic data.

Source code in cobrak/dataclasses.py
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
@dataclass
class Model:
    """Represents a metabolic model in COBRAk.

    This includes its Reaction instances (which define the reaction stoichiometries),
    its Metabolite instances (which are referenced in the mentioned stoichiometries),
    as well as optional enzymatic and thermodynamic data.
    """

    metabolites: dict[str, Metabolite]
    """Keys: Metabolite IDs; Children: Metabolite instances"""
    reactions: dict[str, Reaction]
    """Keys: Reaction IDs; Children: Reaction instances"""
    enzymes: dict[str, Enzyme] = Field(default_factory=dict)
    """[Only neccessary with enzymatic constraints] Keys: Enzyme IDs; Children: Enzyme instances; default is {}"""
    max_prot_pool: PositiveFloat = Field(default=1e9)
    """[Only neccessary with enzymatic constraints] Maximal usable protein pool in g/gDW; default is 1e9, i.e. basically unrestricted"""
    extra_linear_watches: dict[str, ExtraLinearWatch] = Field(default_factory=dict)
    """[Optional] Extra non-linear watches. Keys are watch names, children the watch definition."""
    extra_nonlinear_watches: dict[str, ExtraNonlinearWatch] = Field(
        default_factory=dict
    )
    """[Optional] Extra non-linear watches. Keys are watch names, children the watch definition."""
    extra_linear_constraints: list[ExtraLinearConstraint] = Field(default_factory=list)
    """[Optional] Extra linear constraints"""
    extra_nonlinear_constraints: list[ExtraNonlinearConstraint] = Field(
        default_factory=list
    )
    """[Optional] Extra non-linear constraints"""
    kinetic_ignored_metabolites: list[str] = Field(default_factory=list)
    """[Optional and only works with saturation term constraints] Metabolite IDs for which no k_m is neccessary"""
    R: PositiveFloat = Field(default=STANDARD_R)
    """[Optional and only works with thermodynamic constraints] Gas constant reference for dG'° in kJ⋅K⁻¹⋅mol⁻¹; default is STANDARD_R"""
    T: PositiveFloat = Field(default=STANDARD_T)
    """[Optional and only works with thermodynamic constraints] Temperature reference for dG'° in K; default is STANDARD_T"""
    annotation: dict[str, str | list[str]] = Field(default_factory=dict)
    """[Optional] Any annotation for the model itself (e.g., its name or references). Has no effect on calculations."""
    reac_enz_separator: str = REAC_ENZ_SEPARATOR
    """[Optional] String infix that separated reaction IDs of reaction with multiple enzyme variants from their enzyme ID. Defaults to '_ENZ_'"""
    fwd_suffix: str = REAC_FWD_SUFFIX
    """[Optional] Reaction ID suffix of forward reaction variants (e.g. in a reversible reaction A→B, for the direction A→B). Default is '_FWD'"""
    rev_suffix: str = REAC_REV_SUFFIX
    """[Optional] Reaction ID suffix of reverse reaction variants (e.g. in a reversible reaction A→B, for the direction B→A). Default is '_REV'"""
    max_conc_sum: float = float("inf")
    """[Optional and only works with thermodynamic constraints, and overridden with float("inf") when include_met_concs_sum_in_prot_pool==True] Maximal allowed sum of concentrations (for MILPs: linear approximation; for NLPs: Exact value). Inactive if set to default value of float('inf')"""
    conc_sum_ignore_prefixes: list[str] = Field(default_factory=list)
    """[Optional and only works with thermodynamic constraints] """
    conc_sum_include_suffixes: list[str] = Field(default_factory=list)
    """[Optional and only works with thermodynamic constraints] """
    conc_sum_max_rel_error: float = 0.05
    """[Optional and only works with MILPs with thermodynamic constraints] Maximal relative concentration sum approximation error"""
    conc_sum_min_abs_error: float = 1e-6
    """[Optional and only works with MILPs with thermodynamic constraints] Maximal absolute concentration sum approximation error"""
    include_mets_in_prot_pool: bool = False
    """[Experimental! Optional and only works with MILPs with enzyme and thermodynamic constraints] Whether or not metabolite masses are included in the protein (now generalized mass) pool (makes the problem non-linear!)"""

    def __enter__(self):  # noqa: ANN204
        """Method called when entering 'with' blocks"""
        # Return a deep copy of self
        return deepcopy(self)

    def __exit__(self, a, b, c):  # noqa: ANN001, ANN204
        """Method called when leaving a 'with' block"""
        return  # Return None to propagate any exceptions

R = Field(default=STANDARD_R) class-attribute instance-attribute

[Optional and only works with thermodynamic constraints] Gas constant reference for dG'° in kJ⋅K⁻¹⋅mol⁻¹; default is STANDARD_R

T = Field(default=STANDARD_T) class-attribute instance-attribute

[Optional and only works with thermodynamic constraints] Temperature reference for dG'° in K; default is STANDARD_T

annotation = Field(default_factory=dict) class-attribute instance-attribute

[Optional] Any annotation for the model itself (e.g., its name or references). Has no effect on calculations.

conc_sum_ignore_prefixes = Field(default_factory=list) class-attribute instance-attribute

[Optional and only works with thermodynamic constraints]

conc_sum_include_suffixes = Field(default_factory=list) class-attribute instance-attribute

[Optional and only works with thermodynamic constraints]

conc_sum_max_rel_error = 0.05 class-attribute instance-attribute

[Optional and only works with MILPs with thermodynamic constraints] Maximal relative concentration sum approximation error

conc_sum_min_abs_error = 1e-06 class-attribute instance-attribute

[Optional and only works with MILPs with thermodynamic constraints] Maximal absolute concentration sum approximation error

enzymes = Field(default_factory=dict) class-attribute instance-attribute

[Only neccessary with enzymatic constraints] Keys: Enzyme IDs; Children: Enzyme instances; default is {}

extra_linear_constraints = Field(default_factory=list) class-attribute instance-attribute

[Optional] Extra linear constraints

extra_linear_watches = Field(default_factory=dict) class-attribute instance-attribute

[Optional] Extra non-linear watches. Keys are watch names, children the watch definition.

extra_nonlinear_constraints = Field(default_factory=list) class-attribute instance-attribute

[Optional] Extra non-linear constraints

extra_nonlinear_watches = Field(default_factory=dict) class-attribute instance-attribute

[Optional] Extra non-linear watches. Keys are watch names, children the watch definition.

fwd_suffix = REAC_FWD_SUFFIX class-attribute instance-attribute

[Optional] Reaction ID suffix of forward reaction variants (e.g. in a reversible reaction A→B, for the direction A→B). Default is '_FWD'

include_mets_in_prot_pool = False class-attribute instance-attribute

[Experimental! Optional and only works with MILPs with enzyme and thermodynamic constraints] Whether or not metabolite masses are included in the protein (now generalized mass) pool (makes the problem non-linear!)

kinetic_ignored_metabolites = Field(default_factory=list) class-attribute instance-attribute

[Optional and only works with saturation term constraints] Metabolite IDs for which no k_m is neccessary

max_conc_sum = float('inf') class-attribute instance-attribute

[Optional and only works with thermodynamic constraints, and overridden with float("inf") when include_met_concs_sum_in_prot_pool==True] Maximal allowed sum of concentrations (for MILPs: linear approximation; for NLPs: Exact value). Inactive if set to default value of float('inf')

max_prot_pool = Field(default=1000000000.0) class-attribute instance-attribute

[Only neccessary with enzymatic constraints] Maximal usable protein pool in g/gDW; default is 1e9, i.e. basically unrestricted

metabolites instance-attribute

Keys: Metabolite IDs; Children: Metabolite instances

reac_enz_separator = REAC_ENZ_SEPARATOR class-attribute instance-attribute

[Optional] String infix that separated reaction IDs of reaction with multiple enzyme variants from their enzyme ID. Defaults to 'ENZ'

reactions instance-attribute

Keys: Reaction IDs; Children: Reaction instances

rev_suffix = REAC_REV_SUFFIX class-attribute instance-attribute

[Optional] Reaction ID suffix of reverse reaction variants (e.g. in a reversible reaction A→B, for the direction B→A). Default is '_REV'

__enter__()

Method called when entering 'with' blocks

Source code in cobrak/dataclasses.py
385
386
387
388
def __enter__(self):  # noqa: ANN204
    """Method called when entering 'with' blocks"""
    # Return a deep copy of self
    return deepcopy(self)

__exit__(a, b, c)

Method called when leaving a 'with' block

Source code in cobrak/dataclasses.py
390
391
392
def __exit__(self, a, b, c):  # noqa: ANN001, ANN204
    """Method called when leaving a 'with' block"""
    return  # Return None to propagate any exceptions

ParameterReference

Represents the database reference for a kinetic parameter.

Source code in cobrak/dataclasses.py
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
@dataclass
class ParameterReference:
    """Represents the database reference for a kinetic parameter."""

    database: str = ""
    """(If given) The database from which this parameter was read. Defaults to ''."""
    comment: str = "(no refs)"
    """Any comment given for this value (e.g. literature)? Defaults to '(no refs)'."""
    species: str = ""
    """Scientific name of the species where this value was measured. Defaults to ''."""
    substrate: str = ""
    """The metabolite (or reaction substrate) for which this value was measured. Defaults to ''."""
    pubs: list[str] = Field(default_factory=list)
    """"""
    tax_distance: int | None = None
    value: float | None = None

comment = '(no refs)' class-attribute instance-attribute

Any comment given for this value (e.g. literature)? Defaults to '(no refs)'.

database = '' class-attribute instance-attribute

(If given) The database from which this parameter was read. Defaults to ''.

pubs = Field(default_factory=list) class-attribute instance-attribute

species = '' class-attribute instance-attribute

Scientific name of the species where this value was measured. Defaults to ''.

substrate = '' class-attribute instance-attribute

The metabolite (or reaction substrate) for which this value was measured. Defaults to ''.

Reaction

Represents a Model's reaction.

E.g., a reaction A -> B [0; 1000], ΔG'°=12.1 kJ⋅mol⁻¹, catalyzed by E1 with k_cat=1000 h⁻¹ would be Reaction( stoichiometries: { "A": -1, "B": +1, }, min_flux: 0, max_flux: 1000, dG0=12.1, dG0_uncertainty=None, enzyme_reaction_data=EnzymeReactionData( identifiers=["E1"], k_cat=1000, k_ms=None, k_is=None, k_as=None, hill_coefficients=None, ), annotation={}, # Can be also ignored )

Source code in cobrak/dataclasses.py
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
@dataclass
class Reaction:
    """Represents a Model's reaction.

    E.g., a reaction
    A -> B [0; 1000], ΔG'°=12.1 kJ⋅mol⁻¹, catalyzed by E1 with k_cat=1000 h⁻¹
    would be
    Reaction(
        stoichiometries: {
            "A": -1,
            "B": +1,
        },
        min_flux: 0,
        max_flux: 1000,
        dG0=12.1,
        dG0_uncertainty=None,
        enzyme_reaction_data=EnzymeReactionData(
            identifiers=["E1"],
            k_cat=1000,
            k_ms=None,
            k_is=None,
            k_as=None,
            hill_coefficients=None,
        ),
        annotation={}, # Can be also ignored
    )
    """

    stoichiometries: dict[str, float]
    """Metabolite stoichiometries"""
    min_flux: float = 0.0
    """Minimal flux (for COBRA-k, this must be ≥ 0). Defaults to 0.0."""
    max_flux: float = 1_000.0
    """Maximal flux (must be >= min_flux). Defaults to 1_000.0."""
    dG0: FiniteFloat | None = None
    """If given, the Gibb's free energy of the reaction (only relevant for thermodynamic constraints); Default is None"""
    dG0_uncertainty: FiniteFloat | None = None
    """If given, the Gibb's free energy's uncertainty (only relevant for thermodynamic constraints); Default is None"""
    enzyme_reaction_data: EnzymeReactionData | None = None
    """If given, enzymatic data (only relevant for enzymatic constraints); Default is None"""
    annotation: dict[str, str | list[str]] = Field(default_factory=dict)
    """Optional annotation (e.g., KEGG identifiers, ...)"""
    name: str = ""
    """Colloquial name of reaction"""

annotation = Field(default_factory=dict) class-attribute instance-attribute

Optional annotation (e.g., KEGG identifiers, ...)

dG0 = None class-attribute instance-attribute

If given, the Gibb's free energy of the reaction (only relevant for thermodynamic constraints); Default is None

dG0_uncertainty = None class-attribute instance-attribute

If given, the Gibb's free energy's uncertainty (only relevant for thermodynamic constraints); Default is None

enzyme_reaction_data = None class-attribute instance-attribute

If given, enzymatic data (only relevant for enzymatic constraints); Default is None

max_flux = 1000.0 class-attribute instance-attribute

Maximal flux (must be >= min_flux). Defaults to 1_000.0.

min_flux = 0.0 class-attribute instance-attribute

Minimal flux (for COBRA-k, this must be ≥ 0). Defaults to 0.0.

name = '' class-attribute instance-attribute

Colloquial name of reaction

stoichiometries instance-attribute

Metabolite stoichiometries

Solver

Represents options for a pyomo-compatible solver

Source code in cobrak/dataclasses.py
451
452
453
454
455
456
457
458
459
460
461
462
463
464
@dataclass
class Solver:
    """Represents options for a pyomo-compatible solver"""

    name: str
    """The solver's name. E.g. 'scip' for SCIP and 'cplex_direct' for CPLEX."""
    solver_options: dict[str, float | int | str] = Field(default_factory=dict)
    """[Optional] Options transmitted to the solver itself."""
    solver_attrs: dict[str, float | int | str] = Field(default_factory=dict)
    """[Optional] Options set on the solver object in pyomo."""
    solve_extra_options: dict[str, Any] = Field(default_factory=dict)
    """[Optional] Options set on pyomo's solve function."""
    solver_factory_args: dict[str, float | int | str] = Field(default_factory=dict)
    """[Optional] Arguments for pyomo's SolverFactory function"""

name instance-attribute

The solver's name. E.g. 'scip' for SCIP and 'cplex_direct' for CPLEX.

solve_extra_options = Field(default_factory=dict) class-attribute instance-attribute

[Optional] Options set on pyomo's solve function.

solver_attrs = Field(default_factory=dict) class-attribute instance-attribute

[Optional] Options set on the solver object in pyomo.

solver_factory_args = Field(default_factory=dict) class-attribute instance-attribute

[Optional] Arguments for pyomo's SolverFactory function

solver_options = Field(default_factory=dict) class-attribute instance-attribute

[Optional] Options transmitted to the solver itself.

equilibrator_functionality

This script is a wrapper for the ΔG'° determination with the eQuilibrator API.

This wrapper intends to work with BiGG-styled cobrapy metabolic models.

equilibrator_get_model_dG0_and_uncertainty_values_for_sbml(sbml_path, inner_to_outer_compartments, phs, pmgs, ionic_strengths, potential_differences, exclusion_prefixes=[], exclusion_inner_parts=[], ignore_uncertainty=False, max_uncertainty=1000.0, calculate_multicompartmental=True, ignored_metabolites=[])

Cobrapy model wrapper for the ΔG'° determination of reactions using the eQuilibrator-API.

Reactions are identified according to all annotation (in the cobrapy reaction's annotation member variable) given in this modules global USED_IDENTIFIERS list.

Parameters:

Name Type Description Default
sbml_path str

The path to the SBML-encoded constraint-based metabolic model for which ΔG'° values are determined.

required
inner_to_outer_compartments List[str]

A list with compartment IDs going from inner (e.g., in E. coli, the cytosol or 'c' in iML1515) to outer (e.g., the extracellular component or 'e' in iML1515). Used for the ΔG'° calculation in multi-compartmental reactions.

required
phs Dict[str, float]

A dictionary with compartment IDs as keys and the compartment pHs as values.

required
pmgs Dict[str, float]

A dictionary with compartment IDs as keys and the compartment pMgs as values.

required
ionic_strengths Dict[str, float]

A dictionary with compartment IDs as keys and the ionic strengths as values.

required
potential_differences Dict[Tuple[str, str], float]

A dictionary containing tuples with 2 elements describing the ID of an innter and outer compartment, and the potential difference between them.

required
max_uncertainty float

The maximal accepted uncertainty value (defaults to 1000 kJ⋅mol⁻¹). If a calculated uncertainty is higher than this value, the associated ΔG'° is not used (i.e., the specific reaction gets no ΔG'°).

1000.0
calculate_multicompartmental bool

If True, multicompartmental reactions also get a ΔG'° using the eQuilibrator's special routine for them. Defaults to True.

True
ignored_metabolites list[str]

List of metabolites that shall be ignored in reaction stoichiometries (e.g., for pseudo-metabolites) such as enzyme_pool in certain enzyme-constrained models. Defaults to [].

[]

Returns:

Type Description
tuple[dict[str, float], dict[str, float]]

Dict[str, Dict[str, float]]: A dictionary with the reaction IDs as keys, and dictionaries as values which, in turn, contain the ΔG'° of a reaction under the key 'dG0' and the calculated uncertainty as 'uncertainty'.

Source code in cobrak/equilibrator_functionality.py
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def equilibrator_get_model_dG0_and_uncertainty_values_for_sbml(
    sbml_path: str,
    inner_to_outer_compartments: list[str],
    phs: dict[str, float],
    pmgs: dict[str, float],
    ionic_strengths: dict[str, float],
    potential_differences: dict[tuple[str, str], float],
    exclusion_prefixes: list[str] = [],
    exclusion_inner_parts: list[str] = [],
    ignore_uncertainty: bool = False,
    max_uncertainty: float = 1_000.0,
    calculate_multicompartmental: bool = True,
    ignored_metabolites: list[str] = [],
) -> tuple[dict[str, float], dict[str, float]]:
    """Cobrapy model wrapper for the ΔG'° determination of reactions using the eQuilibrator-API.

    Reactions are identified according to all annotation (in the cobrapy reaction's annotation member variable)
    given in this modules global USED_IDENTIFIERS list.

    Args:
        sbml_path (str): The path to the SBML-encoded constraint-based metabolic model for which ΔG'° values are determined.
        inner_to_outer_compartments (List[str]): A list with compartment IDs going from inner (e.g., in E. coli,
            the cytosol or 'c' in iML1515) to outer (e.g., the extracellular component or 'e' in iML1515). Used
            for the ΔG'° calculation in multi-compartmental reactions.
        phs (Dict[str, float]): A dictionary with compartment IDs as keys and the compartment pHs as values.
        pmgs (Dict[str, float]): A dictionary with compartment IDs as keys and the compartment pMgs as values.
        ionic_strengths (Dict[str, float]): A dictionary with compartment IDs as keys and the ionic strengths as values.
        potential_differences (Dict[Tuple[str, str], float]): A dictionary containing tuples with 2 elements describing
            the ID of an innter and outer compartment, and the potential difference between them.
        max_uncertainty (float): The maximal accepted uncertainty value (defaults to 1000 kJ⋅mol⁻¹). If a calculated uncertainty
            is higher than this value, the associated ΔG'° is *not* used (i.e., the specific reaction gets no ΔG'°).
        calculate_multicompartmental (bool): If True, multicompartmental reactions also get a ΔG'° using the eQuilibrator's special
            routine for them. Defaults to True.
        ignored_metabolites (list[str]): List of metabolites that shall be ignored in reaction stoichiometries (e.g., for pseudo-metabolites)
            such as enzyme_pool in certain enzyme-constrained models. Defaults to [].

    Returns:
        Dict[str, Dict[str, float]]: A dictionary with the reaction IDs as keys, and dictionaries as values which,
            in turn, contain the ΔG'° of a reaction under the key 'dG0' and the calculated uncertainty as 'uncertainty'.
    """
    cobra_model = cobra.io.read_sbml_model(sbml_path)

    reaction_dG0s: dict[str, float] = {}
    reaction_dG0_uncertainties: dict[str, float] = {}
    cc = ComponentContribution()
    for reaction_x in cobra_model.reactions:
        reaction: cobra.Reaction = reaction_x

        stop = False
        for exclusion_prefix in exclusion_prefixes:
            if reaction.id.startswith(exclusion_prefix):
                stop = True
        for exclusion_inner_part in exclusion_inner_parts:
            if exclusion_inner_part in reaction.id:
                stop = True
        if stop:
            continue

        stoichiometries: list[float] = []
        compartments: list[str] = []
        identifiers: list[str] = []
        identifier_keys: list[str] = []
        for metabolite_x in reaction.metabolites:
            metabolite: cobra.Metabolite = metabolite_x
            if metabolite.id in ignored_metabolites:
                continue
            stoichiometries.append(reaction.metabolites[metabolite])
            compartments.append(metabolite.compartment)
            identifier = ""
            for used_identifier in USED_IDENTIFIERS_FOR_EQUILIBRATOR:
                if used_identifier not in metabolite.annotation:
                    continue
                metabolite_identifiers = metabolite.annotation[used_identifier]
                identifier_temp = ""
                if isinstance(metabolite_identifiers, list):
                    identifier_temp = metabolite_identifiers[0]
                elif isinstance(metabolite_identifiers, str):
                    identifier_temp = metabolite_identifiers
                if used_identifier == "inchi":
                    compound = cc.get_compound_by_inchi(identifier_temp)
                elif used_identifier == "inchi_key":
                    compound_list = cc.search_compound_by_inchi_key(identifier_temp)
                    compound = compound_list[0] if len(compound_list) > 0 else None
                else:
                    identifier_temp = used_identifier + ":" + identifier_temp
                    compound = cc.get_compound(identifier_temp)
                if compound is not None:
                    identifier_key = used_identifier
                    identifier = identifier_temp
                    break
            if not identifier:
                break
            identifier_keys.append(identifier_key)
            identifiers.append(identifier)

        if not identifier:
            print(
                f"ERROR: Metabolite {metabolite_x.id} has no identifier of the given types!"
            )
            print(metabolite_x.annotation)
            continue

        # Check for three cases:
        # 1: Single-compartment reaction
        # 2: Double-compartment reaction
        # 3: Multi-compartment reaction (not possible)
        unique_reaction_compartments = list(set(compartments))
        num_compartments = len(unique_reaction_compartments)
        if num_compartments == 1:
            # Set compartment conditions
            compartment = unique_reaction_compartments[0]
            cc.p_h = Q_(phs[compartment])
            cc.p_mg = Q_(pmgs[compartment])
            cc.ionic_strength = Q_(str(ionic_strengths[compartment]) + "mM")

            # Build together reaction
            reaction_dict: dict[Any, float] = {}
            for i in range(len(stoichiometries)):
                identifier_string = identifiers[i]
                identifier_key = identifier_keys[i]
                stoichiometry = stoichiometries[i]
                if identifier_key == "inchi":
                    compound = cc.get_compound_by_inchi(identifier_string)
                elif identifier_key == "inchi_key":
                    compound = cc.search_compound_by_inchi_key(identifier_string)[0]
                else:
                    compound = cc.get_compound(identifier_string)
                reaction_dict[compound] = stoichiometry
            cc_reaction = Reaction(reaction_dict)

            # Check whether or not the reaction is balanced and...
            if not cc_reaction.is_balanced():
                print(f"INFO: Reaction {reaction.id} is not balanced")
                continue

            standard_dg_prime = cc.standard_dg_prime(cc_reaction)
            uncertainty = standard_dg_prime.error.m_as("kJ/mol")
            if uncertainty < max_uncertainty:
                dG0 = standard_dg_prime.value.m_as("kJ/mol")
                reaction_dG0s[reaction.id] = dG0
                if ignore_uncertainty:
                    reaction_dG0_uncertainties[reaction.id] = 0.0
                else:
                    reaction_dG0_uncertainties[reaction.id] = abs(uncertainty)

                print(
                    f"No error with reaction {reaction.id}, ΔG'° succesfully calculated!"
                )
            else:
                print(
                    f"INFO: Reaction {reaction.id} uncertainty is too high with {uncertainty} kJ⋅mol⁻¹; ΔG'° not assigned for this reaction"
                )
        elif calculate_multicompartmental and num_compartments == 2:
            index_zero = inner_to_outer_compartments.index(
                unique_reaction_compartments[0]
            )
            index_one = inner_to_outer_compartments.index(
                unique_reaction_compartments[1]
            )

            if index_one > index_zero:
                outer_compartment = unique_reaction_compartments[1]
                inner_compartment = unique_reaction_compartments[0]
            else:
                outer_compartment = unique_reaction_compartments[0]
                inner_compartment = unique_reaction_compartments[1]

            ph_inner = Q_(phs[inner_compartment])
            ph_outer = Q_(phs[outer_compartment])
            ionic_strength_inner = Q_(str(ionic_strengths[inner_compartment]) + " mM")
            ionic_strength_outer = Q_(str(ionic_strengths[outer_compartment]) + " mM")
            pmg_inner = Q_(pmgs[inner_compartment])
            pmg_outer = Q_(pmgs[outer_compartment])

            if (inner_compartment, outer_compartment) in potential_differences:
                potential_difference = Q_(
                    str(potential_differences[(inner_compartment, outer_compartment)])
                    + " V"
                )
            elif (outer_compartment, inner_compartment) in potential_differences:
                potential_difference = Q_(
                    str(potential_differences[(outer_compartment, inner_compartment)])
                    + " V"
                )
            else:
                print("ERROR")
                continue

            inner_reaction_dict: dict[str, float] = {}
            outer_reaction_dict: dict[str, float] = {}
            for i in range(len(stoichiometries)):
                key = identifiers[i]
                stoichiometry = stoichiometries[i]
                try:
                    compound_key = cc.get_compound(key)
                except Exception:  # sqlalchemy.orm.exc.MultipleResultsFound
                    print("ERROR")
                    continue

                if compound_key is None:
                    print("NONE in compound")
                    continue

                if compartments[i] == inner_compartment:
                    inner_reaction_dict[compound_key] = stoichiometry
                else:
                    outer_reaction_dict[compound_key] = stoichiometry

            cc_inner_reaction = Reaction(inner_reaction_dict)
            cc_outer_reaction = Reaction(outer_reaction_dict)

            cc.p_h = ph_inner
            cc.ionic_strength = ionic_strength_inner
            cc.p_mg = pmg_inner
            try:
                standard_dg_prime = cc.multicompartmental_standard_dg_prime(
                    cc_inner_reaction,
                    cc_outer_reaction,
                    e_potential_difference=potential_difference,
                    p_h_outer=ph_outer,
                    p_mg_outer=pmg_outer,
                    ionic_strength_outer=ionic_strength_outer,
                )
                uncertainty = standard_dg_prime.error.m_as("kJ/mol")
                if uncertainty < max_uncertainty:
                    dG0 = standard_dg_prime.value.m_as("kJ/mol")
                    reaction_dG0s[reaction.id] = dG0
                    if ignore_uncertainty:
                        reaction_dG0_uncertainties[reaction.id] = 0.0
                    else:
                        reaction_dG0_uncertainties[reaction.id] = abs(uncertainty)
            except ValueError:
                print("ERROR: Multi-compartmental reaction is not balanced")
                continue
        else:
            print("ERROR: More than two compartments are not possible")
            continue

    return reaction_dG0s, reaction_dG0_uncertainties

evolution

Includes functions for calling COBRA-k's genetic algorithm for global NLP-based optimization.

The actual genetic algorithm can be found in the module 'genetic'.

COBRAKProblem

Represents a problem to be solved using evolutionary optimization techniques.

Parameters:

Name Type Description Default
cobrak_model Model

The original COBRA-k model to optimize.

required
objective_target dict[str, float]

The target values for the objectives.

required
objective_sense int

The sense of the objective function (1 for maximization, -1 for minimization).

required
variability_dict dict[str, tuple[float, float]]

The variability data for each reaction.

required
nlp_dict_list list[dict[str, float]]

A list of initial NLP solutions.

required
best_value float

The best value found so far.

required
with_kappa bool

Whether to use kappa parameter. Defaults to True.

True
with_gamma bool

Whether to use gamma parameter. Defaults to True.

True
with_iota bool

Whether to use iota parameter. Defaults to True.

True
with_alpha bool

Whether to use alpha parameter. Defaults to True.

True
num_gens int

The number of generations in the evolutionary algorithm. Defaults to 5.

5
algorithm Literal['genetic']

The type of optimization algorithm to use. Defaults to "genetic", the only available algorithm right now.

'genetic'
lp_solver Solver

The linear programming solver to use. Defaults to SCIP.

SCIP
nlp_solver Solver

The nonlinear programming solver to use. Defaults to IPOPT.

IPOPT
objvalue_json_path str

The path to the JSON file for storing objective values. Defaults to "".

''
max_rounds_same_objvalue float

The maximum number of rounds with the same objective value before stopping. Defaults to float("inf").

float('inf')
correction_config CorrectionConfig

Configuration for corrections during optimization. Defaults to CorrectionConfig().

CorrectionConfig()
min_abs_objvalue float

The minimum absolute value of the objective function to consider as valid. Defaults to 1e-6.

1e-06
pop_size int | None

The population size for the evolutionary algorithm. Defaults to None.

None
ignore_nonlinear_extra_terms_in_ectfbas bool

(bool, optional): Whether or not non-linear watches/constraints shall be ignored in ecTFBAs. Defaults to True.

True

Attributes:

Name Type Description
original_cobrak_model Model

A deep copy of the original COBRA-k model.

blocked_reacs list[str]

List of blocked reactions.

initial_xs_list list[list[int | float]]

Initial list of solutions for each NLP.

minimal_xs_dict dict[float, list[float]]

Dictionary to store minimal solutions.

variability_data dict[str, tuple[float, float]]

A deep copy of the variability data.

idx_to_reac_ids dict[int, tuple[str, ...]]

Mapping from index to reaction IDs.

dim int

The dimension of the problem.

lp_solver Solver

The linear programming solver.

nlp_solver Solver

The nonlinear programming solver.

temp_directory_name str

Name of the temporary directory for storing results.

best_value float

The best value found so far.

objvalue_json_path str

Path to the JSON file for objective values.

max_rounds_same_objvalue float

Maximum number of rounds with same objective value.

correction_config CorrectionConfig

Configuration for corrections.

min_abs_objvalue float

Minimum absolute value of objective function to consider valid.

Source code in cobrak/evolution.py
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
class COBRAKProblem:
    """Represents a problem to be solved using evolutionary optimization techniques.

    Args:
        cobrak_model (Model): The original COBRA-k model to optimize.
        objective_target (dict[str, float]): The target values for the objectives.
        objective_sense (int): The sense of the objective function (1 for maximization, -1 for minimization).
        variability_dict (dict[str, tuple[float, float]]): The variability data for each reaction.
        nlp_dict_list (list[dict[str, float]]): A list of initial NLP solutions.
        best_value (float): The best value found so far.
        with_kappa (bool, optional): Whether to use kappa parameter. Defaults to True.
        with_gamma (bool, optional): Whether to use gamma parameter. Defaults to True.
        with_iota (bool, optional): Whether to use iota parameter. Defaults to True.
        with_alpha (bool, optional): Whether to use alpha parameter. Defaults to True.
        num_gens (int, optional): The number of generations in the evolutionary algorithm. Defaults to 5.
        algorithm (Literal["genetic"], optional): The type of optimization algorithm to use. Defaults to "genetic", the only available algorithm right now.
        lp_solver (Solver, optional): The linear programming solver to use. Defaults to SCIP.
        nlp_solver (Solver, optional): The nonlinear programming solver to use. Defaults to IPOPT.
        objvalue_json_path (str, optional): The path to the JSON file for storing objective values. Defaults to "".
        max_rounds_same_objvalue (float, optional): The maximum number of rounds with the same objective value before stopping. Defaults to float("inf").
        correction_config (CorrectionConfig, optional): Configuration for corrections during optimization. Defaults to CorrectionConfig().
        min_abs_objvalue (float, optional): The minimum absolute value of the objective function to consider as valid. Defaults to 1e-6.
        pop_size (int | None, optional): The population size for the evolutionary algorithm. Defaults to None.
        ignore_nonlinear_extra_terms_in_ectfbas: (bool, optional): Whether or not non-linear watches/constraints shall be ignored in ecTFBAs. Defaults to True.

    Attributes:
        original_cobrak_model (Model): A deep copy of the original COBRA-k model.
        blocked_reacs (list[str]): List of blocked reactions.
        initial_xs_list (list[list[int | float]]): Initial list of solutions for each NLP.
        minimal_xs_dict (dict[float, list[float]]): Dictionary to store minimal solutions.
        variability_data (dict[str, tuple[float, float]]): A deep copy of the variability data.
        idx_to_reac_ids (dict[int, tuple[str, ...]]): Mapping from index to reaction IDs.
        dim (int): The dimension of the problem.
        lp_solver (Solver): The linear programming solver.
        nlp_solver (Solver): The nonlinear programming solver.
        temp_directory_name (str): Name of the temporary directory for storing results.
        best_value (float): The best value found so far.
        objvalue_json_path (str): Path to the JSON file for objective values.
        max_rounds_same_objvalue (float): Maximum number of rounds with same objective value.
        correction_config (CorrectionConfig): Configuration for corrections.
        min_abs_objvalue (float): Minimum absolute value of objective function to consider valid.
    """

    def __init__(
        self,
        cobrak_model: Model,
        objective_target: dict[str, float],
        objective_sense: int,
        variability_dict: dict[str, tuple[float, float]],
        nlp_dict_list: list[dict[str, float]],
        best_value: float,
        with_kappa: bool = True,
        with_gamma: bool = True,
        with_iota: bool = True,
        with_alpha: bool = True,
        num_gens: int = 5,
        algorithm: Literal["genetic"] = "genetic",
        lp_solver: Solver = SCIP,
        nlp_solver: Solver = IPOPT,
        nlp_strict_mode: bool = False,
        nlp_single_strict_reacs: list[str] = [],
        objvalue_json_path: str = "",
        max_rounds_same_objvalue: float = float("inf"),
        correction_config: CorrectionConfig = CorrectionConfig(),
        min_abs_objvalue: float = 1e-6,
        pop_size: int | None = None,
        ignore_nonlinear_extra_terms_in_ectfbas: bool = True,
    ) -> None:
        """Initializes a COBRAKProblem object.

        Args:
            cobrak_model (Model): The original COBRA-k model to optimize.
            objective_target (dict[str, float]): The target values for the objectives.
            objective_sense (int): The sense of the objective function (1 for maximization, -1 for minimization).
            variability_dict (dict[str, tuple[float, float]]): The variability data for each reaction.
            nlp_dict_list (list[dict[str, float]]): A list of initial NLP solutions.
            best_value (float): The best value found so far.
            with_kappa (bool, optional): Whether to use kappa parameter. Defaults to True.
            with_gamma (bool, optional): Whether to use gamma parameter. Defaults to True.
            with_iota (bool, optional): Whether to use iota parameter. Defaults to True.
            with_alpha (bool, optional): Whether to use alpha parameter. Defaults to True.
            num_gens (int, optional): The number of generations in the evolutionary algorithm. Defaults to 5.
            algorithm (Literal["genetic"], optional): The type of optimization algorithm to use. Defaults to "genetic", the only algorithm currently available.
            lp_solver (Solver, optional): The linear programming solver to use. Defaults to SCIP.
            nlp_solver (Solver, optional): The nonlinear programming solver to use. Defaults to IPOPT.
            nlp_strict_mode (bool, optional): Whether or not the <= heuristic (True) or not (False; i.e. setting all equations to ==) shall be used. Defaults to False.
            nlp_single_strict_reacs (list[str], optional): List of single reactions that shall be in strict mode (see ```nlp_strict_mode```argument above).
                If ```nlp_strict_mode=True```, this has no effect. Defaults to [].
            objvalue_json_path (str, optional): The path to the JSON file for storing objective values. Defaults to "".
            max_rounds_same_objvalue (float, optional): The maximum number of rounds with the same objective value before stopping. Defaults to float("inf").
            correction_config (CorrectionConfig, optional): Configuration for corrections during optimization. Defaults to CorrectionConfig().
            min_abs_objvalue (float, optional): The minimum absolute value of the objective function to consider as valid. Defaults to 1e-6.
            pop_size (int | None, optional): The population size for the evolutionary algorithm. Defaults to None.
            ignore_nonlinear_extra_terms_in_ectfbas: (bool, optional): Whether or not non-linear watches/constraints shall be ignored in ecTFBAs. Defaults to True.
        """
        self.original_cobrak_model: Model = deepcopy(cobrak_model)
        self.objective_target = objective_target
        self.objective_sense = objective_sense
        self.blocked_reacs: list[str] = []
        self.initial_xs_list: list[list[int | float]] = [
            [] for _ in range(len(nlp_dict_list))
        ]
        self.minimal_xs_dict: dict[float, list[float]] = {}
        self.variability_data = deepcopy(variability_dict)
        self.algorithm = algorithm

        reac_couples = get_stoichiometrically_coupled_reactions(
            self.original_cobrak_model
        )

        objective_target_ids = list(objective_target.keys())
        filtered_reac_couples: list[tuple[str, ...]] = []
        for reac_couple in reac_couples:
            filtered_reac_couple = [
                reac_id
                for reac_id in reac_couple
                if (abs(self.variability_data[reac_id][1]) > 0.0)
                and (abs(self.variability_data[reac_id][0]) <= 0.0)
                and not (
                    cobrak_model.reactions[reac_id].dG0 is None
                    and cobrak_model.reactions[reac_id].enzyme_reaction_data is None
                )
            ]

            found_invalid_id = False
            for objective_target_id in objective_target_ids:
                if objective_target_id in filtered_reac_couple:
                    found_invalid_id = True
            for var_id in correction_config.error_scenario:
                if var_id in filtered_reac_couple:
                    found_invalid_id = True

            if found_invalid_id:
                continue

            if len(filtered_reac_couple) > 0:
                filtered_reac_couples.append(tuple(filtered_reac_couple))

        self.idx_to_reac_ids: dict[int, tuple[str, ...]] = {}
        couple_idx = 0
        for filtered_reac_couplex in filtered_reac_couples:
            nlp_idx = 0
            for nlp_dict in nlp_dict_list:
                first_reac_id = filtered_reac_couplex[0]
                if (first_reac_id in nlp_dict) and (nlp_dict[first_reac_id] > 0.0):
                    self.initial_xs_list[nlp_idx].append(
                        1.0 if algorithm != "genetic" else 1
                    )
                else:
                    self.initial_xs_list[nlp_idx].append(
                        0.0 if algorithm != "genetic" else 0
                    )

                nlp_idx += 1  # noqa: SIM113

            self.idx_to_reac_ids[couple_idx] = filtered_reac_couplex
            couple_idx += 1

        self.with_kappa = with_kappa
        self.with_gamma = with_gamma
        self.with_iota = with_iota
        self.with_alpha = with_alpha
        self.dim = couple_idx
        self.num_gens = num_gens
        self.lp_solver = lp_solver
        self.nlp_solver = nlp_solver
        self.nlp_strict_mode = nlp_strict_mode
        self.nlp_single_strict_reacs = nlp_single_strict_reacs
        self.temp_directory_name = ""
        self.best_value = best_value
        self.objvalue_json_path = objvalue_json_path
        self.max_rounds_same_objvalue = max_rounds_same_objvalue
        self.correction_config = correction_config
        self.min_abs_objvalue = min_abs_objvalue
        self.pop_size = pop_size
        self.ignore_nonlinear_extra_terms_in_ectfbas = (
            ignore_nonlinear_extra_terms_in_ectfbas
        )

    def fitness(
        self,
        x: list[float | int],
    ) -> list[tuple[float, list[float | int]]]:
        """Calculates the fitness of a given solution.

        Args:
            x (list[float | int]): The solution to evaluate.

        Returns:
            list[tuple[float, list[float | int]]]: A list of tuples, where each tuple contains the fitness value and the corresponding solution.
        """
        # Preliminary TFBA :3
        deactivated_reactions: list[str] = []
        for couple_idx, reac_ids in self.idx_to_reac_ids.items():
            if x[couple_idx] <= 0.02:
                deactivated_reactions.extend(reac_ids)

        try:
            first_ectfba_dict = perform_lp_optimization(
                cobrak_model=self.original_cobrak_model,
                objective_target=self.objective_target,
                objective_sense=self.objective_sense,
                with_enzyme_constraints=True,
                with_thermodynamic_constraints=True,
                with_loop_constraints=True,
                variability_dict=deepcopy(self.variability_data),
                ignored_reacs=deactivated_reactions,
                solver=self.lp_solver,
                correction_config=self.correction_config,
                ignore_nonlinear_terms=self.ignore_nonlinear_extra_terms_in_ectfbas,
            )
        except (ApplicationError, AttributeError, ValueError):
            first_ectfba_dict = {ALL_OK_KEY: False}
        if not first_ectfba_dict[ALL_OK_KEY]:
            return [(1_000_000.0, [])]

        nlp_results: list[dict[str, float]] = []

        if is_objsense_maximization(self.objective_sense):
            lower_value = first_ectfba_dict[OBJECTIVE_VAR_NAME] - 1e-12
            upper_value = None
        else:
            lower_value = None
            upper_value = first_ectfba_dict[OBJECTIVE_VAR_NAME] + 1e-12

        maxz_model = deepcopy(self.original_cobrak_model)
        maxz_model.extra_linear_constraints = [
            ExtraLinearConstraint(
                stoichiometries=self.objective_target,
                lower_value=lower_value,
                upper_value=upper_value,
            )
        ]
        maxz_model.extra_linear_constraints += [
            ExtraLinearConstraint(
                stoichiometries={f"{Z_VAR_PREFIX}{reac_id}": 1.0},
                upper_value=0.0,
            )
            for (reac_id, reac_data) in self.original_cobrak_model.reactions.items()
            if (reac_data.dG0 is not None) and (reac_id in deactivated_reactions)
        ]
        eligible_z_sum_objective = {
            f"{Z_VAR_PREFIX}{reac_id}": 1.0
            for (reac_id, reac_data) in self.original_cobrak_model.reactions.items()
            if (reac_data.dG0 is not None)
            and (self.variability_data[reac_id][1] > 0.0)
            and (reac_id not in deactivated_reactions)
        }
        try:
            maxz_ectfba_dict = perform_lp_optimization(
                cobrak_model=maxz_model,
                objective_target=eligible_z_sum_objective,
                objective_sense=+1,
                with_enzyme_constraints=True,
                with_thermodynamic_constraints=True,
                with_loop_constraints=True,
                variability_dict=deepcopy(self.variability_data),
                ignored_reacs=deactivated_reactions,
                solver=self.lp_solver,
                correction_config=self.correction_config,
                ignore_nonlinear_terms=self.ignore_nonlinear_extra_terms_in_ectfbas,
            )
        except (ApplicationError, AttributeError, ValueError):
            maxz_ectfba_dict = {ALL_OK_KEY: False}
        if maxz_ectfba_dict[ALL_OK_KEY]:
            used_maxz_tfba_dict: dict[str, float] = {}
            for var_id in maxz_ectfba_dict:
                if var_id not in self.original_cobrak_model.reactions:
                    continue
                reaction = self.original_cobrak_model.reactions[var_id]
                if (
                    (reaction.dG0 is None) and (var_id not in deactivated_reactions)
                ) or (
                    (reaction.dG0 is not None)
                    and (maxz_ectfba_dict[f"{Z_VAR_PREFIX}{var_id}"] > 0.0)
                ):
                    used_maxz_tfba_dict[var_id] = 1.0
                else:
                    used_maxz_tfba_dict[var_id] = 0.0

            used_maxz_tfba_dict[ALL_OK_KEY] = True

            if used_maxz_tfba_dict[ALL_OK_KEY]:
                try:
                    second_nlp_dict = (
                        perform_nlp_irreversible_optimization_with_active_reacs_only(
                            cobrak_model=self.original_cobrak_model,
                            objective_target=self.objective_target,
                            objective_sense=self.objective_sense,
                            optimization_dict=deepcopy(used_maxz_tfba_dict),
                            variability_dict=deepcopy(self.variability_data),
                            with_kappa=self.with_kappa,
                            with_gamma=self.with_gamma,
                            with_iota=self.with_iota,
                            with_alpha=self.with_alpha,
                            solver=self.nlp_solver,
                            correction_config=self.correction_config,
                            strict_mode=self.nlp_strict_mode,
                            single_strict_reacs=self.nlp_single_strict_reacs,
                        )
                    )
                    if second_nlp_dict[ALL_OK_KEY] and (
                        abs(second_nlp_dict[OBJECTIVE_VAR_NAME]) > self.min_abs_objvalue
                    ):
                        nlp_results.append(second_nlp_dict)
                except (ApplicationError, AttributeError, ValueError):
                    pass

        ####
        try:
            minz_ectfba_dict = perform_lp_optimization(
                cobrak_model=maxz_model,
                objective_target=eligible_z_sum_objective,
                objective_sense=-1,
                with_enzyme_constraints=True,
                with_thermodynamic_constraints=True,
                with_loop_constraints=True,
                variability_dict=deepcopy(self.variability_data),
                ignored_reacs=deactivated_reactions,
                solver=self.lp_solver,
                correction_config=self.correction_config,
                ignore_nonlinear_terms=self.ignore_nonlinear_extra_terms_in_ectfbas,
            )
        except (ApplicationError, AttributeError, ValueError):
            minz_ectfba_dict = {ALL_OK_KEY: False}
        if minz_ectfba_dict[ALL_OK_KEY]:
            used_minz_tfba_dict: dict[str, float] = {}
            for var_id in minz_ectfba_dict:
                if var_id not in self.original_cobrak_model.reactions:
                    continue
                reaction = self.original_cobrak_model.reactions[var_id]
                if (
                    (reaction.dG0 is None) and (var_id not in deactivated_reactions)
                ) or (
                    (reaction.dG0 is not None)
                    and (minz_ectfba_dict[f"{Z_VAR_PREFIX}{var_id}"] > 0.0)
                ):
                    used_minz_tfba_dict[var_id] = 1.0
                else:
                    used_minz_tfba_dict[var_id] = 0.0

            used_minz_tfba_dict[ALL_OK_KEY] = True

            if used_minz_tfba_dict[ALL_OK_KEY]:
                try:
                    third_nlp_dict = (
                        perform_nlp_irreversible_optimization_with_active_reacs_only(
                            cobrak_model=self.original_cobrak_model,
                            objective_target=self.objective_target,
                            objective_sense=self.objective_sense,
                            optimization_dict=deepcopy(used_minz_tfba_dict),
                            variability_dict=deepcopy(self.variability_data),
                            with_kappa=self.with_kappa,
                            with_gamma=self.with_gamma,
                            with_iota=self.with_iota,
                            with_alpha=self.with_alpha,
                            solver=self.nlp_solver,
                            correction_config=self.correction_config,
                            strict_mode=self.nlp_strict_mode,
                            single_strict_reacs=self.nlp_single_strict_reacs,
                        )
                    )
                    if third_nlp_dict[ALL_OK_KEY] and (
                        abs(third_nlp_dict[OBJECTIVE_VAR_NAME]) > self.min_abs_objvalue
                    ):
                        nlp_results.append(third_nlp_dict)
                except (ApplicationError, AttributeError, ValueError):
                    pass
        ####

        output: list[tuple[float, list[float | int]]] = [(1_000_000, [])]
        for nlp_result in nlp_results:
            objvalues = [nlp_result[OBJECTIVE_VAR_NAME] for nlp_result in nlp_results]
            if is_objsense_maximization(self.objective_sense):
                opt_idx = objvalues.index(max(objvalues))
            else:
                opt_idx = objvalues.index(min(objvalues))
            opt_nlp_dict = nlp_results[opt_idx]

            objective_value = opt_nlp_dict[OBJECTIVE_VAR_NAME]

            if self.temp_directory_name:
                filename = f"{self.temp_directory_name}{objective_value}{time()}{randint(0, 1_000_000_000)}.json"  # noqa: NPY002
                json_write(filename, opt_nlp_dict)

            if is_objsense_maximization(self.objective_sense):
                objective_value *= -1

            print("No error, objective value is:", objective_value)

            active_nlp_x: list[float | int] = [
                0 for _ in range(len(list(self.idx_to_reac_ids.keys())))
            ]
            for couple_idx, reac_ids in self.idx_to_reac_ids.items():
                reac_id = reac_ids[0]
                if reac_id not in opt_nlp_dict or opt_nlp_dict[reac_id] < 1e-11:
                    set_value = 0
                else:
                    set_value = 1
                active_nlp_x[couple_idx] = set_value
            output.append((objective_value, active_nlp_x))

        return output

    def optimize(self) -> dict[float, list[dict[str, float]]]:
        """Performs the optimization process.

        Returns:
            dict[float, list[dict[str, float]]]: A dictionary containing the optimization results.
        """
        temp_directory = TemporaryDirectory()
        self.temp_directory_name = standardize_folder(temp_directory.name)

        match self.algorithm:
            case "genetic":
                evolution = COBRAKGENETIC(
                    fitness_function=self.fitness,
                    xs_dim=self.dim,
                    extra_xs=self.initial_xs_list,
                    gen=self.num_gens,
                    objvalue_json_path=self.objvalue_json_path,
                    max_rounds_same_objvalue=self.max_rounds_same_objvalue,
                    pop_size=self.pop_size,
                )
            case _:
                print(
                    f"ERROR: Evolution algorithm {self.algorithm} does not exist! Use 'genetic'."
                )
                raise ValueError
        evolution.run()

        result_dict: dict[float, list[dict[str, float]]] = {}
        for json_filename in get_files(self.temp_directory_name):
            json_data = json_load(f"{self.temp_directory_name}{json_filename}", Any)
            objective_value = json_data[OBJECTIVE_VAR_NAME]
            if objective_value not in result_dict:
                result_dict[objective_value] = []
            result_dict[objective_value].append(deepcopy(json_data))

        temp_directory.cleanup()

        return {
            key: result_dict[key] for key in sorted(result_dict.keys(), reverse=True)
        }

__init__(cobrak_model, objective_target, objective_sense, variability_dict, nlp_dict_list, best_value, with_kappa=True, with_gamma=True, with_iota=True, with_alpha=True, num_gens=5, algorithm='genetic', lp_solver=SCIP, nlp_solver=IPOPT, nlp_strict_mode=False, nlp_single_strict_reacs=[], objvalue_json_path='', max_rounds_same_objvalue=float('inf'), correction_config=CorrectionConfig(), min_abs_objvalue=1e-06, pop_size=None, ignore_nonlinear_extra_terms_in_ectfbas=True)

Initializes a COBRAKProblem object.

Parameters:

Name Type Description Default
cobrak_model Model

The original COBRA-k model to optimize.

required
objective_target dict[str, float]

The target values for the objectives.

required
objective_sense int

The sense of the objective function (1 for maximization, -1 for minimization).

required
variability_dict dict[str, tuple[float, float]]

The variability data for each reaction.

required
nlp_dict_list list[dict[str, float]]

A list of initial NLP solutions.

required
best_value float

The best value found so far.

required
with_kappa bool

Whether to use kappa parameter. Defaults to True.

True
with_gamma bool

Whether to use gamma parameter. Defaults to True.

True
with_iota bool

Whether to use iota parameter. Defaults to True.

True
with_alpha bool

Whether to use alpha parameter. Defaults to True.

True
num_gens int

The number of generations in the evolutionary algorithm. Defaults to 5.

5
algorithm Literal['genetic']

The type of optimization algorithm to use. Defaults to "genetic", the only algorithm currently available.

'genetic'
lp_solver Solver

The linear programming solver to use. Defaults to SCIP.

SCIP
nlp_solver Solver

The nonlinear programming solver to use. Defaults to IPOPT.

IPOPT
nlp_strict_mode bool

Whether or not the <= heuristic (True) or not (False; i.e. setting all equations to ==) shall be used. Defaults to False.

False
nlp_single_strict_reacs list[str]

List of single reactions that shall be in strict mode (see nlp_strict_modeargument above). If nlp_strict_mode=True, this has no effect. Defaults to [].

[]
objvalue_json_path str

The path to the JSON file for storing objective values. Defaults to "".

''
max_rounds_same_objvalue float

The maximum number of rounds with the same objective value before stopping. Defaults to float("inf").

float('inf')
correction_config CorrectionConfig

Configuration for corrections during optimization. Defaults to CorrectionConfig().

CorrectionConfig()
min_abs_objvalue float

The minimum absolute value of the objective function to consider as valid. Defaults to 1e-6.

1e-06
pop_size int | None

The population size for the evolutionary algorithm. Defaults to None.

None
ignore_nonlinear_extra_terms_in_ectfbas bool

(bool, optional): Whether or not non-linear watches/constraints shall be ignored in ecTFBAs. Defaults to True.

True
Source code in cobrak/evolution.py
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
def __init__(
    self,
    cobrak_model: Model,
    objective_target: dict[str, float],
    objective_sense: int,
    variability_dict: dict[str, tuple[float, float]],
    nlp_dict_list: list[dict[str, float]],
    best_value: float,
    with_kappa: bool = True,
    with_gamma: bool = True,
    with_iota: bool = True,
    with_alpha: bool = True,
    num_gens: int = 5,
    algorithm: Literal["genetic"] = "genetic",
    lp_solver: Solver = SCIP,
    nlp_solver: Solver = IPOPT,
    nlp_strict_mode: bool = False,
    nlp_single_strict_reacs: list[str] = [],
    objvalue_json_path: str = "",
    max_rounds_same_objvalue: float = float("inf"),
    correction_config: CorrectionConfig = CorrectionConfig(),
    min_abs_objvalue: float = 1e-6,
    pop_size: int | None = None,
    ignore_nonlinear_extra_terms_in_ectfbas: bool = True,
) -> None:
    """Initializes a COBRAKProblem object.

    Args:
        cobrak_model (Model): The original COBRA-k model to optimize.
        objective_target (dict[str, float]): The target values for the objectives.
        objective_sense (int): The sense of the objective function (1 for maximization, -1 for minimization).
        variability_dict (dict[str, tuple[float, float]]): The variability data for each reaction.
        nlp_dict_list (list[dict[str, float]]): A list of initial NLP solutions.
        best_value (float): The best value found so far.
        with_kappa (bool, optional): Whether to use kappa parameter. Defaults to True.
        with_gamma (bool, optional): Whether to use gamma parameter. Defaults to True.
        with_iota (bool, optional): Whether to use iota parameter. Defaults to True.
        with_alpha (bool, optional): Whether to use alpha parameter. Defaults to True.
        num_gens (int, optional): The number of generations in the evolutionary algorithm. Defaults to 5.
        algorithm (Literal["genetic"], optional): The type of optimization algorithm to use. Defaults to "genetic", the only algorithm currently available.
        lp_solver (Solver, optional): The linear programming solver to use. Defaults to SCIP.
        nlp_solver (Solver, optional): The nonlinear programming solver to use. Defaults to IPOPT.
        nlp_strict_mode (bool, optional): Whether or not the <= heuristic (True) or not (False; i.e. setting all equations to ==) shall be used. Defaults to False.
        nlp_single_strict_reacs (list[str], optional): List of single reactions that shall be in strict mode (see ```nlp_strict_mode```argument above).
            If ```nlp_strict_mode=True```, this has no effect. Defaults to [].
        objvalue_json_path (str, optional): The path to the JSON file for storing objective values. Defaults to "".
        max_rounds_same_objvalue (float, optional): The maximum number of rounds with the same objective value before stopping. Defaults to float("inf").
        correction_config (CorrectionConfig, optional): Configuration for corrections during optimization. Defaults to CorrectionConfig().
        min_abs_objvalue (float, optional): The minimum absolute value of the objective function to consider as valid. Defaults to 1e-6.
        pop_size (int | None, optional): The population size for the evolutionary algorithm. Defaults to None.
        ignore_nonlinear_extra_terms_in_ectfbas: (bool, optional): Whether or not non-linear watches/constraints shall be ignored in ecTFBAs. Defaults to True.
    """
    self.original_cobrak_model: Model = deepcopy(cobrak_model)
    self.objective_target = objective_target
    self.objective_sense = objective_sense
    self.blocked_reacs: list[str] = []
    self.initial_xs_list: list[list[int | float]] = [
        [] for _ in range(len(nlp_dict_list))
    ]
    self.minimal_xs_dict: dict[float, list[float]] = {}
    self.variability_data = deepcopy(variability_dict)
    self.algorithm = algorithm

    reac_couples = get_stoichiometrically_coupled_reactions(
        self.original_cobrak_model
    )

    objective_target_ids = list(objective_target.keys())
    filtered_reac_couples: list[tuple[str, ...]] = []
    for reac_couple in reac_couples:
        filtered_reac_couple = [
            reac_id
            for reac_id in reac_couple
            if (abs(self.variability_data[reac_id][1]) > 0.0)
            and (abs(self.variability_data[reac_id][0]) <= 0.0)
            and not (
                cobrak_model.reactions[reac_id].dG0 is None
                and cobrak_model.reactions[reac_id].enzyme_reaction_data is None
            )
        ]

        found_invalid_id = False
        for objective_target_id in objective_target_ids:
            if objective_target_id in filtered_reac_couple:
                found_invalid_id = True
        for var_id in correction_config.error_scenario:
            if var_id in filtered_reac_couple:
                found_invalid_id = True

        if found_invalid_id:
            continue

        if len(filtered_reac_couple) > 0:
            filtered_reac_couples.append(tuple(filtered_reac_couple))

    self.idx_to_reac_ids: dict[int, tuple[str, ...]] = {}
    couple_idx = 0
    for filtered_reac_couplex in filtered_reac_couples:
        nlp_idx = 0
        for nlp_dict in nlp_dict_list:
            first_reac_id = filtered_reac_couplex[0]
            if (first_reac_id in nlp_dict) and (nlp_dict[first_reac_id] > 0.0):
                self.initial_xs_list[nlp_idx].append(
                    1.0 if algorithm != "genetic" else 1
                )
            else:
                self.initial_xs_list[nlp_idx].append(
                    0.0 if algorithm != "genetic" else 0
                )

            nlp_idx += 1  # noqa: SIM113

        self.idx_to_reac_ids[couple_idx] = filtered_reac_couplex
        couple_idx += 1

    self.with_kappa = with_kappa
    self.with_gamma = with_gamma
    self.with_iota = with_iota
    self.with_alpha = with_alpha
    self.dim = couple_idx
    self.num_gens = num_gens
    self.lp_solver = lp_solver
    self.nlp_solver = nlp_solver
    self.nlp_strict_mode = nlp_strict_mode
    self.nlp_single_strict_reacs = nlp_single_strict_reacs
    self.temp_directory_name = ""
    self.best_value = best_value
    self.objvalue_json_path = objvalue_json_path
    self.max_rounds_same_objvalue = max_rounds_same_objvalue
    self.correction_config = correction_config
    self.min_abs_objvalue = min_abs_objvalue
    self.pop_size = pop_size
    self.ignore_nonlinear_extra_terms_in_ectfbas = (
        ignore_nonlinear_extra_terms_in_ectfbas
    )

fitness(x)

Calculates the fitness of a given solution.

Parameters:

Name Type Description Default
x list[float | int]

The solution to evaluate.

required

Returns:

Type Description
list[tuple[float, list[float | int]]]

list[tuple[float, list[float | int]]]: A list of tuples, where each tuple contains the fitness value and the corresponding solution.

Source code in cobrak/evolution.py
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
def fitness(
    self,
    x: list[float | int],
) -> list[tuple[float, list[float | int]]]:
    """Calculates the fitness of a given solution.

    Args:
        x (list[float | int]): The solution to evaluate.

    Returns:
        list[tuple[float, list[float | int]]]: A list of tuples, where each tuple contains the fitness value and the corresponding solution.
    """
    # Preliminary TFBA :3
    deactivated_reactions: list[str] = []
    for couple_idx, reac_ids in self.idx_to_reac_ids.items():
        if x[couple_idx] <= 0.02:
            deactivated_reactions.extend(reac_ids)

    try:
        first_ectfba_dict = perform_lp_optimization(
            cobrak_model=self.original_cobrak_model,
            objective_target=self.objective_target,
            objective_sense=self.objective_sense,
            with_enzyme_constraints=True,
            with_thermodynamic_constraints=True,
            with_loop_constraints=True,
            variability_dict=deepcopy(self.variability_data),
            ignored_reacs=deactivated_reactions,
            solver=self.lp_solver,
            correction_config=self.correction_config,
            ignore_nonlinear_terms=self.ignore_nonlinear_extra_terms_in_ectfbas,
        )
    except (ApplicationError, AttributeError, ValueError):
        first_ectfba_dict = {ALL_OK_KEY: False}
    if not first_ectfba_dict[ALL_OK_KEY]:
        return [(1_000_000.0, [])]

    nlp_results: list[dict[str, float]] = []

    if is_objsense_maximization(self.objective_sense):
        lower_value = first_ectfba_dict[OBJECTIVE_VAR_NAME] - 1e-12
        upper_value = None
    else:
        lower_value = None
        upper_value = first_ectfba_dict[OBJECTIVE_VAR_NAME] + 1e-12

    maxz_model = deepcopy(self.original_cobrak_model)
    maxz_model.extra_linear_constraints = [
        ExtraLinearConstraint(
            stoichiometries=self.objective_target,
            lower_value=lower_value,
            upper_value=upper_value,
        )
    ]
    maxz_model.extra_linear_constraints += [
        ExtraLinearConstraint(
            stoichiometries={f"{Z_VAR_PREFIX}{reac_id}": 1.0},
            upper_value=0.0,
        )
        for (reac_id, reac_data) in self.original_cobrak_model.reactions.items()
        if (reac_data.dG0 is not None) and (reac_id in deactivated_reactions)
    ]
    eligible_z_sum_objective = {
        f"{Z_VAR_PREFIX}{reac_id}": 1.0
        for (reac_id, reac_data) in self.original_cobrak_model.reactions.items()
        if (reac_data.dG0 is not None)
        and (self.variability_data[reac_id][1] > 0.0)
        and (reac_id not in deactivated_reactions)
    }
    try:
        maxz_ectfba_dict = perform_lp_optimization(
            cobrak_model=maxz_model,
            objective_target=eligible_z_sum_objective,
            objective_sense=+1,
            with_enzyme_constraints=True,
            with_thermodynamic_constraints=True,
            with_loop_constraints=True,
            variability_dict=deepcopy(self.variability_data),
            ignored_reacs=deactivated_reactions,
            solver=self.lp_solver,
            correction_config=self.correction_config,
            ignore_nonlinear_terms=self.ignore_nonlinear_extra_terms_in_ectfbas,
        )
    except (ApplicationError, AttributeError, ValueError):
        maxz_ectfba_dict = {ALL_OK_KEY: False}
    if maxz_ectfba_dict[ALL_OK_KEY]:
        used_maxz_tfba_dict: dict[str, float] = {}
        for var_id in maxz_ectfba_dict:
            if var_id not in self.original_cobrak_model.reactions:
                continue
            reaction = self.original_cobrak_model.reactions[var_id]
            if (
                (reaction.dG0 is None) and (var_id not in deactivated_reactions)
            ) or (
                (reaction.dG0 is not None)
                and (maxz_ectfba_dict[f"{Z_VAR_PREFIX}{var_id}"] > 0.0)
            ):
                used_maxz_tfba_dict[var_id] = 1.0
            else:
                used_maxz_tfba_dict[var_id] = 0.0

        used_maxz_tfba_dict[ALL_OK_KEY] = True

        if used_maxz_tfba_dict[ALL_OK_KEY]:
            try:
                second_nlp_dict = (
                    perform_nlp_irreversible_optimization_with_active_reacs_only(
                        cobrak_model=self.original_cobrak_model,
                        objective_target=self.objective_target,
                        objective_sense=self.objective_sense,
                        optimization_dict=deepcopy(used_maxz_tfba_dict),
                        variability_dict=deepcopy(self.variability_data),
                        with_kappa=self.with_kappa,
                        with_gamma=self.with_gamma,
                        with_iota=self.with_iota,
                        with_alpha=self.with_alpha,
                        solver=self.nlp_solver,
                        correction_config=self.correction_config,
                        strict_mode=self.nlp_strict_mode,
                        single_strict_reacs=self.nlp_single_strict_reacs,
                    )
                )
                if second_nlp_dict[ALL_OK_KEY] and (
                    abs(second_nlp_dict[OBJECTIVE_VAR_NAME]) > self.min_abs_objvalue
                ):
                    nlp_results.append(second_nlp_dict)
            except (ApplicationError, AttributeError, ValueError):
                pass

    ####
    try:
        minz_ectfba_dict = perform_lp_optimization(
            cobrak_model=maxz_model,
            objective_target=eligible_z_sum_objective,
            objective_sense=-1,
            with_enzyme_constraints=True,
            with_thermodynamic_constraints=True,
            with_loop_constraints=True,
            variability_dict=deepcopy(self.variability_data),
            ignored_reacs=deactivated_reactions,
            solver=self.lp_solver,
            correction_config=self.correction_config,
            ignore_nonlinear_terms=self.ignore_nonlinear_extra_terms_in_ectfbas,
        )
    except (ApplicationError, AttributeError, ValueError):
        minz_ectfba_dict = {ALL_OK_KEY: False}
    if minz_ectfba_dict[ALL_OK_KEY]:
        used_minz_tfba_dict: dict[str, float] = {}
        for var_id in minz_ectfba_dict:
            if var_id not in self.original_cobrak_model.reactions:
                continue
            reaction = self.original_cobrak_model.reactions[var_id]
            if (
                (reaction.dG0 is None) and (var_id not in deactivated_reactions)
            ) or (
                (reaction.dG0 is not None)
                and (minz_ectfba_dict[f"{Z_VAR_PREFIX}{var_id}"] > 0.0)
            ):
                used_minz_tfba_dict[var_id] = 1.0
            else:
                used_minz_tfba_dict[var_id] = 0.0

        used_minz_tfba_dict[ALL_OK_KEY] = True

        if used_minz_tfba_dict[ALL_OK_KEY]:
            try:
                third_nlp_dict = (
                    perform_nlp_irreversible_optimization_with_active_reacs_only(
                        cobrak_model=self.original_cobrak_model,
                        objective_target=self.objective_target,
                        objective_sense=self.objective_sense,
                        optimization_dict=deepcopy(used_minz_tfba_dict),
                        variability_dict=deepcopy(self.variability_data),
                        with_kappa=self.with_kappa,
                        with_gamma=self.with_gamma,
                        with_iota=self.with_iota,
                        with_alpha=self.with_alpha,
                        solver=self.nlp_solver,
                        correction_config=self.correction_config,
                        strict_mode=self.nlp_strict_mode,
                        single_strict_reacs=self.nlp_single_strict_reacs,
                    )
                )
                if third_nlp_dict[ALL_OK_KEY] and (
                    abs(third_nlp_dict[OBJECTIVE_VAR_NAME]) > self.min_abs_objvalue
                ):
                    nlp_results.append(third_nlp_dict)
            except (ApplicationError, AttributeError, ValueError):
                pass
    ####

    output: list[tuple[float, list[float | int]]] = [(1_000_000, [])]
    for nlp_result in nlp_results:
        objvalues = [nlp_result[OBJECTIVE_VAR_NAME] for nlp_result in nlp_results]
        if is_objsense_maximization(self.objective_sense):
            opt_idx = objvalues.index(max(objvalues))
        else:
            opt_idx = objvalues.index(min(objvalues))
        opt_nlp_dict = nlp_results[opt_idx]

        objective_value = opt_nlp_dict[OBJECTIVE_VAR_NAME]

        if self.temp_directory_name:
            filename = f"{self.temp_directory_name}{objective_value}{time()}{randint(0, 1_000_000_000)}.json"  # noqa: NPY002
            json_write(filename, opt_nlp_dict)

        if is_objsense_maximization(self.objective_sense):
            objective_value *= -1

        print("No error, objective value is:", objective_value)

        active_nlp_x: list[float | int] = [
            0 for _ in range(len(list(self.idx_to_reac_ids.keys())))
        ]
        for couple_idx, reac_ids in self.idx_to_reac_ids.items():
            reac_id = reac_ids[0]
            if reac_id not in opt_nlp_dict or opt_nlp_dict[reac_id] < 1e-11:
                set_value = 0
            else:
                set_value = 1
            active_nlp_x[couple_idx] = set_value
        output.append((objective_value, active_nlp_x))

    return output

optimize()

Performs the optimization process.

Returns:

Type Description
dict[float, list[dict[str, float]]]

dict[float, list[dict[str, float]]]: A dictionary containing the optimization results.

Source code in cobrak/evolution.py
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
def optimize(self) -> dict[float, list[dict[str, float]]]:
    """Performs the optimization process.

    Returns:
        dict[float, list[dict[str, float]]]: A dictionary containing the optimization results.
    """
    temp_directory = TemporaryDirectory()
    self.temp_directory_name = standardize_folder(temp_directory.name)

    match self.algorithm:
        case "genetic":
            evolution = COBRAKGENETIC(
                fitness_function=self.fitness,
                xs_dim=self.dim,
                extra_xs=self.initial_xs_list,
                gen=self.num_gens,
                objvalue_json_path=self.objvalue_json_path,
                max_rounds_same_objvalue=self.max_rounds_same_objvalue,
                pop_size=self.pop_size,
            )
        case _:
            print(
                f"ERROR: Evolution algorithm {self.algorithm} does not exist! Use 'genetic'."
            )
            raise ValueError
    evolution.run()

    result_dict: dict[float, list[dict[str, float]]] = {}
    for json_filename in get_files(self.temp_directory_name):
        json_data = json_load(f"{self.temp_directory_name}{json_filename}", Any)
        objective_value = json_data[OBJECTIVE_VAR_NAME]
        if objective_value not in result_dict:
            result_dict[objective_value] = []
        result_dict[objective_value].append(deepcopy(json_data))

    temp_directory.cleanup()

    return {
        key: result_dict[key] for key in sorted(result_dict.keys(), reverse=True)
    }

perform_nlp_evolutionary_optimization(cobrak_model, objective_target, objective_sense, variability_dict={}, with_kappa=True, with_gamma=True, with_iota=False, with_alpha=False, sampling_wished_num_feasible_starts=3, sampling_max_metarounds=3, sampling_rounds_per_metaround=2, sampling_max_deactivated_reactions=5, sampling_always_deactivated_reactions=[], evolution_num_gens=5, algorithm='genetic', lp_solver=SCIP, nlp_solver=IPOPT, nlp_strict_mode=False, nlp_single_strict_reacs=[], objvalue_json_path='', max_rounds_same_objvalue=float('inf'), correction_config=CorrectionConfig(), min_abs_objvalue=1e-13, pop_size=None, working_results=[], ignore_nonlinear_extra_terms_in_ectfbas=True)

Performs NLP evolutionary optimization on the given COBRA-k model.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRA-k model to optimize.

required
objective_target str | dict[str, float]

Target value(s) for the objective function.

required
objective_sense int

Sense of the objective function (1 for maximization, -1 for minimization).

required
variability_dict dict[str, tuple[float, float]]

Variability data for each reaction. Defaults to {}.

{}
with_kappa bool

Whether to use kappa parameter. Defaults to True.

True
with_gamma bool

Whether to use gamma parameter. Defaults to True.

True
with_iota bool

Whether to use iota parameter. Defaults to False.

False
with_alpha bool

Whether to use alpha parameter. Defaults to False.

False
sampling_wished_num_feasible_starts int

The number of wished feasible start solutions. Defaults to 3.

3
sampling_max_metarounds int

Maximum number of meta rounds for sampling. Defaults to 3.

3
sampling_rounds_per_metaround int

Number of rounds per meta round for sampling. Defaults to 2.

2
sampling_max_deactivated_reactions int

Maximum number of deactivated reactions allowed. Defaults to 5.

5
sampling_always_deactivated_reactions list[str]

List of reactions that should always be deactivated. Defaults to [].

[]
evolution_num_gens int

Number of generations for the evolutionary algorithm. Defaults to 5.

5
algorithm Literal['genetic']

Type of optimization algorithm to use. Defaults to "genetic", which is also the only algorithm currently available.

'genetic'
lp_solver Solver

The linear programming solver to use. Defaults to SCIP.

SCIP
nlp_solver Solver

The nonlinear programming solver to use. Defaults to IPOPT.

IPOPT
nlp_strict_mode bool

Whether or not the <= heuristic (True) or not (False; i.e. setting all equations to ==) shall be used. Defaults to False.

False
nlp_single_strict_reacs list[str]

List of single reactions that shall be in strict mode (see nlp_strict_modeargument above). If nlp_strict_mode=True, this has no effect. Defaults to [].

[]
objvalue_json_path str

Path to the JSON file for objective values. Defaults to "".

''
max_rounds_same_objvalue float

Maximum number of rounds with same objective value before stopping. Defaults to float("inf").

float('inf')
correction_config CorrectionConfig

Configuration for corrections during optimization. Defaults to CorrectionConfig().

CorrectionConfig()
min_abs_objvalue float

Minimum absolute value of objective function to consider valid. Defaults to 1e-13.

1e-13
pop_size int | None

Population size for the evolutionary algorithm. Defaults to None.

None
working_results list[dict[str, float]]

List of initial feasible results. Defaults to [].

[]
ignore_nonlinear_extra_terms_in_ectfbas bool

(bool, optional): Whether or not non-linear watches/constraints shall be ignored in ecTFBAs. Defaults to True.

True

Returns:

Type Description
dict[float, list[dict[str, float]]]

dict[float, list[dict[str, float]]]: Dictionary of objective values and corresponding solutions.

Source code in cobrak/evolution.py
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
def perform_nlp_evolutionary_optimization(
    cobrak_model: Model,
    objective_target: str | dict[str, float],
    objective_sense: int,
    variability_dict: dict[str, tuple[float, float]] = {},
    with_kappa: bool = True,
    with_gamma: bool = True,
    with_iota: bool = False,
    with_alpha: bool = False,
    sampling_wished_num_feasible_starts: int = 3,
    sampling_max_metarounds: int = 3,
    sampling_rounds_per_metaround: int = 2,
    sampling_max_deactivated_reactions: int = 5,
    sampling_always_deactivated_reactions: list[str] = [],
    evolution_num_gens: int = 5,
    algorithm: Literal["genetic"] = "genetic",
    lp_solver: Solver = SCIP,
    nlp_solver: Solver = IPOPT,
    nlp_strict_mode: bool = False,
    nlp_single_strict_reacs: list[str] = [],
    objvalue_json_path: str = "",
    max_rounds_same_objvalue: float = float("inf"),
    correction_config: CorrectionConfig = CorrectionConfig(),
    min_abs_objvalue: float = 1e-13,
    pop_size: int | None = None,
    working_results: list[dict[str, float]] = [],
    ignore_nonlinear_extra_terms_in_ectfbas: bool = True,
) -> dict[float, list[dict[str, float]]]:
    """Performs NLP evolutionary optimization on the given COBRA-k model.

    Args:
        cobrak_model (Model): The COBRA-k model to optimize.
        objective_target (str | dict[str, float]): Target value(s) for the objective function.
        objective_sense (int): Sense of the objective function (1 for maximization, -1 for minimization).
        variability_dict (dict[str, tuple[float, float]], optional): Variability data for each reaction. Defaults to {}.
        with_kappa (bool, optional): Whether to use kappa parameter. Defaults to True.
        with_gamma (bool, optional): Whether to use gamma parameter. Defaults to True.
        with_iota (bool, optional): Whether to use iota parameter. Defaults to False.
        with_alpha (bool, optional): Whether to use alpha parameter. Defaults to False.
        sampling_wished_num_feasible_starts (int, optional): The number of wished feasible start solutions. Defaults to 3.
        sampling_max_metarounds (int, optional): Maximum number of meta rounds for sampling. Defaults to 3.
        sampling_rounds_per_metaround (int, optional): Number of rounds per meta round for sampling. Defaults to 2.
        sampling_max_deactivated_reactions (int, optional): Maximum number of deactivated reactions allowed. Defaults to 5.
        sampling_always_deactivated_reactions (list[str], optional): List of reactions that should always be deactivated. Defaults to [].
        evolution_num_gens (int, optional): Number of generations for the evolutionary algorithm. Defaults to 5.
        algorithm (Literal["genetic"], optional): Type of optimization algorithm to use. Defaults to "genetic", which is also the only algorithm currently available.
        lp_solver (Solver, optional): The linear programming solver to use. Defaults to SCIP.
        nlp_solver (Solver, optional): The nonlinear programming solver to use. Defaults to IPOPT.
        nlp_strict_mode (bool, optional): Whether or not the <= heuristic (True) or not (False; i.e. setting all equations to ==) shall be used. Defaults to False.
        nlp_single_strict_reacs (list[str], optional): List of single reactions that shall be in strict mode (see ```nlp_strict_mode```argument above).
            If ```nlp_strict_mode=True```, this has no effect. Defaults to [].
        objvalue_json_path (str, optional): Path to the JSON file for objective values. Defaults to "".
        max_rounds_same_objvalue (float, optional): Maximum number of rounds with same objective value before stopping. Defaults to float("inf").
        correction_config (CorrectionConfig, optional): Configuration for corrections during optimization. Defaults to CorrectionConfig().
        min_abs_objvalue (float, optional): Minimum absolute value of objective function to consider valid. Defaults to 1e-13.
        pop_size (int | None, optional): Population size for the evolutionary algorithm. Defaults to None.
        working_results (list[dict[str, float]], optional): List of initial feasible results. Defaults to [].
        ignore_nonlinear_extra_terms_in_ectfbas: (bool, optional): Whether or not non-linear watches/constraints shall be ignored in ecTFBAs. Defaults to True.

    Returns:
        dict[float, list[dict[str, float]]]: Dictionary of objective values and corresponding solutions.
    """
    if variability_dict == {}:
        variability_dict = perform_lp_variability_analysis(
            cobrak_model=cobrak_model,
            with_enzyme_constraints=True,
            with_thermodynamic_constraints=True,
            active_reactions=[],
            solver=lp_solver,
            ignore_nonlinear_terms=ignore_nonlinear_extra_terms_in_ectfbas,
        )
    else:
        variability_dict = deepcopy(variability_dict)

    # Initial sampling
    if isinstance(objective_target, str):
        objective_target = {objective_target: 1.0}
    objective_target_ids = list(objective_target.keys())  # type: ignore

    deactivatable_reactions = [
        var_id
        for var_id in variability_dict
        if (var_id in cobrak_model.reactions)
        and (variability_dict[var_id][0] == 0.0)
        and (var_id not in objective_target_ids)
        and (var_id not in sampling_always_deactivated_reactions)
        and (var_id not in correction_config.error_scenario)
    ]
    distinct_feasible_start_solutions: dict[tuple[str, ...], dict[str, float]] = {}
    for current_round in range(sampling_max_metarounds):
        # Get deactivated reaction lists
        all_deactivated_reaction_lists = [
            [
                sample(
                    deactivatable_reactions,
                    randint(1, sampling_max_deactivated_reactions + 1),  # noqa: NPY002
                )
                + sampling_always_deactivated_reactions
                for _ in range(sampling_rounds_per_metaround)
            ]
            for _ in range(cpu_count())
        ]
        if current_round == 0:
            all_deactivated_reaction_lists[0][0] = deepcopy(
                sampling_always_deactivated_reactions
            )

        # run sampling
        results = Parallel(n_jobs=-1, verbose=10)(
            delayed(_sampling_routine)(
                cobrak_model,
                objective_target,
                objective_sense,
                variability_dict,
                with_kappa,
                with_gamma,
                with_iota,
                with_alpha,
                deactivated_reaction_lists,
                lp_solver,
                nlp_solver,
                nlp_strict_mode,
                nlp_single_strict_reacs,
                correction_config,
                min_abs_objvalue,
                ignore_nonlinear_extra_terms_in_ectfbas,
            )
            for deactivated_reaction_lists in all_deactivated_reaction_lists
        )
        if len(working_results) > 0:
            results.append(working_results)
        best_result = (
            -float("inf") if is_objsense_maximization(objective_sense) else float("inf")
        )
        for result in results:
            for nlp_dict in result:
                active_reacs_tuple = tuple(
                    sorted(
                        get_active_reacs_from_optimization_dict(cobrak_model, nlp_dict)
                    )
                )
                distinct_feasible_start_solutions[active_reacs_tuple] = deepcopy(
                    nlp_dict
                )
                if is_objsense_maximization(objective_sense):
                    best_result = max(nlp_dict[OBJECTIVE_VAR_NAME], best_result)
                else:
                    best_result = min(nlp_dict[OBJECTIVE_VAR_NAME], best_result)

        if (
            len(distinct_feasible_start_solutions.keys())
            >= sampling_wished_num_feasible_starts
        ):
            break

    if len(distinct_feasible_start_solutions.keys()) == 0:
        print(
            "ERROR in initial sampling: No feasible sampling solution found! Check feasibility of problem and/or adjust sampling settings."
        )
        raise ValueError
    if (
        len(distinct_feasible_start_solutions.keys())
        < sampling_wished_num_feasible_starts
    ):
        print("INFO: Fewer feasible sampling solutions found than wished.")

    # Evolutionary algorithm
    problem = COBRAKProblem(
        cobrak_model=cobrak_model,
        objective_target=objective_target,  # type: ignore
        objective_sense=objective_sense,
        variability_dict=variability_dict,
        nlp_dict_list=list(distinct_feasible_start_solutions.values()),
        best_value=best_result,
        with_kappa=with_kappa,
        with_gamma=with_gamma,
        with_iota=with_iota,
        with_alpha=with_alpha,
        num_gens=evolution_num_gens,
        algorithm=algorithm,
        lp_solver=lp_solver,
        nlp_solver=nlp_solver,
        objvalue_json_path=objvalue_json_path,
        max_rounds_same_objvalue=max_rounds_same_objvalue,
        correction_config=correction_config,
        min_abs_objvalue=min_abs_objvalue,
        pop_size=pop_size,
        ignore_nonlinear_extra_terms_in_ectfbas=ignore_nonlinear_extra_terms_in_ectfbas,
        nlp_strict_mode=nlp_strict_mode,
        nlp_single_strict_reacs=nlp_single_strict_reacs,
    )

    return problem.optimize()

postprocess(cobrak_model, opt_dict, objective_target, objective_sense, variability_data, with_kappa=True, with_gamma=True, with_iota=False, with_alpha=False, lp_solver=SCIP, nlp_solver=IPOPT, nlp_strict_mode=False, nlp_single_strict_reacs=[], verbose=False, correction_config=CorrectionConfig(), onlytested='', ignore_nonlinear_extra_terms_in_ectfbas=True)

Postprocesses the optimization results to find feasible switches.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRA-k model to optimize.

required
opt_dict dict[str, float]

Optimization result dictionary.

required
objective_target str | dict[str, float]

Target value(s) for the objective function.

required
objective_sense int

Sense of the objective function (1 for maximization, -1 for minimization).

required
variability_data dict[str, tuple[float, float]]

Variability data for each reaction.

required
with_kappa bool

Whether to use kappa parameter. Defaults to True.

True
with_gamma bool

Whether to use gamma parameter. Defaults to True.

True
with_iota bool

Whether to use iota parameter. Defaults to False.

False
with_alpha bool

Whether to use alpha parameter. Defaults to False.

False
lp_solver Solver

The linear programming solver to use. Defaults to SCIP.

SCIP
nlp_solver Solver

The nonlinear programming solver to use. Defaults to IPOPT.

IPOPT
nlp_strict_mode bool

Whether or not the <= heuristic (True) or not (False; i.e. setting all equations to ==) shall be used. Defaults to False.

False
nlp_single_strict_reacs list[str]

List of single reactions that shall be in strict mode (see nlp_strict_modeargument above). If nlp_strict_mode=True, this has no effect. Defaults to [].

[]
verbose bool

Whether to enable verbose output. Defaults to False.

False
correction_config CorrectionConfig

Configuration for corrections during optimization. Defaults to CorrectionConfig().

CorrectionConfig()
onlytested str

Specific reactions to test during postprocessing. Defaults to "".

''
ignore_nonlinear_extra_terms_in_ectfbas bool

(bool, optional): Whether or not non-linear watches/constraints shall be ignored in ecTFBAs.

True

Returns:

Type Description
tuple[float, list[float | int]]

tuple[float, list[float | int]]: Best result and a list of feasible switches.

Source code in cobrak/evolution.py
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
def postprocess(
    cobrak_model: Model,
    opt_dict: dict[str, float],
    objective_target: str | dict[str, float],
    objective_sense: int,
    variability_data: dict[str, tuple[float, float]],
    with_kappa: bool = True,
    with_gamma: bool = True,
    with_iota: bool = False,
    with_alpha: bool = False,
    lp_solver: Solver = SCIP,
    nlp_solver: Solver = IPOPT,
    nlp_strict_mode: bool = False,
    nlp_single_strict_reacs: list[str] = [],
    verbose: bool = False,
    correction_config: CorrectionConfig = CorrectionConfig(),
    onlytested: str = "",
    ignore_nonlinear_extra_terms_in_ectfbas: bool = True,
) -> tuple[float, list[float | int]]:
    """Postprocesses the optimization results to find feasible switches.

    Args:
        cobrak_model (Model): The COBRA-k model to optimize.
        opt_dict (dict[str, float]): Optimization result dictionary.
        objective_target (str | dict[str, float]): Target value(s) for the objective function.
        objective_sense (int): Sense of the objective function (1 for maximization, -1 for minimization).
        variability_data (dict[str, tuple[float, float]]): Variability data for each reaction.
        with_kappa (bool, optional): Whether to use kappa parameter. Defaults to True.
        with_gamma (bool, optional): Whether to use gamma parameter. Defaults to True.
        with_iota (bool, optional): Whether to use iota parameter. Defaults to False.
        with_alpha (bool, optional): Whether to use alpha parameter. Defaults to False.
        lp_solver (Solver, optional): The linear programming solver to use. Defaults to SCIP.
        nlp_solver (Solver, optional): The nonlinear programming solver to use. Defaults to IPOPT.
        nlp_strict_mode (bool, optional): Whether or not the <= heuristic (True) or not (False; i.e. setting all equations to ==) shall be used. Defaults to False.
        nlp_single_strict_reacs (list[str], optional): List of single reactions that shall be in strict mode (see ```nlp_strict_mode```argument above).
            If ```nlp_strict_mode=True```, this has no effect. Defaults to [].
        verbose (bool, optional): Whether to enable verbose output. Defaults to False.
        correction_config (CorrectionConfig, optional): Configuration for corrections during optimization. Defaults to CorrectionConfig().
        onlytested (str, optional): Specific reactions to test during postprocessing. Defaults to "".
        ignore_nonlinear_extra_terms_in_ectfbas: (bool, optional): Whether or not non-linear watches/constraints shall be ignored in ecTFBAs.

    Returns:
        tuple[float, list[float | int]]: Best result and a list of feasible switches.
    """
    if variability_data == {}:
        variability_data = perform_lp_variability_analysis(
            cobrak_model=cobrak_model,
            with_enzyme_constraints=True,
            with_thermodynamic_constraints=True,
            active_reactions=[],
            solver=lp_solver,
            ignore_nonlinear_terms=ignore_nonlinear_extra_terms_in_ectfbas,
        )
    else:
        variability_data = deepcopy(variability_data)

    pyomo_lp_solver = get_solver(
        lp_solver,
    )

    cobrak_model = deepcopy(cobrak_model)
    if type(objective_target) is str:
        objective_target = {objective_target: 1.0}
    obj_value = 0.0
    for obj_target_id, obj_target_multiplier in objective_target.items():
        obj_value += opt_dict[obj_target_id] * obj_target_multiplier
    epsilon = 1e-6 if is_objsense_maximization(objective_sense) else -1e-6
    cobrak_model = add_objective_value_as_extra_linear_constraint(
        cobrak_model,
        obj_value + epsilon,
        objective_target,
        objective_sense,
    )

    reac_couples = get_stoichiometrically_coupled_reactions(cobrak_model)
    active_reacs = [
        active_reac
        for active_reac in get_active_reacs_from_optimization_dict(
            cobrak_model, opt_dict
        )
        if (active_reac in [reac_ids[0] for reac_ids in reac_couples])
        and (active_reac not in objective_target)  # and opt_dict[active_reac] > 1e-8
    ]
    active_reac_couples: list[list[str]] = [
        reac_couple
        for reac_couple in reac_couples
        if reac_couple[0] in active_reacs and variability_data[reac_couple[0]][0] == 0.0
    ]
    inactive_reac_couples: list[list[str]] = [
        reac_couple
        for reac_couple in reac_couples
        if reac_couple[0] not in active_reacs
        and variability_data[reac_couple[0]][1] > 0.0
    ]
    targets = []
    for max_target_num in (0, 5):
        targets += [("deac", x, max_target_num) for x in active_reac_couples]
        targets += [("ac", x, max_target_num) for x in inactive_reac_couples]
    all_feasible_switches_metalist = Parallel(n_jobs=-1, verbose=10)(
        delayed(_postprocess_batch)(
            reac_couples,
            targets_batch,
            active_reacs,
            cobrak_model,
            objective_target,
            objective_sense,
            variability_data,
            pyomo_lp_solver,
            with_kappa,
            with_gamma,
            with_iota,
            with_alpha,
            lp_solver,
            nlp_solver,
            nlp_strict_mode,
            nlp_single_strict_reacs,
            verbose,
            correction_config,
            onlytested,
            ignore_nonlinear_extra_terms_in_ectfbas,
        )
        for targets_batch in split_list(targets, cpu_count())
    )
    all_feasible_switches = []
    for sublist in all_feasible_switches_metalist:
        all_feasible_switches.extend(sublist)

    if len(all_feasible_switches) > 0:
        best_result = all_feasible_switches[0][2]
        for result in [x[2] for x in all_feasible_switches[1:]]:
            if is_objsense_maximization(objective_sense):
                if result[OBJECTIVE_VAR_NAME] > best_result[OBJECTIVE_VAR_NAME]:
                    best_result = result
            else:
                if result[OBJECTIVE_VAR_NAME] < best_result[OBJECTIVE_VAR_NAME]:
                    best_result = result
    else:
        best_result = {}

    return all_feasible_switches, best_result

example_models

Contains the toy model from COBRAk's documentation and publication as example model as well as iCH360_cobrak.

__getattr__(name)

Called by Python when an attribute that does not yet exist in the module namespace is requested.

We use it to load the JSON the first time iCH360_cobrak is asked for, then store the result in globals() so the load happens only once.

Source code in cobrak/example_models.py
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
def __getattr__(name: str) -> Any:  # called only for missing attrs  # noqa: ANN401
    """Called by Python when an attribute that does not yet exist in the
    module namespace is requested.

    We use it to load the JSON the first time ``iCH360_cobrak`` is asked
    for, then store the result in ``globals()`` so the load happens only
    once.
    """
    if name == "iCH360_cobrak":
        data = json_load(
            str(r.files("cobrak").joinpath("data/iCH360_cobrak.json"))
        )  # <-load the file now
        globals()[name] = data  # cache for future accesses
        return data
    raise AttributeError(name)

expasy_functionality

This module provides functionality to parse Expasy enzyme RDF files and extract EC number transfers.

get_ec_number_transfers(expasy_enzyme_rdf_path)

Parses an Expasy enzyme RDF file to extract enzyme EC number transfers.

Parameters:

Name Type Description Default
expasy_enzyme_rdf_path str

Path to the Expasy enzyme RDF file.

required

Returns:

Type Description
dict[str, str]

dict[str, str]: A dictionary where each key is an EC number, and its corresponding value is the EC number it is transferred to. The dictionary includes both directions of the transfer (old to new and new to old).

Source code in cobrak/expasy_functionality.py
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
@validate_call(validate_return=True)
def get_ec_number_transfers(expasy_enzyme_rdf_path: str) -> dict[str, str]:
    """Parses an Expasy enzyme RDF file to extract enzyme EC number transfers.

    Args:
        expasy_enzyme_rdf_path (str): Path to the Expasy enzyme RDF file.

    Returns:
        dict[str, str]: A dictionary where each key is an EC number, and its corresponding value is the EC number it is transferred to.
                        The dictionary includes both directions of the transfer (old to new and new to old).
    """
    tree = ET.parse(expasy_enzyme_rdf_path)
    root = tree.getroot()

    ec_number_transfers: dict[str, str] = {}
    for child in root:
        for subchild in child:
            if "replaces" not in subchild.tag:
                continue
            new_ec_numbers = list(child.attrib.values())
            old_ec_numbers = list(subchild.attrib.values())
            for new_ec_number in new_ec_numbers:
                for old_ec_number in old_ec_numbers:
                    ec_number_transfers[old_ec_number] = new_ec_number
                    ec_number_transfers[new_ec_number] = old_ec_number
    return ec_number_transfers

genetic

Methods for COBRA-k's genetic algorithm used in the COBRA-k evolutionary algorithm

COBRAKGENETIC

A class for performing genetic algorithm optimization.

Attributes:

Name Type Description
fitness_function Callable

A function that takes a list of integers/floats and returns a tuple containing the fitness score and a list of integers/floats.

xs_dim int

The dimensionality of the search space.

gen int

The number of generations to run the algorithm for.

seed int | None

The seed for the random number generator.

objvalue_json_path str

The path to a JSON file to store objective values.

max_rounds_same_objvalue float

The maximum number of rounds with the same objective value before stopping the algorithm.

pop_size int | None

The size of the population. If None, defaults to the number of CPUs.

Source code in cobrak/genetic.py
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
class COBRAKGENETIC:
    """A class for performing genetic algorithm optimization.

    Attributes:
        fitness_function (Callable): A function that takes a list of integers/floats
            and returns a tuple containing the fitness score and a list of integers/floats.
        xs_dim (int): The dimensionality of the search space.
        gen (int): The number of generations to run the algorithm for.
        seed (int | None): The seed for the random number generator.
        objvalue_json_path (str): The path to a JSON file to store objective values.
        max_rounds_same_objvalue (float): The maximum number of rounds with the same
            objective value before stopping the algorithm.
        pop_size (int | None): The size of the population. If None, defaults to the
            number of CPUs.
    """

    def __init__(
        self,
        fitness_function: Callable[
            [list[float | int]], tuple[float, list[float | int]]
        ],
        xs_dim: int,
        gen: int,
        extra_xs: list[list[int]] = [],
        seed: int | None = None,
        objvalue_json_path: str = "",
        max_rounds_same_objvalue: float = float("inf"),
        pop_size: int | None = None,
    ) -> None:
        """Initializes the COBRAKGENETIC object.

        Args:
            fitness_function (Callable): The fitness function to evaluate solutions.
            xs_dim (int): The dimensionality of the search space.
            gen (int): The number of generations to run.
            extra_xs (list[list[int]], optional): Extra particles to initialize the population.
                Defaults to [].
            seed (int | None, optional): Seed for the random number generator. Defaults to None.
            objvalue_json_path (str, optional): Path to a JSON file to store objective values.
                Defaults to "".
            max_rounds_same_objvalue (float, optional): Maximum rounds with the same objective
                value before stopping. Defaults to infinity.
            pop_size (int | None, optional): Population size. Defaults to None.
        """
        # Parameters
        self.fitness_function = fitness_function
        self.xs_dim = xs_dim
        self.gen = gen
        self.seed = seed
        if seed is not None:
            np.random.seed(seed)  # noqa: NPY002

        # Initialization of random particles
        cpu_count_value = cpu_count() if pop_size is None else pop_size
        if cpu_count_value is None:
            self.cpu_count = 1
        else:
            self.cpu_count = cpu_count_value
        self.init_xs = [
            [randint(0, 1) for _ in range(xs_dim)]
            for _ in range(self.cpu_count - len(extra_xs))
        ]

        # Addition of user-defined extra particles
        if extra_xs != []:
            self.init_xs.extend(extra_xs)

        self.tested_xs: dict[tuple[int, ...], float] = {}
        self.all_xs: dict[tuple[int, ...], float] = {}
        self.objvalue_json_path = objvalue_json_path
        self.objvalue_json_data: dict[float, list[float]] = {}
        self.max_rounds_same_objvalue = max_rounds_same_objvalue

    def _get_sorted_list_from_tested_xs(self) -> list[tuple[float, tuple[int, ...]]]:
        """Returns a sorted list of tuples containing fitness scores and solutions.

        Returns:
            list[tuple[float, tuple[int, ...]]]: A sorted list of (fitness, solution) tuples.
        """
        return sorted(
            [(fitness, x) for (x, fitness) in self.tested_xs.items()],
            key=operator.itemgetter(0),
        )

    def run(self) -> tuple[float, tuple[int, ...]]:
        """Runs the genetic algorithm optimization.

        Returns:
            tuple[float, tuple[int, ...]]: A tuple containing the best fitness score and the
            corresponding solution.
        """
        init_fitnesses = Parallel(n_jobs=-1)(
            delayed(self.fitness_function)(x) for x in self.init_xs
        )
        if init_fitnesses is not None:
            self.tested_xs = {}
            for init_fitness in init_fitnesses:
                for fitness, xs in init_fitness:
                    self.tested_xs[tuple(xs)] = fitness
        else:
            print("ERROR: Something went wrong during initialization")
            raise ValueError

        if self.objvalue_json_path:
            start_time = time()
            self.objvalue_json_data[0.0] = sorted(self.tested_xs.values())
            json_write(self.objvalue_json_path, self.objvalue_json_data)

        # Actual algorithm
        max_objvalues = []
        for _ in range(self.gen):
            max_objvalues.append(max(self.tested_xs.values()))
            if last_n_elements_equal(max_objvalues, self.max_rounds_same_objvalue):  # type: ignore
                break

            xs_list = self._get_sorted_list_from_tested_xs()
            xs_list = [xs for xs in xs_list if len(xs) > 0]

            chosen_xs: list[tuple[int, ...]] = []
            # Choose some of the top 3
            for _ in range(self.cpu_count // 4):
                chosen_xs.append(choice([x[1] for x in xs_list if len(x[1]) > 0][:3]))

            # Choose some of the 25% best
            for _ in range(self.cpu_count // 2):
                chosen_xs.append(
                    choice([x[1] for x in xs_list][: round(len(xs_list) * 0.25)])
                )

            # Choose some other 75% worst
            for _ in range(self.cpu_count // 4):
                addlength = 0
                while not addlength:
                    added_xs = deepcopy(
                        choice([x[1] for x in xs_list][round(len(xs_list) * 0.25) :])
                    )
                    addlength = len(added_xs)
                chosen_xs.append(added_xs)

            # Random crossovers
            for _ in range(round(len(chosen_xs) * 0.2)):
                target = randint(0, len(chosen_xs) - 1)
                source = randint(0, len(chosen_xs) - 1)
                cut = randint(0, self.xs_dim - 1)
                chosen_xs[target] = deepcopy(chosen_xs[target][:cut]) + deepcopy(
                    chosen_xs[source][cut:]
                )

            # Test Xs in parallel
            results = Parallel(n_jobs=-1, verbose=10)(
                delayed(self.update_particle)(
                    chosen_x,
                    count_last_equal_elements(max_objvalues),
                )
                for chosen_x in chosen_xs
            )

            if results is None:
                print("ERROR: Something went wrong during fitness calculations")
                raise ValueError

            # Unpack results
            for fitnesses_and_active_xs, mutated_x in results:
                for fitness, active_x in fitnesses_and_active_xs:
                    if active_x is None or active_x == []:
                        continue
                    self.tested_xs[tuple(active_x)] = fitness
                    self.all_xs[tuple(active_x)] = fitness
                self.all_xs[tuple(mutated_x)] = max(
                    fitness for (fitness, _) in fitnesses_and_active_xs
                )

            if self.objvalue_json_path:
                self.objvalue_json_data[time() - start_time] = sorted(
                    self.tested_xs.values()
                )
                json_write(self.objvalue_json_path, self.objvalue_json_data)

        best_f_and_x = self._get_sorted_list_from_tested_xs()[0]
        return best_f_and_x[0], best_f_and_x[1]

    def update_particle(
        self,
        chosen_x: list[int],
        num_rounds_without_best_change: int,
    ) -> tuple[float, list[int], list[int]]:
        """Updates a single particle by introducing mutations.

        Args:
            chosen_x (list[int]): The current solution represented as a list of integers.
            num_rounds_without_best_change (int): The number of rounds without a change in the
                best fitness score.

        Returns:
            tuple[list[list[float]], list[int]]: A tuple containing a list of fitness scores
            and the mutated solution.
        """
        if not len(chosen_x):
            return [[1_000_000, []]], []

        min_change_p = 0.1 * 0.95**num_rounds_without_best_change
        max_change_p = 0.1 * 1.05**num_rounds_without_best_change
        change_p = uniform(min_change_p, max_change_p)
        change_p = max(0.001, change_p)
        change_p = min(0.999, change_p)

        mutation_tries = 0
        while True:
            mutated_x: list[float] = []
            match randint(0, 2):
                case 0:  # Extend
                    for x in chosen_x:
                        if x == 1:
                            mutated_x.append(1)
                            continue
                        if uniform(0.0, 1.0) < change_p:
                            mutated_x.append(1)
                        else:
                            mutated_x.append(x)
                case 1:  # Decrease
                    for x in chosen_x:
                        if x == 0:
                            mutated_x.append(0)
                            continue
                        if uniform(0.0, 1.0) < change_p:
                            mutated_x.append(0)
                        else:
                            mutated_x.append(x)
                case 2:  # Extend and decrease
                    for x in chosen_x:
                        if x == 1:
                            if uniform(0.0, 1.0) < change_p:
                                mutated_x.append(0)
                            else:
                                mutated_x.append(x)
                        else:
                            if uniform(0.0, 1.0) < change_p:
                                mutated_x.append(1)
                            else:
                                mutated_x.append(x)
                # case 3:  # Random
                #    mutated_x = [randint(0, 1) for _ in range(len(chosen_x))]

            if mutation_tries > 250:
                return [[1_000_000, []]], []
            if tuple(mutated_x) in self.all_xs:
                mutation_tries += 1
            else:
                break

        # Evaluate new position
        fitnesses_and_active_xs = self.fitness_function(mutated_x)

        return fitnesses_and_active_xs, mutated_x

__init__(fitness_function, xs_dim, gen, extra_xs=[], seed=None, objvalue_json_path='', max_rounds_same_objvalue=float('inf'), pop_size=None)

Initializes the COBRAKGENETIC object.

Parameters:

Name Type Description Default
fitness_function Callable

The fitness function to evaluate solutions.

required
xs_dim int

The dimensionality of the search space.

required
gen int

The number of generations to run.

required
extra_xs list[list[int]]

Extra particles to initialize the population. Defaults to [].

[]
seed int | None

Seed for the random number generator. Defaults to None.

None
objvalue_json_path str

Path to a JSON file to store objective values. Defaults to "".

''
max_rounds_same_objvalue float

Maximum rounds with the same objective value before stopping. Defaults to infinity.

float('inf')
pop_size int | None

Population size. Defaults to None.

None
Source code in cobrak/genetic.py
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
def __init__(
    self,
    fitness_function: Callable[
        [list[float | int]], tuple[float, list[float | int]]
    ],
    xs_dim: int,
    gen: int,
    extra_xs: list[list[int]] = [],
    seed: int | None = None,
    objvalue_json_path: str = "",
    max_rounds_same_objvalue: float = float("inf"),
    pop_size: int | None = None,
) -> None:
    """Initializes the COBRAKGENETIC object.

    Args:
        fitness_function (Callable): The fitness function to evaluate solutions.
        xs_dim (int): The dimensionality of the search space.
        gen (int): The number of generations to run.
        extra_xs (list[list[int]], optional): Extra particles to initialize the population.
            Defaults to [].
        seed (int | None, optional): Seed for the random number generator. Defaults to None.
        objvalue_json_path (str, optional): Path to a JSON file to store objective values.
            Defaults to "".
        max_rounds_same_objvalue (float, optional): Maximum rounds with the same objective
            value before stopping. Defaults to infinity.
        pop_size (int | None, optional): Population size. Defaults to None.
    """
    # Parameters
    self.fitness_function = fitness_function
    self.xs_dim = xs_dim
    self.gen = gen
    self.seed = seed
    if seed is not None:
        np.random.seed(seed)  # noqa: NPY002

    # Initialization of random particles
    cpu_count_value = cpu_count() if pop_size is None else pop_size
    if cpu_count_value is None:
        self.cpu_count = 1
    else:
        self.cpu_count = cpu_count_value
    self.init_xs = [
        [randint(0, 1) for _ in range(xs_dim)]
        for _ in range(self.cpu_count - len(extra_xs))
    ]

    # Addition of user-defined extra particles
    if extra_xs != []:
        self.init_xs.extend(extra_xs)

    self.tested_xs: dict[tuple[int, ...], float] = {}
    self.all_xs: dict[tuple[int, ...], float] = {}
    self.objvalue_json_path = objvalue_json_path
    self.objvalue_json_data: dict[float, list[float]] = {}
    self.max_rounds_same_objvalue = max_rounds_same_objvalue

run()

Runs the genetic algorithm optimization.

Returns:

Type Description
float

tuple[float, tuple[int, ...]]: A tuple containing the best fitness score and the

tuple[int, ...]

corresponding solution.

Source code in cobrak/genetic.py
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
def run(self) -> tuple[float, tuple[int, ...]]:
    """Runs the genetic algorithm optimization.

    Returns:
        tuple[float, tuple[int, ...]]: A tuple containing the best fitness score and the
        corresponding solution.
    """
    init_fitnesses = Parallel(n_jobs=-1)(
        delayed(self.fitness_function)(x) for x in self.init_xs
    )
    if init_fitnesses is not None:
        self.tested_xs = {}
        for init_fitness in init_fitnesses:
            for fitness, xs in init_fitness:
                self.tested_xs[tuple(xs)] = fitness
    else:
        print("ERROR: Something went wrong during initialization")
        raise ValueError

    if self.objvalue_json_path:
        start_time = time()
        self.objvalue_json_data[0.0] = sorted(self.tested_xs.values())
        json_write(self.objvalue_json_path, self.objvalue_json_data)

    # Actual algorithm
    max_objvalues = []
    for _ in range(self.gen):
        max_objvalues.append(max(self.tested_xs.values()))
        if last_n_elements_equal(max_objvalues, self.max_rounds_same_objvalue):  # type: ignore
            break

        xs_list = self._get_sorted_list_from_tested_xs()
        xs_list = [xs for xs in xs_list if len(xs) > 0]

        chosen_xs: list[tuple[int, ...]] = []
        # Choose some of the top 3
        for _ in range(self.cpu_count // 4):
            chosen_xs.append(choice([x[1] for x in xs_list if len(x[1]) > 0][:3]))

        # Choose some of the 25% best
        for _ in range(self.cpu_count // 2):
            chosen_xs.append(
                choice([x[1] for x in xs_list][: round(len(xs_list) * 0.25)])
            )

        # Choose some other 75% worst
        for _ in range(self.cpu_count // 4):
            addlength = 0
            while not addlength:
                added_xs = deepcopy(
                    choice([x[1] for x in xs_list][round(len(xs_list) * 0.25) :])
                )
                addlength = len(added_xs)
            chosen_xs.append(added_xs)

        # Random crossovers
        for _ in range(round(len(chosen_xs) * 0.2)):
            target = randint(0, len(chosen_xs) - 1)
            source = randint(0, len(chosen_xs) - 1)
            cut = randint(0, self.xs_dim - 1)
            chosen_xs[target] = deepcopy(chosen_xs[target][:cut]) + deepcopy(
                chosen_xs[source][cut:]
            )

        # Test Xs in parallel
        results = Parallel(n_jobs=-1, verbose=10)(
            delayed(self.update_particle)(
                chosen_x,
                count_last_equal_elements(max_objvalues),
            )
            for chosen_x in chosen_xs
        )

        if results is None:
            print("ERROR: Something went wrong during fitness calculations")
            raise ValueError

        # Unpack results
        for fitnesses_and_active_xs, mutated_x in results:
            for fitness, active_x in fitnesses_and_active_xs:
                if active_x is None or active_x == []:
                    continue
                self.tested_xs[tuple(active_x)] = fitness
                self.all_xs[tuple(active_x)] = fitness
            self.all_xs[tuple(mutated_x)] = max(
                fitness for (fitness, _) in fitnesses_and_active_xs
            )

        if self.objvalue_json_path:
            self.objvalue_json_data[time() - start_time] = sorted(
                self.tested_xs.values()
            )
            json_write(self.objvalue_json_path, self.objvalue_json_data)

    best_f_and_x = self._get_sorted_list_from_tested_xs()[0]
    return best_f_and_x[0], best_f_and_x[1]

update_particle(chosen_x, num_rounds_without_best_change)

Updates a single particle by introducing mutations.

Parameters:

Name Type Description Default
chosen_x list[int]

The current solution represented as a list of integers.

required
num_rounds_without_best_change int

The number of rounds without a change in the best fitness score.

required

Returns:

Type Description
float

tuple[list[list[float]], list[int]]: A tuple containing a list of fitness scores

list[int]

and the mutated solution.

Source code in cobrak/genetic.py
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
def update_particle(
    self,
    chosen_x: list[int],
    num_rounds_without_best_change: int,
) -> tuple[float, list[int], list[int]]:
    """Updates a single particle by introducing mutations.

    Args:
        chosen_x (list[int]): The current solution represented as a list of integers.
        num_rounds_without_best_change (int): The number of rounds without a change in the
            best fitness score.

    Returns:
        tuple[list[list[float]], list[int]]: A tuple containing a list of fitness scores
        and the mutated solution.
    """
    if not len(chosen_x):
        return [[1_000_000, []]], []

    min_change_p = 0.1 * 0.95**num_rounds_without_best_change
    max_change_p = 0.1 * 1.05**num_rounds_without_best_change
    change_p = uniform(min_change_p, max_change_p)
    change_p = max(0.001, change_p)
    change_p = min(0.999, change_p)

    mutation_tries = 0
    while True:
        mutated_x: list[float] = []
        match randint(0, 2):
            case 0:  # Extend
                for x in chosen_x:
                    if x == 1:
                        mutated_x.append(1)
                        continue
                    if uniform(0.0, 1.0) < change_p:
                        mutated_x.append(1)
                    else:
                        mutated_x.append(x)
            case 1:  # Decrease
                for x in chosen_x:
                    if x == 0:
                        mutated_x.append(0)
                        continue
                    if uniform(0.0, 1.0) < change_p:
                        mutated_x.append(0)
                    else:
                        mutated_x.append(x)
            case 2:  # Extend and decrease
                for x in chosen_x:
                    if x == 1:
                        if uniform(0.0, 1.0) < change_p:
                            mutated_x.append(0)
                        else:
                            mutated_x.append(x)
                    else:
                        if uniform(0.0, 1.0) < change_p:
                            mutated_x.append(1)
                        else:
                            mutated_x.append(x)
            # case 3:  # Random
            #    mutated_x = [randint(0, 1) for _ in range(len(chosen_x))]

        if mutation_tries > 250:
            return [[1_000_000, []]], []
        if tuple(mutated_x) in self.all_xs:
            mutation_tries += 1
        else:
            break

    # Evaluate new position
    fitnesses_and_active_xs = self.fitness_function(mutated_x)

    return fitnesses_and_active_xs, mutated_x

io

General (COBRAk-independent) helper functions, primarily for I/O tasks such as pickle and JSON file handlings.

convert_cobrak_model_to_annotated_cobrapy_model(cobrak_model, combine_base_reactions=False, add_enzyme_constraints=False)

Converts a COBRAk model to an annotated COBRApy model.

This function takes a COBRAk model and converts it to a COBRApy model, adding annotations and constraints as specified by the input parameters.

The function adds the following annotation keys to the COBRApy model:

  • cobrak_Cmin: The minimum concentration of a metabolite.
  • cobrak_Cmax: The maximum concentration of a metabolite.
  • cobrak_id_<version>: The ID of the reaction in the COBRAk model.
  • cobrak_dG0_<version>: The standard Gibbs free energy change of a reaction.
  • cobrak_dG0_uncertainty_<version>: The uncertainty of the standard Gibbs free energy change of a reaction.
  • cobrak_k_cat_<version>: The turnover number of an enzyme.
  • cobrak_k_ms_<version>: The Michaelis constant of an enzyme.
  • cobrak_k_is_<version>: The inhibition constant of an enzyme.
  • cobrak_k_as_<version>: The activation constant of an enzyme.
  • cobrak_special_stoichiometries_<version>: Special stoichiometries of a reaction.
  • cobrak_max_prot_pool: The maximum protein pool size.
  • cobrak_R: The gas constant.
  • cobrak_T: The temperature.
  • cobrak_kinetic_ignored_metabolites: A list of metabolites that are ignored in kinetic simulations.
  • cobrak_extra_linear_constraints: A list of extra linear constraints.
  • cobrak_mw: The molecular weight of an enzyme.
  • cobrak_min_conc: The minimum concentration of an enzyme.
  • cobrak_max_conc: The maximum concentration of an enzyme.

The conversion process also involves the merging of forward and reverse reactions, as well as isomeric alternatives, into a single reaction in the COBRApy model. When the combine_base_reactions parameter is set to True, the function combines these reactions into a single entity, while still preserving the unique characteristics of each original reaction. To achieve this, the function uses a versioning system, denoted by the suffix, to differentiate between the annotations of the original reactions. For example, the cobrak_id_ annotation key will contain the ID of the original reaction, with indicating whether it corresponds to the forward or reverse direction, or an isomeric alternative. This versioning system allows the model to retain the distinct properties of each reaction, such as their standard Gibbs free energy changes or enzyme kinetics, while still representing them as a single, unified reaction. The suffix can take on values such as V0, V1, etc., with each value corresponding to a specific original reaction.

The conversion of a COBRAk model to a COBRApy model also includes the optional direct addition of enzyme constraints in the style of GECKO [1] (or expaned sMOMENT [2]), which can be enabled through the add_enzyme_constraints parameter. When this parameter is set to True, the function introduces new pseudo-metabolites and pseudo-reactions to the model, allowing for the simulation of enzyme kinetics and protein expression. Specifically, a protein pool pseudo-metabolite is added, which represents the total amount of protein available in the system. Additionally, pseudo-reactions are created to deliver enzymes to the protein pool, taking into account the molecular weight and concentration of each enzyme. The function also adds pseudo-reactions to form enzyme complexes, which are essential for simulating the k_cat-based kinetics of enzymatic reactions.

[1] https://doi.org/10.15252/msb.20167411 [2] https://doi.org/10.1186/s12859-019-3329-9

Parameters

cobrak_model : Model The COBRAk model to be converted. combine_base_reactions : bool, optional Whether to combine base reactions into a single reaction (default: False). add_enzyme_constraints : bool, optional Whether to add enzyme constraints to the model (default: False).

Returns

cobra.Model The converted COBRApy model.

Raises

ValueError If combine_base_reactions and add_enzyme_constraints are both True.

Source code in cobrak/io.py
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
@validate_call
def convert_cobrak_model_to_annotated_cobrapy_model(
    cobrak_model: Model,
    combine_base_reactions: bool = False,
    add_enzyme_constraints: bool = False,
) -> cobra.Model:
    """Converts a COBRAk model to an annotated COBRApy model.

    This function takes a COBRAk model and converts it to a COBRApy model,
    adding annotations and constraints as specified by the input parameters.

    The function adds the following annotation keys to the COBRApy model:

    * `cobrak_Cmin`: The minimum concentration of a metabolite.
    * `cobrak_Cmax`: The maximum concentration of a metabolite.
    * `cobrak_id_<version>`: The ID of the reaction in the COBRAk model.
    * `cobrak_dG0_<version>`: The standard Gibbs free energy change of a reaction.
    * `cobrak_dG0_uncertainty_<version>`: The uncertainty of the standard Gibbs free energy change of a reaction.
    * `cobrak_k_cat_<version>`: The turnover number of an enzyme.
    * `cobrak_k_ms_<version>`: The Michaelis constant of an enzyme.
    * `cobrak_k_is_<version>`: The inhibition constant of an enzyme.
    * `cobrak_k_as_<version>`: The activation constant of an enzyme.
    * `cobrak_special_stoichiometries_<version>`: Special stoichiometries of a reaction.
    * `cobrak_max_prot_pool`: The maximum protein pool size.
    * `cobrak_R`: The gas constant.
    * `cobrak_T`: The temperature.
    * `cobrak_kinetic_ignored_metabolites`: A list of metabolites that are ignored in kinetic simulations.
    * `cobrak_extra_linear_constraints`: A list of extra linear constraints.
    * `cobrak_mw`: The molecular weight of an enzyme.
    * `cobrak_min_conc`: The minimum concentration of an enzyme.
    * `cobrak_max_conc`: The maximum concentration of an enzyme.

    The conversion process also involves the merging of forward and reverse reactions, as well as isomeric alternatives,
    into a single reaction in the COBRApy model. When the combine_base_reactions parameter is set to True,
    the function combines these reactions into a single entity, while still preserving the unique characteristics
    of each original reaction. To achieve this, the function uses a versioning system, denoted by the <version> suffix,
    to differentiate between the annotations of the original reactions. For example, the cobrak_id_<version> annotation
    key will contain the ID of the original reaction, with <version> indicating whether it corresponds to the forward or
    reverse direction, or an isomeric alternative. This versioning system allows the model to retain the distinct properties
    of each reaction, such as their standard Gibbs free energy changes or enzyme kinetics, while still representing them as
    a single, unified reaction. The <version> suffix can take on values such as V0, V1, etc., with each value corresponding
    to a specific original reaction.

    The conversion of a COBRAk model to a COBRApy model also includes the optional direct addition of enzyme constraints
    in the style of GECKO [1] (or expaned sMOMENT [2]),
    which can be enabled through the add_enzyme_constraints parameter. When this parameter is set to True,
    the function introduces new pseudo-metabolites and pseudo-reactions to the model, allowing for the simulation
    of enzyme kinetics and protein expression. Specifically, a protein pool pseudo-metabolite is added, which
    represents the total amount of protein available in the system. Additionally, pseudo-reactions are created
    to deliver enzymes to the protein pool, taking into account the molecular weight and concentration of each enzyme.
    The function also adds pseudo-reactions to form enzyme complexes, which are essential for simulating the k_cat-based kinetics
    of enzymatic reactions.

    [1] https://doi.org/10.15252/msb.20167411
    [2] https://doi.org/10.1186/s12859-019-3329-9

    Parameters
    ----------
    cobrak_model : Model
        The COBRAk model to be converted.
    combine_base_reactions : bool, optional
        Whether to combine base reactions into a single reaction (default: False).
    add_enzyme_constraints : bool, optional
        Whether to add enzyme constraints to the model (default: False).

    Returns
    -------
    cobra.Model
        The converted COBRApy model.

    Raises
    ------
    ValueError
        If combine_base_reactions and add_enzyme_constraints are both True.
    """
    cobrak_model = deepcopy(cobrak_model)
    if combine_base_reactions and add_enzyme_constraints:
        print(
            "ERROR: Stoichiometric enzyme constraints do not work with combined base reactions\n"
            "       as for these enzyme constraints, reactions must remain irreversible."
        )
        raise ValueError

    cobra_model = cobra.Model()

    # Add metabolites
    added_metabolites: list[cobra.Metabolite] = []
    for met_id, met_data in cobrak_model.metabolites.items():
        cobra_metabolite: cobra.Metabolite = cobra.Metabolite(
            id=met_id,
            compartment=met_id.split("_")[-1] if "_" in met_id else "c",
            name=met_data.name,
            formula=met_data.formula,
            charge=met_data.charge,
        )
        cobra_metabolite.annotation = met_data.annotation

        # Add full annotation
        cobra_metabolite.annotation["cobrak_Cmin"] = exp(met_data.log_min_conc)
        cobra_metabolite.annotation["cobrak_Cmax"] = exp(met_data.log_max_conc)
        if met_data.smiles:
            cobra_metabolite.annotation["cobrak_smiles"] = met_data.smiles
        if met_data.compartment:
            cobra_metabolite.compartment = met_data.compartment
        if met_data.molar_mass:
            cobra_metabolite.annotation["cobrak_molar_mass"] = met_data.molar_mass

        added_metabolites.append(cobra_metabolite)
    cobra_model.add_metabolites(added_metabolites)

    if add_enzyme_constraints:
        enzyme_reacs: list[cobra.Reaction] = []

        # If set: Add protein pool reaction (flux in g⋅gDW⁻¹)
        prot_pool_met = cobra.Metabolite(id="prot_pool", compartment="c")
        cobra_model.add_metabolites([prot_pool_met])

        prot_pool_reac = cobra.Reaction(
            "prot_pool_delivery",
            lower_bound=0.0,
            upper_bound=cobrak_model.max_prot_pool,
        )
        prot_pool_reac.add_metabolites(
            {
                prot_pool_met: 1.0,
            }
        )
        enzyme_reacs.append(prot_pool_reac)

        # If set: Add enzyme concentration delivery reactions (flux in mmol⋅gDW⁻¹)
        for enzyme_id, enzyme_data in cobrak_model.enzymes.items():
            enzyme_met = cobra.Metabolite(id=enzyme_id, compartment="c")

            lower_bound = (
                enzyme_data.min_conc if enzyme_data.min_conc is not None else 0.0
            )
            upper_bound = (
                enzyme_data.max_conc if enzyme_data.max_conc is not None else 100_000.0
            )
            enzyme_reac = cobra.Reaction(
                id="enzyme_delivery_" + enzyme_id,
                lower_bound=lower_bound,
                upper_bound=upper_bound,
            )
            enzyme_reac.add_metabolites(
                {
                    prot_pool_met: -enzyme_data.molecular_weight,
                    enzyme_met: 1.0,
                }
            )
            enzyme_reacs.append(enzyme_reac)

        cobra_model.add_reactions(enzyme_reacs)

    # If set: Add enzyme complex metabolites and delivery reactions
    if add_enzyme_constraints:
        enzyme_complexes: set[tuple[str, ...]] = {
            tuple(
                enzyme_id
                for enzyme_id in reaction_data.enzyme_reaction_data.identifiers
            )
            for reaction_data in cobrak_model.reactions.values()
            if reaction_data.enzyme_reaction_data is not None
        }
        complex_reacs: list[cobra.Reaction] = []
        for enzyme_complex in enzyme_complexes:
            if len(enzyme_complex) <= 1:
                continue
            if enzyme_complex == ("",):
                continue
            complex_met = cobra.Metabolite(id="_".join(enzyme_complex), compartment="c")
            complex_reac = cobra.Reaction(
                id="complex_delivery_" + "_".join(enzyme_complex),
                lower_bound=0.0,
                upper_bound=10_000.0,
            )
            complex_reac.add_metabolites(
                {
                    cobra_model.metabolites.get_by_id(enzyme_id): -1
                    for enzyme_id in enzyme_complex
                    if enzyme_id
                }
            )
            complex_reac.add_metabolites(
                {
                    complex_met: 1.0,
                }
            )
            complex_reacs.append(complex_reac)
        cobra_model.add_reactions(complex_reacs)

    # Add reactions
    added_reactions: list[cobra.Reaction] = []
    if not combine_base_reactions:
        for reac_id, reac_data in cobrak_model.reactions.items():
            cobra_reaction = cobra.Reaction(
                id=reac_id,
                lower_bound=reac_data.min_flux,
                upper_bound=reac_data.max_flux,
                name=reac_data.name,
            )
            cobra_reaction.add_metabolites(
                {
                    cobra_model.metabolites.get_by_id(met_id): stoich
                    for met_id, stoich in reac_data.stoichiometries.items()
                }
            )

            # Add full annotation
            _add_annotation_to_cobra_reaction(cobra_reaction, reac_id, reac_data, "V0")

            if (
                reac_data.enzyme_reaction_data is not None
                and add_enzyme_constraints
                and reac_data.enzyme_reaction_data.identifiers != []
            ):
                complex_met_id = "_".join(reac_data.enzyme_reaction_data.identifiers)
                if complex_met_id:
                    cobra_reaction.add_metabolites(
                        {
                            cobra_model.metabolites.get_by_id(complex_met_id): -1
                            / reac_data.enzyme_reaction_data.k_cat
                        }
                    )
            added_reactions.append(cobra_reaction)
    else:
        base_id_to_reac_ids: dict[str, list[str]] = {}
        for reac_id in cobrak_model.reactions:
            base_id = get_base_id(
                reac_id,
                cobrak_model.fwd_suffix,
                cobrak_model.rev_suffix,
                cobrak_model.reac_enz_separator,
            )
            if base_id not in base_id_to_reac_ids:
                base_id_to_reac_ids[base_id] = []
            base_id_to_reac_ids[base_id].append(reac_id)

        for base_id, reac_ids in base_id_to_reac_ids.items():
            rev_ids = [
                reac_id
                for reac_id in reac_ids
                if reac_id.endswith(cobrak_model.rev_suffix)
            ]
            fwd_ids = [
                reac_id
                for reac_id in reac_ids
                if not reac_id.endswith(cobrak_model.rev_suffix)
            ]

            if len(rev_ids) > 0:
                min_flux = -max(
                    cobrak_model.reactions[rev_id].max_flux for rev_id in rev_ids
                )
                name = cobrak_model.reactions[rev_ids[0]].name
            else:
                min_flux = max(
                    cobrak_model.reactions[fwd_id].min_flux for fwd_id in fwd_ids
                )
            if len(fwd_ids) > 0:
                max_flux = max(
                    cobrak_model.reactions[fwd_id].max_flux for fwd_id in fwd_ids
                )
                met_stoichiometries = {
                    cobra_model.metabolites.get_by_id(met_id): stoich
                    for met_id, stoich in cobrak_model.reactions[
                        fwd_ids[0]
                    ].stoichiometries.items()
                }
                name = cobrak_model.reactions[fwd_ids[0]].name
            else:
                max_flux = min(
                    cobrak_model.reactions[rev_id].max_flux for rev_id in rev_ids
                )
                met_stoichiometries = {
                    cobra_model.metabolites.get_by_id(met_id): -stoich
                    for met_id, stoich in cobrak_model.reactions[
                        rev_ids[0]
                    ].stoichiometries.items()
                }

            cobra_reaction = cobra.Reaction(
                id=base_id,
                lower_bound=min_flux,
                upper_bound=max_flux,
                name=name,
            )
            cobra_reaction.add_metabolites(met_stoichiometries)
            for number, reac_id in enumerate(reac_ids):
                version = f"V{number}"
                reac_data = cobrak_model.reactions[reac_id]
                _add_annotation_to_cobra_reaction(
                    cobra_reaction, reac_id, reac_data, version
                )

            added_reactions.append(cobra_reaction)

    # Add global information reaction
    added_reactions.append(
        cobra.Reaction(
            id="cobrak_global_settings",
            lower_bound=0.0,
            upper_bound=0.0,
        )
    )
    added_reactions[-1].annotation["cobrak_max_prot_pool"] = cobrak_model.max_prot_pool
    added_reactions[-1].annotation["cobrak_R"] = cobrak_model.R
    added_reactions[-1].annotation["cobrak_T"] = cobrak_model.T
    added_reactions[-1].annotation["cobrak_kinetic_ignored_metabolites"] = str(
        cobrak_model.kinetic_ignored_metabolites
    )
    added_reactions[-1].annotation["cobrak_reac_rev_suffix"] = cobrak_model.rev_suffix
    added_reactions[-1].annotation["cobrak_reac_fwd_suffix"] = cobrak_model.fwd_suffix
    added_reactions[-1].annotation["cobrak_reac_enz_separator"] = (
        cobrak_model.reac_enz_separator
    )
    added_reactions[-1].annotation["cobrak_extra_linear_constraints"] = str(
        [asdict(x) for x in cobrak_model.extra_linear_constraints]
    )
    added_reactions[-1].annotation["cobrak_extra_linear_watches"] = str(
        {key: asdict(value) for key, value in cobrak_model.extra_linear_watches.items()}
    )
    added_reactions[-1].annotation["cobrak_extra_nonlinear_constraints"] = str(
        [asdict(x) for x in cobrak_model.extra_nonlinear_constraints]
    )
    added_reactions[-1].annotation["cobrak_extra_nonlinear_watches"] = str(
        {
            key: asdict(value)
            for key, value in cobrak_model.extra_nonlinear_watches.items()
        }
    )

    cobra_model.add_reactions(added_reactions)

    gene_ids = [x.id for x in cobra_model.genes]
    for enzyme_id, enzyme_data in cobrak_model.enzymes.items():
        if enzyme_id not in gene_ids:
            cobra_model.genes.append(cobra.Gene(enzyme_id, name=enzyme_data.name))
        gene = cobra_model.genes.get_by_id(enzyme_id)
        gene.annotation["cobrak_mw"] = enzyme_data.molecular_weight
        if enzyme_data.min_conc is not None:
            gene.annotation["cobrak_min_conc"] = enzyme_data.min_conc
        if enzyme_data.max_conc is not None:
            gene.annotation["cobrak_max_conc"] = enzyme_data.max_conc
        for key, text in enzyme_data.annotation.items():
            gene.annotation[key] = text
        gene.annotation["cobrak_sequence"] = enzyme_data.sequence

    return cobra_model

ensure_folder_existence(folder)

Checks if the given folder exists. If not, the folder is created.

Argument
  • folder: str ~ The folder whose existence shall be enforced.
Source code in cobrak/io.py
200
201
202
203
204
205
206
207
208
209
210
211
@validate_call
def ensure_folder_existence(folder: str) -> None:
    """Checks if the given folder exists. If not, the folder is created.

    Argument
    ----------
    * folder: str ~ The folder whose existence shall be enforced.
    """
    if os.path.isdir(folder):
        return
    with contextlib.suppress(FileExistsError):
        os.makedirs(folder)

ensure_json_existence(path)

Ensures that a JSON file exists at the specified path.

If the file does not exist, it creates an empty JSON file with "{}" as its content.

Parameters:

Name Type Description Default
path str

The file path where the JSON file should exist.

required
Source code in cobrak/io.py
214
215
216
217
218
219
220
221
222
223
224
225
226
@validate_call
def ensure_json_existence(path: str) -> None:
    """Ensures that a JSON file exists at the specified path.

    If the file does not exist, it creates an empty JSON file with "{}" as its content.

    Args:
        path (str): The file path where the JSON file should exist.
    """
    if os.path.isfile(path):
        return
    with open(path, "w", encoding="utf-8") as f:  # noqa: FURB103
        f.write("{}")

get_base_id(reac_id, fwd_suffix=REAC_FWD_SUFFIX, rev_suffix=REAC_REV_SUFFIX, reac_enz_separator=REAC_ENZ_SEPARATOR)

Extract the base ID from a reaction ID by removing specified suffixes and separators.

Processes a reaction ID to remove forward and reverse suffixes as well as any enzyme separators, to obtain the base reaction ID.

Parameters:

Name Type Description Default
reac_id str

The reaction ID to be processed.

required
fwd_suffix str

The suffix indicating forward reactions. Defaults to REAC_FWD_SUFFIX.

REAC_FWD_SUFFIX
rev_suffix str

The suffix indicating reverse reactions. Defaults to REAC_REV_SUFFIX.

REAC_REV_SUFFIX
reac_enz_separator str

The separator used between reaction and enzyme identifiers. Defaults to REAC_ENZ_SEPARATOR.

REAC_ENZ_SEPARATOR

Returns:

Name Type Description
str str

The base reaction ID with specified suffixes and separators removed.

Source code in cobrak/io.py
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
@validate_call
def get_base_id(
    reac_id: str,
    fwd_suffix: str = REAC_FWD_SUFFIX,
    rev_suffix: str = REAC_REV_SUFFIX,
    reac_enz_separator: str = REAC_ENZ_SEPARATOR,
) -> str:
    """Extract the base ID from a reaction ID by removing specified suffixes and separators.

    Processes a reaction ID to remove forward and reverse suffixes
    as well as any enzyme separators, to obtain the base reaction ID.

    Args:
        reac_id (str): The reaction ID to be processed.
        fwd_suffix (str, optional): The suffix indicating forward reactions. Defaults to REAC_FWD_SUFFIX.
        rev_suffix (str, optional): The suffix indicating reverse reactions. Defaults to REAC_REV_SUFFIX.
        reac_enz_separator (str, optional): The separator used between reaction and enzyme identifiers. Defaults to REAC_ENZ_SEPARATOR.

    Returns:
        str: The base reaction ID with specified suffixes and separators removed.
    """
    reac_id_split = reac_id.split(reac_enz_separator)
    return (
        (reac_id_split[0] + "\b")
        .replace(f"{fwd_suffix}\b", "")
        .replace(f"{rev_suffix}\b", "")
        .replace("\b", "")
    )

get_files(path)

Returns the names of the files in the given folder as a list of strings.

Arguments
  • path: str ~ The path to the folder of which the file names shall be returned
Source code in cobrak/io.py
609
610
611
612
613
614
615
616
617
618
619
620
@validate_call
def get_files(path: str) -> list[str]:
    """Returns the names of the files in the given folder as a list of strings.

    Arguments
    ----------
    * path: str ~ The path to the folder of which the file names shall be returned
    """
    files: list[str] = []
    for _, _, filenames in os.walk(path):
        files.extend(filenames)
    return files

get_folders(path)

Returns the names of the folders in the given folder as a list of strings.

Arguments
  • path: str ~ The path to the folder whose folders shall be returned
Source code in cobrak/io.py
623
624
625
626
627
628
629
630
631
632
633
634
635
@validate_call
def get_folders(path: str) -> list[str]:
    """Returns the names of the folders in the given folder as a list of strings.

    Arguments
    ----------
    * path: str ~ The path to the folder whose folders shall be returned
    """
    return [
        folder
        for folder in os.listdir(path)
        if os.path.isdir(os.path.join(path, folder))
    ]

gzip_load_file(filepath, remove_newlines=True)

Loads a gzipped file and returns its content as a list of strings,
where each string is a line from the file.

This function uses gzip.open in text mode ('rt') and readlines() to efficiently
load the file content.

Args:
    filepath: The path to the compressed (.tsv.gz) file.
    remove_newlines: Whether or not newlines (

) shall be removed or not. Defaults to True.

Returns:
    A list of strings, where each string is a line from the file.
    Returns an empty list if the file is not found or an error occurs.
Source code in cobrak/io.py
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
@validate_call
def gzip_load_file(filepath: str, remove_newlines: bool = True) -> list[str]:
    """
    Loads a gzipped file and returns its content as a list of strings,
    where each string is a line from the file.

    This function uses gzip.open in text mode ('rt') and readlines() to efficiently
    load the file content.

    Args:
        filepath: The path to the compressed (.tsv.gz) file.
        remove_newlines: Whether or not newlines (\n) shall be removed or not. Defaults to True.

    Returns:
        A list of strings, where each string is a line from the file.
        Returns an empty list if the file is not found or an error occurs.
    """
    lines: list[str] = []
    try:
        # Open the gzipped file in read text mode ('rt')
        # This allows reading line by line directly without manual decompression.
        with gzip.open(filepath, "rt", encoding="utf-8") as f:
            # Read all lines into a list
            lines = f.readlines()
        print(f"Successfully loaded {len(lines)} lines from '{filepath}'.")
    except FileNotFoundError:
        print(f"Error: Gzipped file '{filepath}' not found.")
    except Exception as e:
        print(f"An error occurred while loading '{filepath}': {e}")

    if remove_newlines:
        return [x.replace("\n", "") for x in lines]
    return lines

gzip_write_file(filepath, lines)

Write a gzipped (.gz) file out from the given content.

Parameters:

Name Type Description Default
filepath str

The path of the gzipped file (.gz ending has to be added)

required
lines list[str]

A list of stringswith the file content. Newlines have to be added.

required
Source code in cobrak/io.py
673
674
675
676
677
678
679
680
681
682
@validate_call
def gzip_write_file(filepath: str, lines: list[str]) -> None:
    """Write a gzipped (.gz) file out from the given content.

    Args:
        filepath: The path of the gzipped file (.gz ending has to be added)
        lines: A list of stringswith the file content. Newlines have to be added.
    """
    with gzip.open(filepath, "wt", encoding="utf-8") as f_out:
        f_out.writelines(lines)

json_load(path, dataclass_type=Any)

Load JSON data from a file and validate it against a specified dataclass type.

This function reads the content of a JSON file located at the given path, parses it, and validates the parsed data against the provided dataclass_type. If the data is valid according to the dataclass schema, it returns an instance of the dataclass populated with the data. Otherwise, it raises an exception.

Parameters:

path : str The file path to the JSON file that needs to be loaded.

Type[T]

A dataclass type against which the JSON data should be validated and deserialized.

Returns:

T An instance of the specified dataclass_type populated with the data from the JSON file.

Raises:

JSONDecodeError If the content of the file is not a valid JSON string.

ValidationError If the parsed JSON data does not conform to the schema defined by dataclass_type.

Examples:

@dataclass ... class Person: ... name: str ... age: int

person = json_load('person.json', Person) print(person.name, person.age) John Doe 30

Source code in cobrak/io.py
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def json_load(path: str, dataclass_type: T = Any) -> T:
    """Load JSON data from a file and validate it against a specified dataclass type.

    This function reads the content of a JSON file located at the given `path`, parses it,
    and validates the parsed data against the provided `dataclass_type`. If the data is valid
    according to the dataclass schema, it returns an instance of the dataclass populated with
    the data. Otherwise, it raises an exception.

    Parameters:
    ----------
    path : str
        The file path to the JSON file that needs to be loaded.

    dataclass_type : Type[T]
        A dataclass type against which the JSON data should be validated and deserialized.

    Returns:
    -------
    T
        An instance of the specified `dataclass_type` populated with the data from the JSON file.

    Raises:
    ------
    JSONDecodeError
        If the content of the file is not a valid JSON string.

    ValidationError
        If the parsed JSON data does not conform to the schema defined by `dataclass_type`.

    Examples:
    --------
    >>> @dataclass
    ... class Person:
    ...     name: str
    ...     age: int

    >>> person = json_load('person.json', Person)
    >>> print(person.name, person.age)
    John Doe 30
    """
    with open(path, encoding="utf-8") as f:  # noqa: FURB101
        data = f.read()

    return TypeAdapter(dataclass_type).validate_json(data)

json_write(path, json_data)

Writes a JSON file at the given path with the given data as content.

Can be also used for any of COBRAk's dataclasses as well as any dictionary of the form dict[str, dict[str, T] | None] where T stands for a COBRAk dataclass or any other JSON-compatible object type.

Arguments
  • path: str ~ The path of the JSON file that shall be written
  • json_data: Any ~ The dictionary or list which shalll be the content of the created JSON file
Source code in cobrak/io.py
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def json_write(path: str, json_data: Any) -> None:  # noqa: ANN401
    """Writes a JSON file at the given path with the given data as content.

    Can be also used for any of COBRAk's dataclasses as well as any
    dictionary of the form dict[str, dict[str, T] | None] where
    T stands for a COBRAk dataclass or any other JSON-compatible
    object type.

    Arguments
    ----------
    * path: str ~  The path of the JSON file that shall be written
    * json_data: Any ~ The dictionary or list which shalll be the content of
      the created JSON file
    """
    if is_dataclass(json_data):
        json_write(path, asdict(json_data))
    elif isinstance(json_data, BaseModel):
        json_output = json_data.model_dump_json(indent=2)
        with open(path, "w+", encoding="utf-8") as f:
            f.write(json_output)
    elif isinstance(json_data, dict) and sum(
        is_dataclass(value) for value in json_data.values()
    ):
        json_dict: dict[str, dict[str, Any] | None] = {}
        for key, data in json_data.items():
            if data is None:
                json_dict[key] = None
            elif is_dataclass(data):
                json_dict[key] = asdict(data)
            else:
                json_dict[key] = data
        json_write(path, json_dict)
    else:
        json_output = json.dumps(json_data, indent=4)
        with open(path, "w+", encoding="utf-8") as f:
            f.write(json_output)

json_zip_load(path)

Loads the given zipped JSON file and returns it as json_data (a list or a dictionary).

Arguments
  • path: str ~ The path of the JSON file without ".zip" at the end
Returns

dict or list ~ The loaded JSON data

Source code in cobrak/io.py
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
@validate_call
def json_zip_load(path: str) -> dict:
    """Loads the given zipped JSON file and returns it as json_data (a list
    or a dictionary).

    Arguments
    ----------
    * path: str ~ The path of the JSON file without ".zip" at the end

    Returns
    -------
    dict or list ~ The loaded JSON data
    """
    # Create a temporary directory to extract the zip file contents
    with tempfile.TemporaryDirectory() as temp_dir:
        # Construct the full path to the zip file
        zip_path = f"{path}.zip"

        # Open the zip file and extract its contents to the temporary directory
        with zipfile.ZipFile(zip_path, "r") as zip_file:
            zip_file.extractall(temp_dir)

        # Construct the full path to the JSON file in the temporary directory
        json_path = os.path.join(temp_dir, os.path.basename(path))

        # Open and load the JSON file
        with open(json_path, encoding="utf-8") as json_file:
            json_data = json.load(json_file)

    return json_data

json_zip_write(path, json_data, zip_method=zipfile.ZIP_LZMA)

Writes a zipped JSON file at the given path with the given dictionary as content.

Arguments
  • path: str ~ The path of the JSON file that shall be written without ".zip" at the end
  • json_data: Any ~ The dictionary or list which shalll be the content of the created JSON file
Source code in cobrak/io.py
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def json_zip_write(
    path: str,
    json_data: Any,  # noqa: ANN401
    zip_method: int = zipfile.ZIP_LZMA,  # noqa: ANN401
) -> None:
    """Writes a zipped JSON file at the given path with the given dictionary as content.

    Arguments
    ----------
    * path: str ~  The path of the JSON file that shall be written without ".zip" at the end
    * json_data: Any ~ The dictionary or list which shalll be the content of
      the created JSON file
    """
    json_output = json.dumps(json_data, indent=4).encode("utf-8")
    with ZipFile(path + ".zip", "w", compression=zip_method) as zip_file:
        zip_file.writestr(os.path.basename(path), json_output)

load_annotated_cobrapy_model_as_cobrak_model(cobra_model, exclude_enzyme_constraints=True, mw_for_enzymes_without_cobrak_mw_annotation=1000000.0, deactivate_mw_warning=False)

Converts a COBRApy model with (and also without :-) annotations into a COBRAk Model.

This function takes a COBRApy model, which may contain specific annotations for metabolites, reactions, and genes, and converts it into a COBRAk model. The conversion involves extracting relevant annotations and constructing COBRAk-specific data structures for metabolites, reactions, and enzymes.

  • cobra_model (cobra.Model): The COBRApy model to be converted. This model should contain annotations that are compatible with the COBRAk model structure.
  • exclude_enzyme_constraints (bool): Whether or not to exclude all stoichiometric enzyme constraint additions. Defaults to True.
  • Model: A COBRAk model constructed from the annotated COBRApy model, including metabolites, reactions, and enzymes with their respective parameters and constraints.

Notes: - The function assumes that certain annotations (e.g., "cobrak_Cmin", "cobrak_dG0") are present in the COBRApy model. Missing annotations will result in default values being used. - Reactions with IDs like "prot_pool_delivery" and those starting with "enzyme_delivery_" are ignored. - Ensure that the COBRApy model is correctly annotated to fully leverage the conversion process.

Source code in cobrak/io.py
 822
 823
 824
 825
 826
 827
 828
 829
 830
 831
 832
 833
 834
 835
 836
 837
 838
 839
 840
 841
 842
 843
 844
 845
 846
 847
 848
 849
 850
 851
 852
 853
 854
 855
 856
 857
 858
 859
 860
 861
 862
 863
 864
 865
 866
 867
 868
 869
 870
 871
 872
 873
 874
 875
 876
 877
 878
 879
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898
 899
 900
 901
 902
 903
 904
 905
 906
 907
 908
 909
 910
 911
 912
 913
 914
 915
 916
 917
 918
 919
 920
 921
 922
 923
 924
 925
 926
 927
 928
 929
 930
 931
 932
 933
 934
 935
 936
 937
 938
 939
 940
 941
 942
 943
 944
 945
 946
 947
 948
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def load_annotated_cobrapy_model_as_cobrak_model(
    cobra_model: cobra.Model,
    exclude_enzyme_constraints: bool = True,
    mw_for_enzymes_without_cobrak_mw_annotation: float = 1e6,
    deactivate_mw_warning: bool = False,
) -> Model:
    """Converts a COBRApy model with (and also without :-) annotations into a COBRAk Model.

    This function takes a COBRApy model, which may contain specific annotations for metabolites,
    reactions, and genes, and converts it into a COBRAk model. The conversion involves extracting
    relevant annotations and constructing COBRAk-specific data structures for metabolites, reactions,
    and enzymes.

    Parameters:
    - cobra_model (cobra.Model): The COBRApy model to be converted. This model should contain
      annotations that are compatible with the COBRAk model structure.
    - exclude_enzyme_constraints (bool): Whether or not to exclude all stoichiometric enzyme constraint additions.
      Defaults to True.

    Returns:
    - Model: A COBRAk model constructed from the annotated COBRApy model, including metabolites,
      reactions, and enzymes with their respective parameters and constraints.

    Notes:
    - The function assumes that certain annotations (e.g., "cobrak_Cmin", "cobrak_dG0") are present
      in the COBRApy model. Missing annotations will result in default values being used.
    - Reactions with IDs like "prot_pool_delivery" and those starting with "enzyme_delivery_" are ignored.
    - Ensure that the COBRApy model is correctly annotated to fully leverage the conversion process.
    """
    if exclude_enzyme_constraints:
        gene_ids = [gene.id for gene in cobra_model.genes]

    if "cobrak_global_settings" in [x.id for x in cobra_model.reactions]:
        global_settings_reac = cobra_model.reactions.get_by_id("cobrak_global_settings")
        max_prot_pool = float(global_settings_reac.annotation["cobrak_max_prot_pool"])
        kinetic_ignored_metabolites = literal_eval(
            global_settings_reac.annotation["cobrak_kinetic_ignored_metabolites"]
        )
        if "cobrak_extra_linear_constraints" in global_settings_reac.annotation:
            extra_linear_constraints = [
                ExtraLinearConstraint(**x)
                for x in literal_eval(
                    global_settings_reac.annotation["cobrak_extra_linear_constraints"]
                )
            ]
        else:
            extra_linear_constraints = []
        if "cobrak_extra_linear_watches" in global_settings_reac.annotation:
            extra_linear_watches_raw = literal_eval(
                global_settings_reac.annotation["cobrak_extra_linear_watches"]
            )
            extra_linear_watches = {
                key: ExtraLinearWatch(**watch)
                for key, watch in extra_linear_watches_raw.items()
            }
        else:
            extra_linear_watches = {}
        if "cobrak_extra_nonlinear_constraints" in global_settings_reac.annotation:
            extra_nonlinear_constraints = [
                ExtraNonlinearConstraint(**x)
                for x in literal_eval(
                    global_settings_reac.annotation[
                        "cobrak_extra_nonlinear_constraints"
                    ]
                )
            ]
        else:
            extra_nonlinear_constraints = []
        if "cobrak_extra_nonlinear_watches" in global_settings_reac.annotation:
            extra_nonlinear_watches_raw = literal_eval(
                global_settings_reac.annotation["cobrak_extra_nonlinear_watches"]
            )
            extra_nonlinear_watches = {
                key: ExtraNonlinearWatch(**watch)
                for key, watch in extra_nonlinear_watches_raw.items()
            }
        else:
            extra_nonlinear_watches = {}
        R = float(global_settings_reac.annotation["cobrak_R"])
        T = float(global_settings_reac.annotation["cobrak_T"])
        reac_fwd_suffix = global_settings_reac.annotation["cobrak_reac_fwd_suffix"]
        reac_rev_suffix = global_settings_reac.annotation["cobrak_reac_rev_suffix"]
        reac_enz_separator = global_settings_reac.annotation[
            "cobrak_reac_enz_separator"
        ]
    else:
        max_prot_pool = STANDARD_MAX_PROT_POOL
        extra_linear_constraints = []
        extra_nonlinear_constraints = []
        kinetic_ignored_metabolites = []
        R = STANDARD_R
        T = STANDARD_T
        reac_fwd_suffix = REAC_FWD_SUFFIX
        reac_rev_suffix = REAC_REV_SUFFIX
        reac_enz_separator = REAC_ENZ_SEPARATOR
        extra_linear_constraints = []
        extra_linear_watches = {}
        extra_nonlinear_constraints = []
        extra_nonlinear_watches = {}

    cobrak_metabolites: dict[str, Metabolite] = {}
    for metabolite_x in cobra_model.metabolites:
        metabolite: cobra.Metabolite = metabolite_x
        if exclude_enzyme_constraints and sum(
            met_split in gene_ids for met_split in metabolite.id.split("_")
        ):
            continue

        log_min_conc = (
            log(float(metabolite.annotation["cobrak_Cmin"]))
            if "cobrak_Cmax" in metabolite.annotation
            else log(1e-6)
        )
        log_max_conc = (
            log(float(metabolite.annotation["cobrak_Cmax"]))
            if "cobrak_Cmax" in metabolite.annotation
            else log(0.01)
        )
        smiles = metabolite.annotation.get("cobrak_smiles", "")
        molar_mass = (
            float(metabolite.annotation["cobrak_molar_mass"])
            if "cobrak_molar_mass" in metabolite.annotation
            else None
        )
        compartment = metabolite.compartment

        cobrak_metabolites[metabolite.id] = Metabolite(
            log_min_conc=log_min_conc,
            log_max_conc=log_max_conc,
            annotation={
                key: literal_eval(value) if "[" in value else value
                for key, value in metabolite.annotation.items()
                if not key.startswith("cobrak_")
            },
            formula="" if not metabolite.formula else metabolite.formula,
            charge=metabolite.charge,
            name=metabolite.name,
            smiles=smiles,
            compartment=compartment,
            molar_mass=molar_mass,
        )

    cobrak_reactions: dict[str, Reaction] = {}
    for reaction in cobra_model.reactions:
        if (
            reaction.id == "prot_pool_delivery"
            or reaction.id.startswith("enzyme_delivery_")
            or reaction.id.startswith("complex_delivery_")
            or reaction.id.startswith("cobrak_global_settings")
        ):
            continue

        version_data = [
            (key.replace("cobrak_id_", ""), reaction.annotation[key])
            for key in reaction.annotation
            if key.startswith("cobrak_id_")
        ]
        if version_data == []:
            version_data = [("0", reaction.id)]
        for version, version_reac_id in version_data:
            if f"cobrak_dG0_{version}" in reaction.annotation:
                dG0 = float(reaction.annotation[f"cobrak_dG0_{version}"])
            else:
                dG0 = None
            if f"cobrak_dG0_uncertainty_{version}" in reaction.annotation:
                dG0_uncertainty = float(
                    reaction.annotation[f"cobrak_dG0_uncertainty_{version}"]
                )
            else:
                dG0_uncertainty = None

            if f"cobrak_k_cat_{version}" in reaction.annotation:
                if reac_enz_separator in version_reac_id:
                    identifiers = (
                        (
                            version_reac_id.replace("_and", "").split(
                                reac_enz_separator
                            )[1]
                            + "\b"
                        )
                        .replace(f"{reac_fwd_suffix}\b", "")
                        .replace(f"{reac_rev_suffix}\b", "")
                        .replace(f"{REAC_FWD_SUFFIX}\b", "")
                        .replace(f"{REAC_REV_SUFFIX}\b", "")
                        .replace("\b", "")
                        .split("_")
                    )
                else:
                    identifiers = reaction.gene_reaction_rule.split(" and ")

                k_cat = float(reaction.annotation[f"cobrak_k_cat_{version}"])
                if f"cobrak_k_cat_references_{version}" in reaction.annotation:
                    k_cat_references_raw = literal_eval(
                        reaction.annotation[f"cobrak_k_cat_references_{version}"]
                    )
                    k_cat_references = [
                        ParameterReference(**reference)
                        for reference in k_cat_references_raw
                    ]
                else:
                    k_cat_references = []
                if f"cobrak_k_ms_{version}" in reaction.annotation:
                    k_ms = literal_eval(reaction.annotation[f"cobrak_k_ms_{version}"])
                else:
                    k_ms = {}
                if f"cobrak_k_m_references_{version}" in reaction.annotation:
                    k_m_references_raw = literal_eval(
                        reaction.annotation[f"cobrak_k_m_references_{version}"]
                    )
                    k_m_references = {
                        met_id: [
                            ParameterReference(**reference) for reference in references
                        ]
                        for met_id, references in k_m_references_raw.items()
                    }
                else:
                    k_m_references = {}
                if f"cobrak_k_is_{version}" in reaction.annotation:
                    k_is = literal_eval(reaction.annotation[f"cobrak_k_is_{version}"])
                else:
                    k_is = {}
                if f"cobrak_k_i_references_{version}" in reaction.annotation:
                    k_i_references_raw = literal_eval(
                        reaction.annotation[f"cobrak_k_i_references_{version}"]
                    )
                    k_i_references = {
                        met_id: [
                            ParameterReference(**reference) for reference in references
                        ]
                        for met_id, references in k_i_references_raw.items()
                    }
                else:
                    k_i_references = {}
                if f"cobrak_k_as_{version}" in reaction.annotation:
                    k_as = literal_eval(reaction.annotation[f"cobrak_k_as_{version}"])
                else:
                    k_as = {}
                if f"cobrak_k_a_references_{version}" in reaction.annotation:
                    k_a_references_raw = literal_eval(
                        reaction.annotation[f"cobrak_k_a_references_{version}"]
                    )
                    k_a_references = {
                        met_id: [
                            ParameterReference(**reference) for reference in references
                        ]
                        for met_id, references in k_a_references_raw.items()
                    }
                else:
                    k_a_references = {}
                if f"cobrak_special_stoichiometries_{version}" in reaction.annotation:
                    special_stoichiometries = literal_eval(
                        reaction.annotation[f"cobrak_special_stoichiometries_{version}"]
                    )
                else:
                    special_stoichiometries = {}
                hill_coefficients = HillCoefficients()
                hill_coefficient_references = HillParameterReferences()
                if f"cobrak_hills_kappa_{version}" in reaction.annotation:
                    hill_coefficients.kappa = literal_eval(
                        reaction.annotation[f"cobrak_hills_kappa_{version}"]
                    )
                if f"cobrak_hills_kappa_references_{version}" in reaction.annotation:
                    hill_kappa_references_raw = literal_eval(
                        reaction.annotation[f"cobrak_hills_kappa_references_{version}"]
                    )
                    hill_coefficient_references.kappa = {
                        met_id: [
                            ParameterReference(**reference) for reference in references
                        ]
                        for met_id, references in hill_kappa_references_raw.items()
                    }
                if f"cobrak_hills_iota_{version}" in reaction.annotation:
                    hill_coefficients.iota = literal_eval(
                        reaction.annotation[f"cobrak_hills_iota_{version}"]
                    )
                if f"cobrak_hills_iota_references_{version}" in reaction.annotation:
                    hill_iota_references_raw = literal_eval(
                        reaction.annotation[f"cobrak_hills_iota_references_{version}"]
                    )
                    hill_coefficient_references.iota = {
                        met_id: [
                            ParameterReference(**reference) for reference in references
                        ]
                        for met_id, references in hill_iota_references_raw.items()
                    }
                if f"cobrak_hills_alpha_{version}" in reaction.annotation:
                    hill_coefficients.alpha = literal_eval(
                        reaction.annotation[f"cobrak_hills_alpha_{version}"]
                    )
                if f"cobrak_hills_alpha_references_{version}" in reaction.annotation:
                    hill_alpha_references_raw = literal_eval(
                        reaction.annotation[f"cobrak_hills_alpha_references_{version}"]
                    )
                    hill_coefficient_references.alpha = {
                        met_id: [
                            ParameterReference(**reference) for reference in references
                        ]
                        for met_id, references in hill_alpha_references_raw.items()
                    }
                enzyme_reaction_data = EnzymeReactionData(
                    identifiers=identifiers,
                    k_cat=k_cat,
                    k_cat_references=k_cat_references,
                    k_ms=k_ms,
                    k_m_references=k_m_references,
                    k_is=k_is,
                    k_i_references=k_i_references,
                    k_as=k_as,
                    k_a_references=k_a_references,
                    special_stoichiometries=special_stoichiometries,
                    hill_coefficients=hill_coefficients,
                )
            else:
                if reaction.gene_reaction_rule:
                    identifiers = reaction.gene_reaction_rule.split(" and ")
                    enzyme_reaction_data = (
                        EnzymeReactionData(
                            identifiers=identifiers,
                        )
                        if identifiers != [""]
                        else None
                    )
                else:
                    enzyme_reaction_data = None

            if len(version_data) > 1:
                if version_reac_id.endswith(reac_rev_suffix):
                    stoich_multiplier = -1
                    min_flux = 0.0
                    max_flux = -reaction.lower_bound
                else:
                    stoich_multiplier = +1
                    min_flux = 0.0
                    max_flux = reaction.upper_bound
            else:
                min_flux = reaction.lower_bound
                max_flux = reaction.upper_bound
                stoich_multiplier = +1

            cobrak_reactions[version_reac_id] = Reaction(
                min_flux=min_flux,
                max_flux=max_flux,
                stoichiometries={
                    metabolite.id: stoich_multiplier * value
                    for (metabolite, value) in reaction.metabolites.items()
                    if (not exclude_enzyme_constraints)
                    or (
                        not sum(
                            met_split in gene_ids
                            for met_split in metabolite.id.split("_")
                        )
                    )
                },
                dG0=dG0,
                dG0_uncertainty=dG0_uncertainty,
                enzyme_reaction_data=enzyme_reaction_data,
                annotation={
                    key: literal_eval(value) if "[" in value else value
                    for key, value in reaction.annotation.items()
                    if not key.startswith("cobrak_")
                },
                name=reaction.name,
            )

    cobrak_enzymes: dict[str, Enzyme] = {}
    for gene in cobra_model.genes:
        if "cobrak_mw" in gene.annotation:
            mw = float(gene.annotation["cobrak_mw"])
        else:
            if not deactivate_mw_warning:
                print(
                    f"INFO: No molecular weight given as cobrak_mw annotation for {gene.id}. Setting to standard value {mw_for_enzymes_without_cobrak_mw_annotation}."
                )
                print(
                    " Please change this value later to a reasonable value if you use enzyme constraints, e.g. through COBRA-k's Uniprot functionality."
                )
            mw: float = mw_for_enzymes_without_cobrak_mw_annotation
        if "cobrak_min_conc" in gene.annotation:
            min_conc = float(gene.annotation["cobrak_min_conc"])
        else:
            min_conc = None
        if "cobrak_max_conc" in gene.annotation:
            max_conc = float(gene.annotation["cobrak_max_conc"])
        else:
            max_conc = None
        if "cobrak_sequence" in gene.annotation:
            sequence = gene.annotation["cobrak_sequence"]
        else:
            sequence = ""
        cobrak_enzymes[gene.id] = Enzyme(
            molecular_weight=mw,
            min_conc=min_conc,
            max_conc=max_conc,
            name=gene.name
            if not gene.name.startswith("G_")
            else gene.name[len("G_") :],
            annotation={
                key: value
                for key, value in gene.annotation.items()
                if not key.startswith("cobrak_")
            },
            sequence=sequence,
        )

    # Clean reaction identifiers
    for reaction in cobrak_reactions.values():
        if reaction.enzyme_reaction_data:
            reaction.enzyme_reaction_data.identifiers = [
                identifier
                for identifier in reaction.enzyme_reaction_data.identifiers
                if identifier in cobrak_enzymes
            ]
            if reaction.enzyme_reaction_data.identifiers == []:
                reaction.enzyme_reaction_data = None

    return Model(
        reactions=cobrak_reactions,
        metabolites=cobrak_metabolites,
        enzymes=cobrak_enzymes,
        max_prot_pool=max_prot_pool,
        extra_linear_constraints=extra_linear_constraints,
        extra_linear_watches=extra_linear_watches,
        extra_nonlinear_constraints=extra_nonlinear_constraints,
        extra_nonlinear_watches=extra_nonlinear_watches,
        kinetic_ignored_metabolites=kinetic_ignored_metabolites,
        R=R,
        T=T,
        fwd_suffix=reac_fwd_suffix,
        rev_suffix=reac_rev_suffix,
        reac_enz_separator=reac_enz_separator,
    )

load_annotated_sbml_model_as_cobrak_model(filepath, do_model_fullsplit=True, exclude_enzyme_constraints=True, mw_for_enzymes_without_cobrak_mw_annotation=1000000.0, deactivate_mw_warning=False)

Load an annotated (and also plain un-annotated :-) SBML model from a file and convert it into a COBRAk Model.

This function reads an SBML file containing a metabolic model with specific annotations and converts it into a COBRAk Model. It uses the COBRApy library to read the SBML file and then uses the load_annotated_cobrapy_model_as_cobrak_model function to perform the conversion.

Parameters: - filepath (str): The path to the SBML file containing the annotated metabolic model. - do_model_fullsplit (bool, optional): Whether or not the model shall be "fullsplit" (i.e., any reversible reaction and enzyme reaction variant becomes its own )

  • Model: A COBRAk Model constructed from the annotated SBML model, ready for further kinetic and thermodynamic analyses.
Source code in cobrak/io.py
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
@validate_call
def load_annotated_sbml_model_as_cobrak_model(
    filepath: str,
    do_model_fullsplit: bool = True,
    exclude_enzyme_constraints: bool = True,
    mw_for_enzymes_without_cobrak_mw_annotation: float = 1e6,
    deactivate_mw_warning: bool = False,
) -> Model:
    """
    Load an annotated (and also plain un-annotated :-) SBML model from a file and convert it into a COBRAk Model.

    This function reads an SBML file containing a metabolic model with specific annotations
    and converts it into a COBRAk Model. It uses the COBRApy library to read the SBML
    file and then uses the `load_annotated_cobrapy_model_as_cobrak_model` function to perform
    the conversion.

    Parameters:
    - filepath (str): The path to the SBML file containing the annotated metabolic model.
    - do_model_fullsplit (bool, optional): Whether or not the model shall be "fullsplit" (i.e., any
      reversible reaction and enzyme reaction variant becomes its own )

    Returns:
    - Model: A COBRAk Model constructed from the annotated SBML model, ready for further
      kinetic and thermodynamic analyses.
    """
    if do_model_fullsplit:
        return load_annotated_cobrapy_model_as_cobrak_model(
            get_fullsplit_cobra_model(cobra.io.read_sbml_model(filepath)),
            exclude_enzyme_constraints=exclude_enzyme_constraints,
            mw_for_enzymes_without_cobrak_mw_annotation=mw_for_enzymes_without_cobrak_mw_annotation,
            deactivate_mw_warning=deactivate_mw_warning,
        )
    return load_annotated_cobrapy_model_as_cobrak_model(
        cobra.io.read_sbml_model(filepath),
        exclude_enzyme_constraints=exclude_enzyme_constraints,
        mw_for_enzymes_without_cobrak_mw_annotation=mw_for_enzymes_without_cobrak_mw_annotation,
        deactivate_mw_warning=deactivate_mw_warning,
    )

load_unannotated_sbml_as_cobrapy_model(path)

Loads an unannotated SBML model from a file into a COBRApy model.

This function reads an SBML file that contains a metabolic model without specific annotations and loads it into a COBRApy model object. It utilizes the COBRApy library's read_sbml_model function to perform the loading.

Parameters: - path (str): The file path to the SBML file containing the metabolic model.

Returns: - cobra.Model: A COBRApy model object representing the metabolic network described in the SBML file.

Source code in cobrak/io.py
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
@validate_call
def load_unannotated_sbml_as_cobrapy_model(path: str) -> cobra.Model:
    """Loads an unannotated SBML model from a file into a COBRApy model.

    This function reads an SBML file that contains a metabolic model without specific annotations
    and loads it into a COBRApy model object. It utilizes the COBRApy library's `read_sbml_model`
    function to perform the loading.

    Parameters:
    - path (str): The file path to the SBML file containing the metabolic model.

    Returns:
    - cobra.Model: A COBRApy model object representing the metabolic network described in the SBML file.
    """
    return cobra.io.read_sbml_model(path)

pickle_load(path)

Returns the value of the given pickle file.

Arguments
  • path: str ~ The path to the pickle file.
Source code in cobrak/io.py
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
@validate_call
def pickle_load(path: str) -> Any:  # noqa: ANN401
    """Returns the value of the given pickle file.

    Arguments
    ----------
    * path: str ~ The path to the pickle file.
    """
    with open(path, "rb") as pickle_file:
        return pickle.load(pickle_file)

pickle_write(path, pickled_object)

Writes the given object as pickled file with the given path

Arguments
  • path: str ~ The path of the pickled file that shall be created
  • pickled_object: Any ~ The object which shall be saved in the pickle file
Source code in cobrak/io.py
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def pickle_write(path: str, pickled_object: Any) -> None:  # noqa: ANN401
    """Writes the given object as pickled file with the given path

    Arguments
    ----------
    * path: str ~ The path of the pickled file that shall be created
    * pickled_object: Any ~ The object which shall be saved in the pickle file
    """
    with open(path, "wb") as pickle_file:
        pickle.dump(pickled_object, pickle_file)

save_cobrak_model_as_annotated_sbml_model(cobrak_model, filepath, combine_base_reactions=False, add_enzyme_constraints=False)

Exports a COBRAk model to an annotated SBML file.

This function converts a Model to a COBRApy model and writes it to an SBML file at the specified file path. Optionally, stoichiometric GECKO [1]-like enzyme constraints can be added during the conversion.

[1] Sánchez et al. Molecular systems biology, 13(8), 935. https://doi.org/10.15252/msb.20167411

Parameters:

Name Type Description Default
cobrak_model Model

The Model to be exported.

required
filepath str

The file path where the SBML file will be saved.

required
add_enzyme_constraints bool

Whether to add enzyme constraints during the conversion. Defaults to False.

False
Source code in cobrak/io.py
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
@validate_call
def save_cobrak_model_as_annotated_sbml_model(
    cobrak_model: Model,
    filepath: str,
    combine_base_reactions: bool = False,
    add_enzyme_constraints: bool = False,
) -> None:
    """Exports a COBRAk model to an annotated SBML file.

    This function converts a `Model` to a COBRApy model and writes it to an SBML file at the specified file path.
    Optionally, stoichiometric GECKO [1]-like enzyme constraints can be added during the conversion.

    [1] Sánchez et al. Molecular systems biology, 13(8), 935. https://doi.org/10.15252/msb.20167411

    Args:
        cobrak_model (Model): The `Model` to be exported.
        filepath (str): The file path where the SBML file will be saved.
        add_enzyme_constraints (bool, optional): Whether to add enzyme constraints during the conversion. Defaults to False.
    """
    cobra.io.write_sbml_model(
        convert_cobrak_model_to_annotated_cobrapy_model(
            cobrak_model,
            combine_base_reactions,
            add_enzyme_constraints,
        ),
        filepath,
    )

standardize_folder(folder)

Returns for the given folder path is returned in a more standardized way.

I.e., folder paths with potential \ are replaced with /. In addition, if a path does not end with / will get an added /. If the given folder path is empty (''), it returns just ''.

Argument
  • folder: str ~ The folder path that shall be standardized.
Source code in cobrak/io.py
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
@validate_call
def standardize_folder(folder: str) -> str:
    """Returns for the given folder path is returned in a more standardized way.

    I.e., folder paths with potential \\ are replaced with /. In addition, if
    a path does not end with / will get an added /.
    If the given folder path is empty (''), it returns just ''.

    Argument
    ----------
    * folder: str ~ The folder path that shall be standardized.
    """
    # Catch empty folders as they don't need to be standardized
    if not folder:
        return ""

    # Standardize for \ or / as path separator character.
    folder = folder.replace("\\", "/")

    # If the last character is not a path separator, it is
    # added so that all standardized folder path strings
    # contain it.
    if folder[-1] != "/":
        folder += "/"

    return folder

lps

COBRAk LPs and MILPs.

This file contains all linear program (LP) and mixed-integer linear program (MILP) functions that can be used with COBRAk models. With LP, one can integrate stoichiometric and enzymatic constraints. With MILP, one can additionally integrate thermodynamic constraints. For non-linear-programs (NLP), see nlps.py in the same folder.

add_flux_sum_var(model, cobrak_model)

Add a flux sum variable to a (N/MI)LP model.

This function introduces a flux sum variable to a given (N/MI)LP Pyomo model, which represents the total sum of absolute fluxes across all reactions in the COBRAk model. The methodology is based on the pFBA (Parsimonious Flux Balance Analysis) approach [1].

[1] Lewis et al. Molecular systems biology 6.1 (2010): 390. https://doi.org/10.1038/msb.2010.47

Parameters:

Name Type Description Default
model ConcreteModel

The Pyomo instance of the (N/MI)LP model.

required
cobrak_model Model

The associated metabolic model containing reaction data.

required

Returns:

Name Type Description
ConcreteModel ConcreteModel

The modified Pyomo model with the added flux sum variable and constraint.

Source code in cobrak/lps.py
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def add_flux_sum_var(model: ConcreteModel, cobrak_model: Model) -> ConcreteModel:
    """Add a flux sum variable to a (N/MI)LP model.

    This function introduces a flux sum variable to a given (N/MI)LP Pyomo model, which represents
    the total sum of absolute fluxes across all reactions in the COBRAk model. The methodology is based on
    the pFBA (Parsimonious Flux Balance Analysis) approach [1].

    [1] Lewis et al. Molecular systems biology 6.1 (2010): 390. https://doi.org/10.1038/msb.2010.47

    Args:
        model (ConcreteModel): The Pyomo instance of the (N/MI)LP model.
        cobrak_model (Model): The associated metabolic model containing reaction data.

    Returns:
        ConcreteModel: The modified Pyomo model with the added flux sum variable and constraint.
    """
    flux_sum_expr = 0.0
    for reac_id in cobrak_model.reactions:
        try:
            flux_sum_expr += getattr(model, reac_id)
        except AttributeError:
            continue

    setattr(model, FLUX_SUM_VAR_ID, Var(within=Reals, bounds=(0.0, 1e9)))
    setattr(
        model,
        "FLUX_SUM_CONSTRAINT",
        Constraint(rule=getattr(model, FLUX_SUM_VAR_ID) == flux_sum_expr),
    )

    return model

add_loop_constraints_to_lp(model, cobrak_model, only_nonthermodynamic, ignored_reacs=[])

Add mixed-integer loop constraints to a (N/MI)LP model to prevent thermodynamically infeasible cycles.

This function incorporates loop constraints into a given (N/MI)LP Pyomo model based on the COBRAk model's reaction data. It follows the ll-COBRA methodology described in [1] to prevent the formation of thermodynamically infeasible cycles in metabolic networks.

[1] Schellenberger et al. (2011). Biophysical journal, 100(3), 544-553. https://doi.org/10.1016/j.bpj.2010.12.3707

Parameters:

Name Type Description Default
model ConcreteModel

The Pyomo instance of the (N/MI)LP model.

required
cobrak_model Model

The associated metabolic model containing reaction data.

required
only_nonthermodynamic bool

If True, only add constraints to reactions without thermodynamic data.

required

Returns:

Name Type Description
ConcreteModel ConcreteModel

The modified Pyomo model with added loop constraints.

Source code in cobrak/lps.py
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def add_loop_constraints_to_lp(
    model: ConcreteModel,
    cobrak_model: Model,
    only_nonthermodynamic: bool,
    ignored_reacs: list[str] = [],
) -> ConcreteModel:
    """Add mixed-integer loop constraints to a (N/MI)LP model to prevent thermodynamically infeasible cycles.

    This function incorporates loop constraints into a given (N/MI)LP Pyomo model based on the COBRAk model's
    reaction data. It follows the ll-COBRA methodology described in [1] to prevent the formation
    of thermodynamically infeasible cycles in metabolic networks.

    [1] Schellenberger et al. (2011). Biophysical journal, 100(3), 544-553. https://doi.org/10.1016/j.bpj.2010.12.3707

    Args:
        model (ConcreteModel): The Pyomo instance of the (N/MI)LP model.
        cobrak_model (Model): The associated metabolic model containing reaction data.
        only_nonthermodynamic (bool): If True, only add constraints to reactions without thermodynamic data.

    Returns:
        ConcreteModel: The modified Pyomo model with added loop constraints.
    """
    base_id_constraints: dict[str, Expression] = {}
    num_elements_per_constraint = {}
    for reac_id, reaction in cobrak_model.reactions.items():
        if reac_id in ignored_reacs:
            continue
        if (only_nonthermodynamic) and (reaction.dG0 is not None):
            continue

        base_id = get_base_id(reac_id, cobrak_model.fwd_suffix, cobrak_model.rev_suffix)
        if base_id not in base_id_constraints:
            base_id_constraints[base_id] = 0.0
            num_elements_per_constraint[base_id] = 0

        zv_var_id = "zV_var_" + reac_id
        setattr(model, zv_var_id, Var(within=Binary))
        setattr(
            model,
            reac_id + "_base",
            Constraint(
                rule=getattr(model, reac_id) <= BIG_M * getattr(model, zv_var_id)
            ),
        )

        base_id_constraints[base_id] += getattr(model, zv_var_id)
        num_elements_per_constraint[base_id] += 1

    for base_id, constraint_lhs in base_id_constraints.items():
        if num_elements_per_constraint[base_id] > 1:
            setattr(
                model,
                base_id + "_base_constraint",
                Constraint(rule=constraint_lhs <= 1.0),
            )

    return model

get_lp_from_cobrak_model(cobrak_model, with_enzyme_constraints, with_thermodynamic_constraints, with_loop_constraints, with_flux_sum_var=False, ignored_reacs=[], min_mdf=STANDARD_MIN_MDF, add_thermobottleneck_analysis_vars=False, strict_kappa_products_equality=False, add_extra_linear_constraints=True, correction_config=CorrectionConfig(), ignore_nonlinear_terms=False)

Construct a linear programming (LP) model from a COBRAk model with various constraints and configurations.

This function creates a steady-state LP model from the provided COBRAk Model and enhances it with different types of constraints and variables based on the specified parameters. It allows for the inclusion of enzyme constraints, thermodynamic constraints, loop constraints, and additional linear constraints. Furthermore, it supports the addition of flux sum variables and error handling configurations.

See the following chapters of COBRAk's documentation for more on these constraints:

  • Steady-state and extra linear constraints ⇒ Chapter "Linear Programs"
  • Enzyme constraints ⇒ Chapter "Linear Programs"
  • Thermodynamic constraints ⇒ Chapter "Mixed-Integer Linear Programs"
Parameters

cobrak_model : Model The COBRAk Model from which to construct the LP model. with_enzyme_constraints : bool If True, adds enzyme-pool constraints to the model. with_thermodynamic_constraints : bool If True, adds thermodynamic MILP constraints to the model, ensuring that reaction fluxes are thermodynamically feasible by considering Gibbs free energy changes. with_loop_constraints : bool If True, adds loop constraints to prevent or control flux loops in the metabolic network. This constraint makes the LP a MILP as a binary variable controls whether either the forward or the reverse reaction is running. with_flux_sum_var : bool, optional If True, adds a flux sum variable to the model, which aggregates the total flux through all reactions for optimization or analysis purposes. Defaults to False. ignored_reacs : list[str], optional List of reaction IDs to ignore in the model, which will be excluded. Defaults to []. min_mdf : float, optional Minimum value for Max-Min Driving Force (MDF). Only relevant with thermodynamic constraints. Defaults to STANDARD_MIN_MDF. add_thermobottleneck_analysis_vars : bool, optional If True, adds variables for thermodynamic bottleneck analysis, helping to identify potential bottlenecks in the metabolic network where thermodynamic constraints might limit flux. Defaults to False. strict_kappa_products_equality : bool, optional If True, enforces strict equality for kappa products, ensuring consistency in thermodynamic parameters related to reaction products. Defaults to False. add_extra_linear_constraints : bool, optional If True, adds extra linear constraints from the COBRAk Mmodel, allowing for additional linear constraints. Defaults to True. correction_config : CorrectionConfig, optional Configuration for parameter correction handling in the model, allowing for the inclusion of error terms in constraints related to enzyme activity, thermodynamics, etc. Defaults to CorrectionConfig(). ignore_nonlinear_terms: bool, optional Whether or not non-linear extra watches and constraints shall not be included. Defaults to False. Note: If such non-linear values exist and are included, the whole problem becomes non-linear, making it incompatible with any purely linear solver!

Returns

ConcreteModel The constructed LP model with the specified constraints and configurations.

Source code in cobrak/lps.py
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
@validate_call
def get_lp_from_cobrak_model(
    cobrak_model: Model,
    with_enzyme_constraints: bool,
    with_thermodynamic_constraints: bool,
    with_loop_constraints: bool,
    with_flux_sum_var: bool = False,
    ignored_reacs: list[str] = [],
    min_mdf: float = STANDARD_MIN_MDF,
    add_thermobottleneck_analysis_vars: bool = False,
    strict_kappa_products_equality: bool = False,
    add_extra_linear_constraints: bool = True,
    correction_config: CorrectionConfig = CorrectionConfig(),
    ignore_nonlinear_terms: bool = False,
) -> ConcreteModel:
    """Construct a linear programming (LP) model from a COBRAk model with various constraints and configurations.

    This function creates a steady-state LP model from the provided COBRAk Model and enhances it with
    different types of constraints and variables based on the specified parameters. It allows for the
    inclusion of enzyme constraints, thermodynamic constraints, loop constraints, and additional
    linear constraints. Furthermore, it supports the addition of flux sum variables and error handling
    configurations.

    See the following chapters of COBRAk's documentation for more on these constraints:

    * Steady-state and extra linear constraints ⇒ Chapter "Linear Programs"
    * Enzyme constraints ⇒ Chapter "Linear Programs"
    * Thermodynamic constraints ⇒ Chapter "Mixed-Integer Linear Programs"

    Parameters
    ----------
    cobrak_model : Model
        The COBRAk Model from which to construct the LP model.
    with_enzyme_constraints : bool
        If True, adds enzyme-pool constraints to the model.
    with_thermodynamic_constraints : bool
        If True, adds thermodynamic MILP constraints to the model, ensuring that reaction fluxes are
        thermodynamically feasible by considering Gibbs free energy changes.
    with_loop_constraints : bool
        If True, adds loop constraints to prevent or control flux loops in the metabolic network.
        This constraint makes the LP a MILP as a binary variable controls whether either the
        forward or the reverse reaction is running.
    with_flux_sum_var : bool, optional
        If True, adds a flux sum variable to the model, which aggregates the total flux through
        all reactions for optimization or analysis purposes. Defaults to False.
    ignored_reacs : list[str], optional
        List of reaction IDs to ignore in the model, which will be excluded. Defaults to [].
    min_mdf : float, optional
        Minimum value for Max-Min Driving Force (MDF). Only relevant with thermodynamic
        constraints. Defaults to STANDARD_MIN_MDF.
    add_thermobottleneck_analysis_vars : bool, optional
        If True, adds variables for thermodynamic bottleneck analysis, helping to identify
        potential bottlenecks in the metabolic network where thermodynamic constraints might limit
        flux. Defaults to False.
    strict_kappa_products_equality : bool, optional
        If True, enforces strict equality for kappa products, ensuring consistency in
        thermodynamic parameters related to reaction products. Defaults to False.
    add_extra_linear_constraints : bool, optional
        If True, adds extra linear constraints from the COBRAk Mmodel, allowing for additional
        linear constraints. Defaults to True.
    correction_config : CorrectionConfig, optional
        Configuration for parameter correction handling in the model, allowing for the inclusion of error terms
        in constraints related to enzyme activity, thermodynamics, etc. Defaults to CorrectionConfig().
    ignore_nonlinear_terms: bool, optional
        Whether or not non-linear extra watches and constraints shall *not* be included. Defaults to False.
        Note: If such non-linear values exist and are included, the whole problem becomes *non-linear*, making it
        incompatible with any purely linear solver!

    Returns
    -------
    ConcreteModel
        The constructed LP model with the specified constraints and configurations.
    """
    # Initialize the steady-state LP model from the COBRA model, ignoring specified reactions
    model: ConcreteModel = _get_steady_state_lp_from_cobrak_model(
        cobrak_model=cobrak_model,
        ignored_reacs=ignored_reacs,
    )

    # Add enzyme constraints if enabled
    if with_enzyme_constraints:
        model = _add_enzyme_constraints_to_lp(
            model=model,
            cobrak_model=cobrak_model,
            ignored_reacs=ignored_reacs,
            add_error_term=correction_config.add_kcat_times_e_error_term,
            error_cutoff=correction_config.kcat_times_e_error_cutoff,
            max_rel_correction=correction_config.max_rel_kcat_times_e_correction,
        )

    # Add thermodynamic constraints if enabled
    if with_thermodynamic_constraints:
        model = _add_thermodynamic_constraints_to_lp(
            model=model,
            cobrak_model=cobrak_model,
            add_thermobottleneck_analysis_vars=add_thermobottleneck_analysis_vars,
            min_mdf=min_mdf,
            strict_kappa_products_equality=strict_kappa_products_equality,
            add_dG0_error_term=correction_config.add_dG0_error_term,
            dG0_error_cutoff=correction_config.dG0_error_cutoff,
            max_abs_dG0_correction=correction_config.max_abs_dG0_correction,
            add_km_error_term=correction_config.add_km_error_term,
            km_error_cutoff=correction_config.km_error_cutoff,
            max_rel_km_correction=correction_config.max_rel_km_correction,
            ignored_reacs=ignored_reacs,
        )

        if (
            cobrak_model.max_conc_sum < float("inf")
            or cobrak_model.include_mets_in_prot_pool
        ):
            model = _add_conc_sum_constraints(cobrak_model, model)

    # Add loop constraints if enabled
    if with_loop_constraints:
        model = add_loop_constraints_to_lp(
            model,
            cobrak_model,
            only_nonthermodynamic=with_thermodynamic_constraints,
            ignored_reacs=ignored_reacs,
        )

    # Add flux sum variable if enabled
    if with_flux_sum_var:
        model = add_flux_sum_var(
            model,
            cobrak_model,
        )

    # Apply error scenarios and add error sum term if error handling is configured
    if is_any_error_term_active(correction_config):
        if correction_config.error_scenario != {}:
            _apply_error_scenario(
                model,
                cobrak_model,
                correction_config,
            )

        if correction_config.add_error_sum_term:
            model = _add_error_sum_to_model(
                model,
                cobrak_model,
                correction_config,
            )

    # Add extra linear constraints if enabled
    if add_extra_linear_constraints:
        model = _add_extra_watches_and_constraints_to_lp(
            model=model,
            cobrak_model=cobrak_model,
            ignore_nonlinear_terms=ignore_nonlinear_terms,
        )

    return model

perform_lp_dG0_varying_thermodynamic_bottleneck_analysis(cobrak_model, dG0_variation=-100, min_mdf_advantage=1e-06, with_enzyme_constraints=False, solver=SCIP, ignore_nonlinear_terms=False, verbose=False, parallel_verbosity_level=0)

Perform thermodynamic bottleneck analysis on a COBRA-k model using mixed-integer linear programming with ΔG'° variations.

This is an alternative to perform_lp_thermodynamic_bottleneck_analysis.

This function identifies the current set of thermodynamic bottlenecks in a COBRAk model by lowering the ΔG'° of each one reaction by the given factor (in kJ/mol). Typically, the minimal MDF to be reached would be a previously calculated optimal network-wide MDF (also called OptMDF). The basic methology was first described in [1]. To prevent thermodynamic cycles, the ΔG'° of potential reverse reactions is raised by the amount the one ΔG'° was lowered. To speed up calculations, this bottleneck analysis is performed in a parallelized fashion.

[1] Bekiaris et al. (2021). PLOS Computational Biology, 14(1), https://doi.org/10.1371/journal.pcbi.1009093

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAk model to analyze for thermodynamic bottlenecks.

required
dG0_variation float

The amount in kJ/mol by which a reaction's ΔG'° is lowered. Defaults to -100.

-100
min_mdf_advantage float

The minimal OptMDF advantage through weakening tbhis bottleneck. Defaults to 1e-6.

1e-06
with_enzyme_constraints bool

Whether to include enzyme constraints in the analysis.

False
verbose bool

If True, print immediate information about identified bottlenecks. Defaults to False.

False
solver Solver

The COBRA-k Solver instance describing the used MILP solver. Defaults to SCIP.

SCIP
parallel_verbosity_level int

Sets the verbosity level for the analysis parallelization. The higher, the value, the more is printed. Default: 0.

0
ignore_nonlinear_terms bool

bool, optional Whether or not non-linear extra watches and constraints shall not be included. Defaults to False. Note: If such non-linear values exist and are included, the whole problem becomes non-linear, making it incompatible with any purely linear solver!

False

Returns:

Type Description
list[str]

list[str]: A list of reaction IDs identified as thermodynamic bottlenecks.

Source code in cobrak/lps.py
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
@validate_call(validate_return=True)
def perform_lp_dG0_varying_thermodynamic_bottleneck_analysis(
    cobrak_model: Model,
    dG0_variation: float = -100,
    min_mdf_advantage: float = 1e-6,
    with_enzyme_constraints: bool = False,
    solver: Solver = SCIP,
    ignore_nonlinear_terms: bool = False,
    verbose: bool = False,
    parallel_verbosity_level: int = 0,
) -> list[str]:
    """Perform thermodynamic bottleneck analysis on a COBRA-k model using mixed-integer linear programming *with ΔG'° variations*.

    This is an alternative to ```perform_lp_thermodynamic_bottleneck_analysis```.

    This function identifies the *current* set of thermodynamic bottlenecks in a COBRAk model by lowering the ΔG'° of each
    one reaction by the given factor (in kJ/mol). Typically, the minimal MDF to be reached would be a previously calculated
    optimal network-wide MDF (also called OptMDF). The basic methology was first described in [1].
    To prevent thermodynamic cycles, the ΔG'° of potential reverse reactions is raised by the amount the one ΔG'° was lowered.
    To speed up calculations, this bottleneck analysis is performed in a parallelized fashion.

    [1] Bekiaris et al. (2021). PLOS Computational Biology, 14(1), https://doi.org/10.1371/journal.pcbi.1009093

    Args:
        cobrak_model (Model): The COBRAk model to analyze for thermodynamic bottlenecks.
        dG0_variation (float, optional): The amount in kJ/mol by which a reaction's ΔG'° is lowered. Defaults to -100.
        min_mdf_advantage (float, optional): The minimal OptMDF advantage through weakening tbhis bottleneck. Defaults to 1e-6.
        with_enzyme_constraints (bool, optional): Whether to include enzyme constraints in the analysis.
        verbose (bool, optional): If True, print immediate information about identified bottlenecks. Defaults to False.
        solver (Solver, optional): The COBRA-k Solver instance describing the used MILP solver. Defaults to SCIP.
        parallel_verbosity_level (int, optional): Sets the verbosity level for the analysis parallelization. The higher,
                                            the value, the more is printed. Default: 0.
        ignore_nonlinear_terms: bool, optional
            Whether or not non-linear extra watches and constraints shall *not* be included. Defaults to False.
            Note: If such non-linear values exist and are included, the whole problem becomes *non-linear*, making it
            incompatible with any purely linear solver!

    Returns:
        list[str]: A list of reaction IDs identified as thermodynamic bottlenecks.
    """
    cobrak_model = deepcopy(cobrak_model)

    old_mdf = perform_lp_optimization(
        cobrak_model=cobrak_model,
        objective_target=MDF_VAR_ID,
        objective_sense=+1,
        with_enzyme_constraints=with_enzyme_constraints,
        with_thermodynamic_constraints=True,
        solver=solver,
        ignore_nonlinear_terms=ignore_nonlinear_terms,
    )[OBJECTIVE_VAR_NAME]

    target_reac_ids = [
        reac_id
        for reac_id, reac in cobrak_model.reactions.items()
        if reac.dG0 is not None
    ]
    results: list[str] = Parallel(n_jobs=-1, verbose=parallel_verbosity_level)(
        delayed(_batch_dG0_varying_bottleneck_calculation)(
            solver,
            old_mdf,
            min_mdf_advantage,
            dG0_variation,
            cobrak_model,
            with_enzyme_constraints,
            target_reac_id,
            verbose,
            ignore_nonlinear_terms,
        )
        for target_reac_id in target_reac_ids
    )
    return [reac_id for reac_id in results if len(reac_id) > 0]

perform_lp_min_active_reactions_analysis(cobrak_model, with_enzyme_constraints, variability_dict, min_mdf=0.0, verbose=False, solver=SCIP, ignore_nonlinear_terms=False)

Run a mixed-integer linear program to determine the minimum number of active reactions.

This function constructs and solves a mixed-integer linear programming model to find the minimum number of reactions that need to be active to satisfy the given variability constraints. It uses a binary variable for each reaction to indicate whether it is active, and the objective is to minimize the sum of these binary variables. The model includes constraints based on enzyme data, thermodynamic feasibility, and loop prevention, depending on the specified parameters.

Parameters

cobrak_model : Model The COBRA model containing the metabolic network and reaction data. with_enzyme_constraints : bool If True, includes enzyme-pool constraints in the model. variability_dict : dict[str, tuple[float, float]] A dictionary where keys are reaction IDs and values are tuples specifying (lower bound, upper bound) for reaction fluxes. min_mdf : float, optional Minimum value for Min-Max Driving Force (MDF), setting a lower bound on fluxes. Defaults to 0.0. verbose : bool, optional If True, enables solver output. Defaults to False. solver: Solver The MILP solver used for this analysis. Defaults to SCIP. ignore_nonlinear_terms: bool, optional Whether or not non-linear extra watches and constraints shall not be included. Defaults to False. Note: If such non-linear values exist and are included, the whole problem becomes non-linear, making it incompatible with any purely linear solver!

Returns

float The minimum number of active reactions required to satisfy the constraints.

Source code in cobrak/lps.py
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
@validate_call(validate_return=True)
def perform_lp_min_active_reactions_analysis(
    cobrak_model: Model,
    with_enzyme_constraints: bool,
    variability_dict: dict[str, tuple[float, float]],
    min_mdf: float = 0.0,
    verbose: bool = False,
    solver: Solver = SCIP,
    ignore_nonlinear_terms: bool = False,
) -> float:
    """Run a mixed-integer linear program to determine the minimum number of active reactions.

    This function constructs and solves a mixed-integer linear programming model to find the minimum number of
    reactions that need to be active to satisfy the given variability constraints. It uses a binary
    variable for each reaction to indicate whether it is active, and the objective is to minimize
    the sum of these binary variables. The model includes constraints based on enzyme data,
    thermodynamic feasibility, and loop prevention, depending on the specified parameters.

    Parameters
    ----------
    cobrak_model : Model
        The COBRA model containing the metabolic network and reaction data.
    with_enzyme_constraints : bool
        If True, includes enzyme-pool constraints in the model.
    variability_dict : dict[str, tuple[float, float]]
        A dictionary where keys are reaction IDs and values are tuples specifying (lower bound,
        upper bound) for reaction fluxes.
    min_mdf : float, optional
        Minimum value for Min-Max Driving Force (MDF), setting a lower bound on fluxes.
        Defaults to 0.0.
    verbose : bool, optional
        If True, enables solver output. Defaults to False.
    solver: Solver
        The MILP solver used for this analysis. Defaults to SCIP.
    ignore_nonlinear_terms: bool, optional
        Whether or not non-linear extra watches and constraints shall *not* be included. Defaults to False.
        Note: If such non-linear values exist and are included, the whole problem becomes *non-linear*, making it
        incompatible with any purely linear solver!

    Returns
    -------
    float
        The minimum number of active reactions required to satisfy the constraints.
    """
    # Create a deep copy of the COBRAk model to avoid modifying the original model
    cobrak_model = deepcopy(cobrak_model)

    # Remove reactions that are not present in the variability dictionary
    minz_cobrak_model = delete_unused_reactions_in_variability_dict(
        cobrak_model, variability_dict
    )

    # Construct the LP model with the specified constraints
    minz_model, _ = get_lp_from_cobrak_model(
        minz_cobrak_model,
        with_enzyme_constraints=with_enzyme_constraints,
        with_thermodynamic_constraints=True,
        with_loop_constraints=False,
        min_mdf=min_mdf,
        ignore_nonlinear_terms=ignore_nonlinear_terms,
    )

    # Initialize the sum of binary variables to zero
    extrazsum_expression = 0.0

    # Iterate over all potentially active reactions
    for reac_id in get_potentially_active_reactions_in_variability_dict(
        cobrak_model, variability_dict
    ):
        # Create a binary variable for each reaction to indicate activity
        extraz_varname = f"extraz_var_{reac_id}"
        setattr(minz_model, extraz_varname, Var(within=Binary))

        # Add a constraint to relate reaction flux to the binary variable
        setattr(
            minz_model,
            f"extraz_const_{reac_id}",
            Constraint(
                rule=getattr(minz_model, reac_id)
                <= BIG_M * getattr(minz_model, extraz_varname)
            ),
        )

        # Accumulate the binary variables in the sum expression
        extrazsum_expression += getattr(minz_model, extraz_varname)

    # Add a variable to represent the total sum of active reactions
    setattr(minz_model, "extrazsum", Var(within=Reals))

    # Add a constraint to equate the sum variable to the sum expression
    setattr(
        minz_model,
        "extrazsum_const",
        Constraint(rule=getattr(minz_model, "extrazsum") == extrazsum_expression),
    )

    # Set the objective function to minimize the number of active reactions
    minz_model.obj = get_objective(minz_model, "extrazsum", minimize)

    # Initialize the solver with the specified options and attributes
    solver = get_solver(solver)

    # Solve the LP model
    solver.solve(minz_model, tee=verbose, **solver.solve_extra_options)

    # Retrieve the solution as a dictionary
    minz_dict = get_pyomo_solution_as_dict(minz_model)

    # Return the minimum number of active reactions
    return minz_dict["extrazsum"]

perform_lp_optimization(cobrak_model, objective_target, objective_sense, with_enzyme_constraints=False, with_thermodynamic_constraints=False, with_loop_constraints=False, variability_dict={}, ignored_reacs=[], min_mdf=STANDARD_MIN_MDF, verbose=False, with_flux_sum_var=False, solver=SCIP, ignore_nonlinear_terms=False, correction_config=CorrectionConfig(), var_data_abs_epsilon=1e-05)

Perform linear programming optimization on a COBRAk model to determine flux distributions.

This function constructs and solves an LP problem for the given metabolic model using specified constraints, variables, and solver options. It supports various types of constraints such as enzyme constraints, thermodynamic constraints, and loop constraints. Additionally, it can handle variability dictionaries and ignored reactions.

Parameters:

Name Type Description Default
cobrak_model Model

A COBRAk Model object representing the metabolic network.

required
objective_target str | dict[str, float]

The target for optimization. Can be a reaction ID if optimizing a single reaction or a dictionary specifying flux values for multiple reactions.

required
objective_sense int

The sense of the optimization problem (+1: maximize, -1: minimize).

required
with_enzyme_constraints bool

Whether to include enzyme constraints in the model. Defaults to False.

False
with_thermodynamic_constraints bool

Whether to include thermodynamic constraints in the model. Defaults to False.

False
with_loop_constraints bool

Whether to include loop closure constraints in the model. Defaults to False.

False
variability_dict dict[str, tuple[float, float]]

A dictionary specifying variable bounds for reactions or metabolites. Defaults to an empty dict.

{}
ignored_reacs list[str]

List of reaction IDs to deactivate during optimization. Defaults to an empty list.

[]
min_mdf float

Minimum metabolic distance factor threshold for thermodynamic constraints. Defaults to STANDARD_MIN_MDF.

STANDARD_MIN_MDF
verbose bool

Whether to print solver output information. Defaults to False.

False
with_flux_sum_var bool

Whether to include flux sum variable in the model. Defaults to False.

False
solver Solver

Solver used for LP. Default is SCIP.

SCIP
ignore_nonlinear_terms bool

(bool): Whether or not non-linear watches/constraints shall be ignored in ecTFBAs. Defaults to True. Note: If such non-linear values exist and are included, the whole problem becomes non-linear, making it incompatible with any purely linear solver!

False
correction_config CorrectionConfig

Configuration for handling prameter corrections and scenarios during optimization.

CorrectionConfig()
var_data_abs_epsilon float

(float, optional): Under this value, any data given by the variability dict is considered to be 0. Defaults to 1e-5.

1e-05

Returns:

Type Description
dict[str, float]

dict[str, float]: A dictionary containing the flux distribution results for each reaction in the model.

Source code in cobrak/lps.py
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
@validate_call
def perform_lp_optimization(
    cobrak_model: Model,
    objective_target: str | dict[str, float],
    objective_sense: int,
    with_enzyme_constraints: bool = False,
    with_thermodynamic_constraints: bool = False,
    with_loop_constraints: bool = False,
    variability_dict: dict[str, tuple[float, float]] = {},
    ignored_reacs: list[str] = [],
    min_mdf: float = STANDARD_MIN_MDF,
    verbose: bool = False,
    with_flux_sum_var: bool = False,
    solver: Solver = SCIP,
    ignore_nonlinear_terms: bool = False,
    correction_config: CorrectionConfig = CorrectionConfig(),
    var_data_abs_epsilon: float = 1e-5,
) -> dict[str, float]:
    """Perform linear programming optimization on a COBRAk model to determine flux distributions.

    This function constructs and solves an LP problem for the given metabolic model using specified constraints,
    variables, and solver options. It supports various types of constraints such as enzyme constraints, thermodynamic
    constraints, and loop constraints. Additionally, it can handle variability dictionaries and ignored reactions.

    Parameters:
        cobrak_model (Model): A COBRAk Model object representing the metabolic network.
        objective_target (str | dict[str, float]): The target for optimization. Can be a reaction ID if optimizing a single
            reaction or a dictionary specifying flux values for multiple reactions.
        objective_sense (int): The sense of the optimization problem (+1: maximize, -1: minimize).
        with_enzyme_constraints (bool, optional): Whether to include enzyme constraints in the model. Defaults to False.
        with_thermodynamic_constraints (bool, optional): Whether to include thermodynamic constraints in the model.
            Defaults to False.
        with_loop_constraints (bool, optional): Whether to include loop closure constraints in the model. Defaults to False.
        variability_dict (dict[str, tuple[float, float]], optional): A dictionary specifying variable bounds for reactions
            or metabolites. Defaults to an empty dict.
        ignored_reacs (list[str], optional): List of reaction IDs to deactivate during optimization. Defaults to an empty list.
        min_mdf (float, optional): Minimum metabolic distance factor threshold for thermodynamic constraints. Defaults to STANDARD_MIN_MDF.
        verbose (bool, optional): Whether to print solver output information. Defaults to False.
        with_flux_sum_var (bool, optional): Whether to include flux sum variable in the model. Defaults to False.
        solver (Solver, optional): Solver used for LP. Default is SCIP.
        ignore_nonlinear_terms: (bool): Whether or not non-linear watches/constraints shall be ignored in ecTFBAs. Defaults to True.
            Note: If such non-linear values exist and are included, the whole problem becomes *non-linear*, making it incompatible with any
            purely linear solver!
        correction_config (CorrectionConfig, optional): Configuration for handling prameter corrections and scenarios during optimization.
        var_data_abs_epsilon: (float, optional): Under this value, any data given by the variability dict is considered to be 0. Defaults to 1e-5.

    Returns:
        dict[str, float]: A dictionary containing the flux distribution results for each reaction in the model.
    """
    optimization_cobrak_model = deepcopy(cobrak_model)
    if variability_dict != {}:
        optimization_cobrak_model = delete_unused_reactions_in_variability_dict(
            cobrak_model,
            variability_dict,
        )
    optimization_model = get_lp_from_cobrak_model(
        cobrak_model=optimization_cobrak_model,
        with_enzyme_constraints=with_enzyme_constraints,
        with_thermodynamic_constraints=with_thermodynamic_constraints,
        with_loop_constraints=with_loop_constraints,
        with_flux_sum_var=with_flux_sum_var,
        min_mdf=min_mdf,
        ignore_nonlinear_terms=ignore_nonlinear_terms,
        correction_config=correction_config,
    )

    for deactivated_reaction in set(ignored_reacs):
        try:
            setattr(
                optimization_model,
                f"DEACTIVATE_{deactivated_reaction}",
                Constraint(
                    expr=getattr(optimization_model, deactivated_reaction) == 0.0
                ),
            )
        except AttributeError:
            continue

    optimization_model = apply_variability_dict(
        optimization_model,
        cobrak_model,
        variability_dict,
        correction_config.error_scenario,
        abs_epsilon=var_data_abs_epsilon,
    )
    optimization_model.obj = get_objective(
        optimization_model, objective_target, objective_sense
    )

    pyomo_solver = get_solver(solver)
    results = pyomo_solver.solve(
        optimization_model, tee=verbose, **solver.solve_extra_options
    )

    fba_dict = get_pyomo_solution_as_dict(optimization_model)

    return add_statuses_to_optimziation_dict(fba_dict, results)

perform_lp_thermodynamic_bottleneck_analysis(cobrak_model, with_enzyme_constraints=False, min_mdf=STANDARD_MIN_MDF, verbose=False, solver=SCIP, ignore_nonlinear_terms=False)

Perform thermodynamic bottleneck analysis on a COBRAk model using mixed-integer linear programming.

This function identifies a minimal set of thermodynamic bottlenecks in a COBRAk model by minimizing the sum of newly introduced binary variables that indicate bottleneck reactions, i.e. reactions that do not allow the max-min driving force (MDF) to become at least the set min_mdf. This methology was first described in [1]. Keep in mind that results from this function are optimal, but not neccessarily unique!

[1] Bekiaris et al. (2023). Nature Communications, 14(1), 4660. https://doi.org/10.1038/s41467-023-40297-8

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAk model to analyze for thermodynamic bottlenecks.

required
with_enzyme_constraints bool

Whether to include enzyme constraints in the analysis.

False
min_mdf float

Minimum max-min driving force (MDF) to be enforced. Defaults to STANDARD_MIN_MDF.

STANDARD_MIN_MDF
verbose bool

If True, print detailed information about identified bottlenecks. Defaults to False.

False
solver Solver

The COBRA-k Solver instance of the MILP solver. Defaults to "SCIP".

SCIP
ignore_nonlinear_terms bool

bool, optional Whether or not non-linear extra watches and constraints shall not be included. Defaults to False. Note: If such non-linear values exist and are included, the whole problem becomes non-linear, making it incompatible with any purely linear solver!

False

Returns:

Type Description
list[str]

list[str]: A list of reaction IDs identified as thermodynamic bottlenecks.

dict[str, float]

tuple[str, float]: The MILP solution of this bottleneck search process.

Source code in cobrak/lps.py
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
@validate_call(validate_return=True)
def perform_lp_thermodynamic_bottleneck_analysis(
    cobrak_model: Model,
    with_enzyme_constraints: bool = False,
    min_mdf: float = STANDARD_MIN_MDF,
    verbose: bool = False,
    solver: Solver = SCIP,
    ignore_nonlinear_terms: bool = False,
) -> tuple[list[str], dict[str, float]]:
    """Perform thermodynamic bottleneck analysis on a COBRAk model using mixed-integer linear programming.

    This function identifies a minimal set of thermodynamic bottlenecks in a COBRAk model by minimizing the sum of
    newly introduced binary variables that indicate bottleneck reactions, i.e. reactions that do not allow the
    max-min driving force (MDF) to become at least the set min_mdf.
    This methology was first described in [1]. Keep in mind that results from this function are optimal, but not
    neccessarily unique!

    [1] Bekiaris et al. (2023). Nature Communications, 14(1), 4660.  https://doi.org/10.1038/s41467-023-40297-8

    Args:
        cobrak_model (Model): The COBRAk model to analyze for thermodynamic bottlenecks.
        with_enzyme_constraints (bool): Whether to include enzyme constraints in the analysis.
        min_mdf (float, optional): Minimum max-min driving force (MDF) to be enforced. Defaults to STANDARD_MIN_MDF.
        verbose (bool, optional): If True, print detailed information about identified bottlenecks. Defaults to False.
        solver (Solver, optional): The COBRA-k Solver instance of the MILP solver. Defaults to "SCIP".
        ignore_nonlinear_terms: bool, optional
            Whether or not non-linear extra watches and constraints shall *not* be included. Defaults to False.
            Note: If such non-linear values exist and are included, the whole problem becomes *non-linear*, making it
            incompatible with any purely linear solver!

    Returns:
        list[str]: A list of reaction IDs identified as thermodynamic bottlenecks.
        tuple[str, float]: The MILP solution of this bottleneck search process.
    """
    cobrak_model = deepcopy(cobrak_model)
    thermo_constraint_lp = get_lp_from_cobrak_model(
        cobrak_model,
        with_enzyme_constraints=with_enzyme_constraints,
        with_thermodynamic_constraints=True,
        with_loop_constraints=False,
        add_thermobottleneck_analysis_vars=True,
        min_mdf=min_mdf,
        ignore_nonlinear_terms=ignore_nonlinear_terms,
    )

    thermo_constraint_lp.obj = get_objective(
        thermo_constraint_lp,
        "zb_sum",
        objective_sense=-1,
    )
    pyomo_solver = get_solver(solver)
    results = pyomo_solver.solve(
        thermo_constraint_lp, tee=verbose, **solver.solve_extra_options
    )
    solution_dict = get_pyomo_solution_as_dict(thermo_constraint_lp)

    bottleneck_counter = 1
    bottleneck_reactions = []
    for var_id, var_value in solution_dict.items():
        if not var_id.startswith("zb_var_"):
            continue
        if var_value <= 0.01:
            continue
        bottleneck_reac_id = var_id.replace("zb_var_", "")
        bottleneck_reactions.append(bottleneck_reac_id)
        if verbose:
            bottleneck_dG0 = cobrak_model.reactions[bottleneck_reac_id].dG0
            if bottleneck_dG0 is not None:
                printed_dG0 = round(bottleneck_dG0, 3)
            printed_string = get_reaction_string(cobrak_model, bottleneck_reac_id)
            print(
                f"#{bottleneck_counter}: {bottleneck_reac_id} with ΔG'° of {printed_dG0} kJ/mol, {printed_string}"
            )
        bottleneck_counter += 1

    return bottleneck_reactions, add_statuses_to_optimziation_dict(
        solution_dict, results
    )

perform_lp_variability_analysis(cobrak_model, with_enzyme_constraints=False, with_thermodynamic_constraints=False, active_reactions=[], min_active_flux=0.001, calculate_reacs=True, calculate_concs=True, calculate_rest=True, further_tested_vars=[], min_mdf=STANDARD_MIN_MDF, min_flux_cutoff=1e-05, abs_df_cutoff=1e-05, min_enzyme_cutoff=1e-05, max_active_enzyme_cutoff=0.0001, solver=SCIP, parallel_verbosity_level=0, ignore_nonlinear_terms=False, verbose=False)

Perform linear programming variability analysis on a COBRAk model.

This function conducts a variability analysis on a COBRAk model using linear programming (LP). It evaluates the range of possible flux values for each reaction and all other occuring variables in the model, considering optional enzyme and thermodynamic constraints. The methodology is based on the approach described by [1] and parallelized as outlined in [2].

[1] Mahadevan & Schilling. (2003). Metabolic engineering, 5(4), 264-276. https://doi.org/10.1016/j.ymben.2003.09.002 [2] Gudmundsson & Thiele. BMC Bioinformatics 11, 489 (2010). https://doi.org/10.1186/1471-2105-11-489

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAk model to analyze.

required
with_enzyme_constraints bool

Whether to include enzyme constraints in the analysis.

False
with_thermodynamic_constraints bool

Whether to include thermodynamic constraints in the analysis.

False
active_reactions list[str]

List of reactions to be set as active with a minimum flux. Defaults to an empty list.

[]
min_active_flux float

Minimum flux value for active reactions. Defaults to 1e-5.

0.001
calculate_reacs bool

If True, analyze reaction fluxes. Default: True.

True
calculate_concs bool

If True, analyze concentrations. Default: True.

True
calculate_rest bool

If True, analyze all other parameters (e.g. kappa values and driving forces). Default: True.

True
min_mdf float

Minimum metabolic driving force (MDF) to be enforced. Defaults to 0.0.

STANDARD_MIN_MDF
min_flux_cutoff float

Minimum flux cutoff for considering a reaction active. Defaults to 1e-8.

1e-05
solver Solver

MILP solver used for variability analysis. Default is SCIP, recommended is CPLEX_FOR_VARIABILITY_ANALYSIS or GUROBI_FOR_VARIABILITY_ANALYSIS if you have a CPLEX or Gurobi license.

SCIP
parallel_verbosity_level int

Sets the verbosity level for the analysis parallelization. The higher, the value, the more is printed. Default: 0.

0
ignore_nonlinear_terms bool

(bool): Whether or not non-linear watches/constraints shall be ignored in ecTFBAs. Defaults to True. Note: If such non-linear values exist and are included, the whole problem becomes non-linear, making it incompatible with any purely linear solver!

False
verbose bool

If True, the objective values of solved problems are shown, together with computation time in s. Defaults to False.

False

Returns:

Type Description
dict[str, tuple[float, float]]

dict[str, tuple[float, float]]: A dictionary mapping variable IDs to their minimum and maximum values determined by the variability analysis.

Source code in cobrak/lps.py
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
2295
2296
2297
2298
2299
2300
2301
2302
2303
2304
2305
2306
2307
2308
2309
2310
2311
2312
2313
2314
2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
@validate_call
def perform_lp_variability_analysis(
    cobrak_model: Model,
    with_enzyme_constraints: bool = False,
    with_thermodynamic_constraints: bool = False,
    active_reactions: list[str] = [],
    min_active_flux: float = 1e-3,
    calculate_reacs: bool = True,
    calculate_concs: bool = True,
    calculate_rest: bool = True,
    further_tested_vars: list[str] = [],
    min_mdf: float = STANDARD_MIN_MDF,
    min_flux_cutoff: float = 1e-5,
    abs_df_cutoff: float = 1e-5,
    min_enzyme_cutoff: float = 1e-5,
    max_active_enzyme_cutoff: float = 1e-4,
    solver: Solver = SCIP,
    parallel_verbosity_level: int = 0,
    ignore_nonlinear_terms: bool = False,
    verbose: bool = False,
) -> dict[str, tuple[float, float]]:
    """Perform linear programming variability analysis on a COBRAk model.

    This function conducts a variability analysis on a COBRAk model using linear programming (LP).
    It evaluates the range of possible flux values for each reaction and all other occuring variables in the model,
    considering optional enzyme and thermodynamic constraints. The methodology is based on the approach
    described by [1] and parallelized as outlined in [2].

    [1] Mahadevan & Schilling. (2003). Metabolic engineering, 5(4), 264-276. https://doi.org/10.1016/j.ymben.2003.09.002
    [2] Gudmundsson & Thiele. BMC Bioinformatics 11, 489 (2010). https://doi.org/10.1186/1471-2105-11-489

    Args:
        cobrak_model (Model): The COBRAk model to analyze.
        with_enzyme_constraints (bool): Whether to include enzyme constraints in the analysis.
        with_thermodynamic_constraints (bool): Whether to include thermodynamic constraints in the analysis.
        active_reactions (list[str], optional): List of reactions to be set as active with a minimum flux.
                                                Defaults to an empty list.
        min_active_flux (float, optional): Minimum flux value for active reactions. Defaults to 1e-5.
        calculate_reacs (bool, optional): If True, analyze reaction fluxes. Default: True.
        calculate_concs (bool, optional): If True, analyze concentrations. Default: True.
        calculate_rest (bool, optional): If True, analyze all other parameters (e.g. kappa values and driving forces). Default: True.
        min_mdf (float, optional): Minimum metabolic driving force (MDF) to be enforced. Defaults to 0.0.
        min_flux_cutoff (float, optional): Minimum flux cutoff for considering a reaction active. Defaults to 1e-8.
        solver (Solver, optional): MILP solver used for variability analysis. Default is SCIP, recommended is CPLEX_FOR_VARIABILITY_ANALYSIS
                                   or GUROBI_FOR_VARIABILITY_ANALYSIS if you have a CPLEX or Gurobi license.
        parallel_verbosity_level (int, optional): Sets the verbosity level for the analysis parallelization. The higher,
                                                  the value, the more is printed. Default: 0.
        ignore_nonlinear_terms: (bool): Whether or not non-linear watches/constraints shall be ignored in ecTFBAs. Defaults to True.
            Note: If such non-linear values exist and are included, the whole problem becomes *non-linear*, making it incompatible with any
            purely linear solver!
        verbose (bool): If True, the objective values of solved problems are shown, together with computation time in s. Defaults to False.

    Returns:
        dict[str, tuple[float, float]]: A dictionary mapping variable IDs to their minimum and maximum values
                                        determined by the variability analysis.
    """
    cobrak_model = deepcopy(cobrak_model)
    for active_reaction in active_reactions:
        cobrak_model.reactions[active_reaction].min_flux = min_active_flux

    model = get_lp_from_cobrak_model(
        cobrak_model=cobrak_model,
        with_enzyme_constraints=with_enzyme_constraints,
        with_thermodynamic_constraints=with_thermodynamic_constraints,
        with_loop_constraints=True,
        min_mdf=min_mdf,
        strict_kappa_products_equality=True,
        ignore_nonlinear_terms=False,
    )
    model_var_names = get_model_var_names(model)

    min_values: dict[str, float] = {}
    max_values: dict[str, float] = {}
    objective_targets: list[tuple[int, str]] = []

    max_flux_sum_result = perform_lp_optimization(
        cobrak_model,
        objective_target=FLUX_SUM_VAR_ID,
        objective_sense=+1,
        with_enzyme_constraints=with_enzyme_constraints,
        with_thermodynamic_constraints=with_thermodynamic_constraints,
        with_loop_constraints=True,
        with_flux_sum_var=True,
        solver=solver,
        ignore_nonlinear_terms=False,
    )
    min_flux_sum_result = perform_lp_optimization(
        cobrak_model,
        objective_target=FLUX_SUM_VAR_ID,
        objective_sense=-1,
        with_enzyme_constraints=with_enzyme_constraints,
        with_thermodynamic_constraints=with_thermodynamic_constraints,
        with_loop_constraints=True,
        with_flux_sum_var=True,
        solver=solver,
        ignore_nonlinear_terms=ignore_nonlinear_terms,
    )

    if (calculate_concs or calculate_rest) and with_thermodynamic_constraints:
        min_mdf_result = perform_lp_optimization(
            cobrak_model,
            objective_target=MDF_VAR_ID,
            objective_sense=-1,
            with_enzyme_constraints=with_enzyme_constraints,
            with_thermodynamic_constraints=with_thermodynamic_constraints,
            with_loop_constraints=True,
            solver=solver,
            ignore_nonlinear_terms=ignore_nonlinear_terms,
        )
        max_mdf_result = perform_lp_optimization(
            cobrak_model,
            objective_target=MDF_VAR_ID,
            objective_sense=+1,
            with_enzyme_constraints=with_enzyme_constraints,
            with_thermodynamic_constraints=with_thermodynamic_constraints,
            with_loop_constraints=True,
            solver=solver,
            ignore_nonlinear_terms=ignore_nonlinear_terms,
        )

    if calculate_concs:
        for met_id, metabolite in cobrak_model.metabolites.items():
            met_var_name = f"{LNCONC_VAR_PREFIX}{met_id}"
            if met_var_name in model_var_names:
                min_mdf_conc = min_mdf_result[met_var_name]
                max_mdf_conc = min_mdf_result[met_var_name]
                if metabolite.log_min_conc in (min_mdf_conc, max_mdf_conc):
                    min_values[met_var_name] = metabolite.log_min_conc
                else:
                    objective_targets.append((-1, met_var_name))
                if metabolite.log_max_conc in (min_mdf_conc, max_mdf_conc):
                    max_values[met_var_name] = metabolite.log_max_conc
                else:
                    objective_targets.append((+1, met_var_name))

    for reac_id, reaction in cobrak_model.reactions.items():
        min_flux_sum_flux = min_flux_sum_result[reac_id]
        max_flux_sum_flux = max_flux_sum_result[reac_id]

        if calculate_reacs:
            if reaction.min_flux in (min_flux_sum_flux, max_flux_sum_flux):
                min_values[reac_id] = (
                    reaction.min_flux if reaction.min_flux >= min_flux_cutoff else 0.0
                )
            else:
                objective_targets.append((-1, reac_id))
            if reaction.max_flux in (min_flux_sum_flux, max_flux_sum_flux):
                max_values[reac_id] = reaction.max_flux
            else:
                objective_targets.append((+1, reac_id))

        if not calculate_rest:
            continue

        f_var_name = f"{DF_VAR_PREFIX}{reac_id}"
        kappa_substrates_var_name = f"{KAPPA_SUBSTRATES_VAR_PREFIX}{reac_id}"
        kappa_products_var_name = f"{KAPPA_PRODUCTS_VAR_PREFIX}{reac_id}"
        if f_var_name in model_var_names:
            if min_mdf in (min_mdf_result[f_var_name], max_mdf_result[f_var_name]):
                min_values[f_var_name] = min_mdf
            else:
                objective_targets.append((-1, f_var_name))
            objective_targets.append((+1, f_var_name))
        if kappa_substrates_var_name in model_var_names:
            objective_targets.extend(
                ((-1, kappa_substrates_var_name), (+1, kappa_substrates_var_name))
            )
        if kappa_products_var_name in model_var_names:
            objective_targets.extend(
                ((-1, kappa_products_var_name), (+1, kappa_products_var_name))
            )
        if (
            reaction.enzyme_reaction_data is not None
            and with_enzyme_constraints
            and reaction.enzyme_reaction_data.k_cat < 1e20
        ):
            full_enzyme_id = get_full_enzyme_id(
                reaction.enzyme_reaction_data.identifiers
            )
            if full_enzyme_id:
                enzyme_delivery_var_name = get_reaction_enzyme_var_id(reac_id, reaction)
                if 0.0 in (min_flux_sum_flux, max_flux_sum_flux):
                    min_values[enzyme_delivery_var_name] = 0.0
                else:
                    objective_targets.append((-1, enzyme_delivery_var_name))
                objective_targets.append((+1, enzyme_delivery_var_name))

    for further_tested_var in further_tested_vars:
        objective_targets.extend(((+1, further_tested_var), (-1, further_tested_var)))

    objectives_data: list[tuple[str, str]] = []
    for obj_sense, target_id in objective_targets:
        if obj_sense == -1:
            objective_name = f"MIN_OBJ_{target_id}"
            pyomo_sense = minimize
        else:
            objective_name = f"MAX_OBJ_{target_id}"
            pyomo_sense = maximize
        setattr(
            model,
            objective_name,
            Objective(expr=getattr(model, target_id), sense=pyomo_sense),
        )
        getattr(model, objective_name).deactivate()
        objectives_data.append((objective_name, target_id))

    objectives_data_batches = split_list(objectives_data, cpu_count())
    pyomo_solver = get_solver(solver)

    results_list = Parallel(n_jobs=-1, verbose=parallel_verbosity_level)(
        delayed(_batch_variability_optimization)(
            pyomo_solver, model, batch, solver.solve_extra_options, verbose
        )
        for batch in objectives_data_batches
    )
    for result in chain(*results_list):
        is_minimization = result[0]
        target_id = result[1]
        result_value = result[2]
        if is_minimization:
            min_values[target_id] = result_value
        else:
            max_values[target_id] = result_value

    for key, min_value in min_values.items():
        if key in cobrak_model.reactions:
            min_values[key] = min_value if min_value >= min_flux_cutoff else 0.0
        if key.startswith(ENZYME_VAR_PREFIX):
            min_values[key] = min_value if min_value >= min_enzyme_cutoff else 0.0
        if key.startswith(DF_VAR_PREFIX):
            min_values[key] = min_value if abs(min_value) >= abs_df_cutoff else 0.0

    enzyme_var_to_reac_id = {
        get_reaction_enzyme_var_id(reac_id, reaction): reac_id
        for reac_id, reaction in cobrak_model.reactions.items()
    }
    for key, max_value in max_values.items():
        if key.startswith(ENZYME_VAR_PREFIX) and (
            (max_values[key] != 0.0) or (max_values[enzyme_var_to_reac_id[key]] > 0.0)
        ):
            max_values[key] = max(max_value, max_active_enzyme_cutoff)
        if key.startswith(DF_VAR_PREFIX):
            max_values[key] = max_value if abs(max_value) >= abs_df_cutoff else 0.0

    all_target_ids = sorted(
        set(
            list(min_values.keys())
            + list(max_values.keys())
            + [obj_target[1] for obj_target in objective_targets]
        )
    )
    variability_dict: dict[str, tuple[float, float]] = {
        target_id: (min_values[target_id], max_values[target_id])
        for target_id in all_target_ids
    }

    return variability_dict

metanetx_functionality

Functionalities for reading out MetaNetX files.

add_smiles_annotation_to_metabolites(cobrak_model, chem_prop_json_filepath, chem_xref_json_filepath, print_found_smiles=False, print_not_found_smiles=False, allowed_annotation_keys=[])

Annotates metabolites in a COBRA-k model with SMILES strings using preprocessed MetaNetX files.

The function reads two gzipped JSON files (produced by COBRA-k's clean_and_compress_mnx_files function - see there): 1. chem_xref.json 2. chem_prop.json Note: The JSON's producedby COBRA-k's clean_and_compress_mnx_files are zipped, but you must not add the .zip suffix to the given file paths.

It iterates through the model's metabolites, uses their existing annotations to find a matching MetaNetX ID, and then uses the MetaNetX ID to retrieve the SMILES string. The SMILES string is then added to the metabolite's annotation dictionary under the specified key.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRA-k Model object containing the metabolites to be annotated.

required
chem_prop_json_filepath str

Path to the zipped JSON file containing MetaNetX chemical properties (MetaNetX ID to properties), without the .zip file ending.

required
chem_xref_json_filepath str

Path to the zipped JSON file containing MetaNetX cross-references (External ID to MetaNetX ID), without the .zip file ending.

required
print_found_smiles bool

If True, prints a message for every metabolite where a SMILES string was successfully added.

False
print_not_found_smiles bool

If True, prints a message for every metabolite where no SMILES string could be found.

False
allowed_annotation_keys list[str]

An optional list of annotation keys (e.g., 'chebi', 'bigg.metabolite') to be considered only. If empty, all existing annotation keys are checked. Note: If a metabolite has multiple eligible annotations, the very first read out annotation with MetaNetX cross-reference is used. Thereby, the first annotation key in this list has the highest precedence. (default: [], i.e. all keys are considered)

[]
smiles_annotation_key

The key under which the SMILES string should be stored in the metabolite's annotation dictionary (default: 'smiles').

required

Returns:

Type Description
Model

The updated COBRA-k Model object with SMILES annotations added to

Model

the metabolites.

Source code in cobrak/metanetx_functionality.py
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def add_smiles_annotation_to_metabolites(
    cobrak_model: Model,
    chem_prop_json_filepath: str,
    chem_xref_json_filepath: str,
    print_found_smiles: bool = False,
    print_not_found_smiles: bool = False,
    allowed_annotation_keys: list[str] = [],
) -> Model:
    """Annotates metabolites in a COBRA-k model with SMILES strings using preprocessed MetaNetX files.

    The function reads two gzipped JSON files (produced by COBRA-k's
    `clean_and_compress_mnx_files` function - see there):
    1. `chem_xref.json`
    2. `chem_prop.json`
    Note: The JSON's producedby COBRA-k's clean_and_compress_mnx_files are zipped, but
    you must not add the .zip suffix to the given file paths.

    It iterates through the model's metabolites, uses their existing annotations
    to find a matching MetaNetX ID, and then uses the MetaNetX ID to retrieve
    the SMILES string. The SMILES string is then added to the metabolite's
    annotation dictionary under the specified key.

    Args:
        cobrak_model: The COBRA-k `Model` object containing the metabolites to
                      be annotated.
        chem_prop_json_filepath: Path to the zipped JSON file containing
                                 MetaNetX chemical properties (MetaNetX ID to properties),
                                 without the .zip file ending.
        chem_xref_json_filepath: Path to the zipped JSON file containing
                                 MetaNetX cross-references (External ID to MetaNetX ID),
                                 without the .zip file ending.
        print_found_smiles: If True, prints a message for every metabolite
                            where a SMILES string was successfully added.
        print_not_found_smiles: If True, prints a message for every metabolite
                                where no SMILES string could be found.
        allowed_annotation_keys: An optional list of annotation keys (e.g.,
                                 'chebi', 'bigg.metabolite') to be considered only.
                                 If empty, all existing annotation keys are checked.
                                 Note: If a metabolite has multiple eligible annotations,
                                 the very first read out annotation with MetaNetX cross-reference
                                 is used. Thereby, the first annotation key in this list has the
                                 highest precedence. (default: [], i.e. all keys are considered)
        smiles_annotation_key: The key under which the SMILES string should be
                               stored in the metabolite's annotation dictionary
                               (default: 'smiles').

    Returns:
        The updated COBRA-k `Model` object with SMILES annotations added to
        the metabolites.
    """
    chem_xref_dict: dict[tuple[str, str], str] = json_zip_load(chem_xref_json_filepath)
    chem_prop_dict: dict[str, dict[str, str]] = json_zip_load(chem_prop_json_filepath)

    for met_id, met_data in cobrak_model.metabolites.items():
        metanetx_id: str = ""
        eligible_keys: list[str] = (
            allowed_annotation_keys
            if allowed_annotation_keys
            else list(met_data.annotation.keys())
        )
        for key in eligible_keys:
            if key not in met_data.annotation:
                continue
            values_unknown_type: list[str] | str = met_data.annotation[key]
            if type(values_unknown_type) is str:
                values: list[str] = [values_unknown_type]
            else:
                values: list[str] = values_unknown_type
            metanetx_id_found = False
            for value in values:
                annotation_id = f"{key}:{value.split(':')[0]}"
                if annotation_id not in chem_xref_dict:
                    continue
                metanetx_id = chem_xref_dict[annotation_id]
                metanetx_id_found = True
                break
            if metanetx_id_found:
                break

        if not metanetx_id or metanetx_id not in chem_prop_dict:
            if print_not_found_smiles:
                print(f"SMILES not found for {met_id}")
            continue
        smiles = chem_prop_dict[metanetx_id]["smiles"]
        met_data.smiles = smiles
        if print_found_smiles:
            print(f"SMILES found for {met_id}: {smiles}")

    return cobrak_model

clean_and_compress_mnx_files(chem_prop_filepath, chem_xref_filepath, output_dir)

Cleans data from two MetaNetX TSV files (chem_prop and chem_xref) and saves the cleaned versions as compressed JSON (.json.zip) files in a specified output directory.

These cleaned versions are small enough to be stored in a GitHub repository :-) and can be directly used with COBRA-k's other MetaNetX functions to add SMILES to metabolites.

The two files can be found here (as of Dec 2, 2025): https://www.metanetx.org/mnxdoc/mnxref.html

Parameters:

Name Type Description Default
chem_prop_filename

The path to the 'chem_prop.tsv' file.

required
chem_xref_filename

The path to the 'chem_xref.tsv' file.

required
output_dir str

The path to the directory where the cleaned, compressed files will be saved.

required
Source code in cobrak/metanetx_functionality.py
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def clean_and_compress_mnx_files(
    chem_prop_filepath: str,
    chem_xref_filepath: str,
    output_dir: str,
) -> None:
    """
    Cleans data from two MetaNetX TSV files (chem_prop and chem_xref) and
    saves the cleaned versions as compressed JSON (.json.zip) files in a specified
    output directory.

    These cleaned versions are small enough to be stored in a GitHub repository :-)
    and can be directly used with COBRA-k's other MetaNetX functions to add SMILES to
    metabolites.

    The two files can be found here (as of Dec 2, 2025):
    https://www.metanetx.org/mnxdoc/mnxref.html

    Args:
        chem_prop_filename: The path to the 'chem_prop.tsv' file.
        chem_xref_filename: The path to the 'chem_xref.tsv' file.
        output_dir: The path to the directory where the cleaned, compressed
                    files will be saved.
    """

    # 1. Create the output directory if it doesn't exist
    ensure_folder_existence(output_dir)

    # --- Processing chem_xref.tsv ---
    chem_xref_output_path = os.path.join(output_dir, "chem_xref.json")
    print(f"Processing '{chem_xref_filepath}'...")

    chem_xref_dict: dict[str, str] = {}
    try:
        with open(chem_xref_filepath, encoding="utf-8") as f:
            for line in f:
                # Skip comments and empty lines
                if line.startswith("#") or len(line.strip()) == 0:
                    continue

                line = line.strip()  # noqa: PLW2901
                linesplit = line.split("\t")

                # Ensure there are enough columns
                if len(linesplit) > 1:
                    external_id = linesplit[0]
                    metanetx_id = linesplit[1]
                    chem_xref_dict[external_id] = metanetx_id

        # Write the cleaned data to a compressed GZ file
        json_zip_write(chem_xref_output_path, chem_xref_dict)
        print(f"Successfully saved compressed file to '{chem_xref_output_path}'")

    except FileNotFoundError:
        print(f"Error: Input file '{chem_xref_filepath}' not found.")
    except Exception as e:
        print(
            f"The following error occurred while processing '{chem_xref_filepath}': {e}"
        )

    # --- Processing chem_prop.tsv ---
    chem_prop_output_path = os.path.join(output_dir, "chem_prop.json")
    print(f"Processing '{chem_prop_filepath}'...")

    chem_prop_dict: dict[str, dict[str, str]] = {}
    try:
        with open(chem_prop_filepath, encoding="utf-8") as f:
            for line in f:
                # Skip comments and empty lines
                if line.startswith("#") or len(line.strip()) == 0:
                    continue

                line = line.strip()  # noqa: PLW2901
                linesplit = line.split("\t")

                # Ensure there are enough columns before accessing them
                if len(linesplit) > 8:
                    metanetx_id = linesplit[0]
                    colloquial_name = linesplit[1]
                    charge = linesplit[4]
                    mass = linesplit[5]
                    smiles = linesplit[8]

                    chem_prop_dict[metanetx_id] = {
                        "colloquial_name": colloquial_name,
                        "charge": charge,
                        "mass": mass,
                        "smiles": smiles,
                    }

        # Write the cleaned data to a compressed GZ file
        json_zip_write(chem_prop_output_path, chem_prop_dict)
        print(f"Successfully saved compressed file to '{chem_prop_output_path}'")

    except FileNotFoundError:
        print(f"Error: Input file '{chem_prop_filepath}' not found.")
    except Exception as e:
        print(
            f"The following error occurred while processing '{chem_prop_filepath}': {e}"
        )

    print("Cleanup and compression of MetaNetX tsv files complete!")

model_instantiation

This module contains the most convenient ways to create new Model instances from COBRApy models.

delete_enzymatically_suboptimal_reactions_in_cobrak_model(cobrak_model, ignored_ids=['s0001'], enz_reacs_to_keep=[])

Delete enzymatically suboptimal reactions in a COBRA-k model, similar to the idea in sMOMENT/AutoPACMEN [1].

This function processes each reaction in the provided COBRA-k model to determine if it is enzymatically suboptimal based on Molecular Weight by k_cat (MW/kcat). Suboptimal reactions are identified by comparing their MW/kcat value with that of other reactions sharing the same base identifier, retaining only those with the lowest MW/kcat. The function then removes these suboptimal reactions from the model and cleans up orphaned metabolites.

  • The function assumes that the 'enzyme_reaction_data' attribute of each reaction includes identifiers and k_cat information for enzyme-catalyzed reactions. If not, those reactions are skipped.
  • Reactions with identical base IDs (but different directional suffixes) are considered as variants of the same reaction.
  • After removing suboptimal reactions, the function calls delete_orphaned_metabolites_and_enzymes to clean up any orphaned metabolites and enzymes that may have been left behind.

[1] https://doi.org/10.1186/s12859-019-3329-9

Parameters:

Name Type Description Default
cobrak_model Model

A COBRA-k model containing biochemical reactions.

required

Returns:

Type Description
Model

cobra.Model: The updated COBRA-k model after removing enzymatically suboptimal reactions.

Source code in cobrak/model_instantiation.py
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
def delete_enzymatically_suboptimal_reactions_in_cobrak_model(
    cobrak_model: Model,
    ignored_ids: list[str] = ["s0001"],
    enz_reacs_to_keep: list[str] = [],
) -> Model:
    """Delete enzymatically suboptimal reactions in a COBRA-k model, similar to the idea in sMOMENT/AutoPACMEN [1].

    This function processes each reaction in the provided COBRA-k model to
    determine if it is enzymatically suboptimal based on Molecular Weight by k_cat (MW/kcat).
    Suboptimal reactions are identified by comparing their MW/kcat value with that of other reactions
    sharing the same base identifier, retaining only those with the lowest MW/kcat.
    The function then removes these suboptimal reactions from the model and cleans up orphaned metabolites.

    - The function assumes that the 'enzyme_reaction_data' attribute of each reaction includes
      identifiers and k_cat information for enzyme-catalyzed reactions. If not, those reactions are skipped.
    - Reactions with identical base IDs (but different directional suffixes) are considered as variants of the same reaction.
    - After removing suboptimal reactions, the function calls `delete_orphaned_metabolites_and_enzymes` to clean up any orphaned metabolites and enzymes that may have been left behind.

    [1] https://doi.org/10.1186/s12859-019-3329-9

    Parameters:
        cobrak_model (cobra.Model): A COBRA-k model containing biochemical reactions.

    Returns:
        cobra.Model: The updated COBRA-k model after removing enzymatically suboptimal reactions.
    """
    reac_id_to_mw_by_kcat: dict[str, float] = {}
    reac_id_to_base_id: dict[str, str] = {}
    base_id_to_min_mw_by_kcat: dict[str, float] = {}
    ignored_reac_ids_with_mws: list[str] = []
    for reac_id, reac_data in cobrak_model.reactions.items():
        if (
            reac_data.enzyme_reaction_data is None
            or reac_data.enzyme_reaction_data.identifiers in ([], [""])
        ):
            ignored_reac_ids_with_mws.append((reac_id, 0.0))
            continue
        if reac_data.enzyme_reaction_data.k_cat >= 1e19:
            ignored_reac_ids_with_mws.append(
                (reac_id, get_full_enzyme_mw(cobrak_model, reac_data))
            )
            continue
        if any(
            ignored_id in reac_data.enzyme_reaction_data.identifiers
            for ignored_id in ignored_ids
        ):
            ignored_reac_ids_with_mws.append((reac_id, 0.0))
            continue

        mw_by_kcat = (
            get_full_enzyme_mw(cobrak_model, reac_data)
            / reac_data.enzyme_reaction_data.k_cat
        )

        reac_id_to_mw_by_kcat[reac_id] = mw_by_kcat

        if reac_id.endswith(cobrak_model.fwd_suffix):
            direction_addition = cobrak_model.fwd_suffix
        elif reac_id.endswith(cobrak_model.rev_suffix):
            direction_addition = cobrak_model.rev_suffix
        else:
            direction_addition = ""
        base_id = reac_id.split(cobrak_model.reac_enz_separator)[0] + direction_addition

        reac_id_to_base_id[reac_id] = base_id
        if base_id not in base_id_to_min_mw_by_kcat:
            base_id_to_min_mw_by_kcat[base_id] = mw_by_kcat
        else:
            base_id_to_min_mw_by_kcat[base_id] = min(
                base_id_to_min_mw_by_kcat[base_id], mw_by_kcat
            )

    reacs_to_delete = [
        reac_id
        for reac_id, base_id in reac_id_to_base_id.items()
        if reac_id_to_mw_by_kcat[reac_id] != base_id_to_min_mw_by_kcat[base_id]
    ]
    extra_reacs_to_delete = _extra_reacs_to_delete(
        ignored_reac_ids=ignored_reac_ids_with_mws,
        enz_reacs_to_keep=enz_reacs_to_keep,
        rev_suffix=cobrak_model.rev_suffix,
        fwd_suffix=cobrak_model.fwd_suffix,
        reac_enz_separator=cobrak_model.reac_enz_separator,
    )
    for reac_to_delete in reacs_to_delete + extra_reacs_to_delete:
        del cobrak_model.reactions[reac_to_delete]

    return delete_orphaned_metabolites_and_enzymes(cobrak_model)

delete_enzymatically_suboptimal_reactions_in_fullsplit_cobrapy_model(cobra_model, enzyme_reaction_data, enzyme_molecular_weights, fwd_suffix=REAC_FWD_SUFFIX, rev_suffix=REAC_REV_SUFFIX, reac_enz_separator=REAC_ENZ_SEPARATOR, special_enzyme_stoichiometries={})

Removes enzymatically suboptimal reactions from a fullsplit COBRApy model.

This function identifies and deletes reactions in a COBRApy model that are enzymatically suboptimal based on enzyme reaction data and molecular weights. I.e., it retains only the reactions with the minimum molecular weight to k_cat (MW/k_cat) ratio for each base reaction. "base" reaction stands for any originally identical reaction, e.g., if there are somehow now multiple phosphoglucokinase (PGK) reactions due to an enzyme fullsplit, only one of these PGK variants will be retained in the returned model.

Parameters:

Name Type Description Default
cobra_model Model

The COBRA-k model from which suboptimal reactions will be removed.

required
enzyme_reaction_data dict[str, EnzymeReactionData | None]

A dictionary mapping reaction IDs to EnzymeReactionData objects or None if the data is missing.

required
enzyme_molecular_weights dict[str, float]

A dictionary mapping enzyme identifiers to their molecular weights.

required

Returns:

Type Description
Model

cobra.Model: The modified COBRA-k model with suboptimal reactions removed.

Source code in cobrak/model_instantiation.py
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
def delete_enzymatically_suboptimal_reactions_in_fullsplit_cobrapy_model(
    cobra_model: cobra.Model,
    enzyme_reaction_data: dict[str, EnzymeReactionData | None],
    enzyme_molecular_weights: dict[str, float],
    fwd_suffix: str = REAC_FWD_SUFFIX,
    rev_suffix: str = REAC_REV_SUFFIX,
    reac_enz_separator: str = REAC_ENZ_SEPARATOR,
    special_enzyme_stoichiometries: dict[str, dict[str, float]] = {},
) -> cobra.Model:
    """Removes enzymatically suboptimal reactions from a fullsplit COBRApy model.

    This function identifies and deletes reactions in a COBRApy model that are enzymatically suboptimal based on
    enzyme reaction data and molecular weights. I.e., it retains only the reactions with the minimum molecular weight
    to k_cat (MW/k_cat) ratio for each base reaction. "base" reaction stands for any originally identical reaction, e.g.,
    if there are somehow now multiple phosphoglucokinase (PGK) reactions due to an enzyme fullsplit, only one of these
    PGK variants will be retained in the returned model.

    Args:
        cobra_model (cobra.Model): The COBRA-k model from which suboptimal reactions will be removed.
        enzyme_reaction_data (dict[str, EnzymeReactionData | None]): A dictionary mapping reaction IDs to
            ```EnzymeReactionData``` objects or ```None``` if the data is missing.
        enzyme_molecular_weights (dict[str, float]): A dictionary mapping enzyme identifiers to their molecular weights.

    Returns:
        cobra.Model: The modified COBRA-k model with suboptimal reactions removed.
    """
    reac_ids: list[str] = [reaction.id for reaction in cobra_model.reactions]
    ignored_reac_ids_with_mws: list[str] = []
    base_reacs_to_min_mw_by_k_cat: dict[str, tuple[str, float]] = {}
    for reac_id in reac_ids:
        if reac_enz_separator not in reac_id:
            continue
        if reac_id not in enzyme_reaction_data:
            enzyme_ids = reac_id.split(reac_enz_separator)[1].split("_and_")
            enzyme_ids[-1] = (
                enzyme_ids[-1].replace(rev_suffix, "").replace(fwd_suffix, "")
            )
            if not all(
                enzyme_id in enzyme_molecular_weights for enzyme_id in enzyme_ids
            ):
                ignored_reac_ids_with_mws.append((reac_id, 0.0))
                continue
            enzyme_reaction_data[reac_id] = EnzymeReactionData(identifiers=enzyme_ids)

        try:
            current_enzyme_reaction_data = enzyme_reaction_data[reac_id]
        except KeyError:
            logging.warning(f"The dict enzyme_reaction_data does not have {reac_id}")  # noqa: G004, LOG015
            continue
        if current_enzyme_reaction_data is None:
            ignored_reac_ids_with_mws.append((reac_id, 0.0))
            continue

        mw = 0.0
        for identifier in current_enzyme_reaction_data.identifiers:
            if reac_id in special_enzyme_stoichiometries:
                if identifier in special_enzyme_stoichiometries[reac_id]:
                    stoichiometry = special_enzyme_stoichiometries[reac_id][identifier]
                else:
                    stoichiometry = 1.0
            else:
                stoichiometry = 1.0
            try:
                mw += stoichiometry * enzyme_molecular_weights[identifier]
            except KeyError:
                logging.warning(f"Cannot find {identifier} in enzyme_molecular_weights")  # noqa: G004, LOG015

        k_cat = current_enzyme_reaction_data.k_cat
        if k_cat > 1e19:
            ignored_reac_ids_with_mws.append((reac_id, mw))
            continue

        if reac_id.endswith(fwd_suffix):
            direction_addition = fwd_suffix
        elif reac_id.endswith(rev_suffix):
            direction_addition = rev_suffix
        else:
            direction_addition = ""
        base_id = reac_id.split(reac_enz_separator)[0] + direction_addition

        mw_by_k_cat = mw / k_cat
        if (
            base_id not in base_reacs_to_min_mw_by_k_cat
            or mw_by_k_cat < base_reacs_to_min_mw_by_k_cat[base_id][1]
        ):
            base_reacs_to_min_mw_by_k_cat[base_id] = (reac_id, mw_by_k_cat)
    enz_reacs_to_keep = [entry[0] for entry in base_reacs_to_min_mw_by_k_cat.values()]

    # Remove superfluous reactions
    extra_reacs_to_delete = _extra_reacs_to_delete(
        ignored_reac_ids=ignored_reac_ids_with_mws,
        enz_reacs_to_keep=enz_reacs_to_keep,
        rev_suffix=rev_suffix,
        fwd_suffix=fwd_suffix,
        reac_enz_separator=reac_enz_separator,
    )

    reacs_to_delete = [
        reac_id
        for reac_id in reac_ids
        if (reac_enz_separator in reac_id)
        and (reac_id not in enz_reacs_to_keep)
        and (reac_id not in [item[0] for item in ignored_reac_ids_with_mws])
    ] + extra_reacs_to_delete
    cobra_model.remove_reactions(reacs_to_delete)
    return cobra_model

get_cobrak_model_from_sbml_and_thermokinetic_data(sbml_path, extra_linear_constraints, dG0s, dG0_uncertainties, conc_ranges, enzyme_molecular_weights, enzyme_reaction_data, max_prot_pool=STANDARD_MAX_PROT_POOL, kinetic_ignored_metabolites=[], enzyme_conc_ranges={}, do_model_fullsplit=False, do_delete_enzymatically_suboptimal_reactions=True, R=STANDARD_R, T=STANDARD_T, fwd_suffix=REAC_FWD_SUFFIX, rev_suffix=REAC_REV_SUFFIX, reac_enz_separator=REAC_ENZ_SEPARATOR, omitted_metabolites=[], ignored_enzyme_ids=['s0001'], remove_enzyme_reaction_data_if_no_kcat_set=False, sequences={}, add_molar_masses=True)

Creates a COBRAk model from an SBML and given further thermokinetic (thermodynamic and enzymatic) data.

This function constructs a Model by integrating thermokinetic data and additional constraints into an existing COBRA-k model. It allows for the specification of concentration ranges, enzyme molecular weights, and reaction data, among other parameters.

Parameters:

Name Type Description Default
sbml_path str

The SBML model to be converted.

required
extra_linear_constraints list[ExtraLinearConstraint]

Additional linear constraints to be applied to the model.

required
dG0s dict[str, float]

Standard Gibbs free energy changes for reactions.

required
dG0_uncertainties dict[str, float]

Uncertainties in the standard Gibbs free energy changes.

required
conc_ranges dict[str, tuple[float, float]]

Concentration ranges for metabolites.

required
enzyme_molecular_weights dict[str, float]

Molecular weights of enzymes.

required
enzyme_reaction_data dict[str, EnzymeReactionData | None]

Enzyme reaction data for reactions.

required
max_prot_pool float

Maximum protein pool constraint.

STANDARD_MAX_PROT_POOL
kinetic_ignored_metabolites list[str]

Metabolites to be ignored in kinetic calculations.

[]
enzyme_conc_ranges dict[str, tuple[float, float] | None]

Concentration ranges for enzymes. Defaults to {}.

{}
do_model_fullsplit bool

Whether to perform a full split of the model. Defaults to True.

False
do_delete_enzymatically_suboptimal_reactions bool

Whether to delete enzymatically suboptimal reactions. Defaults to True.

True
R float

Universal gas constant. Defaults to STANDARD_R.

STANDARD_R
T float

Temperature in Kelvin. Defaults to STANDARD_T.

STANDARD_T
omitted_metabolites list[str]

Metabolites that shall not be included in the model. Their stoichiometries will be jsut deleted. Useful to e.g. delete enzyme-constraint pseudo-metabolites. Defauls to [].

[]
ignored_enzyme_ids list[str]

Enzymes that shall not be included if their ID occurs in any identifiers part. Defaults to ["s0001"], i.e. spontaneously occurring reactions.

['s0001']
remove_enzyme_reaction_data_if_no_kcat_set bool

If no \(k_{cat}\) is set for a reaction, shall its EnzymeReactionData be set to None? If False, the default EnzymeReactionData with a very high (effectively non-existing) \(k_{cat}\) is used. Defaults to False. sequences (dict[str, str], optional): Data for protein sequences

False
add_molar_masses bool

bool, default True Whether or not to calculate molar masses for all metabolites through their formula member variable

True

Raises: ValueError: If a concentration range for a metabolite is not provided and no default is set.

Returns:

Name Type Description
Model Model

The constructed Model with integrated thermokinetic data and constraints.

Source code in cobrak/model_instantiation.py
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
def get_cobrak_model_from_sbml_and_thermokinetic_data(
    sbml_path: str,
    extra_linear_constraints: list[ExtraLinearConstraint],
    dG0s: dict[str, float],
    dG0_uncertainties: dict[str, float],
    conc_ranges: dict[str, tuple[float, float]],
    enzyme_molecular_weights: dict[str, float],
    enzyme_reaction_data: dict[str, EnzymeReactionData | None],
    max_prot_pool: float = STANDARD_MAX_PROT_POOL,
    kinetic_ignored_metabolites: list[str] = [],
    enzyme_conc_ranges: dict[str, tuple[float, float] | None] = {},
    do_model_fullsplit: bool = False,
    do_delete_enzymatically_suboptimal_reactions: bool = True,
    R: float = STANDARD_R,
    T: float = STANDARD_T,
    fwd_suffix: str = REAC_FWD_SUFFIX,
    rev_suffix: str = REAC_REV_SUFFIX,
    reac_enz_separator: str = REAC_ENZ_SEPARATOR,
    omitted_metabolites: list[str] = [],
    ignored_enzyme_ids: str = ["s0001"],
    remove_enzyme_reaction_data_if_no_kcat_set: bool = False,
    sequences: dict[str, str] = {},
    add_molar_masses: bool = True,
) -> Model:
    """Creates a COBRAk model from an SBML and given further thermokinetic (thermodynamic and enzymatic) data.

    This function constructs a `Model` by integrating thermokinetic data and additional constraints
    into an existing COBRA-k model. It allows for the specification of concentration ranges, enzyme molecular weights, and
    reaction data, among other parameters.

    Args:
        sbml_path (str): The SBML model to be converted.
        extra_linear_constraints (list[ExtraLinearConstraint]): Additional linear constraints to be applied to the model.
        dG0s (dict[str, float]): Standard Gibbs free energy changes for reactions.
        dG0_uncertainties (dict[str, float]): Uncertainties in the standard Gibbs free energy changes.
        conc_ranges (dict[str, tuple[float, float]]): Concentration ranges for metabolites.
        enzyme_molecular_weights (dict[str, float]): Molecular weights of enzymes.
        enzyme_reaction_data (dict[str, EnzymeReactionData | None]): Enzyme reaction data for reactions.
        max_prot_pool (float): Maximum protein pool constraint.
        kinetic_ignored_metabolites (list[str]): Metabolites to be ignored in kinetic calculations.
        enzyme_conc_ranges (dict[str, tuple[float, float] | None], optional): Concentration ranges for enzymes. Defaults to {}.
        do_model_fullsplit (bool, optional): Whether to perform a full split of the model. Defaults to True.
        do_delete_enzymatically_suboptimal_reactions (bool, optional): Whether to delete enzymatically suboptimal reactions. Defaults to True.
        R (float, optional): Universal gas constant. Defaults to STANDARD_R.
        T (float, optional): Temperature in Kelvin. Defaults to STANDARD_T.
        omitted_metabolites (list[str], optional): Metabolites that shall not be included in the model. Their stoichiometries
         will be jsut deleted. Useful to e.g. delete enzyme-constraint pseudo-metabolites. Defauls to [].
        ignored_enzyme_ids (list[str], optional): Enzymes that shall not be included if their ID occurs in any identifiers part. Defaults to ["s0001"],
         i.e. spontaneously occurring reactions.
        remove_enzyme_reaction_data_if_no_kcat_set (bool, optional): If no $k_{cat}$ is set for a reaction, shall its EnzymeReactionData
         be set to None? If False, the default EnzymeReactionData with a very high (effectively non-existing) $k_{cat}$ is used. Defaults to False.
         sequences (dict[str, str], optional): Data for protein sequences
        add_molar_masses: bool, default True
            Whether or not to calculate molar masses for all metabolites through their formula member variable
    Raises:
        ValueError: If a concentration range for a metabolite is not provided and no default is set.

    Returns:
        Model: The constructed `Model` with integrated thermokinetic data and constraints.
    """
    cobra_model = cobra.io.read_sbml_model(sbml_path)

    if do_model_fullsplit:
        cobra_model = get_fullsplit_cobra_model(cobra_model)

    cobrak_model = Model(
        reactions={},
        metabolites={},
        enzymes={},
        max_prot_pool=max_prot_pool,
        extra_linear_constraints=extra_linear_constraints,
        kinetic_ignored_metabolites=kinetic_ignored_metabolites,
        R=R,
        T=T,
        fwd_suffix=fwd_suffix,
        rev_suffix=rev_suffix,
        reac_enz_separator=reac_enz_separator,
    )

    for metabolite in cobra_model.metabolites:
        if metabolite.id in omitted_metabolites:
            continue

        if metabolite.id in conc_ranges:
            min_conc = conc_ranges[metabolite.id][0]
            max_conc = conc_ranges[metabolite.id][1]
        elif "DEFAULT" in conc_ranges:
            min_conc = conc_ranges["DEFAULT"][0]
            max_conc = conc_ranges["DEFAULT"][1]
        else:
            print(f"ERROR: No concentration range for metabolite {metabolite.id}.")
            print("Fixes: 1) Set its specific range; 2) Set a 'DEFAULT' range.")
            raise ValueError

        cobrak_model.metabolites[metabolite.id] = Metabolite(
            log_min_conc=log(min_conc),
            log_max_conc=log(max_conc),
            annotation={
                key: value
                for key, value in metabolite.annotation.items()
                if not key.startswith("cobrak_")
            },
            name=metabolite.name,
            formula="" if not metabolite.formula else metabolite.formula,
            charge=metabolite.charge,
            compartment=metabolite.compartment,
        )

    for reaction in cobra_model.reactions:
        dG0 = dG0s.get(reaction.id)

        dG0_uncertainty = dG0_uncertainties.get(reaction.id)

        used_enzyme_reaction_data = enzyme_reaction_data.get(reaction.id, None)
        if used_enzyme_reaction_data is None:
            identifiers = reaction.gene_reaction_rule.split(" and ")
            used_enzyme_reaction_data = (
                EnzymeReactionData(
                    identifiers=identifiers,
                )
                if identifiers != [""]
                else None
            )

        cobrak_model.reactions[reaction.id] = Reaction(
            min_flux=reaction.lower_bound,
            max_flux=reaction.upper_bound,
            stoichiometries={
                metabolite.id: value
                for (metabolite, value) in reaction.metabolites.items()
                if metabolite.id not in omitted_metabolites
            },
            dG0=dG0,
            dG0_uncertainty=dG0_uncertainty,
            enzyme_reaction_data=used_enzyme_reaction_data,
            annotation={
                key: value
                for key, value in reaction.annotation.items()
                if not key.startswith("cobrak_")
            },
            name=reaction.name,
        )

    cobra_gene_ids = [gene.id for gene in cobra_model.genes]
    for enzyme_id, molecular_weight in enzyme_molecular_weights.items():
        min_enzyme_conc = None
        max_enzyme_conc = None
        if enzyme_id in enzyme_conc_ranges:
            conc_range = enzyme_conc_ranges[enzyme_id]
            if conc_range is not None:
                min_enzyme_conc = conc_range[0]
                max_enzyme_conc = conc_range[1]
        if enzyme_id in cobra_gene_ids:
            name = cobra_model.genes.get_by_id(enzyme_id).id
            annotation = cobra_model.genes.get_by_id(enzyme_id).annotation
        else:
            name = ""
            annotation = {}
        sequence = sequences.get(enzyme_id, "")
        cobrak_model.enzymes[enzyme_id] = Enzyme(
            molecular_weight=molecular_weight,
            min_conc=min_enzyme_conc,
            max_conc=max_enzyme_conc,
            name=name,
            annotation=annotation,
            sequence=sequence,
        )

    if do_delete_enzymatically_suboptimal_reactions:
        cobrak_model = delete_enzymatically_suboptimal_reactions_in_cobrak_model(
            cobrak_model,
            ignored_ids=ignored_enzyme_ids,
        )

    if remove_enzyme_reaction_data_if_no_kcat_set:
        for reaction in cobrak_model.reactions.values():
            if reaction.enzyme_reaction_data is None:
                continue
            if reaction.enzyme_reaction_data.k_cat > 1e19:
                reaction.enzyme_reaction_data = None

    if add_molar_masses:
        cobrak_model = add_molar_masses_to_model_metabolites(cobrak_model)

    return cobrak_model

get_cobrak_model_with_kinetic_data_from_sbml_model_alone(sbml_path, database_data_folder, brenda_version, base_species, prefer_brenda=False, use_ec_number_transfers=True, max_prot_pool=STANDARD_MAX_PROT_POOL, conc_ranges=STANDARD_CONC_RANGES, inner_to_outer_compartments=EC_INNER_TO_OUTER_COMPARTMENTS, phs=EC_PHS, pmgs=EC_PMGS, ionic_strenghts=EC_IONIC_STRENGTHS, potential_differences=EC_POTENTIAL_DIFFERENCES, kinetic_ignored_enzymes=[], custom_kms_and_kcats={}, kinetic_ignored_metabolites=[], do_model_fullsplit=True, do_delete_enzymatically_suboptimal_reactions=True, ignore_dG0_uncertainty=True, enzyme_conc_ranges={}, dG0_exclusion_prefixes=[], dG0_exclusion_inner_parts=[], dG0_corrections={}, extra_linear_constraints=[], R=STANDARD_R, T=STANDARD_T, enzymes_to_delete=[], max_taxonomy_level=1000.0, add_hill_coefficients=True, add_protein_sequences=False, kis_and_kas_only_for_same_compartments=False, add_molar_masses=True)

Build a fully-featured :class:~cobrak.Model from an SBML file and automatically retrieve all required kinetic and thermodynamic data from the local database_data_folder (or download it on-the-fly if missing).

The function orchestrates a multi-step pipeline:

  1. Load the SBML as an un-annotated COBRApy model and optionally delete user-specified enzymes (genes) from the model.
  2. Prepare the external data cache – ensure that the folder structure exists, locate cached JSON files, and (re)generate missing caches.
  3. Parse EC-number transfers (optional) to allow cross-species mapping of enzyme identifiers.
  4. Create a “full-split” model where each enzyme-specific reaction variant is represented as a separate COBRApy reaction (controlled by do_model_fullsplit).
  5. Collect enzyme kinetic parameters from BRENDA and SABIO-RK, optionally preferring one source over the other, and combine the two datasets.
  6. Fetch enzyme molecular weights from UniProt (cached for future runs).
  7. Optionally prune sub-optimal enzyme reactions based on the
  8. Compute standard Gibbs free energies (ΔG⁰) and their uncertainties using eQuilibrator, applying user-defined compartment, pH, ionic-strength, and membrane-potential settings, as well as any exclusion rules.
  9. Apply user-provided ΔG⁰ corrections (e.g. literature adjustments).
  10. Assemble the final COBRA-k model by calling :func:get_cobrak_model_from_sbml_and_thermokinetic_data with all gathered data, then clean up orphaned metabolites/enzymes.
Parameters

sbml_path : str Path to the SBML file that will be converted into a COBRA-k model. database_data_folder : str Root folder containing cached kinetic, thermodynamic and annotation data. The function will create the folder if it does not exist. brenda_version : str Version identifier of the BRENDA JSON archive (e.g. "2023.1"). base_species : str NCBI taxonomy identifier (or scientific name) of the organism for which kinetic data should be retrieved. prefer_brenda : bool, optional If True BRENDA data are used preferentially when both BRENDA and SABIO-RK contain information for the same reaction; otherwise SABIO-RK is preferred. Default: False. use_ec_number_transfers : bool, optional Enable mapping of EC numbers between organisms using the enzyme.rdf file from Expasy. Default: True. max_prot_pool : float, optional Upper bound on the total protein mass (g·gDW⁻¹) that can be allocated to enzymes. Default: :data:STANDARD_MAX_PROT_POOL. conc_ranges : dict[str, tuple[float, float]], optional Log-linear concentration bounds for metabolites (in M). Keys are metabolite IDs; the special key "DEFAULT" provides a fallback range. Default: :data:STANDARD_CONC_RANGES. inner_to_outer_compartments : list[str], optional Mapping of inner to outer compartments required by eQuilibrator for ΔG⁰ calculations. Default: :data:EC_INNER_TO_OUTER_COMPARTMENTS. phs : dict[str, float], optional pH values for each compartment. Default: :data:EC_PHS. pmgs : dict[str, float], optional Magnesium concentrations (M) for each compartment. Default: :data:EC_PMGS. ionic_strenghts : dict[str, float], optional Ionic strength (M) for each compartment. Default: :data:EC_IONIC_STRENGTHS. potential_differences : dict[tuple[str, str], float], optional Membrane potential differences (V) between compartment pairs. Default: :data:EC_POTENTIAL_DIFFERENCES. kinetic_ignored_enzymes : list[str], optional Enzyme identifiers that should be ignored when extracting kinetic data. Default: []. custom_kms_and_kcats : dict[str, EnzymeReactionData | None], optional User-provided kinetic parameters that override any database values. Default: {}. kinetic_ignored_metabolites : list[str], optional Metabolite IDs that shall be excluded from kinetic calculations (e.g., pseudo-metabolites). Default: []. do_model_fullsplit : bool, optional Whether to split reactions per enzyme before further processing. Default: True. do_delete_enzymatically_suboptimal_reactions : bool, optional If True remove reactions that are not optimal with respect to the MW/k_cat criterion. Default: True. ignore_dG0_uncertainty : bool, optional When True discard ΔG⁰ uncertainty values after they have been computed. Default: True. enzyme_conc_ranges : dict[str, tuple[float, float] | None], optional Optional concentration bounds for enzymes (in M). None means no bound. Default: {}. dG0_exclusion_prefixes : list[str], optional Reaction IDs starting with any of these prefixes are removed from the ΔG⁰ dataset. Default: []. dG0_exclusion_inner_parts : list[str], optional Sub-strings that, if present anywhere in a reaction ID, cause its ΔG⁰ entry to be removed. Default: []. dG0_corrections : dict[str, float], optional Additive corrections (in kJ·mol⁻¹) to specific ΔG⁰ values after they have been computed. Default: {}. extra_linear_constraints : list[ExtraLinearConstraint], optional Additional linear constraints (e.g., flux bounds) to be added to the model. Default: []. R : float, optional Universal gas constant (kJ·mol⁻¹·K⁻¹). Default: :data:STANDARD_R. T : float, optional Temperature in Kelvin for thermodynamic calculations. Default: :data:STANDARD_T. enzymes_to_delete : list[str], optional Gene identifiers that should be removed from the initial COBRApy model before any further processing. Default: []. max_taxonomy_level : float, optional Upper bound on the NCBI taxonomy distance used when selecting kinetic data from related organisms. Default: 1_000.0. add_hill_coefficients : bool, optional If True include Hill coefficients from SABIO-RK where available. Default: True. add_protein_sequences: bool, optional Whether to add protein sequences or not to Enzyme instances. Default: False kis_and_kas_only_for_same_compartments: bool, default False If True, kis and kas can only be attributed to a reaction if the affected metabolite has shares one of the reaction metabolite's compartments add_molar_masses: bool, default True Whether or not to calculate molar masses for all metabolites through their formula member variable

Returns

Model A fully populated :class:~cobrak.Model instance containing e.g.: * Metabolite objects with concentration bounds, * Reaction objects with flux bounds, ΔG⁰ values, and enzyme reaction data, * Enzyme objects with molecular weights and concentration bounds, * Any extra linear constraints supplied by the user, * The global protein pool constraint.

Raises

FileNotFoundError If sbml_path does not exist or required external files (e.g. enzyme.rdf when use_ec_number_transfers is True) are missing. ValueError When a required concentration range for a metabolite is not provided and no "DEFAULT" range exists. RuntimeError If any of the external data retrieval steps (BRENDA, SABIO-RK, UniProt, eQuilibrator) fail unexpectedly.

Notes
  • The function heavily relies on caching to avoid repeated expensive web queries. Cache files are stored alongside database_data_folder with _cache_ prefixes.
  • The returned model is already cleaned of orphaned metabolites and enzymes via :func:delete_orphaned_metabolites_and_enzymes.
  • Users can bypass the full pipeline by providing pre-computed cache files; in that case the function will simply load the cached data.
Source code in cobrak/model_instantiation.py
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
def get_cobrak_model_with_kinetic_data_from_sbml_model_alone(
    sbml_path: str,
    database_data_folder: str,
    brenda_version: str,
    base_species: str,
    prefer_brenda: bool = False,
    use_ec_number_transfers: bool = True,
    max_prot_pool: float = STANDARD_MAX_PROT_POOL,
    conc_ranges: dict[str, tuple[float, float]] = STANDARD_CONC_RANGES,
    inner_to_outer_compartments: list[str] = EC_INNER_TO_OUTER_COMPARTMENTS,
    phs: dict[str, float] = EC_PHS,
    pmgs: dict[str, float] = EC_PMGS,
    ionic_strenghts: dict[str, float] = EC_IONIC_STRENGTHS,
    potential_differences: dict[tuple[str, str], float] = EC_POTENTIAL_DIFFERENCES,
    kinetic_ignored_enzymes: list[str] = [],
    custom_kms_and_kcats: dict[str, EnzymeReactionData | None] = {},
    kinetic_ignored_metabolites: list[str] = [],
    do_model_fullsplit: bool = True,
    do_delete_enzymatically_suboptimal_reactions: bool = True,
    ignore_dG0_uncertainty: bool = True,
    enzyme_conc_ranges: dict[str, tuple[float, float] | None] = {},
    dG0_exclusion_prefixes: list[str] = [],
    dG0_exclusion_inner_parts: list[str] = [],
    dG0_corrections: dict[str, float] = {},
    extra_linear_constraints: list[ExtraLinearConstraint] = [],
    R: float = STANDARD_R,
    T: float = STANDARD_T,
    enzymes_to_delete: list[str] = [],
    max_taxonomy_level: float = 1_000.0,
    add_hill_coefficients: bool = True,
    add_protein_sequences: bool = False,
    kis_and_kas_only_for_same_compartments: bool = False,
    add_molar_masses: bool = True,
) -> Model:
    """Build a fully-featured :class:`~cobrak.Model` from an SBML file **and** automatically
    retrieve all required kinetic and thermodynamic data from the local
    ``database_data_folder`` (or download it on-the-fly if missing).

    The function orchestrates a multi-step pipeline:

    1. **Load the SBML** as an un-annotated COBRApy model and optionally delete
       user-specified enzymes (genes) from the model.
    2. **Prepare the external data cache** – ensure that the folder structure
       exists, locate cached JSON files, and (re)generate missing caches.
    3. **Parse EC-number transfers** (optional) to allow cross-species mapping of
       enzyme identifiers.
    4. **Create a “full-split” model** where each enzyme-specific reaction variant
       is represented as a separate COBRApy reaction (controlled by
       ``do_model_fullsplit``).
    5. **Collect enzyme kinetic parameters** from BRENDA and SABIO-RK, optionally
       preferring one source over the other, and combine the two datasets.
    6. **Fetch enzyme molecular weights** from UniProt (cached for future runs).
    7. **Optionally prune sub-optimal enzyme reactions** based on the
       8. **Compute standard Gibbs free energies** (ΔG⁰) and their uncertainties
       using eQuilibrator, applying user-defined compartment, pH, ionic-strength,
       and membrane-potential settings, as well as any exclusion rules.
    9. **Apply user-provided ΔG⁰ corrections** (e.g. literature adjustments).
    10. **Assemble the final COBRA-k model** by calling
        :func:`get_cobrak_model_from_sbml_and_thermokinetic_data` with all
        gathered data, then clean up orphaned metabolites/enzymes.

    Parameters
    ----------
    sbml_path : str
        Path to the SBML file that will be converted into a COBRA-k model.
    database_data_folder : str
        Root folder containing cached kinetic, thermodynamic and annotation data.
        The function will create the folder if it does not exist.
    brenda_version : str
        Version identifier of the BRENDA JSON archive (e.g. ``"2023.1"``).
    base_species : str
        NCBI taxonomy identifier (or scientific name) of the organism for which
        kinetic data should be retrieved.
    prefer_brenda : bool, optional
        If ``True`` BRENDA data are used preferentially when both BRENDA and
        SABIO-RK contain information for the same reaction; otherwise SABIO-RK
        is preferred. Default: ``False``.
    use_ec_number_transfers : bool, optional
        Enable mapping of EC numbers between organisms using the
        ``enzyme.rdf`` file from Expasy. Default: ``True``.
    max_prot_pool : float, optional
        Upper bound on the total protein mass (g·gDW⁻¹) that can be allocated to
        enzymes. Default: :data:`STANDARD_MAX_PROT_POOL`.
    conc_ranges : dict[str, tuple[float, float]], optional
        Log-linear concentration bounds for metabolites (in M). Keys are metabolite
        IDs; the special key ``"DEFAULT"`` provides a fallback range. Default:
        :data:`STANDARD_CONC_RANGES`.
    inner_to_outer_compartments : list[str], optional
        Mapping of inner to outer compartments required by eQuilibrator for
        ΔG⁰ calculations. Default: :data:`EC_INNER_TO_OUTER_COMPARTMENTS`.
    phs : dict[str, float], optional
        pH values for each compartment. Default: :data:`EC_PHS`.
    pmgs : dict[str, float], optional
        Magnesium concentrations (M) for each compartment. Default: :data:`EC_PMGS`.
    ionic_strenghts : dict[str, float], optional
        Ionic strength (M) for each compartment. Default: :data:`EC_IONIC_STRENGTHS`.
    potential_differences : dict[tuple[str, str], float], optional
        Membrane potential differences (V) between compartment pairs. Default:
        :data:`EC_POTENTIAL_DIFFERENCES`.
    kinetic_ignored_enzymes : list[str], optional
        Enzyme identifiers that should be ignored when extracting kinetic data.
        Default: ``[]``.
    custom_kms_and_kcats : dict[str, EnzymeReactionData | None], optional
        User-provided kinetic parameters that override any database values.
        Default: ``{}``.
    kinetic_ignored_metabolites : list[str], optional
        Metabolite IDs that shall be excluded from kinetic calculations
        (e.g., pseudo-metabolites). Default: ``[]``.
    do_model_fullsplit : bool, optional
        Whether to split reactions per enzyme before further processing.
        Default: ``True``.
    do_delete_enzymatically_suboptimal_reactions : bool, optional
        If ``True`` remove reactions that are not optimal with respect to the
        ``MW/k_cat`` criterion. Default: ``True``.
    ignore_dG0_uncertainty : bool, optional
        When ``True`` discard ΔG⁰ uncertainty values after they have been computed.
        Default: ``True``.
    enzyme_conc_ranges : dict[str, tuple[float, float] | None], optional
        Optional concentration bounds for enzymes (in M). ``None`` means no bound.
        Default: ``{}``.
    dG0_exclusion_prefixes : list[str], optional
        Reaction IDs starting with any of these prefixes are removed from the
        ΔG⁰ dataset. Default: ``[]``.
    dG0_exclusion_inner_parts : list[str], optional
        Sub-strings that, if present anywhere in a reaction ID, cause its ΔG⁰
        entry to be removed. Default: ``[]``.
    dG0_corrections : dict[str, float], optional
        Additive corrections (in kJ·mol⁻¹) to specific ΔG⁰ values after they have
        been computed. Default: ``{}``.
    extra_linear_constraints : list[ExtraLinearConstraint], optional
        Additional linear constraints (e.g., flux bounds) to be added to the model.
        Default: ``[]``.
    R : float, optional
        Universal gas constant (kJ·mol⁻¹·K⁻¹). Default: :data:`STANDARD_R`.
    T : float, optional
        Temperature in Kelvin for thermodynamic calculations. Default:
        :data:`STANDARD_T`.
    enzymes_to_delete : list[str], optional
        Gene identifiers that should be removed from the initial COBRApy model
        before any further processing. Default: ``[]``.
    max_taxonomy_level : float, optional
        Upper bound on the NCBI taxonomy distance used when selecting kinetic
        data from related organisms. Default: ``1_000.0``.
    add_hill_coefficients : bool, optional
        If ``True`` include Hill coefficients from SABIO-RK where available.
        Default: ``True``.
    add_protein_sequences: bool, optional
        Whether to add protein sequences or not to Enzyme instances. Default: ``False``
    kis_and_kas_only_for_same_compartments: bool, default False
        If True, kis and kas can only be attributed to a reaction if the affected metabolite has
        shares one of the reaction metabolite's compartments
    add_molar_masses: bool, default True
        Whether or not to calculate molar masses for all metabolites through their formula member variable

    Returns
    -------
    Model
        A fully populated :class:`~cobrak.Model` instance containing e.g.:
        * Metabolite objects with concentration bounds,
        * Reaction objects with flux bounds, ΔG⁰ values, and enzyme reaction data,
        * Enzyme objects with molecular weights and concentration bounds,
        * Any extra linear constraints supplied by the user,
        * The global protein pool constraint.

    Raises
    ------
    FileNotFoundError
        If ``sbml_path`` does not exist or required external files (e.g.
        ``enzyme.rdf`` when ``use_ec_number_transfers`` is ``True``) are missing.
    ValueError
        When a required concentration range for a metabolite is not provided and
        no ``"DEFAULT"`` range exists.
    RuntimeError
        If any of the external data retrieval steps (BRENDA, SABIO-RK,
        UniProt, eQuilibrator) fail unexpectedly.

    Notes
    -----
    * The function heavily relies on caching to avoid repeated expensive web
      queries. Cache files are stored alongside ``database_data_folder`` with
      ``_cache_`` prefixes.
    * The returned model is already cleaned of orphaned metabolites and enzymes
      via :func:`delete_orphaned_metabolites_and_enzymes`.
    * Users can bypass the full pipeline by providing pre-computed cache files;
      in that case the function will simply load the cached data.
    """
    cobra_model = load_unannotated_sbml_as_cobrapy_model(sbml_path)
    remove_genes(
        model=cobra_model,
        gene_list=enzymes_to_delete,
        remove_reactions=False,
    )

    database_data_folder = standardize_folder(database_data_folder)
    data_cache_files = get_files(database_data_folder)

    parse_external_resources(database_data_folder, brenda_version)
    if use_ec_number_transfers:
        transfer_json_path = f"{database_data_folder}ec_number_transfers.json"
        if not exists(transfer_json_path):
            if not exists(f"{database_data_folder}enzyme.rdf"):
                print(
                    f"ERROR: Argument use_ec_number_transfers is True, but no necessary enzyme.rdf can be found in {database_data_folder}"
                )
                print(
                    "You may download it from https://ftp.expasy.org/databases/enzyme/"
                )
                print(
                    f"After downloading, put it into the folder {database_data_folder}"
                )
            ec_number_transfers = get_ec_number_transfers(
                f"{database_data_folder}enzyme.rdf"
            )
            json_write(transfer_json_path, ec_number_transfers)
    else:
        transfer_json_path = ""

    fullsplit_model = (
        get_fullsplit_cobra_model(cobra_model)
        if do_model_fullsplit
        else deepcopy(cobra_model)
    )

    enzyme_reaction_data: dict[str, EnzymeReactionData | None] = {}
    if (not database_data_folder) or (
        (database_data_folder)
        and (
            ("_cache_dG0.json" not in data_cache_files)
            or ("_cache_dG0_uncertainties.json" not in data_cache_files)
            or ("_cache_enzyme_reaction_data.json" not in data_cache_files)
        )
    ):
        with tempfile.TemporaryDirectory() as tmpdict:
            temp_sbml_path = tmpdict + "temp.xml"
            cobra.io.write_sbml_model(fullsplit_model, temp_sbml_path)

            brenda_enzyme_reaction_data = brenda_select_enzyme_kinetic_data_for_sbml(
                sbml_path=temp_sbml_path,
                brenda_json_targz_file_path=f"{database_data_folder}brenda_{brenda_version}.json.tar.gz",
                bigg_metabolites_json_path=f"{database_data_folder}bigg_models_metabolites.json",
                brenda_version=brenda_version,
                base_species=base_species,
                ncbi_parsed_json_path=f"{database_data_folder}parsed_taxdmp.json",
                kinetic_ignored_metabolites=kinetic_ignored_metabolites,
                kinetic_ignored_enzyme_ids=kinetic_ignored_enzymes,
                custom_enzyme_kinetic_data=custom_kms_and_kcats,
                max_taxonomy_level=max_taxonomy_level,
                transfered_ec_number_json=transfer_json_path,
                kis_and_kas_only_for_same_compartments=kis_and_kas_only_for_same_compartments,
            )
            sabio_enzyme_reaction_data = sabio_select_enzyme_kinetic_data_for_sbml(
                sbml_path=temp_sbml_path,
                sabio_target_folder=database_data_folder,
                base_species=base_species,
                ncbi_parsed_json_path=f"{database_data_folder}parsed_taxdmp.json",
                bigg_metabolites_json_path=f"{database_data_folder}bigg_models_metabolites.json",
                kinetic_ignored_metabolites=kinetic_ignored_metabolites,
                kinetic_ignored_enzyme_ids=kinetic_ignored_enzymes,
                custom_enzyme_kinetic_data=custom_kms_and_kcats,
                max_taxonomy_level=max_taxonomy_level,
                add_hill_coefficients=add_hill_coefficients,
                transfered_ec_number_json=transfer_json_path,
                kis_and_kas_only_for_same_compartments=kis_and_kas_only_for_same_compartments,
            )

        enzyme_reaction_data = combine_enzyme_reaction_datasets(
            [
                (
                    brenda_enzyme_reaction_data
                    if prefer_brenda
                    else sabio_enzyme_reaction_data
                ),
                (
                    sabio_enzyme_reaction_data
                    if prefer_brenda
                    else brenda_enzyme_reaction_data
                ),
            ]
        )

        if database_data_folder:
            json_write(
                f"{database_data_folder}_cache_enzyme_reaction_data.json",
                enzyme_reaction_data,
            )
    else:
        enzyme_reaction_data = json_load(
            f"{database_data_folder}_cache_enzyme_reaction_data.json",
            dict[str, EnzymeReactionData | None],
        )

    with tempfile.TemporaryDirectory() as tmpdict:
        sbml_path = tmpdict + "temp.xml"
        cobra.io.write_sbml_model(fullsplit_model, sbml_path)
        enzyme_molecular_weights = uniprot_get_enzyme_molecular_weights_for_sbml(
            sbml_path=sbml_path,
            cache_basepath=database_data_folder,
            base_species=base_species,
        )
        if database_data_folder:
            json_write(
                f"{database_data_folder}_cache_uniprot_molecular_weights.json",
                enzyme_molecular_weights,
            )

        if add_protein_sequences:
            sequences = uniprot_get_enzyme_sequences_for_sbml(
                sbml_path=sbml_path,
                cache_basepath=database_data_folder,
                base_species=base_species,
            )
            if database_data_folder:
                json_write(
                    f"{database_data_folder}_cache_uniprot_sequences.json",
                    sequences,
                )
        else:
            sequences = {}

    if do_delete_enzymatically_suboptimal_reactions:
        fullsplit_model = (
            delete_enzymatically_suboptimal_reactions_in_fullsplit_cobrapy_model(
                fullsplit_model,
                enzyme_reaction_data,
                enzyme_molecular_weights,
            )
        )

    if (not database_data_folder) or (
        (database_data_folder)
        and (
            ("_cache_dG0.json" not in data_cache_files)
            or ("_cache_dG0_uncertainties.json" not in data_cache_files)
        )
    ):
        with tempfile.TemporaryDirectory() as tmpdict:
            cobra.io.write_sbml_model(fullsplit_model, tmpdict + "temp.xml")
            dG0s, dG0_uncertainties = (
                equilibrator_get_model_dG0_and_uncertainty_values_for_sbml(
                    tmpdict + "temp.xml",
                    inner_to_outer_compartments,
                    phs,
                    pmgs,
                    ionic_strenghts,
                    potential_differences,
                    dG0_exclusion_prefixes,
                    dG0_exclusion_inner_parts,
                    ignore_dG0_uncertainty,
                )
            )
        if database_data_folder:
            json_write(f"{database_data_folder}_cache_dG0.json", dG0s)
            json_write(
                f"{database_data_folder}_cache_dG0_uncertainties.json",
                dG0_uncertainties,
            )
    else:
        dG0s = json_load(f"{database_data_folder}_cache_dG0.json", dict[str, float])
        dG0_uncertainties = json_load(
            f"{database_data_folder}_cache_dG0_uncertainties.json",
            dict[str, float],
        )

        dG0_keys = list(dG0s.keys())
        for dG0_key in dG0_keys:
            if any(
                dG0_key.startswith(dG0_exclusion_prefix)
                for dG0_exclusion_prefix in dG0_exclusion_prefixes
            ) or any(
                dG0_exclusion_inner_part in dG0_key
                for dG0_exclusion_inner_part in dG0_exclusion_inner_parts
            ):
                del dG0s[dG0_key]
                if dG0_key in dG0_uncertainties:
                    del dG0_uncertainties[dG0_key]

    for key, value in dG0_corrections.items():
        dG0s[key] += value

    with tempfile.TemporaryDirectory() as tmpdict:
        cobra.io.write_sbml_model(fullsplit_model, tmpdict + "temp.xml")
        return delete_orphaned_metabolites_and_enzymes(
            get_cobrak_model_from_sbml_and_thermokinetic_data(
                sbml_path=tmpdict + "temp.xml",
                extra_linear_constraints=extra_linear_constraints,
                dG0s=dG0s,
                dG0_uncertainties=dG0_uncertainties
                if not ignore_dG0_uncertainty
                else {},
                conc_ranges=conc_ranges,
                enzyme_molecular_weights=enzyme_molecular_weights,
                enzyme_reaction_data=enzyme_reaction_data,
                max_prot_pool=max_prot_pool,
                kinetic_ignored_metabolites=kinetic_ignored_metabolites,
                enzyme_conc_ranges=enzyme_conc_ranges,
                R=R,
                T=T,
                do_delete_enzymatically_suboptimal_reactions=False,
                sequences=sequences,
                add_molar_masses=add_molar_masses,
            )
        )

molmass_functionality

Functionality for using molmass

add_molar_masses_to_model_metabolites(model, verbose=False)

Calculates and assigns molar masses to metabolites in a Model instance.

This function iterates through all metabolites in the provided model, parses their chemical formulas using the molmass library, and updates the molar_mass attribute for each metabolite.

Parameters:

Name Type Description Default
model Model

A COBRA-k Model instance.

required
verbose bool

If True, prints status messages regarding missing formulas, successful calculations, or parsing errors.

False

Returns:

Name Type Description
Model Model

The modified model object with updated 'molar_mass' attributes.

Raises:

Type Description
Note

molmass's FormulaError (e.g. a wrong formula with unknown symbols) is handled internally

Source code in cobrak/molmass_functionality.py
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
@validate_call(validate_return=True)
def add_molar_masses_to_model_metabolites(
    model: Model,
    verbose: bool = False,
) -> Model:
    """Calculates and assigns molar masses to metabolites in a Model instance.

    This function iterates through all metabolites in the provided model, parses
    their chemical formulas using the `molmass` library, and updates the
    `molar_mass` attribute for each metabolite.

    Args:
        model: A COBRA-k Model instance.
        verbose: If True, prints status messages regarding missing formulas,
            successful calculations, or parsing errors.

    Returns:
        Model: The modified model object with updated 'molar_mass' attributes.

    Raises:
        Note: molmass's FormulaError (e.g. a wrong formula with unknown symbols) is handled internally
    """
    for met_id, metabolite in model.metabolites.items():
        if not metabolite.formula:
            if verbose:
                print(f"No formula given for {met_id} - no molar mass computable!")
            continue
        try:
            molmass_formula = Formula(metabolite.formula)
            metabolite.molar_mass = molmass_formula.mass
            if verbose:
                print(f"Average Mass (g⋅mol⁻¹) of {met_id}: {molmass_formula.mass:.4f}")
        except FormulaError:
            if verbose:
                print(
                    f"FormulaError with {met_id} with formula {metabolite.formula} - no molar mass computable!"
                )
    return model

ncbi_taxonomy_functionality

ncbi_taxonomy.py

This module contains functions which can access NCBI TAXONOMY.

get_taxonomy_dict_from_nbci_taxonomy(organisms, parsed_json_data)

Generates a taxonomy dictionary from NCBI taxonomy data.

This function constructs a dictionary mapping each organism to its taxonomy path based on the provided NCBI taxonomy data.

Parameters:

Name Type Description Default
organisms list[str]

A list of organism names for which taxonomy paths are to be retrieved.

required
parsed_json_data dict[str, Any]

Parsed JSON data containing taxonomy information, including: - "number_to_names_dict": A dictionary mapping taxonomy numbers to names. - "names_to_number_dict": A dictionary mapping organism names to taxonomy numbers. - "nodes_dict": A dictionary representing the taxonomy tree structure.

required

Returns:

Type Description
dict[str, list[str]]

dict[str, list[str]]: A dictionary where each key is an organism name and the value is a list of taxonomy names

dict[str, list[str]]

representing the path from the organism to the root of the taxonomy tree.

Source code in cobrak/ncbi_taxonomy_functionality.py
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
@validate_call(validate_return=True)
def get_taxonomy_dict_from_nbci_taxonomy(
    organisms: list[str],
    parsed_json_data: dict[str, Any],
) -> dict[str, list[str]]:
    """Generates a taxonomy dictionary from NCBI taxonomy data.

    This function constructs a dictionary mapping each organism to its taxonomy path based on the provided NCBI taxonomy data.

    Args:
        organisms (list[str]): A list of organism names for which taxonomy paths are to be retrieved.
        parsed_json_data (dict[str, Any]): Parsed JSON data containing taxonomy information, including:
            - "number_to_names_dict": A dictionary mapping taxonomy numbers to names.
            - "names_to_number_dict": A dictionary mapping organism names to taxonomy numbers.
            - "nodes_dict": A dictionary representing the taxonomy tree structure.

    Returns:
        dict[str, list[str]]: A dictionary where each key is an organism name and the value is a list of taxonomy names
        representing the path from the organism to the root of the taxonomy tree.
    """
    number_to_names_dict = parsed_json_data["number_to_names_dict"]
    names_to_number_dict = parsed_json_data["names_to_number_dict"]
    nodes_dict = parsed_json_data["nodes_dict"]

    organism_to_taxonomy_dicts: dict[str, list[str]] = {}
    for organism in organisms:
        try:
            node_train = [names_to_number_dict[organism]]
        except KeyError:
            organism_to_taxonomy_dicts[organism] = [organism, "all"]
            continue
        current_number = names_to_number_dict[organism]
        while True:
            next_number = nodes_dict[current_number]
            if next_number == "END":
                break
            node_train.append(next_number)
            current_number = next_number
        node_train_names = [number_to_names_dict[x][0] for x in node_train]
        organism_to_taxonomy_dicts[organism] = node_train_names
    return organism_to_taxonomy_dicts

get_taxonomy_scores(base_species, taxonomy_dict)

Returns a dictionary with a taxonomic distance from the given organism.

e.g. if base_species is "Escherichia coli" and taxonomy_dict is

{
    "Escherichia coli": ["Escherichia", "Bacteria", "Organism"],
    "Pseudomonas": ["Pseudomonas", "Bacteria", "Organism"],
    "Homo sapiens": ["Homo", "Mammalia", "Animalia", "Organism"],
}

this function would return

{
    "Escherichia coli": 0,
    "Pseudomonas": 1,
    "Homo sapiens": 4,
}
Arguments
  • base_species: str ~ The species to which a relation is made.
  • taxonomy_dict: dict[str, list[str]] ~ A dictionary with organism names as keys and their taxonomic levels (sorted from nearest to farthest) as string list.
Source code in cobrak/ncbi_taxonomy_functionality.py
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
@validate_call(validate_return=True)
def get_taxonomy_scores(
    base_species: str,
    taxonomy_dict: dict[str, list[str]],
) -> dict[str, NonNegativeInt]:
    """Returns a dictionary with a taxonomic distance from the given organism.

    e.g. if base_species is "Escherichia coli" and taxonomy_dict is
    <pre>
    {
        "Escherichia coli": ["Escherichia", "Bacteria", "Organism"],
        "Pseudomonas": ["Pseudomonas", "Bacteria", "Organism"],
        "Homo sapiens": ["Homo", "Mammalia", "Animalia", "Organism"],
    }
    </pre>
    this function would return
    <pre>
    {
        "Escherichia coli": 0,
        "Pseudomonas": 1,
        "Homo sapiens": 4,
    }
    </pre>

    Arguments
    ----------
    * base_species: str ~ The species to which a relation is made.
    * taxonomy_dict: dict[str, list[str]] ~ A dictionary with organism names as keys and
      their taxonomic levels (sorted from nearest to farthest) as string list.
    """
    base_species_taxonomy = taxonomy_dict[base_species]
    taxonomy_scores: dict[str, int] = {
        base_species: 0,
    }
    for other_species_name, other_species_taxonomy in taxonomy_dict.items():
        score = 0
        for taxonomy_part in base_species_taxonomy:
            if taxonomy_part in other_species_taxonomy:
                break
            score += 1
        taxonomy_scores[other_species_name] = score

    return taxonomy_scores

most_taxonomic_similar(base_species, taxonomy_dict)

Returns a dictionary with a score of taxonomic distance from the given organism.

e.g. if base_species is "Escherichia coli" and taxonomy_dict is

{
    "Escherichia coli": ["Escherichia", "Bacteria", "Organism"],
    "Pseudomonas": ["Pseudomonas", "Bacteria", "Organism"],
    "Homo sapiens": ["Homo", "Mammalia", "Animalia", "Organism"],
}

this function would return

{
    "Escherichia coli": 0,
    "Pseudomonas": 1,
    "Homo sapiens": 2,
}
Arguments
  • base_species: str ~ The species to which a relation is made.
  • taxonomy_dict: dict[str, list[str]] ~ A dictionary with organism names as keys and their taxonomic levels (sorted from nearest to farthest) as string list.
Source code in cobrak/ncbi_taxonomy_functionality.py
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
@validate_call(validate_return=True)
def most_taxonomic_similar(
    base_species: str, taxonomy_dict: dict[str, list[str]]
) -> dict[str, int]:
    """Returns a dictionary with a score of taxonomic distance from the given organism.

    e.g. if base_species is "Escherichia coli" and taxonomy_dict is
    <pre>
    {
        "Escherichia coli": ["Escherichia", "Bacteria", "Organism"],
        "Pseudomonas": ["Pseudomonas", "Bacteria", "Organism"],
        "Homo sapiens": ["Homo", "Mammalia", "Animalia", "Organism"],
    }
    </pre>
    this function would return
    <pre>
    {
        "Escherichia coli": 0,
        "Pseudomonas": 1,
        "Homo sapiens": 2,
    }
    </pre>

    Arguments
    ----------
    * base_species: str ~ The species to which a relation is made.
    * taxonomy_dict: dict[str, list[str]] ~ A dictionary with organism names as keys and
      their taxonomic levels (sorted from nearest to farthest) as string list.
    """
    base_taxonomy = taxonomy_dict[base_species]
    level_dict: dict[str, int] = {}
    for level, taxonomic_level in enumerate(base_taxonomy):
        level_dict[taxonomic_level] = level

    score_dict: dict[str, int] = {}
    for species, taxonomic_levels in taxonomy_dict.items():
        for taxonomic_level in taxonomic_levels:
            if taxonomic_level in level_dict:
                score_dict[species] = level_dict[taxonomic_level]
                break

    return score_dict

parse_ncbi_taxonomy(ncbi_taxdmp_zipfile_path, ncbi_parsed_json_path)

Parses NCBI taxonomy data from a taxdump zip file and saves it as a JSON file.

This function extracts the necessary files from the NCBI taxdump zip archive, parses the taxonomy data, and writes the parsed data to a JSON file. The parsed data includes mappings from taxonomy numbers to names and vice versa, as well as the taxonomy tree structure.

Parameters:

Name Type Description Default
ncbi_taxdmp_zipfile_path str

The file path to the NCBI taxdump zip archive.

required
ncbi_parsed_json_path str

The file path where the parsed JSON data will be saved.

required
Source code in cobrak/ncbi_taxonomy_functionality.py
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
@validate_call(validate_return=True)
def parse_ncbi_taxonomy(
    ncbi_taxdmp_zipfile_path: str,
    ncbi_parsed_json_path: str,
) -> None:
    """Parses NCBI taxonomy data from a taxdump zip file and saves it as a JSON file.

    This function extracts the necessary files from the NCBI taxdump zip archive, parses the taxonomy data,
    and writes the parsed data to a JSON file. The parsed data includes mappings from taxonomy numbers to names
    and vice versa, as well as the taxonomy tree structure.

    Args:
        ncbi_taxdmp_zipfile_path (str): The file path to the NCBI taxdump zip archive.
        ncbi_parsed_json_path (str): The file path where the parsed JSON data will be saved.
    """
    old_wd = os.getcwd()
    folder = standardize_folder(os.path.dirname(ncbi_taxdmp_zipfile_path))
    filename = os.path.basename(ncbi_taxdmp_zipfile_path)
    os.chdir(folder)

    with ZipFile(filename, "r") as zipfile:
        zipfile.extract("names.dmp")
        zipfile.extract("nodes.dmp")

    with open("names.dmp", encoding="utf-8") as f:
        name_lines = f.readlines()
    with open("nodes.dmp", encoding="utf-8") as f:
        node_lines = f.readlines()

    os.remove("names.dmp")
    os.remove("nodes.dmp")
    os.chdir(old_wd)

    parsed_json_data = {}

    number_to_names_dict: dict[str, Any] = {}
    names_to_number_dict = {}
    for line in name_lines:
        if ("scientific name" not in line) and ("synonym" not in line):
            continue
        number = line.split("|")[0].lstrip().rstrip()
        name = line.split("|")[1].lstrip().rstrip()
        if number not in number_to_names_dict:
            number_to_names_dict[number] = []
        number_to_names_dict[number].append(name)
        names_to_number_dict[name] = number

    parsed_json_data["number_to_names_dict"] = number_to_names_dict
    parsed_json_data["names_to_number_dict"] = names_to_number_dict

    nodes_dict = {}
    for line in node_lines:
        begin = line.split("|")[0].lstrip().rstrip()
        end = line.split("|")[1].lstrip().rstrip()
        if begin == end:
            nodes_dict[begin] = "END"
        else:
            nodes_dict[begin] = end
    parsed_json_data["nodes_dict"] = nodes_dict
    json_zip_write(ncbi_parsed_json_path, parsed_json_data)

nlps

This file contains all non-linear programs (NLP) functions, including the evolutionary NLP optimization algorithm, that can be used with COBRAk models. With NLPs, all types of constraints (stoichiomnetric, enzymatic, κ, γ, ι, ...) can be integrated. However, NLPs can be very slow. For linear-programs (LP) and mixed-integer linear programs (MILP), see lps.py in the same folder.

add_loop_constraints_to_nlp(model, cobrak_model)

Adds loop constraints to a non-linear program (NLP) model.

The loop constraints are of the nonlinear form v_fwd * v_rev = 0.0 for any forward/reverse pair of split reversible reactions.

Parameters * model (ConcreteModel): The NLP model to add constraints to. * cobrak_model (Model): The COBRAk model associated with the NLP model.

Returns * ConcreteModel: The NLP model with added loop constraints.

Source code in cobrak/nlps.py
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def add_loop_constraints_to_nlp(
    model: ConcreteModel,
    cobrak_model: Model,
) -> ConcreteModel:
    """Adds loop constraints to a non-linear program (NLP) model.

    The loop constraints are of the nonlinear form v_fwd * v_rev = 0.0
    for any forward/reverse pair of split reversible reactions.

    Parameters
    * `model` (`ConcreteModel`): The NLP model to add constraints to.
    * `cobrak_model` (`Model`): The COBRAk model associated with the NLP model.

    Returns
    * `ConcreteModel`: The NLP model with added loop constraints.
    """
    model_var_names = [v.name for v in model.component_objects(Var)]
    for reac_id, reaction in cobrak_model.reactions.items():
        if reaction.dG0 is not None:
            continue
        if not reac_id.endswith(cobrak_model.rev_suffix):
            continue
        other_reac_id = reac_id.replace(
            cobrak_model.rev_suffix, cobrak_model.fwd_suffix
        )
        if other_reac_id not in model_var_names:
            continue

        setattr(
            model,
            f"loop_constraint_{reac_id}",
            Constraint(
                rule=getattr(model, reac_id) * getattr(model, other_reac_id) == 0.0
            ),
        )

    return model

get_nlp_from_cobrak_model(cobrak_model, ignored_reacs=[], with_kappa=True, with_gamma=True, with_iota=False, with_alpha=False, approximation_value=0.0001, irreversible_mode=False, variability_data={}, strict_mode=False, single_strict_reacs=[], irreversible_mode_min_mdf=STANDARD_MIN_MDF, with_flux_sum_var=False, correction_config=CorrectionConfig())

Creates a pyomo non-linear program (NLP) model instance from a COBRAk Model.

For more, see COBRAk's NLP documentation chapter.

Parameters

  • cobrak_model (Model): The COBRAk model to create the NLP model from.
  • ignored_reacs (list[str], optional): List of reaction IDs to ignore. Defaults to [].
  • with_kappa (bool, optional): Whether to include κ saturation term terms. Defaults to True.
  • with_gamma (bool, optional): Whether to include γ thermodynamic terms. Defaults to True.
  • with_iota (bool, optional): Whether to include ι inhibition terms. Defaults to False and untested!
  • with_alpha (bool, optional): Whether to include α activation terms. Defaults to False and untested!
  • approximation_value (float, optional): Approximation value for κ, γ, ι, and α terms. Defaults to 0.0001. This value is the minimal value for κ, γ, ι, and α terms, and can lead to an overapproximation in this regard.
  • irreversible_mode (bool, optional): Whether to use irreversible mode. Defaults to False.
  • variability_data (dict[str, tuple[float, float]], optional): Variability data for reactions. Defaults to {}.
  • strict_mode (bool, optional): Whether to use strict mode (i.e. all <= heuristics become == relations). Defaults to False.
  • single_strict_reacs (list[str], optional): If 'strict_mode==False', only reactions with an ID in this list are set to strict mode.
  • irreversible_mode_min_mdf (float, optional): Minimum MDF value for irreversible mode. Defaults to STANDARD_MIN_MDF.
  • with_flux_sum_var (bool, optional): Whether to include a flux sum variable of name cobrak.constants.FLUX_SUM_VAR. Defaults to False.
  • correction_config (CorrectionConfig, optional): Parameter correction configuration. Defaults to CorrectionConfig().

Returns

  • ConcreteModel: The created NLP model.
Source code in cobrak/nlps.py
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def get_nlp_from_cobrak_model(
    cobrak_model: Model,
    ignored_reacs: list[str] = [],
    with_kappa: bool = True,
    with_gamma: bool = True,
    with_iota: bool = False,
    with_alpha: bool = False,
    approximation_value: float = 0.0001,
    irreversible_mode: bool = False,
    variability_data: dict[str, tuple[float, float]] = {},
    strict_mode: bool = False,
    single_strict_reacs: list[str] = [],
    irreversible_mode_min_mdf: float = STANDARD_MIN_MDF,
    with_flux_sum_var: bool = False,
    correction_config: CorrectionConfig = CorrectionConfig(),
) -> ConcreteModel:
    """Creates a pyomo non-linear program (NLP) model instance from a COBRAk Model.

    For more, see COBRAk's NLP documentation chapter.

    # Parameters
    * `cobrak_model` (`Model`): The COBRAk model to create the NLP model from.
    * `ignored_reacs` (`list[str]`, optional): List of reaction IDs to ignore. Defaults to `[]`.
    * `with_kappa` (`bool`, optional): Whether to include κ saturation term terms. Defaults to `True`.
    * `with_gamma` (`bool`, optional): Whether to include γ thermodynamic terms. Defaults to `True`.
    * `with_iota` (`bool`, optional): Whether to include ι inhibition terms. Defaults to `False` and untested!
    * `with_alpha` (`bool`, optional): Whether to include α activation terms. Defaults to `False` and untested!
    * `approximation_value` (`float`, optional): Approximation value for κ, γ, ι, and α terms. Defaults to `0.0001`. This value is the
       minimal value for κ, γ, ι, and α terms, and can lead to an overapproximation in this regard.
    * `irreversible_mode` (`bool`, optional): Whether to use irreversible mode. Defaults to `False`.
    * `variability_data` (`dict[str, tuple[float, float]]`, optional): Variability data for reactions. Defaults to `{}`.
    * `strict_mode` (`bool`, optional): Whether to use strict mode (i.e. all <= heuristics become == relations). Defaults to `False`.
    * `single_strict_reacs` (`list[str]`, optional): If 'strict_mode==False', only reactions with an ID in this list are set to strict mode.
    * `irreversible_mode_min_mdf` (`float`, optional): Minimum MDF value for irreversible mode. Defaults to `STANDARD_MIN_MDF`.
    * `with_flux_sum_var` (`bool`, optional): Whether to include a flux sum variable of name ```cobrak.constants.FLUX_SUM_VAR```. Defaults to `False`.
    * `correction_config` (`CorrectionConfig`, optional): Parameter correction configuration. Defaults to `CorrectionConfig()`.

    # Returns
    * `ConcreteModel`: The created NLP model.
    """
    cobrak_model = deepcopy(cobrak_model)

    reac_ids = list(cobrak_model.reactions.keys())
    enforced_reacs: list[str] = []
    ignored_reacs = deepcopy(ignored_reacs)
    for reac_id in variability_data:
        if reac_id not in reac_ids:
            continue
        min_flux = variability_data[reac_id][0]
        if min_flux < 1e-6:
            continue
        enforced_reacs.append(reac_id)

        if reac_id.endswith("_REV"):
            other_id = reac_id.replace("_REV", "_FWD")
        elif reac_id.endswith("_FWD"):
            other_id = reac_id.replace("_FWD", "_REV")
        else:
            continue
        if other_id in reac_ids:
            ignored_reacs.append(other_id)

    model = get_lp_from_cobrak_model(
        cobrak_model=cobrak_model,
        ignored_reacs=ignored_reacs,
        with_enzyme_constraints=True,
        with_thermodynamic_constraints=False,
        with_loop_constraints=False,
        add_extra_linear_constraints=False,
        with_flux_sum_var=with_flux_sum_var,
        correction_config=CorrectionConfig(
            add_kcat_times_e_error_term=correction_config.add_kcat_times_e_error_term,
            kcat_times_e_error_cutoff=correction_config.kcat_times_e_error_cutoff,
            max_rel_kcat_times_e_correction=correction_config.max_rel_kcat_times_e_correction,
            add_error_sum_term=False,
        ),
    )
    model = _add_concentration_vars_and_constraints(model, cobrak_model)

    if correction_config.add_kcat_times_e_error_term:
        model_vars = get_model_var_names(model)

    if correction_config.add_km_error_term:
        kms_lowbound, kms_highbound = _get_km_bounds(
            cobrak_model, correction_config.km_error_cutoff
        )
    else:
        kms_lowbound, kms_highbound = 0.0, 0.0

    if correction_config.add_dG0_error_term:
        dG0_highbound = _get_dG0_highbound(
            cobrak_model, correction_config.dG0_error_cutoff
        )
    else:
        dG0_highbound = 0.0

    setattr(
        model,
        MDF_VAR_ID,
        Var(within=Reals, bounds=(irreversible_mode_min_mdf, 1_000_000)),
    )
    # Set "MM" constraints
    if not irreversible_mode:
        reaction_couples = get_stoichiometrically_coupled_reactions(
            cobrak_model=cobrak_model,
        )
        reac_id_to_reac_couple_id: dict[str, str] = {}
        for couple in reaction_couples:
            for reac_id in couple:
                reac_id_to_reac_couple_id[reac_id] = "".join(couple)
        created_z_vars = []

    if with_alpha or with_iota:
        model_var_names = get_model_var_names(model)
    for reac_id, reaction in cobrak_model.reactions.items():
        if reac_id in ignored_reacs:
            continue

        if with_gamma and reaction.dG0 is not None:
            model, f_var_name = _add_df_and_dG0_var_for_reaction(
                model,
                reac_id,
                reaction,
                cobrak_model,
                strict_df_equality=strict_mode or reac_id in single_strict_reacs,
                add_error_term=correction_config.add_dG0_error_term
                and (reaction.dG0 >= dG0_highbound),
                max_abs_dG0_correction=correction_config.max_abs_dG0_correction,
            )

            if (
                not irreversible_mode
                and variability_data[reac_id][0] == 0.0
                and variability_data[reac_id][1] != 0.0
            ):
                z_varname = f"{Z_VAR_PREFIX}{reac_id_to_reac_couple_id[reac_id]}"
                if z_varname not in created_z_vars:
                    setattr(model, z_varname, Var(within=Binary))
                    created_z_vars.append(z_varname)

                # Big-M 0: r_i <= lb * z_i
                bigm_optmdfpathway_0_constraint = getattr(
                    model, reac_id
                ) <= reaction.max_flux * getattr(model, z_varname)
                setattr(
                    model,
                    f"bigm_optmdfpathway_0_{reac_id}",
                    Constraint(rule=bigm_optmdfpathway_0_constraint),
                )

                # Big-M 1: f_i + (1-z_i) * M_i >= var_B
                bigm_optmdfpathway_1_constraint = getattr(model, f_var_name) + (
                    1 - getattr(model, z_varname)
                ) * BIG_M >= getattr(model, MDF_VAR_ID)

                setattr(
                    model,
                    f"bigm_optmdfpathway_1_{reac_id}",
                    Constraint(rule=bigm_optmdfpathway_1_constraint),
                )
            elif reac_id in variability_data and variability_data[reac_id][1] != 0.0:
                mdf_constraint = getattr(model, f_var_name) >= getattr(
                    model, MDF_VAR_ID
                )

                setattr(
                    model,
                    f"mdf_constraint_{reac_id}",
                    Constraint(rule=mdf_constraint),
                )

        if (reaction.enzyme_reaction_data is None) or (
            reaction.enzyme_reaction_data.k_cat > 1e19
        ):
            continue

        # Determine whether or not κ, γ, ι and α are possible to add to the reaction
        # given its current kinetic and thermodynamic data.
        has_gamma = True
        has_kappa = True
        if not have_all_unignored_km(
            reaction, cobrak_model.kinetic_ignored_metabolites
        ):
            has_kappa = False
        if reaction.dG0 is None:
            has_gamma = False
        if (not has_kappa) and (not has_gamma):
            continue
        has_iota = reaction.enzyme_reaction_data.k_is != {}
        has_alpha = reaction.enzyme_reaction_data.k_as != {}

        reac_full_enzyme_id = get_full_enzyme_id(
            reaction.enzyme_reaction_data.identifiers
        )
        if not reac_full_enzyme_id:  # E.g., in ATPM
            continue
        enzyme_var_id = get_reaction_enzyme_var_id(reac_id, reaction)

        # V+
        k_cat = reaction.enzyme_reaction_data.k_cat

        if correction_config.add_kcat_times_e_error_term:
            kcat_times_e_error_var_id = f"{ERROR_VAR_PREFIX}_kcat_times_e_{reac_id}"
            if kcat_times_e_error_var_id in model_vars:
                v_plus = getattr(model, enzyme_var_id) * k_cat + getattr(
                    model, kcat_times_e_error_var_id
                )
            else:
                v_plus = getattr(model, enzyme_var_id) * k_cat
        else:
            v_plus = getattr(model, enzyme_var_id) * k_cat

        # κ (for solver stability, with a minimal value of 0.0001)
        if has_kappa and with_kappa:
            model, kappa_substrates_var_id, kappa_products_var_id = (
                _add_kappa_substrates_and_products_vars(
                    model,
                    reac_id,
                    reaction,
                    cobrak_model,
                    strict_kappa_products_equality=strict_mode
                    or reac_id in single_strict_reacs,
                    add_error_term=correction_config.add_km_error_term,
                    max_rel_km_correction=correction_config.max_rel_km_correction,
                    kms_lowbound=kms_lowbound,
                    kms_highbound=kms_highbound,
                )
            )

            kappa_var_id = f"{KAPPA_VAR_PREFIX}{reac_id}"
            setattr(
                model,
                kappa_var_id,
                Var(within=Reals, bounds=(approximation_value, 1.0)),
            )
            kappa_rhs = approximation_value + exp(
                getattr(model, kappa_substrates_var_id)
            ) / (
                1
                + exp(getattr(model, kappa_substrates_var_id))
                + exp(getattr(model, kappa_products_var_id))
            )
            if strict_mode or reac_id in single_strict_reacs:
                kappa_constraint = getattr(model, kappa_var_id) == kappa_rhs
            else:
                kappa_constraint = getattr(model, kappa_var_id) <= kappa_rhs
            setattr(
                model, f"kappa_constraint_{reac_id}", Constraint(rule=kappa_constraint)
            )

        # γ (for solver stability, with a minimal value of 0.0001)
        if has_gamma and with_gamma:
            gamma_var_name = f"{GAMMA_VAR_PREFIX}{reac_id}"

            min_gamma_value = (
                approximation_value if irreversible_mode else -float("inf")
            )
            setattr(
                model,
                gamma_var_name,
                Var(within=Reals, bounds=(min_gamma_value, 1.0)),
            )
            f_by_RT = getattr(model, f_var_name) / (cobrak_model.R * cobrak_model.T)

            if irreversible_mode:
                gamma_rhs = approximation_value + (1 - exp(-f_by_RT))
            else:
                gamma_rhs = (
                    approximation_value
                    + (
                        1
                        - exp(
                            -f_by_RT
                        )  # * getattr(model, f"{Z_VAR_PREFIX}{reac_id_to_reac_couple_id[reac_id]}")
                    )
                )  # (f_by_RT**2) / (1 + (f_by_RT**2)) would be a rough approximation

            if strict_mode or reac_id in single_strict_reacs:
                gamma_var_constraint_0 = getattr(model, gamma_var_name) == gamma_rhs
            else:
                gamma_var_constraint_0 = getattr(model, gamma_var_name) <= gamma_rhs
            setattr(
                model,
                f"gamma_var_constraint_{reac_id}_0",
                Constraint(rule=gamma_var_constraint_0),
            )

        # ι (for solver stability, with a minimal value of 0.0001)
        if with_iota and has_iota:
            iota_product = 1.0
            for met_id, k_i in reaction.enzyme_reaction_data.k_is.items():
                if met_id in cobrak_model.kinetic_ignored_metabolites:
                    continue
                var_id = f"{LNCONC_VAR_PREFIX}{met_id}"
                if var_id not in model_var_names:
                    continue
                stoichiometry = abs(
                    reaction.stoichiometries.get(met_id, 1.0)
                ) * reaction.enzyme_reaction_data.hill_coefficients.iota.get(
                    met_id, 1.0
                )
                term_without_error = True
                if (
                    correction_config.add_ki_error_term
                ):  # Error term to make k_I *higher*
                    all_kis = get_model_kis(cobrak_model)
                    if (
                        k_i
                        < all_kis[
                            : ceil(correction_config.ki_error_cutoff * len(all_kis))
                        ][-1]
                    ):
                        term_without_error = False
                        ki_error_var = setattr(
                            model,
                            f"{ERROR_VAR_PREFIX}____{reac_id}____{met_id}____iota",
                            Var(
                                within=Reals,
                                bounds=(
                                    0.0,
                                    log(correction_config.max_rel_ki_correction * k_i),
                                ),
                            ),
                        )
                        iota_product *= 1 / (
                            1
                            + exp(
                                stoichiometry * getattr(model, var_id)
                                - stoichiometry * log(k_i)
                                + stoichiometry * getattr(model, ki_error_var)
                            )
                        )
                if term_without_error:
                    iota_product *= 1 / (
                        1
                        + exp(
                            stoichiometry * getattr(model, var_id)
                            - stoichiometry * log(k_i)
                        )
                    )
            iota_var_name = f"{IOTA_VAR_PREFIX}{reac_id}"
            setattr(
                model,
                iota_var_name,
                Var(within=Reals, bounds=(approximation_value, 1.0)),
            )
            if strict_mode or reac_id in single_strict_reacs:
                iota_var_constraint_0 = (
                    getattr(model, iota_var_name) == approximation_value + iota_product
                )
            else:
                iota_var_constraint_0 = (
                    getattr(model, iota_var_name) <= approximation_value + iota_product
                )
            setattr(
                model,
                f"iota_var_constraint_{reac_id}_0",
                Constraint(rule=iota_var_constraint_0),
            )

        if with_alpha and has_alpha:
            alpha_product = 1.0
            for met_id, k_a in reaction.enzyme_reaction_data.k_as.items():
                if met_id in cobrak_model.kinetic_ignored_metabolites:
                    continue
                var_id = f"{LNCONC_VAR_PREFIX}{met_id}"
                if var_id not in model_var_names:
                    continue
                stoichiometry = abs(
                    reaction.stoichiometries.get(met_id, 1.0)
                ) * reaction.enzyme_reaction_data.hill_coefficients.alpha.get(
                    met_id, 1.0
                )

                term_without_error = True
                if (
                    correction_config.add_ki_error_term
                ):  # Error term to make k_A *lower*
                    all_kas = get_model_kas(cobrak_model)
                    if (
                        k_a
                        > all_kas[
                            floor(
                                (1 - correction_config.ka_error_cutoff) * len(all_kas)
                            ) :
                        ][0]
                    ):
                        term_without_error = False
                        ka_error_var = setattr(
                            model,
                            f"{ERROR_VAR_PREFIX}____{reac_id}____{met_id}____alpha",
                            Var(
                                within=Reals,
                                bounds=(
                                    0.0,
                                    log(correction_config.max_rel_ki_correction * k_i),
                                ),
                            ),
                        )
                        iota_product *= 1 / (
                            1
                            + exp(
                                stoichiometry * log(k_a)
                                - stoichiometry * getattr(model, var_id)
                                - stoichiometry * getattr(model, ka_error_var)
                            )
                        )
                if term_without_error:
                    alpha_product *= 1 / (
                        1
                        + exp(
                            stoichiometry * log(k_a)
                            - stoichiometry * getattr(model, var_id)
                        )
                    )

            alpha_var_name = f"{ALPHA_VAR_PREFIX}{reac_id}"
            setattr(
                model,
                alpha_var_name,
                Var(within=Reals, bounds=(approximation_value, 1.0)),
            )

            if strict_mode or reac_id in single_strict_reacs:
                alpha_var_constraint_0 = (
                    getattr(model, alpha_var_name)
                    == approximation_value + alpha_product
                )
            else:
                alpha_var_constraint_0 = (
                    getattr(model, alpha_var_name)
                    <= approximation_value + alpha_product
                )
            setattr(
                model,
                f"alpha_var_constraint_{reac_id}_0",
                Constraint(rule=alpha_var_constraint_0),
            )

        # Build kinetic term for reaction according to included parts
        kinetic_rhs = v_plus
        if has_kappa and with_kappa:
            kinetic_rhs *= getattr(model, kappa_var_id)
        if has_gamma and with_gamma:
            kinetic_rhs *= getattr(model, gamma_var_name)
        if has_iota and with_iota:
            kinetic_rhs *= getattr(model, iota_var_name)
        if has_alpha and with_alpha:
            kinetic_rhs *= getattr(model, alpha_var_name)

        # Apply strict mode
        if strict_mode or reac_id in single_strict_reacs:
            setattr(
                model,
                f"full_reac_constraint_{reac_id}",
                Constraint(rule=getattr(model, reac_id) == kinetic_rhs),
            )
        else:
            setattr(
                model,
                f"full_reac_constraint_{reac_id}",
                Constraint(rule=getattr(model, reac_id) <= kinetic_rhs),
            )

    model = _add_extra_watches_and_constraints_to_lp(
        model, cobrak_model, ignore_nonlinear_terms=False
    )
    if is_any_error_term_active(correction_config):
        if correction_config.error_scenario != {}:
            _apply_error_scenario(
                model,
                cobrak_model,
                correction_config,
            )
        if correction_config.add_error_sum_term:
            model = _add_error_sum_to_model(
                model,
                cobrak_model,
                correction_config,
            )

    ########################
    if (
        cobrak_model.max_conc_sum < float("inf")
        or cobrak_model.include_mets_in_prot_pool
    ):
        met_sum_ids: list[str] = []
        for var_id in get_model_var_names(model):
            if not var_id.startswith(LNCONC_VAR_PREFIX):
                continue
            if not any(
                var_id.endswith(suffix)
                for suffix in cobrak_model.conc_sum_include_suffixes
            ):
                continue
            if any(
                var_id.replace(LNCONC_VAR_PREFIX, "").startswith(prefix)
                for prefix in cobrak_model.conc_sum_ignore_prefixes
            ):
                continue
            met_sum_ids.append(var_id)

        conc_sum_expr = 0.0
        for met_sum_id in met_sum_ids:
            met_id = met_sum_id[len(LNCONC_VAR_PREFIX) :]
            if (
                cobrak_model.include_mets_in_prot_pool
                and cobrak_model.metabolites[met_id].molar_mass
            ):
                conc_sum_expr += cobrak_model.metabolites[met_id].molar_mass * exp(
                    getattr(model, met_sum_id)
                )
            else:
                conc_sum_expr += exp(getattr(model, met_sum_id))

        setattr(
            model,
            "met_sum_var",
            Var(within=Reals, bounds=(1e-5, cobrak_model.max_conc_sum)),
        )

        if cobrak_model.include_mets_in_prot_pool:
            setattr(
                model,
                GENERALIZED_SUM_CONSTRAINT_NAME,
                Constraint(
                    rule=getattr(model, PROT_POOL_REAC_NAME)
                    + getattr(model, "met_sum_var")
                    <= cobrak_model.max_prot_pool
                ),
            )
        else:
            setattr(
                model,
                "met_sum_constraint",
                Constraint(rule=conc_sum_expr <= getattr(model, "met_sum_var")),
            )
    ################

    return model

perform_nlp_irreversible_optimization(cobrak_model, objective_target, objective_sense, variability_dict, with_kappa=True, with_gamma=True, with_iota=False, with_alpha=False, approximation_value=0.0001, verbose=False, strict_mode=False, single_strict_reacs=[], min_mdf=STANDARD_MIN_MDF, solver=IPOPT, min_flux=0.0, with_flux_sum_var=False, correction_config=CorrectionConfig(), var_data_abs_epsilon=1e-05)

Performs an irreversible non-linear program (NLP) optimization on a COBRAk model.

For more about the NLP, see the COBRAk documentation's NLP chapter.

Parameters

  • cobrak_model (Model): The COBRAk model to optimize.
  • objective_target (str | dict[str, float]): The objective target (reaction ID or dictionary of reaction IDs and coefficients).
  • objective_sense (int): The objective sense (1 for maximization, -1 for minimization).
  • variability_dict (dict[str, tuple[float, float]]): Dictionary of reaction IDs and their variability (lower and upper bounds).
  • with_kappa (bool, optional): Whether to include κ saturation terms. Defaults to True.
  • with_gamma (bool, optional): Whether to include γ thermodynamic terms. Defaults to True.
  • with_iota (bool, optional): Whether to include ι inhibition terms. Defaults to False and untested!
  • with_alpha (bool, optional): Whether to include α activation terms. Defaults to False and untested!
  • approximation_value (float, optional): Approximation value for κ, γ, ι, and α terms. Defaults to 0.0001. This value is the minimal value for κ, γ, ι, and α terms, and can lead to an overapproximation in this regard.
  • verbose (bool, optional): Whether to print solver output. Defaults to False.
  • strict_mode (bool, optional): Whether to use strict mode (i.e. all <= heuristics become == relations). Defaults to False.
  • single_strict_reacs (list[str], optional): If 'strict_mode==False', only reactions with an ID in this list are set to strict mode.
  • min_mdf (float, optional): Minimum MDF value. Defaults to STANDARD_MIN_MDF.
  • solver_name (Solver, optional): Used NLP solver. Defaults to IPOPT.
  • min_flux (float, optional): Minimum flux value. Defaults to 0.0.
  • with_flux_sum_var (bool, optional): Whether to include a reaction flux sum variable of name cobrak.constants.FLUX_SUM_VAR. Defaults to False.
  • correction_config (CorrectionConfig, optional): Parameter correction configuration. Defaults to CorrectionConfig().
  • var_data_abs_epsilon: (float, optional): Under this value, any data given by the variability dict is considered to be 0. Defaults to 1e-5.

Returns

  • dict[str, float]: The optimization results.
Source code in cobrak/nlps.py
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def perform_nlp_irreversible_optimization(
    cobrak_model: Model,
    objective_target: str | dict[str, float],
    objective_sense: int,
    variability_dict: dict[str, tuple[float, float]],
    with_kappa: bool = True,
    with_gamma: bool = True,
    with_iota: bool = False,
    with_alpha: bool = False,
    approximation_value: NonNegativeFloat = 0.0001,
    verbose: bool = False,
    strict_mode: bool = False,
    single_strict_reacs: list[str] = [],
    min_mdf: float = STANDARD_MIN_MDF,
    solver: Solver = IPOPT,
    min_flux: NonNegativeFloat = 0.0,
    with_flux_sum_var: bool = False,
    correction_config: CorrectionConfig = CorrectionConfig(),
    var_data_abs_epsilon: float = 1e-5,
) -> dict[str, float]:
    """Performs an irreversible non-linear program (NLP) optimization on a COBRAk model.

    For more about the NLP, see the COBRAk documentation's NLP chapter.

    # Parameters
    * `cobrak_model` (`Model`): The COBRAk model to optimize.
    * `objective_target` (`str | dict[str, float]`): The objective target (reaction ID or dictionary of reaction IDs and coefficients).
    * `objective_sense` (`int`): The objective sense (1 for maximization, -1 for minimization).
    * `variability_dict` (`dict[str, tuple[float, float]]`): Dictionary of reaction IDs and their variability (lower and upper bounds).
    * `with_kappa` (`bool`, optional): Whether to include κ saturation terms. Defaults to `True`.
    * `with_gamma` (`bool`, optional): Whether to include γ thermodynamic terms. Defaults to `True`.
    * `with_iota` (`bool`, optional): Whether to include ι inhibition terms. Defaults to `False` and untested!
    * `with_alpha` (`bool`, optional): Whether to include α activation terms. Defaults to `False` and untested!
    * `approximation_value` (`float`, optional): Approximation value for κ, γ, ι, and α terms. Defaults to `0.0001`. This value is the
       minimal value for κ, γ, ι, and α terms, and can lead to an overapproximation in this regard.
    * `verbose` (`bool`, optional): Whether to print solver output. Defaults to `False`.
    * `strict_mode` (`bool`, optional): Whether to use strict mode (i.e. all <= heuristics become == relations). Defaults to `False`.
    * `single_strict_reacs` (`list[str]`, optional): If 'strict_mode==False', only reactions with an ID in this list are set to strict mode.
    * `min_mdf` (`float`, optional): Minimum MDF value. Defaults to `STANDARD_MIN_MDF`.
    * `solver_name` (Solver, optional): Used NLP solver. Defaults to IPOPT.
    * `min_flux` (`float`, optional): Minimum flux value. Defaults to `0.0`.
    * `with_flux_sum_var` (`bool`, optional): Whether to include a reaction flux sum variable of name ```cobrak.constants.FLUX_SUM_VAR```. Defaults to `False`.
    * `correction_config` (`CorrectionConfig`, optional): Parameter correction configuration. Defaults to `CorrectionConfig()`.
    *  var_data_abs_epsilon: (`float`, optional): Under this value, any data given by the variability dict is considered to be 0. Defaults to 1e-5.

    # Returns
    * `dict[str, float]`: The optimization results.
    """
    nlp_model = get_nlp_from_cobrak_model(
        cobrak_model,
        with_kappa=with_kappa,
        with_gamma=with_gamma,
        with_iota=with_iota,
        with_alpha=with_alpha,
        approximation_value=approximation_value,
        irreversible_mode=True,
        variability_data=variability_dict,
        strict_mode=strict_mode,
        single_strict_reacs=single_strict_reacs,
        irreversible_mode_min_mdf=min_mdf,
        with_flux_sum_var=with_flux_sum_var,
        correction_config=correction_config,
    )
    variability_dict = deepcopy(variability_dict)
    if min_flux != 0.0:
        for reac_id in cobrak_model.reactions:
            if (reac_id in variability_dict) and (
                (variability_dict[reac_id][0] == 0.0)
                and (variability_dict[reac_id][1] >= min_flux)
            ):
                variability_dict[reac_id] = (min_flux, variability_dict[reac_id][1])

    nlp_model = apply_variability_dict(
        nlp_model,
        cobrak_model,
        variability_dict,
        correction_config.error_scenario,
        var_data_abs_epsilon,
    )
    nlp_model.obj = get_objective(nlp_model, objective_target, objective_sense)
    pyomo_solver = get_solver(solver)
    results = pyomo_solver.solve(nlp_model, tee=verbose, **solver.solve_extra_options)
    mmtfba_dict = get_pyomo_solution_as_dict(nlp_model)
    return add_statuses_to_optimziation_dict(mmtfba_dict, results)

perform_nlp_irreversible_optimization_with_active_reacs_only(cobrak_model, objective_target, objective_sense, optimization_dict, variability_dict, with_kappa=True, with_gamma=True, with_iota=False, with_alpha=False, approximation_value=0.0001, verbose=False, strict_mode=False, single_strict_reacs=[], min_mdf=STANDARD_MIN_MDF, solver=IPOPT, do_not_delete_with_z_var_one=False, correction_config=CorrectionConfig(), var_data_abs_epsilon=1e-05)

Performs an irreversible non-linear program (NLP) optimization on a COBRAk model, considering only active reactions of the optimization dict.

For more about the NLP, see the COBRAk documentation's NLP chapter.

Parameters

  • cobrak_model (Model): The COBRAk model to optimize.
  • objective_target (str | dict[str, float]): The objective target (reaction ID or dictionary of reaction IDs and coefficients).
  • objective_sense (int): The objective sense (1 for maximization, -1 for minimization).
  • optimization_dict (dict[str, float]): Dictionary of reaction IDs and their optimization values.
  • variability_dict (dict[str, tuple[float, float]]): Dictionary of reaction IDs and their variability (lower and upper bounds).
  • with_kappa (bool, optional): Whether to include κ terms. Defaults to True.
  • with_gamma (bool, optional): Whether to include γ terms. Defaults to True.
  • with_iota (bool, optional): Whether to include ι inhibition terms. Defaults to False and untested!
  • with_alpha (bool, optional): Whether to include α activation terms. Defaults to False and untested!
  • approximation_value (float, optional): Approximation value for κ, γ, ι, and α terms. Defaults to 0.0001. This value is the minimal value for κ, γ, ι, and α terms, and can lead to an overapproximation in this regard.
  • verbose (bool, optional): Whether to print solver output. Defaults to False.
  • strict_mode (bool, optional): Whether to use strict mode (i.e. all <= heuristics become == relations). Defaults to False.
  • single_strict_reacs (list[str], optional): If 'strict_mode==False', only reactions with an ID in this list are set to strict mode.
  • min_mdf (float, optional): Minimum MDF value. Defaults to STANDARD_MIN_MDF.
  • solver (Solver, optional): Used NLP solver. Defaults to IPOPT.
  • do_not_delete_with_z_var_one (bool, optional): Whether to delete reactions with associated Z variables (in the optimization dics) equal to one. Defaults to False.
  • correction_config (CorrectionConfig, optional): Paramter correction configuration. Defaults to CorrectionConfig().
  • var_data_abs_epsilon: (float, optional): Under this value, any data given by the variability dict is considered to be 0. Defaults to 1e-5.

Returns

  • dict[str, float]: The optimization results.
Source code in cobrak/nlps.py
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def perform_nlp_irreversible_optimization_with_active_reacs_only(
    cobrak_model: Model,
    objective_target: str | dict[str, float],
    objective_sense: int,
    optimization_dict: dict[str, float],
    variability_dict: dict[str, tuple[float, float]],
    with_kappa: bool = True,
    with_gamma: bool = True,
    with_iota: bool = False,
    with_alpha: bool = False,
    approximation_value: float = 0.0001,
    verbose: bool = False,
    strict_mode: bool = False,
    single_strict_reacs: list[str] = [],
    min_mdf: float = STANDARD_MIN_MDF,
    solver: Solver = IPOPT,
    do_not_delete_with_z_var_one: bool = False,
    correction_config: CorrectionConfig = CorrectionConfig(),
    var_data_abs_epsilon: float = 1e-5,
) -> dict[str, float]:
    """Performs an irreversible non-linear program (NLP) optimization on a COBRAk model, considering only active reactions of the optimization dict.

    For more about the NLP, see the COBRAk documentation's NLP chapter.

    # Parameters
    * `cobrak_model` (`Model`): The COBRAk model to optimize.
    * `objective_target` (`str | dict[str, float]`): The objective target (reaction ID or dictionary of reaction IDs and coefficients).
    * `objective_sense` (`int`): The objective sense (1 for maximization, -1 for minimization).
    * `optimization_dict` (`dict[str, float]`): Dictionary of reaction IDs and their optimization values.
    * `variability_dict` (`dict[str, tuple[float, float]]`): Dictionary of reaction IDs and their variability (lower and upper bounds).
    * `with_kappa` (`bool`, optional): Whether to include κ terms. Defaults to `True`.
    * `with_gamma` (`bool`, optional): Whether to include γ terms. Defaults to `True`.
    * `with_iota` (`bool`, optional): Whether to include ι inhibition terms. Defaults to `False` and untested!
    * `with_alpha` (`bool`, optional): Whether to include α activation terms. Defaults to `False` and untested!
    * `approximation_value` (`float`, optional): Approximation value for κ, γ, ι, and α terms. Defaults to `0.0001`. This value is the
       minimal value for κ, γ, ι, and α terms, and can lead to an overapproximation in this regard.
    * `verbose` (`bool`, optional): Whether to print solver output. Defaults to `False`.
    * `strict_mode` (`bool`, optional): Whether to use strict mode (i.e. all <= heuristics become == relations). Defaults to `False`.
    * `single_strict_reacs` (`list[str]`, optional): If 'strict_mode==False', only reactions with an ID in this list are set to strict mode.
    * `min_mdf` (`float`, optional): Minimum MDF value. Defaults to `STANDARD_MIN_MDF`.
    * `solver` (Solver, optional): Used NLP solver. Defaults to IPOPT.
    * `do_not_delete_with_z_var_one` (`bool`, optional): Whether to delete reactions with associated Z variables (in the optimization dics) equal to one.
      Defaults to `False`.
    * `correction_config` (`CorrectionConfig`, optional): Paramter correction configuration. Defaults to `CorrectionConfig()`.
    *  var_data_abs_epsilon: (`float`, optional): Under this value, any data given by the variability dict is considered to be 0. Defaults to 1e-5.

    # Returns
    * `dict[str, float]`: The optimization results.
    """
    optimization_dict = deepcopy(optimization_dict)
    for single_strict_reac in single_strict_reacs:
        optimization_dict[single_strict_reac] = 1.0
    nlp_cobrak_model = delete_unused_reactions_in_optimization_dict(
        cobrak_model=cobrak_model,
        optimization_dict=optimization_dict,
        do_not_delete_with_z_var_one=do_not_delete_with_z_var_one,
    )
    return perform_nlp_irreversible_optimization(
        cobrak_model=nlp_cobrak_model,
        objective_target=objective_target,
        objective_sense=objective_sense,
        variability_dict=variability_dict,
        with_kappa=with_kappa,
        with_gamma=with_gamma,
        with_iota=with_iota,
        with_alpha=with_alpha,
        approximation_value=approximation_value,
        verbose=verbose,
        strict_mode=strict_mode,
        single_strict_reacs=single_strict_reacs,
        min_mdf=min_mdf,
        solver=solver,
        correction_config=correction_config,
        var_data_abs_epsilon=var_data_abs_epsilon,
    )

perform_nlp_irreversible_variability_analysis_with_active_reacs_only(cobrak_model, optimization_dict, tfba_variability_dict, with_kappa=True, with_gamma=True, with_iota=False, with_alpha=False, active_reactions=[], min_active_flux=1e-05, calculate_reacs=True, calculate_concs=True, calculate_rest=True, extra_tested_vars_max=[], extra_tested_vars_min=[], strict_mode=False, single_strict_reacs=[], min_mdf=STANDARD_MIN_MDF, min_flux_cutoff=1e-08, solver=IPOPT, do_not_delete_with_z_var_one=False, parallel_verbosity_level=0, approximation_value=0.0001)

Performs an irreversible non-linear program (NLP) variability analysis on a COBRAk model, considering only active reactions.

This function calculates the minimum and maximum values of reaction fluxes, metabolite concentrations, and other variables in the model, given a set of active reactions and a variability dictionary. It uses a combination of NLP optimizations and parallel processing to efficiently compute the variability of the model.

Parameters

  • cobrak_model (Model): The COBRAk model to analyze.
  • optimization_dict (dict[str, float]): Dictionary of reaction IDs and their optimization values.
  • tfba_variability_dict (dict[str, tuple[float, float]]): Dictionary of reaction IDs and their TFBA variability (lower and upper bounds).
  • with_kappa (bool, optional): Whether to include κ saturation terms. Defaults to True.
  • with_gamma (bool, optional): Whether to include γ thermodynamic terms. Defaults to True.
  • with_iota (bool, optional): Whether to include ι inhibition terms. Defaults to False and untested!
  • with_alpha (bool, optional): Whether to include α activation terms. Defaults to False and untested!
  • active_reactions (list[str], optional): List of active reaction IDs. Defaults to [].
  • min_active_flux (float, optional): Minimum flux value for active reactions. Defaults to 1e-5.
  • calculate_reacs (bool, optional): Whether to calculate reaction flux variability. Defaults to True.
  • calculate_concs (bool, optional): Whether to calculate metabolite concentration variability. Defaults to True.
  • calculate_rest (bool, optional): Whether to calculate variability of other variables (e.g., enzyme delivery, κ, γ). Defaults to True.
  • strict_mode (bool, optional): Whether to use strict mode (i.e. all <= heuristics become == relations). Defaults to False.
  • single_strict_reacs (list[str], optional): If 'strict_mode==False', only reactions with an ID in this list are set to strict mode.
  • min_mdf (float, optional): Minimum MDF value. Defaults to STANDARD_MIN_MDF.
  • min_flux_cutoff (float, optional): Minimum flux cutoff value. Defaults to 1e-8.
  • solver (Solver, optional): Used NLP solver. Defaults to IPOPT.
  • do_not_delete_with_z_var_one (bool, optional): Whether to delete reactions with Z variable equal to one. Defaults to False.
  • parallel_verbosity_level (int, optional): Verbosity level for parallel processing. Defaults to 0.
  • approximation_value (float, optional): Approximation value for κ, γ, ι, and α terms. Defaults to 0.0001. This value is the minimal value for κ, γ, ι, and α terms, and can lead to an overapproximation in this regard.

Returns

  • dict[str, tuple[float, float]]: A dictionary of variable IDs and their variability (lower and upper bounds).
Source code in cobrak/nlps.py
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def perform_nlp_irreversible_variability_analysis_with_active_reacs_only(
    cobrak_model: Model,
    optimization_dict: dict[str, float],
    tfba_variability_dict: dict[str, tuple[float, float]],
    with_kappa: bool = True,
    with_gamma: bool = True,
    with_iota: bool = False,
    with_alpha: bool = False,
    active_reactions: list[str] = [],
    min_active_flux: float = 1e-5,
    calculate_reacs: bool = True,
    calculate_concs: bool = True,
    calculate_rest: bool = True,
    extra_tested_vars_max: list[str] = [],
    extra_tested_vars_min: list[str] = [],
    strict_mode: bool = False,
    single_strict_reacs: list[str] = [],
    min_mdf: float = STANDARD_MIN_MDF,
    min_flux_cutoff: float = 1e-8,
    solver: Solver = IPOPT,
    do_not_delete_with_z_var_one: bool = False,
    parallel_verbosity_level: int = 0,
    approximation_value: float = 0.0001,
) -> dict[str, tuple[float, float]]:
    """Performs an irreversible non-linear program (NLP) variability analysis on a COBRAk model, considering only active reactions.

    This function calculates the minimum and maximum values of reaction fluxes, metabolite concentrations, and other variables in the model,
    given a set of active reactions and a variability dictionary.
    It uses a combination of NLP optimizations and parallel processing to efficiently compute the variability of the model.

    # Parameters
    * `cobrak_model` (`Model`): The COBRAk model to analyze.
    * `optimization_dict` (`dict[str, float]`): Dictionary of reaction IDs and their optimization values.
    * `tfba_variability_dict` (`dict[str, tuple[float, float]]`): Dictionary of reaction IDs and their TFBA variability (lower and upper bounds).
    * `with_kappa` (`bool`, optional): Whether to include κ saturation terms. Defaults to `True`.
    * `with_gamma` (`bool`, optional): Whether to include γ thermodynamic terms. Defaults to `True`.
    * `with_iota` (`bool`, optional): Whether to include ι inhibition terms. Defaults to `False` and untested!
    * `with_alpha` (`bool`, optional): Whether to include α activation terms. Defaults to `False` and untested!
    * `active_reactions` (`list[str]`, optional): List of active reaction IDs. Defaults to `[]`.
    * `min_active_flux` (`float`, optional): Minimum flux value for active reactions. Defaults to `1e-5`.
    * `calculate_reacs` (`bool`, optional): Whether to calculate reaction flux variability. Defaults to `True`.
    * `calculate_concs` (`bool`, optional): Whether to calculate metabolite concentration variability. Defaults to `True`.
    * `calculate_rest` (`bool`, optional): Whether to calculate variability of other variables (e.g., enzyme delivery, κ, γ). Defaults to `True`.
    * `strict_mode` (`bool`, optional): Whether to use strict mode (i.e. all <= heuristics become == relations). Defaults to `False`.
    * `single_strict_reacs` (`list[str]`, optional): If 'strict_mode==False', only reactions with an ID in this list are set to strict mode.
    * `min_mdf` (`float`, optional): Minimum MDF value. Defaults to `STANDARD_MIN_MDF`.
    * `min_flux_cutoff` (`float`, optional): Minimum flux cutoff value. Defaults to `1e-8`.
    * `solver` (Solver, optional): Used NLP solver. Defaults to IPOPT.
    * `do_not_delete_with_z_var_one` (`bool`, optional): Whether to delete reactions with Z variable equal to one. Defaults to `False`.
    * `parallel_verbosity_level` (`int`, optional): Verbosity level for parallel processing. Defaults to `0`.
    * `approximation_value` (`float`, optional): Approximation value for κ, γ, ι, and α terms. Defaults to `0.0001`. This value is the
       minimal value for κ, γ, ι, and α terms, and can lead to an overapproximation in this regard.

    # Returns
    * `dict[str, tuple[float, float]]`: A dictionary of variable IDs and their variability (lower and upper bounds).
    """
    cobrak_model = deepcopy(cobrak_model)
    cobrak_model = delete_unused_reactions_in_optimization_dict(
        cobrak_model=cobrak_model,
        optimization_dict=optimization_dict,
        do_not_delete_with_z_var_one=do_not_delete_with_z_var_one,
    )

    for active_reaction in active_reactions:
        cobrak_model.reactions[active_reaction].min_flux = min_active_flux

    model: ConcreteModel = get_nlp_from_cobrak_model(
        cobrak_model=deepcopy(cobrak_model),
        with_kappa=with_kappa,
        with_gamma=with_gamma,
        with_iota=with_iota,
        with_alpha=with_alpha,
        approximation_value=approximation_value,
        variability_data=deepcopy(tfba_variability_dict),
        strict_mode=strict_mode,
        irreversible_mode_min_mdf=min_mdf,
    )
    model_var_names = get_model_var_names(model)

    min_values: dict[str, float] = {}
    max_values: dict[str, float] = {}
    objective_targets: list[tuple[int, str]] = []

    """
    min_flux_sum_result = perform_nlp_irreversible_optimization(
        deepcopy(cobrak_model),
        objective_target=FLUX_SUM_VAR_ID,
        objective_sense=-1,
        with_kappa=with_kappa,
        with_gamma=with_gamma,
        with_iota=with_iota,
        with_alpha=with_alpha,
        approximation_value=approximation_value,
        variability_dict=deepcopy(tfba_variability_dict),
        strict_mode=strict_mode,
        min_mdf=min_mdf,
        with_flux_sum_var=True,
        solver=solver,
    )
    """

    if calculate_concs or calculate_rest:
        min_mdf_result = perform_nlp_irreversible_optimization(
            deepcopy(cobrak_model),
            objective_target=MDF_VAR_ID,
            objective_sense=-1,
            with_kappa=with_kappa,
            with_gamma=with_gamma,
            with_iota=with_iota,
            with_alpha=with_alpha,
            approximation_value=approximation_value,
            variability_dict=deepcopy(tfba_variability_dict),
            strict_mode=strict_mode,
            solver=solver,
        )

    if calculate_concs:
        for met_id, metabolite in cobrak_model.metabolites.items():
            met_var_name = f"{LNCONC_VAR_PREFIX}{met_id}"
            if met_var_name in model_var_names:
                min_mdf_conc = min_mdf_result[met_var_name]
                max_mdf_conc = min_mdf_result[met_var_name]
                if metabolite.log_min_conc in (min_mdf_conc, max_mdf_conc):
                    min_values[met_var_name] = metabolite.log_min_conc
                else:
                    objective_targets.append((-1, met_var_name))
                if metabolite.log_max_conc in (min_mdf_conc, max_mdf_conc):
                    max_values[met_var_name] = metabolite.log_max_conc
                else:
                    objective_targets.append((+1, met_var_name))

    for reac_id, reaction in cobrak_model.reactions.items():
        # min_flux_sum_flux = min_flux_sum_result[reac_id]
        if calculate_reacs:
            # if reaction.min_flux in (min_flux_sum_flux,):
            #    min_values[reac_id] = (
            #        reaction.min_flux if reaction.min_flux >= min_flux_cutoff else 0.0
            #    )
            # else:
            # if reaction.max_flux in (min_flux_sum_flux,):
            #    max_values[reac_id] = reaction.max_flux
            # else:
            objective_targets.extend(((-1, reac_id), (+1, reac_id)))

        if not calculate_rest:
            continue

        kappa_var_name = f"{KAPPA_VAR_PREFIX}{reac_id}"
        gamma_var_name = f"{GAMMA_VAR_PREFIX}{reac_id}"
        if kappa_var_name in model_var_names:
            objective_targets.extend(((-1, kappa_var_name), (+1, kappa_var_name)))
        if gamma_var_name in model_var_names:
            objective_targets.extend(((-1, gamma_var_name), (+1, gamma_var_name)))
        if reaction.enzyme_reaction_data is not None:
            full_enzyme_id = get_full_enzyme_id(
                reaction.enzyme_reaction_data.identifiers
            )
            if full_enzyme_id:
                enzyme_delivery_var_name = get_reaction_enzyme_var_id(reac_id, reaction)
                # if 0.0 in (min_flux_sum_flux,):
                #    min_values[enzyme_delivery_var_name] = 0.0
                # else:
                objective_targets.extend(
                    ((-1, enzyme_delivery_var_name), (+1, enzyme_delivery_var_name))
                )

    if len(extra_tested_vars_min) > 0:
        for extra_tested_var in extra_tested_vars_max:
            if extra_tested_var in model_var_names:
                objective_targets.append((-1, extra_tested_var))

    if len(extra_tested_vars_max) > 0:
        for extra_tested_var in extra_tested_vars_max:
            if extra_tested_var in model_var_names:
                objective_targets.append((+1, extra_tested_var))

    objectives_data: list[tuple[str, str]] = []
    for obj_sense, target_id in objective_targets:
        if obj_sense == -1:
            objective_name = f"MIN_OBJ_{target_id}"
            pyomo_sense = minimize
        else:
            objective_name = f"MAX_OBJ_{target_id}"
            pyomo_sense = maximize
        setattr(
            model,
            objective_name,
            Objective(expr=getattr(model, target_id), sense=pyomo_sense),
        )
        getattr(model, objective_name).deactivate()
        objectives_data.append((objective_name, target_id))

    objectives_data_batches = split_list(
        objectives_data, len(objectives_data)
    )  # cpu_count())

    results_list = Parallel(n_jobs=-1, verbose=parallel_verbosity_level)(
        delayed(_batch_nlp_variability_optimization)(
            batch,
            cobrak_model,
            with_kappa,
            with_gamma,
            with_iota,
            with_alpha,
            approximation_value,
            tfba_variability_dict,
            strict_mode,
            single_strict_reacs,
            min_mdf,
            solver,
        )
        for batch in objectives_data_batches
    )
    for result in chain(*results_list):
        is_minimization = result[0]
        target_id = result[1]
        result_value = result[2]
        if is_minimization:
            min_values[target_id] = result_value
        else:
            max_values[target_id] = result_value

    for key, min_value in min_values.items():
        if (key in cobrak_model.reactions) or (
            key.startswith(ENZYME_VAR_PREFIX) and (min_value is not None)
        ):
            min_values[key] = min_value if min_value >= min_flux_cutoff else 0.0

    all_target_ids = sorted(
        set(
            list(min_values.keys())
            + list(max_values.keys())
            + [obj_target[1] for obj_target in objective_targets]
        )
    )
    all_target_ids = [x[1] for x in objectives_data]
    variability_dict: dict[str, tuple[float, float]] = {
        target_id: (min_values[target_id], max_values[target_id])
        for target_id in all_target_ids
    }

    return variability_dict

perform_nlp_reversible_optimization(cobrak_model, objective_target, objective_sense, variability_dict, with_kappa=True, with_gamma=True, with_iota=False, with_alpha=False, approximation_value=0.0001, strict_mode=False, single_strict_reacs=[], verbose=False, solver=SCIP, with_flux_sum_var=False, correction_config=CorrectionConfig(), show_variable_count=False, var_data_abs_epsilon=1e-05)

Performs a reversible MILP-based non-linear program (NLP) optimization on a COBRAk model.

For more on the MINLP, see the COBRAk documentation's NLP chapter.

Parameters
  • cobrak_model (Model): The COBRAk model to optimize.
  • objective_target (str | dict[str, float]): The objective target (reaction ID or dictionary of reaction IDs and coefficients).
  • objective_sense (int): The objective sense (1 for maximization, -1 for minimization).
  • variability_dict (dict[str, tuple[float, float]]): Dictionary of reaction IDs and their variability (lower and upper bounds).
  • with_kappa (bool, optional): Whether to include κ saturation terms. Defaults to True.
  • with_gamma (bool, optional): Whether to include γ thermodynamic terms. Defaults to True.
  • with_iota (bool, optional): Whether to include ι inhibition terms. Defaults to False and untested!
  • with_alpha (bool, optional): Whether to include α activation terms. Defaults to False and untested!
  • approximation_value (float, optional): Approximation value for κ, γ, ι, and α terms. Defaults to 0.0001. This value is the minimal value for κ, γ, ι, and α terms, and can lead to an overapproximation in this regard.
  • strict_mode (bool, optional): Whether to use strict mode (i.e. all <= heuristics become == relations). Defaults to False.
  • single_strict_reacs (list[str], optional): If 'strict_mode==False', only reactions with an ID in this list are set to strict mode.
  • verbose (bool, optional): Whether to print solver output. Defaults to False.
  • solver_name (str, optional): Used MINLP solver. Defaults to SCIP,
  • with_flux_sum_var (bool, optional): Whether to include a reaction flux sum variable of name cobrak.constants.FLUX_SUM_VAR. Defaults to False.
  • correction_config (CorrectionConfig, optional): Parameter correction configuration. Defaults to CorrectionConfig().
  • var_data_abs_epsilon: (float, optional): Under this value, any data given by the variability dict is considered to be 0. Defaults to 1e-5.
Returns
  • dict[str, float]: The optimization results.
Source code in cobrak/nlps.py
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def perform_nlp_reversible_optimization(
    cobrak_model: Model,
    objective_target: str | dict[str, float],
    objective_sense: int,
    variability_dict: dict[str, tuple[float, float]],
    with_kappa: bool = True,
    with_gamma: bool = True,
    with_iota: bool = False,
    with_alpha: bool = False,
    approximation_value: NonNegativeFloat = 0.0001,
    strict_mode: bool = False,
    single_strict_reacs: list[str] = [],
    verbose: bool = False,
    solver: Solver = SCIP,
    with_flux_sum_var: bool = False,
    correction_config: CorrectionConfig = CorrectionConfig(),
    show_variable_count: bool = False,
    var_data_abs_epsilon: float = 1e-5,
) -> dict[str, float]:
    """Performs a reversible MILP-based non-linear program (NLP) optimization on a COBRAk model.

    For more on the MINLP, see the COBRAk documentation's NLP chapter.

    #### Parameters
    * `cobrak_model` (`Model`): The COBRAk model to optimize.
    * `objective_target` (`str | dict[str, float]`): The objective target (reaction ID or dictionary of reaction IDs and coefficients).
    * `objective_sense` (`int`): The objective sense (1 for maximization, -1 for minimization).
    * `variability_dict` (`dict[str, tuple[float, float]]`): Dictionary of reaction IDs and their variability (lower and upper bounds).
    * `with_kappa` (`bool`, optional): Whether to include κ saturation terms. Defaults to `True`.
    * `with_gamma` (`bool`, optional): Whether to include γ thermodynamic terms. Defaults to `True`.
    * `with_iota` (`bool`, optional): Whether to include ι inhibition terms. Defaults to `False` and untested!
    * `with_alpha` (`bool`, optional): Whether to include α activation terms. Defaults to `False` and untested!
    * `approximation_value` (`float`, optional): Approximation value for κ, γ, ι, and α terms. Defaults to `0.0001`. This value is the
       minimal value for κ, γ, ι, and α terms, and can lead to an overapproximation in this regard.
    * `strict_mode` (`bool`, optional): Whether to use strict mode (i.e. all <= heuristics become == relations). Defaults to `False`.
    * `single_strict_reacs` (`list[str]`, optional): If 'strict_mode==False', only reactions with an ID in this list are set to strict mode.
    * `verbose` (`bool`, optional): Whether to print solver output. Defaults to `False`.
    * `solver_name` (`str`, optional): Used MINLP solver. Defaults to SCIP,
    * `with_flux_sum_var` (`bool`, optional): Whether to include a reaction flux sum variable of name ```cobrak.constants.FLUX_SUM_VAR```. Defaults to `False`.
    * `correction_config` (`CorrectionConfig`, optional): Parameter correction configuration. Defaults to `CorrectionConfig()`.
    *  var_data_abs_epsilon: (`float`, optional): Under this value, any data given by the variability dict is considered to be 0. Defaults to 1e-5.

    #### Returns
    * `dict[str, float]`: The optimization results.
    """
    nlp_model = get_nlp_from_cobrak_model(
        cobrak_model,
        with_kappa=with_kappa,
        with_gamma=with_gamma,
        with_iota=with_iota,
        with_alpha=with_alpha,
        approximation_value=approximation_value,
        irreversible_mode=False,
        variability_data=variability_dict,
        strict_mode=strict_mode,
        single_strict_reacs=single_strict_reacs,
        with_flux_sum_var=with_flux_sum_var,
        correction_config=correction_config,
    )

    nlp_model = apply_variability_dict(
        nlp_model,
        cobrak_model,
        variability_dict,
        correction_config.error_scenario,
        var_data_abs_epsilon,
    )
    nlp_model.obj = get_objective(nlp_model, objective_target, objective_sense)
    pyomo_solver = get_solver(solver)

    if show_variable_count:
        float_vars = [v for v in nlp_model.component_objects(Var) if v.domain == Reals]
        num_float_vars = sum(1 for v in float_vars for i in v)
        binary_vars = [
            v for v in nlp_model.component_objects(Var) if v.domain == Binary
        ]
        num_binary_vars = sum(1 for v in binary_vars for i in v)
        print("# FLOAT VARS:", num_float_vars)
        print("# BINARY VARS:", num_binary_vars)

    results = pyomo_solver.solve(nlp_model, tee=verbose, **solver.solve_extra_options)

    nlp_result = get_pyomo_solution_as_dict(nlp_model)
    return add_statuses_to_optimziation_dict(nlp_result, results)

plotting

Functions for plotting different types of data or reaction kinetics, all using matplotlib.

distinct_colors(n)

Produce n distinct Matplotlib colour specifications.

Parameters

n : int Number of colours required (must be > 0).

Returns

List[str] A list of n colour strings. The list is deterministic: calling distinct_colors(5) today and tomorrow returns exactly the same five colours.

Notes
  • The first 10 colours are the Tableau palettetab:orange`` …) – the same palette that Matplotlib uses for its default colour cycle.
  • If n > 10 the function continues with the CSS-4 colour dictionary, sorted by hue (HSV) so that successive colours are as dissimilar as possible * All colours are returned as hex strings (e.g. '#1f77b4') because hex codes are universally accepted by Matplotlib.
Source code in cobrak/plotting.py
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
@validate_call(validate_return=True)
def distinct_colors(n: int) -> list[str]:
    """Produce *n* distinct Matplotlib colour specifications.

    Parameters
    ----------
    n : int
        Number of colours required (must be > 0).

    Returns
    -------
    List[str]
        A list of *n* colour strings.  The list is deterministic:
        calling ``distinct_colors(5)`` today and tomorrow returns
        exactly the same five colours.

    Notes
    -----
    * The first 10 colours are the Tableau palettetab:orange`` …) – the same palette that
    Matplotlib uses for its default colour cycle.
    * If ``n`` > 10 the function continues with the CSS-4 colour
    dictionary, sorted by hue (HSV) so that successive colours
    are as dissimilar as possible * All colours are returned as hex strings (e.g. ``'#1f77b4'``)
    because hex codes are universally accepted by Matplotlib.
    """
    if n <= 0:
        raise ValueError("n must be a positive integer")

    # ----------------------------------------------------------
    # 1️⃣  Tableau colours – the first 10 highly-distinct colours
    # ----------------------------------------------------------
    tableau_hex = list(mcolors.TABLEAU_COLORS.values())  # 10 entries
    if n <= len(tableau_hex):
        return tableau_hex[:n]

    # ----------------------------------------------------------
    # 2️⃣  Prepare the remaining colours (CSS-4) sorted by hue
    # ----------------------------------------------------------
    # Convert every CSS-4 colour to RGB → HSV, keep the original hex
    css_items = [
        (mcolors.rgb_to_hsv(mcolors.to_rgb(hexcol)), hexcol)
        for hexcol in mcolors.CSS4_COLORS.values()
    ]
    # Sort by hue (the first component of HSV)
    css_items.sort(key=lambda pair: pair[0][0])

    # Extract the sorted hex strings
    css_sorted_hex = [hexcol for _, hexcol in css_items]

    # ----------------------------------------------------------
    # 3️⃣  Concatenate Tableau + sorted CSS-4 and slice to *n*
    # ----------------------------------------------------------
    all_colours = tableau_hex + css_sorted_hex
    return all_colours[:n]

dual_axis_plot(xpoints, leftaxis_ypoints_list, rightaxis_ypoints_list, xaxis_caption='', leftaxis_caption='', rightaxis_caption='', leftaxis_colors=[], rightaxis_colors=[], leftaxis_titles=[], rightaxis_titles=[], extrapoints=[], has_legend=True, legend_direction='', legend_position=(), is_leftaxis_logarithmic=False, is_rightaxis_logarithmic=False, point_style='', line_style='-', max_digits_after_comma=4, savepath='', left_ylim=None, right_ylim=None, xlim=None, left_axis_in_front=True, left_legend_position=[], right_legend_position=[], figure_size_inches=None, special_figure_mode=False, axistitle_labelsize=14, axisticks_labelsize=13, legend_labelsize=13, extrahlines=[])

Creates a plot with a dual Y-axis.

Parameters:

Name Type Description Default
xpoints list[float]

X-axis data points.

required
leftaxis_ypoints_list list[list[float]]

List of Y-axis data points for the left axis.

required
rightaxis_ypoints_list list[list[float]]

List of Y-axis data points for the right axis.

required
xaxis_caption str

X-axis caption. Defaults to "".

''
leftaxis_caption str

Left Y-axis caption. Defaults to "".

''
rightaxis_caption str

Right Y-axis caption. Defaults to "".

''
leftaxis_colors list[str]

Colors for left axis lines. Defaults to [].

[]
rightaxis_colors list[str]

Colors for right axis lines. Defaults to [].

[]
leftaxis_titles list[str]

Legend titles for left axis lines. Defaults to [].

[]
rightaxis_titles list[str]

Legend titles for right axis lines. Defaults to [].

[]
extrapoints list[tuple[float, float, bool, str, str, str, float]]

List of single points, described by tuples with the content [x, y, is_left_axis, color, marker, label, yerr]. If yerr=0, no error bar is drawn at all. Defaults to [].

[]
has_legend bool

Whether to show the legend. Defaults to True.

True
legend_direction str

Legend direction. Defaults to "".

''
legend_position tuple[float, float]

Legend position. Defaults to ().

()
is_leftaxis_logarithmic bool

Whether to use a logarithmic scale for the left axis. Defaults to False.

False
is_rightaxis_logarithmic bool

Whether to use a logarithmic scale for the right axis. Defaults to False.

False
point_style str

Style for points. Defaults to "".

''
line_style str

Style for lines. Defaults to "-".

'-'
max_digits_after_comma int

Max digits after comma shown. Defaults to 4.

4
savepath str

If given, the plot is not shown but saved at the given path. Defaults to ""

''

Returns:

Type Description
None

None (displays the plot)

Source code in cobrak/plotting.py
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def dual_axis_plot(
    xpoints: list[float],
    leftaxis_ypoints_list: list[list[float]],
    rightaxis_ypoints_list: list[list[float]],
    xaxis_caption: str = "",
    leftaxis_caption: str = "",
    rightaxis_caption: str = "",
    leftaxis_colors: list[str] = [],
    rightaxis_colors: list[str] = [],
    leftaxis_titles: list[str] = [],
    rightaxis_titles: list[str] = [],
    extrapoints: list[tuple[float, float, bool, str, str, str, float]] = [],
    has_legend: bool = True,
    legend_direction: str = "",
    legend_position: tuple[Any, ...] = (),
    is_leftaxis_logarithmic: bool = False,
    is_rightaxis_logarithmic: bool = False,
    point_style: str = "",
    line_style: str = "-",
    max_digits_after_comma: int = 4,
    savepath: str = "",
    left_ylim: None | tuple[float, float] = None,
    right_ylim: None | tuple[float, float] = None,
    xlim: None | tuple[float, float] = None,
    left_axis_in_front: bool = True,
    left_legend_position: list[int] = [],
    right_legend_position: list[int] = [],
    figure_size_inches: None | tuple[float, float] = None,
    special_figure_mode: bool = False,
    axistitle_labelsize: float = 14,
    axisticks_labelsize: float = 13,
    legend_labelsize: float = 13,
    extrahlines: list[tuple[float, str, str, str | None]] = [],
) -> None:
    """Creates a plot with a dual Y-axis.

    Args:
        xpoints (list[float]): X-axis data points.
        leftaxis_ypoints_list (list[list[float]]): List of Y-axis data points for the left axis.
        rightaxis_ypoints_list (list[list[float]]): List of Y-axis data points for the right axis.
        xaxis_caption (str, optional): X-axis caption. Defaults to "".
        leftaxis_caption (str, optional): Left Y-axis caption. Defaults to "".
        rightaxis_caption (str, optional): Right Y-axis caption. Defaults to "".
        leftaxis_colors (list[str], optional): Colors for left axis lines. Defaults to [].
        rightaxis_colors (list[str], optional): Colors for right axis lines. Defaults to [].
        leftaxis_titles (list[str], optional): Legend titles for left axis lines. Defaults to [].
        rightaxis_titles (list[str], optional): Legend titles for right axis lines. Defaults to [].
        extrapoints (list[tuple[float, float, bool, str, str, str, float]], optional): List of single points,
            described by tuples with the content [x, y, is_left_axis, color, marker, label, yerr]. If yerr=0,
            no error bar is drawn at all. Defaults to [].
        has_legend (bool, optional): Whether to show the legend. Defaults to True.
        legend_direction (str, optional): Legend direction. Defaults to "".
        legend_position (tuple[float, float], optional): Legend position. Defaults to ().
        is_leftaxis_logarithmic (bool, optional): Whether to use a logarithmic scale for the left axis. Defaults to False.
        is_rightaxis_logarithmic (bool, optional): Whether to use a logarithmic scale for the right axis. Defaults to False.
        point_style (str, optional): Style for points. Defaults to "".
        line_style (str, optional): Style for lines. Defaults to "-".
        max_digits_after_comma (int, optional): Max digits after comma shown. Defaults to 4.
        savepath (str): If given, the plot is not shown but saved at the given path. Defaults to ""

    Returns:
        None (displays the plot)
    """

    fig, ax1 = plt.subplots()
    if figure_size_inches is not None:
        fig.set_size_inches(figure_size_inches[0], figure_size_inches[1])

    # Left Axis Plotting
    for y, color, linestyle, label in extrahlines:
        ax1.axhline(
            y=y,
            color=color,
            linestyle=linestyle,
            label=label,
        )

    for i, ypoints in enumerate(leftaxis_ypoints_list):
        color = leftaxis_colors[i] if leftaxis_colors else None
        title = leftaxis_titles[i] if leftaxis_titles else None
        ax1.plot(
            xpoints,
            ypoints,
            color=color,
            linestyle=line_style,
            marker=point_style,
            label=title,
        )

    ax1.set_xlabel(xaxis_caption, fontsize=axistitle_labelsize)
    ax1.set_ylabel(leftaxis_caption, fontsize=axistitle_labelsize)
    if is_leftaxis_logarithmic:
        ax1.set_yscale("log")
    if left_ylim is not None:
        ax1.set_ylim(left_ylim[0], left_ylim[1])

    plt.xticks(fontsize=axisticks_labelsize)
    plt.yticks(fontsize=axisticks_labelsize)

    # Right Axis Plotting
    if len(rightaxis_ypoints_list) > 0:
        ax2 = ax1.twinx()
        for i, ypoints in enumerate(rightaxis_ypoints_list):
            color = rightaxis_colors[i] if rightaxis_colors else None
            title = rightaxis_titles[i] if rightaxis_titles else None
            ax2.plot(
                xpoints,
                ypoints,
                color=color,
                linestyle=line_style,
                marker=point_style,
                label=title,
            )

        ax2.set_ylabel(rightaxis_caption, fontsize=axistitle_labelsize)
        if is_rightaxis_logarithmic:
            ax2.set_yscale("log")
        if right_ylim is not None:
            ax2.set_ylim(right_ylim[0], right_ylim[1])

        if left_axis_in_front:
            ax1.set_zorder(ax2.get_zorder() + 1)
            ax1.patch.set_visible(False)

    if xlim is not None:
        ax1.set_xlim(xlim[0], xlim[1])

    for i, extrapoint in enumerate(extrapoints):
        axis = ax1 if extrapoint[2] else ax2
        if extrapoint[6] != 0.0:
            axis.errorbar(
                extrapoint[0],
                extrapoint[1],
                yerr=extrapoint[6],
                ecolor=extrapoint[3],
                capsize=5,
                linestyle="",
                color=extrapoint[3],
                marker=extrapoint[4],
                label=extrapoint[5],
            )
        else:
            axis.plot(
                extrapoint[0],
                extrapoint[1],
                linestyle="",
                color=extrapoint[3],
                marker=extrapoint[4],
                label=extrapoint[5],
            )

    # Legend
    if has_legend:
        handles, labels = ax1.get_legend_handles_labels()
        if len(rightaxis_ypoints_list) > 0:
            handles2, labels2 = ax2.get_legend_handles_labels()

            if left_legend_position != []:
                oldhandles, oldlabels = deepcopy(handles), deepcopy(labels)
                for i, left_legend_position in enumerate(left_legend_position):
                    handles[left_legend_position] = oldhandles[i]
                    labels[left_legend_position] = oldlabels[i]

            if right_legend_position != []:
                oldhandles2, oldlabels2 = deepcopy(handles2), deepcopy(labels2)
                for i, right_legend_position in enumerate(right_legend_position):
                    handles2[right_legend_position] = oldhandles2[i]
                    labels2[right_legend_position] = oldlabels2[i]
            if special_figure_mode:
                # Just for COBRA-k's initial publication :-)
                # del handles[1]
                # del labels[1]
                # handles2.append(oldhandles[-2])
                # labels2.append(oldlabels[-2])
                pass

            handles = handles + handles2
            labels = labels + labels2
        extraargs = {"loc": legend_position} if legend_position != () else {}
        if legend_direction:
            extraargs["loc"] = legend_direction
        plt.legend(
            handles,
            labels,
            bbox_to_anchor=(0.5, 0.5)
            if not legend_position and not legend_direction
            else None,
            fontsize=legend_labelsize,
            **extraargs,
        )

    plt.xticks(fontsize=axisticks_labelsize)
    plt.yticks(fontsize=axisticks_labelsize)

    # Format axis ticks
    ax1.xaxis.set_major_formatter(
        plt.FuncFormatter(lambda x, _: f"{x:.{max_digits_after_comma}f}")
    )
    ax1.yaxis.set_major_formatter(
        plt.FuncFormatter(lambda x, _: f"{x:.{max_digits_after_comma}f}")
    )
    if len(rightaxis_ypoints_list) > 0:
        ax2.yaxis.set_major_formatter(
            plt.FuncFormatter(lambda x, _: f"{x:.{max_digits_after_comma}f}")
        )

    plt.tight_layout()  # Adjust layout to prevent labels from overlapping

    if not savepath:
        plt.show()
    else:
        plt.savefig(savepath, dpi=300)

    # Close the plot to free up memory
    plt.close()

multi_step_histogram(data, *, bins=10, range_=None, density=False, labels=None, colors=None, linewidth=1.5, alpha=1.0, linestyle='-', title=None, xlabel='Value', ylabel=None, legend_loc='best', ax=None, logmode=False, **hist_kwargs)

Plot several 1-D data sets as step histograms on a single Axes.

Parameters

data : sequence of iterables Each element is a collection of numbers (list, np.ndarray, pd.Series …). bins : int, sequence of scalars, or str, default 10 Passed straight to np.histogram / plt.hist. Use e.g. 'auto' or an explicit array of bin edges for full control. range_ : (float, float), optional Lower and upper range_ of the bins. If None the range_ is inferred from the data. density : bool, default False If True, the histogram is normalized to form a probability density, i.e. the integral of the histogram is 1. labels : sequence of str, optional Human-readable names for the data sets. If omitted, generic names Dataset 0, Dataset 1 … are used. colors : sequence of str, optional Matplotlib colour specifications. If omitted, the default colour cycle is used. linewidth : float, default 1.5 Width of the step lines. alpha : float in [0,1], default 1.0 Transparency of the lines. linestyle : str, default '-' Any valid matplotlib line style ('-', '--', ':' …). title, xlabel, ylabel : str, optional Axis titles. legend_loc : str or None, default 'best' Location of the legend; set to None to suppress the legend. ax : matplotlib.axes.Axes, optional Provide an existing Axes to plot into; otherwise a new figure/axes pair is created. **hist_kwargs Additional keyword arguments forwarded to plt.hist (e.g. log=True for a log-scale y-axis).

Returns

matplotlib.axes.Axes The Axes object containing the plot (useful for further tweaking).

Example

import numpy as np from cobrak.plotting import multi_step_histogram rng = np.random.default_rng() d1 = rng.normal(size=1000) d2 = rng.exponential(scale=2, size=1000) multi_step_histogram([d1, d2], bins=40, density=True, labels=['Normal', 'Exp'], colors=['tab:blue', 'tab:red'], title='Density step-histograms')

Source code in cobrak/plotting.py
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def multi_step_histogram(
    data: list[list[float]],
    *,
    bins: int | Sequence[float] | str = 10,
    range_: tuple[float, float] | None = None,
    density: bool = False,
    labels: Sequence[str] | None = None,
    colors: Sequence[str] | None = None,
    linewidth: float = 1.5,
    alpha: float = 1.0,
    linestyle: str = "-",
    title: str | None = None,
    xlabel: str | None = "Value",
    ylabel: str | None = None,
    legend_loc: str | None = "best",
    ax: plt.Axes | None = None,
    logmode: bool = False,
    **hist_kwargs,  # noqa: ANN003
) -> plt.Axes:
    """Plot several 1-D data sets as *step* histograms on a single Axes.

    Parameters
    ----------
    data : sequence of iterables
        Each element is a collection of numbers (list, np.ndarray, pd.Series …).
    bins : int, sequence of scalars, or str, default 10
        Passed straight to ``np.histogram`` / ``plt.hist``.
        Use e.g. ``'auto'`` or an explicit array of bin edges for full control.
    range_ : (float, float), optional
        Lower and upper range_ of the bins.  If ``None`` the range_ is
        inferred from the data.
    density : bool, default False
        If True, the histogram is normalized to form a probability density,
        i.e. the integral of the histogram is 1.
    labels : sequence of str, optional
        Human-readable names for the data sets.  If omitted, generic names
        ``Dataset 0``, ``Dataset 1`` … are used.
    colors : sequence of str, optional
        Matplotlib colour specifications.  If omitted, the default colour cycle
        is used.
    linewidth : float, default 1.5
        Width of the step lines.
    alpha : float in [0,1], default 1.0
        Transparency of the lines.
    linestyle : str, default '-'
        Any valid matplotlib line style (``'-'``, ``'--'``, ``':'`` …).
    title, xlabel, ylabel : str, optional
        Axis titles.
    legend_loc : str or None, default 'best'
        Location of the legend; set to ``None`` to suppress the legend.
    ax : matplotlib.axes.Axes, optional
        Provide an existing Axes to plot into; otherwise a new figure/axes
        pair is created.
    **hist_kwargs
        Additional keyword arguments forwarded to ``plt.hist`` (e.g.
        ``log=True`` for a log-scale y-axis).

    Returns
    -------
    matplotlib.axes.Axes
        The Axes object containing the plot (useful for further tweaking).

    Example
    -------
    import numpy as np
    from cobrak.plotting import multi_step_histogram
    rng = np.random.default_rng()
    d1 = rng.normal(size=1000)
    d2 = rng.exponential(scale=2, size=1000)
    multi_step_histogram([d1, d2],
                        bins=40,
                        density=True,
                        labels=['Normal', 'Exp'],
                        colors=['tab:blue', 'tab:red'],
                        title='Density step-histograms')
    """

    def _fmt(val: Any, _: Any) -> str:  # noqa: ANN401
        return f"{np.exp(val):.0e}"

    # ------------------------------------------------------------------
    # 1️⃣  Prepare the Axes
    # ------------------------------------------------------------------
    if ax is None:
        fig, ax = plt.subplots(figsize=(8, 5))
    else:
        fig = ax.figure

    # ------------------------------------------------------------------
    # 2️⃣  Normalise input arguments
    # ------------------------------------------------------------------
    n_sets = len(data)
    if labels is None:
        labels = [f"Dataset {i}" for i in range(n_sets)]
    if len(labels) != n_sets:
        raise ValueError("Length of `labels` must match number of data sets.")

    if colors is not None and len(colors) != n_sets:
        raise ValueError("Length of `colors` must match number of data sets.")
    if colors is None:
        colors = distinct_colors(n_sets)

    # ------------------------------------------------------------------
    # 3️⃣  Plot each histogram as a step line
    # ------------------------------------------------------------------
    for idx, (ds, lbl) in enumerate(zip(data, labels)):
        arr = _as_numpy_array(ds)
        if logmode:
            arr = np.log(arr)
        # plt.hist with `histtype='step'` draws exactly what we need.
        # We forward any extra **hist_kwargs** (e.g. log=True) to give the user
        # full flexibility.
        counts, bin_edges, _ = ax.hist(
            arr,
            bins=bins,
            range=range_,
            density=density,
            histtype="step",
            label=lbl,
            color=None if colors is None else colors[idx],
            linewidth=linewidth,
            alpha=alpha,
            linestyle=linestyle,
            **hist_kwargs,
        )

        # --------------------------------------------------------------
        #  Median line – stops at the histogram step
        # --------------------------------------------------------------
        med = np.median(arr)  # median of the data set
        # Find the bin that contains the median
        bin_idx = np.searchsorted(bin_edges, med, side="right") - 1
        # Guard against edge-cases (median exactly on the rightmost edge)
        bin_idx = np.clip(bin_idx, 0, len(counts) - 1)

        # Height of the histogram at the median (count or density)
        med_height = counts[bin_idx]

        # Draw a vertical line from y=0 up to the histogram line
        ax.vlines(
            med,
            0,
            med_height,
            colors=colors[idx] if colors is not None else None,
            linestyles="dashed",
            linewidth=1.5,
        )

    # ------------------------------------------------------------------
    # 4️⃣  Tidy up the figure
    # ------------------------------------------------------------------
    if title:
        ax.set_title(title, fontsize=14, pad=12)

    if xlabel:
        ax.set_xlabel(xlabel, fontsize=12)

    # If the user did not provide a custom ylabel we choose a sensible default.
    if ylabel is None:
        ylabel = "Density" if density else "Count"
    ax.set_ylabel(ylabel, fontsize=12)

    if legend_loc is not None:
        ax.legend(loc=legend_loc, fontsize=10)

    ax.grid(True, which="both", ls=":", linewidth=0.5, alpha=0.7)

    # Tight layout so labels are not clipped.
    fig.tight_layout()

    if logmode:
        ax.xaxis.set_major_formatter(mtick.FuncFormatter(_fmt))

    plt.show()

    return ax

plot_combinations(func, min_values, max_values, num_subplots_per_window=18, num_subplots_per_row=6)

Plot all unique combinations of 2 variable arguments and constant values for the other arguments.

The plot is a scatter plot with different colors for each category in the hue column. The x-axis represents the x-data, the y-axis represents the y-data, and the hue axis represents the category.

The plot has the following features:

  • A title at the top of the plot with the specified title.
  • Labels for the x-axis and y-axis with the specified labels.
  • A legend on the right side of the plot with the specified hue label.
  • Different colors for each category in the hue column, specified by the palette.
  • A scatter plot with points representing the data.

Example usage: from cobrak.plotting import plot_combinations min_values = [-1.0, 0.0, 0.0] max_values = [10.0, 5.0, 10.0] def example_func(args: List[float]) -> float: return args[0] + args[1] + args[2] plot_combinations(example_func, min_values, max_values)

Args: - func: The function to be plotted. It takes a list of floats and returns a float. - min_values: A list of minimum possible values for each argument. - max_values: A list of maximum possible values for each argument. - num_subplots_per_window: The maximum number of subplots per window. Defaults to 18. - num_subplots_per_row: The maximum number of subplots per row in a window. Defaults to 6.

Returns: - None

Source code in cobrak/plotting.py
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def plot_combinations(
    func: Callable[[list[float]], float],
    min_values: list[float],
    max_values: list[float],
    num_subplots_per_window: int = 18,
    num_subplots_per_row: int = 6,
) -> None:
    """Plot all unique combinations of 2 variable arguments and constant values for the other arguments.

    The plot is a scatter plot with different colors for each category in the hue column. The x-axis represents the x-data,
    the y-axis represents the y-data, and the hue axis represents the category.

    The plot has the following features:

    * A title at the top of the plot with the specified title.
    * Labels for the x-axis and y-axis with the specified labels.
    * A legend on the right side of the plot with the specified hue label.
    * Different colors for each category in the hue column, specified by the palette.
    * A scatter plot with points representing the data.

    Example usage:
    from cobrak.plotting import plot_combinations
    min_values = [-1.0, 0.0, 0.0]
    max_values = [10.0, 5.0, 10.0]
    def example_func(args: List[float]) -> float:
       return args[0] + args[1] + args[2]
    plot_combinations(example_func, min_values, max_values)

    Args:
    - func: The function to be plotted. It takes a list of floats and returns a float.
    - min_values: A list of minimum possible values for each argument.
    - max_values: A list of maximum possible values for each argument.
    - num_subplots_per_window: The maximum number of subplots per window. Defaults to 18.
    - num_subplots_per_row: The maximum number of subplots per row in a window. Defaults to 6.

    Returns:
    - None
    """

    # Generate all possible combinations of 2 variable arguments
    variable_combinations = []
    for i in range(len(min_values)):
        for j in range(i + 1, len(min_values)):
            variable_combinations.append((i, j))

    # Generate all unique combinations of variable and constant arguments
    combinations = []
    for variable_combination in variable_combinations:
        constant_combinations = _get_constant_combinations(
            len(min_values), variable_combination, min_values, max_values
        )
        for constant_combination in constant_combinations:
            combinations.append((variable_combination, constant_combination))

    # Plot each combination
    num_windows = int(np.ceil(len(combinations) / num_subplots_per_window))
    for window_index in range(num_windows):
        num_subplots = min(
            num_subplots_per_window,
            len(combinations) - window_index * num_subplots_per_window,
        )
        num_rows = int(np.ceil(num_subplots / num_subplots_per_row))
        _, axs = plt.subplots(
            num_rows,
            num_subplots_per_row,
            figsize=(20, 5 * num_rows),
            subplot_kw={"projection": "3d"},
        )
        if num_subplots_per_row == 1:
            axs = [[ax] for ax in axs]
        elif num_rows == 1:
            axs = [axs]
        else:
            axs = [list(axs_row) for axs_row in axs]

        z_mins = {}
        z_maxs = {}
        for combination in combinations[
            window_index * num_subplots_per_window : (window_index + 1)
            * num_subplots_per_window
        ]:
            variable_combination, constant_combination = combination
            if variable_combination not in z_mins:
                z_mins[variable_combination] = float("inf")
                z_maxs[variable_combination] = float("-inf")
            x = np.linspace(
                min_values[variable_combination[0]],
                max_values[variable_combination[0]],
                100,
            )
            y = np.linspace(
                min_values[variable_combination[1]],
                max_values[variable_combination[1]],
                100,
            )
            X, Y = np.meshgrid(x, y)
            Z = np.zeros(X.shape)
            for i in range(X.shape[0]):
                for j in range(X.shape[1]):
                    args = [
                        (
                            X[i, j]
                            if k == variable_combination[0]
                            else (
                                Y[i, j]
                                if k == variable_combination[1]
                                else constant_combination[k]
                            )
                        )
                        for k in range(len(min_values))
                    ]
                    Z[i, j] = func(args)
            z_mins[variable_combination] = min(z_mins[variable_combination], np.min(Z))  # type: ignore
            z_maxs[variable_combination] = max(z_maxs[variable_combination], np.max(Z))  # type: ignore

        for subplot_index, combination in enumerate(
            combinations[
                window_index * num_subplots_per_window : (window_index + 1)
                * num_subplots_per_window
            ]
        ):
            variable_combination, constant_combination = combination
            x = np.linspace(
                min_values[variable_combination[0]],
                max_values[variable_combination[0]],
                100,
            )
            y = np.linspace(
                min_values[variable_combination[1]],
                max_values[variable_combination[1]],
                100,
            )
            X, Y = np.meshgrid(x, y)
            Z = np.zeros(X.shape)
            for i in range(X.shape[0]):
                for j in range(X.shape[1]):
                    args = [
                        (
                            X[i, j]
                            if k == variable_combination[0]
                            else (
                                Y[i, j]
                                if k == variable_combination[1]
                                else constant_combination[k]
                            )
                        )
                        for k in range(len(min_values))
                    ]
                    Z[i, j] = func(args)

            row_index = subplot_index // num_subplots_per_row
            col_index = subplot_index % num_subplots_per_row

            axs[row_index][col_index].plot_surface(
                X, Y, Z, cmap="viridis", edgecolor="none"
            )
            axs[row_index][col_index].set_xlabel(f"Argument {variable_combination[0]}")
            axs[row_index][col_index].set_ylabel(f"Argument {variable_combination[1]}")
            axs[row_index][col_index].set_zlim(
                z_mins[variable_combination], z_maxs[variable_combination]
            )

            constant_title = ", ".join(
                [
                    f"{i}: {constant_combination[i]}"
                    for i in range(len(min_values))
                    if i not in variable_combination
                ]
            )
            axs[row_index][col_index].set_title(
                (
                    f"Variable: {variable_combination[0]}, {variable_combination[1]}\n"
                    f"Constant: {constant_title}"
                    if constant_title
                    else f"Variable: {variable_combination[0]}, {variable_combination[1]}"
                ),
                fontsize=8,
            )

            # Hide empty plots
            if subplot_index >= num_subplots:
                axs[row_index][col_index].axis("off")

        plt.tight_layout()
        plt.show()

plot_objvalue_evolution(json_path, output_path, ylabel='Objective value', objvalue_multiplicator=-1.0, with_legend=False, precision=4)

Plots the evolution of the objective value over computational time.

Parameters:

Name Type Description Default
json_path str

Path to the JSON file containing the data.

required
output_path str

Path to save the plot.

required
ylabel str

Label for the Y-axis. Defaults to "Objective value".

'Objective value'
objvalue_multiplicator float

Multiplier to apply to the objective value. Defaults to -1.0.

-1.0
with_legend bool

Whether to display the legend. Defaults to False.

False
precision int

The number of decimal places to display on the Y-axis. Defaults to 4.

4

Returns:

Type Description
None

None. Saves the plot to the specified output path.

Source code in cobrak/plotting.py
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def plot_objvalue_evolution(
    json_path: str,
    output_path: str,
    ylabel: str = "Objective value",
    objvalue_multiplicator: float = -1.0,
    with_legend: bool = False,
    precision: int = 4,
) -> None:
    """Plots the evolution of the objective value over computational time.

    Args:
        json_path (str): Path to the JSON file containing the data.
        output_path (str): Path to save the plot.
        ylabel (str, optional): Label for the Y-axis. Defaults to "Objective value".
        objvalue_multiplicator (float, optional): Multiplier to apply to the objective value. Defaults to -1.0.
        with_legend (bool, optional): Whether to display the legend. Defaults to False.
        precision (int, optional): The number of decimal places to display on the Y-axis. Defaults to 4.

    Returns:
        None. Saves the plot to the specified output path.
    """

    def format_decimal(x, _) -> str:  # noqa: ANN001
        return f"{x:.{precision}f}"  # Use the specified precision

    # Load data from JSON file
    data = json_load(json_path, Any)

    # Extract timepoints
    timepoints = tuple(float(key) for key in data)

    # Initialize objvalues list
    objvalues = [[]]

    # Populate objvalues list
    for values in data.values():
        objvalues[0].append(objvalue_multiplicator * values[0])

    plt.clf()
    plt.cla()
    plt.plot(timepoints, objvalues[0], linestyle="-", marker=None, label="Best value")

    # Customize the plot
    plt.xlabel("Computational Time [s]")
    plt.ylabel(ylabel)
    plt.title(f"{ylabel} Evolution Over Time")
    if with_legend:
        plt.legend()

    plt.gca().yaxis.set_major_formatter(FuncFormatter(format_decimal))

    # Save the plot
    plt.savefig(output_path)

    # Close the plot to free up memory
    plt.close()

plot_range_bars(data_captions, data_labels, data_ranges, data_colors, *, cap_len=0.2, line_width=3.0, figsize=(10, 6), title='Range.Bar Plot', ylabel='Label', xlabel='Value', ax=None, highlight_means=None, log_y=False, legend_pos=None, marker_size=80, title_labelsize=16, axes_labelsize=13, ticks_labelsize=13, legend_labelsize=11, legend_bbox_to_anchor=None, ylim=None)

Plot vertical range bars with categorical labels on the x‑axis.

Parameters

data_captions: list[str] Labels for the legend. List length must equal the one from e.g. data_colors. data_labels : list[str] Labels for the x axis. The same strings are also used as the x‑axis tick labels after alphabetical sorting. data_ranges : list[list[tuple[float, float]]] Outer list length = number of groups (must equal len(data_colors)). Each inner list must have the same length as data_labels. (low, high) defines the numeric range for the corresponding label. data_colors : list[str] Colour for each group; length must match the outer dimension of data_ranges. cap_len : float, optional Half‑width of the horizontal caps at each end of a bar (default 0.2). line_width : float, optional Thickness of the vertical bars and caps (default 3.0). figsize : tuple[float, float], optional Figure size passed to plt.subplots (default (10, 6)). title, ylabel, xlabel : str, optional Plot title and axis labels. ax : matplotlib.axes.Axes, optional Axes to draw on; if None a new figure and axes are created. highlight_means : list[bool] | None, optional True for a group means that the mean of each range (low+high)/2 is highlighted with a larger circular marker. Length must equal len(data_ranges). If None no means are highlighted. log_y : bool, optional If True the y‑axis is set to a logarithmic scale. legend_pos: str | None, optional. If not None, the given matplotlib legend position is used.

Returns

matplotlib.axes.Axes The axes containing the generated plot.

Raises

ValueError If the lengths of the input sequences are inconsistent.

Source code in cobrak/plotting.py
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def plot_range_bars(
    data_captions: list[str],
    data_labels: list[str],
    data_ranges: list[list[tuple[float, float]]],
    data_colors: list[str],
    *,
    cap_len: float = 0.2,
    line_width: float = 3.0,
    figsize: tuple[float, float] = (10, 6),
    title: str = "Range.Bar Plot",
    ylabel: str = "Label",
    xlabel: str = "Value",
    ax: plt.Axes | None = None,
    highlight_means: list[bool] | None = None,
    log_y: bool = False,
    legend_pos: str | None = None,
    marker_size: float = 80,
    title_labelsize: float = 16,
    axes_labelsize: float = 13,
    ticks_labelsize: float = 13,
    legend_labelsize: float = 11,
    legend_bbox_to_anchor: None | tuple[float, float] = None,
    ylim: None | tuple[float, float] = None,
) -> plt.Axes:
    """Plot vertical range bars with categorical labels on the x‑axis.

    Parameters
    ----------
    data_captions: list[str]
        Labels for the *legend*. List length must equal the one from e.g. data_colors.
    data_labels : list[str]
        Labels for the x axis.  The same strings are also
        used as the x‑axis tick labels after alphabetical sorting.
    data_ranges : list[list[tuple[float, float]]]
        Outer list length = number of groups (must equal ``len(data_colors)``).
        Each inner list must have the same length as ``data_labels``.
        ``(low, high)`` defines the numeric range for the corresponding label.
    data_colors : list[str]
        Colour for each group; length must match the outer dimension of
        ``data_ranges``.
    cap_len : float, optional
        Half‑width of the horizontal caps at each end of a bar (default 0.2).
    line_width : float, optional
        Thickness of the vertical bars and caps (default 3.0).
    figsize : tuple[float, float], optional
        Figure size passed to ``plt.subplots`` (default (10, 6)).
    title, ylabel, xlabel : str, optional
        Plot title and axis labels.
    ax : matplotlib.axes.Axes, optional
        Axes to draw on; if ``None`` a new figure and axes are created.
    highlight_means : list[bool] | None, optional
        ``True`` for a group means that the mean of each range
        ``(low+high)/2`` is highlighted with a larger circular marker.
        Length must equal ``len(data_ranges)``.  If ``None`` no means are
        highlighted.
    log_y : bool, optional
        If ``True`` the y‑axis is set to a logarithmic scale.
    legend_pos: str | None, optional.
        If not ```None```, the given matplotlib legend position is used.

    Returns
    -------
    matplotlib.axes.Axes
        The axes containing the generated plot.

    Raises
    ------
    ValueError
        If the lengths of the input sequences are inconsistent.
    """
    # ------------------------------------------------------------------ #
    # 1️⃣  Sanity checks
    # ------------------------------------------------------------------ #
    n_groups = len(data_ranges)

    if n_groups != len(data_colors):
        raise ValueError("len(data_colors) must equal the outer length of data_ranges")
    if any(len(inner) != len(data_labels) for inner in data_ranges):
        raise ValueError(
            "Every inner list in data_ranges must have the same length as data_labels"
        )
    if highlight_means is not None and len(highlight_means) != n_groups:
        raise ValueError(
            "highlight_means must be ``None`` or a list with length equal to the number of groups"
        )

    # ------------------------------------------------------------------ #
    # 2️⃣  Alphabetical ordering of the categorical labels (x‑axis)
    # ------------------------------------------------------------------ #
    sorted_idx = sorted(range(len(data_labels)), key=lambda i: data_labels[i])
    sorted_labels = [data_labels[i] for i in sorted_idx]
    # reorder each group’s ranges to match the sorted label order
    sorted_ranges = [[grp[i] for i in sorted_idx] for grp in data_ranges]

    # ------------------------------------------------------------------ #
    # 3️⃣  Figure / Axes handling
    # ------------------------------------------------------------------ #
    if ax is None:
        fig, ax = plt.subplots(figsize=figsize)

    x_pos = range(len(sorted_labels))

    # ------------------------------------------------------------------ #
    # 4️⃣  Plot each group
    # ------------------------------------------------------------------ #
    for grp_idx, (grp_ranges, colour) in enumerate(zip(sorted_ranges, data_colors)):
        lows = [rng[0] for rng in grp_ranges]
        highs = [rng[1] for rng in grp_ranges]

        # vertical range bars
        ax.vlines(
            x=x_pos,
            ymin=lows,
            ymax=highs,
            color=colour,
            linewidth=line_width,
        )
        # caps – low end
        ax.hlines(
            y=lows,
            xmin=[xp - cap_len / 2 for xp in x_pos],
            xmax=[xp + cap_len / 2 for xp in x_pos],
            color=colour,
            linewidth=line_width,
        )
        # caps – high end
        ax.hlines(
            y=highs,
            xmin=[xp - cap_len / 2 for xp in x_pos],
            xmax=[xp + cap_len / 2 for xp in x_pos],
            color=colour,
            linewidth=line_width,
        )

        # ------------------------------------------------------------------
        # 4️⃣️⃣  Optional mean highlighting
        # ------------------------------------------------------------------
        if highlight_means and highlight_means[grp_idx]:
            means = [(low + high) / 2 for low, high in zip(lows, highs)]
            ax.scatter(
                x=list(x_pos),
                y=means,
                color=colour,
                edgecolor="k",
                zorder=5,
                s=marker_size,
                marker="_",
                linewidth=1.5,
                label="_mean",  # dummy label – we will build the legend ourselves
            )

    # ------------------------------------------------------------------ #
    # 5️⃣  Cosmetics
    # ------------------------------------------------------------------ #
    ax.set_xticks(list(x_pos))
    ax.set_xticklabels(sorted_labels, rotation=45, ha="right")
    ax.set_xlabel(xlabel, fontsize=axes_labelsize)
    ax.set_ylabel(ylabel, fontsize=axes_labelsize)
    ax.set_title(title, loc="left", fontweight="bold", fontsize=title_labelsize)
    ax.yaxis.grid(True, which="both", linestyle="--", alpha=0.5)
    if ylim:
        ax.set_ylim(ylim[0], ylim[1])
    ax.tick_params(axis="x", labelsize=ticks_labelsize)
    ax.tick_params(axis="y", labelsize=ticks_labelsize)

    if log_y:
        ax.set_yscale("log")

    # ------------------------------------------------------------------ #
    # 6️⃣  Legend – use *data_labels* (one entry per group) with the supplied colours
    # ------------------------------------------------------------------ #
    legend_handles = [
        Line2D([0], [0], color=col, lw=line_width, label=lbl)
        for lbl, col in zip(data_captions, data_colors)
    ]
    ax.legend(
        handles=legend_handles,
        loc="best" if legend_pos is None else legend_pos,
        fontsize=legend_labelsize,
        bbox_to_anchor=legend_bbox_to_anchor,
    )
    ax.margins(x=0.01)

    plt.tight_layout()
    return ax

plot_variabilities(variabilities, variability_names, variability_titles, colors, xlabel='', ylabel='', yscale='log', plot_mean=True, save_path=None)

Plots the mean values and whisker bars for multiple variabilities.

This function generates a plot where each variability is represented by a series of points (triangles) and whisker bars. Each point (if plot_mean==True) represents the mean value of a data point in the variability, and the whisker bars represent the lower and upper bounds. The variabilities are grouped together for each data point, with a space between each group to clearly distinguish them.

Parameters:

variabilities : List[List[Tuple[float, float, float]]] A list of lists, where each inner list represents a variability. Each tuple in the inner list contains (lower_bound, upper_bound, mean_value) for each data point in the variability.

variability_names : List[str] A list of strings representing the names of the variabilities.

colors : List[str] A list of strings representing the colors for each variability, e.g. using names from https://matplotlib.org/stable/gallery/color/named_colors.html

plot_mean : bool, optional If True, the mean value is plotted as a triangle. If False, only the whisker bars are plotted. Default is True.

save_path : str, optional The file path where the plot should be saved. If None, the plot is displayed. Default is None.

Returns:

None The function either displays the plot or saves it to the specified path.

Example:

from cobrak.plotting import plot_variabilities in_vivo = [(1.0, 3.0, 2.0), (2.0, 4.0, 3.0), (3.0, 5.0, 4.0)] in_silico = [(1.5, 3.5, 2.5), (2.5, 4.5, 3.5), (3.5, 5.5, 4.5)] another_variability = [(1.2, 3.2, 2.2), (2.2, 4.2, 3.2), (3.2, 5.2, 4.2)] variabilities = [in_vivo, in_silico, another_variability] variability_names = ['in_vivo', 'in_silico', 'another_variability'] colors = ['blue', 'orange', 'green'] plot_variabilities(variabilities, variability_names, colors) plot_variabilities(variabilities, variability_names, colors, plot_mean=False) plot_variabilities(variabilities, variability_names, colors, save_path='plot.png')

Source code in cobrak/plotting.py
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def plot_variabilities(
    variabilities: list[list[tuple[float, float, float]]],
    variability_names: list[str],
    variability_titles: list[str],
    colors: list[str],
    xlabel: str = "",
    ylabel: str = "",
    yscale: str = "log",
    plot_mean: bool = True,
    save_path: str | None = None,
) -> None:
    """Plots the mean values and whisker bars for multiple variabilities.

    This function generates a plot where each variability is represented by a series of points (triangles) and whisker bars.
    Each point (if plot_mean==True) represents the mean value of a data point in the variability, and the whisker bars represent the lower and upper bounds.
    The variabilities are grouped together for each data point, with a space between each group to clearly distinguish them.

    Parameters:
    -----------
    variabilities : List[List[Tuple[float, float, float]]]
    A list of lists, where each inner list represents a variability. Each tuple in the inner list contains
    (lower_bound, upper_bound, mean_value) for each data point in the variability.

    variability_names : List[str]
    A list of strings representing the names of the variabilities.

    colors : List[str]
    A list of strings representing the colors for each variability, e.g. using names from https://matplotlib.org/stable/gallery/color/named_colors.html

    plot_mean : bool, optional
    If True, the mean value is plotted as a triangle. If False, only the whisker bars are plotted. Default is True.

    save_path : str, optional
    The file path where the plot should be saved. If None, the plot is displayed. Default is None.

    Returns:
    --------
    None
    The function either displays the plot or saves it to the specified path.

    Example:
    --------
    from cobrak.plotting import plot_variabilities
    in_vivo = [(1.0, 3.0, 2.0), (2.0, 4.0, 3.0), (3.0, 5.0, 4.0)]
    in_silico = [(1.5, 3.5, 2.5), (2.5, 4.5, 3.5), (3.5, 5.5, 4.5)]
    another_variability = [(1.2, 3.2, 2.2), (2.2, 4.2, 3.2), (3.2, 5.2, 4.2)]
    variabilities = [in_vivo, in_silico, another_variability]
    variability_names = ['in_vivo', 'in_silico', 'another_variability']
    colors = ['blue', 'orange', 'green']
    plot_variabilities(variabilities, variability_names, colors)
    plot_variabilities(variabilities, variability_names, colors, plot_mean=False)
    plot_variabilities(variabilities, variability_names, colors, save_path='plot.png')
    """
    # Number of variabilities
    n = len(variabilities[0])
    num_variabilities = len(variabilities)

    # Create a figure and axis
    _, ax = plt.subplots()

    # Define the positions for the groups
    positions = [
        list(
            range(
                i * (num_variabilities + 1),
                i * (num_variabilities + 1) + num_variabilities,
            )
        )
        for i in range(n)
    ]

    # Plot each variability
    for i, (pos_group, variability) in enumerate(zip(positions, zip(*variabilities))):
        for j, (pos, (lower, upper, mean)) in enumerate(zip(pos_group, variability)):
            if plot_mean:
                ax.errorbar(
                    pos,
                    mean,
                    yerr=[[mean - lower], [upper - mean]],
                    fmt="o",
                    capsize=5,
                    color=colors[j],
                    ecolor=colors[j],
                    label=variability_titles[j] if i == 0 else "",
                )
            else:
                ax.errorbar(
                    pos,
                    mean,
                    yerr=[[mean - lower], [upper - mean]],
                    fmt="none",
                    capsize=5,
                    ecolor=colors[j],
                    label=variability_titles[j] if i == 0 else "",
                )

    # Calculate midpoints between groups for vertical lines
    for i in range(len(positions) - 1):
        # Get the end of the current group and the start of the next group
        current_group_end = positions[i][-1]
        next_group_start = positions[i + 1][0]
        # Calculate the midpoint
        midpoint = (current_group_end + next_group_start) / 2
        # Draw a thin vertical black line at the midpoint
        ax.axvline(x=midpoint, color="black", linestyle="-", linewidth=0.5, alpha=0.7)

    # Set the x-axis labels
    ax.set_xticks([pos[0] + (num_variabilities - 1) / 2 for pos in positions])
    ax.set_xticklabels(variability_names)  # [f"Exp {i+1}" for i in range(n)])

    # Add labels and title
    ax.set_xlabel(xlabel)
    ax.set_ylabel(ylabel)
    ax.set_title("Comparison of Variabilities")
    ax.set_yscale(yscale)

    # Add legend
    ax.legend()

    # Save or show the plot
    if save_path is not None:
        plt.savefig(save_path)
    else:
        plt.show()

scatterplot_with_labels(x_data, y_data, labels, x_label=None, y_label=None, y_log=True, x_log=True, add_labels=False, identical_axis_lims=True, xlim_overwrite=None, ylim_overwrite=None, ax=None, save_path=None, title=None, extratext=None, x_labelsize=13, y_labelsize=13, major_tick_labelsize=13, minor_tick_labelsize=10, legend_labelsize=13, title_labelsize=16, extratext_labelsize=14, label_fontsize=13, labelcoords=(0, 10))

Generates a scatter plot with error bars and optional point labels.

Can be used standalone ("one-off" plot with plt.show()), or for subplotting by passing an Axes object. Optionally saves the figure if save_path is provided.

Parameters

x_data : list[tuple[float, float, float]] Each tuple is (lower bound, upper bound, drawn value) for x. y_data : list[tuple[float, float, float]] Each tuple is (lower bound, upper bound, drawn value) for y. labels : list[str] Labels for each point (used if add_labels is True). x_label : str, optional X-axis label. y_label : str, optional Y-axis label. y_log : bool, default True Use log scale for y-axis. x_log : bool, default True Use log scale for x-axis. add_labels : bool, default False Annotate points with corresponding label. identical_axis_lims : bool, default True Make x and y axis limits identical and auto-scale them. ax : matplotlib.axes.Axes, optional If provided, plot is drawn on this Axes (for subplotting). save_path : str, optional If provided and ax is None (standalone plotting), save the figure at this path instead of showing.

Returns

ax : matplotlib.axes.Axes The axis object containing the plot.

Source code in cobrak/plotting.py
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def scatterplot_with_labels(
    x_data: list[tuple[float, float, float]],
    y_data: list[tuple[float, float, float]],
    labels: list[str],
    x_label: str = None,
    y_label: str = None,
    y_log: bool = True,
    x_log: bool = True,
    add_labels: bool = False,
    identical_axis_lims: bool = True,
    xlim_overwrite: None | tuple[float, float] = None,
    ylim_overwrite: None | tuple[float, float] = None,
    ax: plt.Axes = None,
    save_path: str = None,
    title: str | None = None,
    extratext: str | None = None,
    x_labelsize: float = 13,
    y_labelsize: float = 13,
    major_tick_labelsize: float = 13,
    minor_tick_labelsize: float = 10,
    legend_labelsize: float = 13,  # noqa: ARG001
    title_labelsize: float = 16,
    extratext_labelsize: float = 14,
    label_fontsize: float = 13,
    labelcoords: tuple[float, float] = (0, 10),
) -> plt.Axes:
    """Generates a scatter plot with error bars and optional point labels.

    Can be used standalone ("one-off" plot with plt.show()), or for subplotting by passing an Axes object.
    Optionally saves the figure if save_path is provided.

    Parameters
    ----------
    x_data : list[tuple[float, float, float]]
        Each tuple is (lower bound, upper bound, drawn value) for x.
    y_data : list[tuple[float, float, float]]
        Each tuple is (lower bound, upper bound, drawn value) for y.
    labels : list[str]
        Labels for each point (used if add_labels is True).
    x_label : str, optional
        X-axis label.
    y_label : str, optional
        Y-axis label.
    y_log : bool, default True
        Use log scale for y-axis.
    x_log : bool, default True
        Use log scale for x-axis.
    add_labels : bool, default False
        Annotate points with corresponding label.
    identical_axis_lims : bool, default True
        Make x and y axis limits identical and auto-scale them.
    ax : matplotlib.axes.Axes, optional
        If provided, plot is drawn on this Axes (for subplotting).
    save_path : str, optional
        If provided and `ax` is None (standalone plotting), save the figure at this path instead of showing.

    Returns
    -------
    ax : matplotlib.axes.Axes
        The axis object containing the plot.
    """
    # Calculate midpoints and error sizes for x and y coordinates
    x_drawn = [x[2] for x in x_data]
    x_low = [x[0] for x in x_data]
    x_high = [x[1] for x in x_data]
    x_err_low = [x_drawn[i] - x_low[i] for i in range(len(x_data))]
    x_err_high = [x_high[i] - x_drawn[i] for i in range(len(x_data))]

    y_drawn = [y[2] for y in y_data]
    y_low = [y[0] for y in y_data]
    y_high = [y[1] for y in y_data]
    y_err_low = [y_drawn[i] - y_low[i] for i in range(len(y_data))]
    y_err_high = [y_high[i] - y_drawn[i] for i in range(len(y_data))]

    n_points = len(x_drawn)
    colors = get_cmap("viridis")(np.linspace(0, 1, n_points))

    _created_fig = False
    if ax is None:
        fig, ax = plt.subplots(figsize=(10, 6))
        _created_fig = True

    # Plot each point individually to assign different colors
    for i in range(n_points):
        ax.errorbar(
            x_drawn[i],
            y_drawn[i],
            xerr=[[x_err_low[i]], [x_err_high[i]]],
            yerr=[[y_err_low[i]], [y_err_high[i]]],
            fmt="o",
            markersize=7,
            color=colors[i],
            capsize=4,
            capthick=2,
            elinewidth=2,
        )

    # Add labels to each point
    if add_labels:
        for i, (xi, yi) in enumerate(zip(x_drawn, y_drawn)):
            ax.annotate(
                labels[i],
                (xi, yi),
                textcoords="offset points",
                xytext=labelcoords,
                ha="center",
                fontsize=label_fontsize,
            )

    # Axis limits & unity line
    all_x_values = [x_datapoint[0] for x_datapoint in x_data] + [
        x_datapoint[1] for x_datapoint in x_data
    ]
    all_y_values = [y_datapoint[0] for y_datapoint in y_data] + [
        y_datapoint[1] for y_datapoint in y_data
    ]

    min_val = min(*all_y_values, *all_x_values) * 0.99
    max_val = max(*all_y_values, *all_x_values) * 1.2

    if identical_axis_lims:
        ax.set_xlim(min_val, max_val)
        ax.set_ylim(min_val, max_val)

    if xlim_overwrite is not None:
        ax.set_xlim(xlim_overwrite[0], xlim_overwrite[1])
    if ylim_overwrite is not None:
        ax.set_ylim(ylim_overwrite[0], ylim_overwrite[1])

    x_unity = np.linspace(0, max_val * 100, 10)
    y_unity = x_unity
    ax.plot(x_unity, y_unity, "-", color="black", linewidth=1)

    if y_log:
        ax.set_yscale("log")
    if x_log:
        ax.set_xscale("log")

    if x_label:
        ax.set_xlabel(x_label, fontsize=x_labelsize)
    if y_label:
        ax.set_ylabel(y_label, fontsize=y_labelsize)

    if title is not None:
        ax.set_title(title, loc="left", fontweight="bold", fontsize=title_labelsize)

    if extratext:
        ax.text(
            0.025,
            0.975,
            extratext,
            horizontalalignment="left",
            verticalalignment="top",
            transform=ax.transAxes,
            fontsize=extratext_labelsize,
            fontweight="bold",
        )

    ax.grid(True)

    ax.tick_params(axis="both", which="major", labelsize=major_tick_labelsize)
    ax.tick_params(axis="both", which="minor", labelsize=minor_tick_labelsize)
    ax.yaxis.set_major_locator(ax.xaxis.get_major_locator())

    if _created_fig:
        plt.tight_layout()
        if save_path is not None:
            plt.savefig(save_path)
        else:
            plt.show()
        plt.close(fig)
    return ax

printing

Pretty-print summaries of optimization and variability results as well as COBRAk Model instances.

For results, its methods generate rich tables that display flux values and variability information for each category. For models, its methods generate richtables that display the model's structure and parameters.

print_dict(dictionary, indent=4)

Pretty-print a dictionary in a JSON formatted string with the specified indentation.

Args: dictionary (dict[Any, Any]): The dictionary to print. indent (int, optional): The number of spaces for indentation. Defaults to 4.

Source code in cobrak/printing.py
249
250
251
252
253
254
255
256
257
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def print_dict(dictionary: dict[Any, Any], indent: int = 4) -> None:
    """Pretty-print a dictionary in a JSON formatted string with the specified indentation.

    Args:
    dictionary (dict[Any, Any]): The dictionary to print.
    indent (int, optional): The number of spaces for indentation. Defaults to 4.
    """
    console.print(dumps(dictionary, indent=indent))

print_model(cobrak_model, print_reacs=True, print_enzymes=True, print_mets=True, print_extra_linear_constraints=True, print_settings=True, conc_rounding=6)

Pretty-print a detailed summary of the model, including reactions, enzymes, metabolites, and settings.

Args: cobrak_model (Model): The model to print. print_reacs (bool, optional): Whether to print reactions. Defaults to True. print_enzymes (bool, optional): Whether to print enzymes. Defaults to True. print_mets (bool, optional): Whether to print metabolites. Defaults to True. print_extra_linear_constraints (bool, optional): Whether to print extra linear constraints. Defaults to True. print_settings (bool, optional): Whether to print general settings. Defaults to True. conc_rounding (int, optional): Number of decimal places to round concentrations to. Defaults to 6.

Source code in cobrak/printing.py
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
@validate_call
def print_model(
    cobrak_model: Model,
    print_reacs: bool = True,
    print_enzymes: bool = True,
    print_mets: bool = True,
    print_extra_linear_constraints: bool = True,
    print_settings: bool = True,
    conc_rounding: int = 6,
) -> None:
    """Pretty-print a detailed summary of the model, including reactions, enzymes, metabolites, and settings.

    Args:
    cobrak_model (Model): The model to print.
    print_reacs (bool, optional): Whether to print reactions. Defaults to True.
    print_enzymes (bool, optional): Whether to print enzymes. Defaults to True.
    print_mets (bool, optional): Whether to print metabolites. Defaults to True.
    print_extra_linear_constraints (bool, optional): Whether to print extra linear constraints. Defaults to True.
    print_settings (bool, optional): Whether to print general settings. Defaults to True.
    conc_rounding (int, optional): Number of decimal places to round concentrations to. Defaults to 6.
    """

    console.print("\n[b u]Model[b u]")

    if print_reacs:
        reac_table = Table(title="Reactions", title_justify="left")
        reac_table.add_column("ID")
        reac_table.add_column("String")
        reac_table.add_column("ΔG'°")
        reac_table.add_column("kcat")
        reac_table.add_column("kM")
        reac_table.add_column("kI")
        reac_table.add_column("kA")
        reac_table.add_column("Hills")
        reac_table.add_column("Name")
        reac_table.add_column("Annotation")

        for reac_id, reaction in sort_dict_keys(cobrak_model.reactions).items():
            arguments = [
                reac_id,
                get_reaction_string(cobrak_model, reac_id),
                _none_as_na(reaction.dG0),
                (
                    "N/A"
                    if reaction.enzyme_reaction_data is None
                    else str(reaction.enzyme_reaction_data.k_cat)
                ),
                (
                    "N/A"
                    if reaction.enzyme_reaction_data is None
                    else str(reaction.enzyme_reaction_data.k_ms)
                ),
                (
                    "N/A"
                    if reaction.enzyme_reaction_data is None
                    else str(reaction.enzyme_reaction_data.k_is)
                ),
                (
                    "N/A"
                    if reaction.enzyme_reaction_data is None
                    else str(reaction.enzyme_reaction_data.k_as)
                ),
                (
                    "N/A"
                    if reaction.enzyme_reaction_data is None
                    else str(reaction.enzyme_reaction_data.hill_coefficients)
                ),
                reaction.name,
                str(reaction.annotation),
            ]
            reac_table.add_row(*arguments)
        console.print(reac_table)

    if print_enzymes and cobrak_model.enzymes != {}:
        enzyme_table = Table(title="Enzymes", title_justify="left")
        enzyme_table.add_column("ID")
        enzyme_table.add_column("MW")
        enzyme_table.add_column("min([E])")
        enzyme_table.add_column("max([E])")
        enzyme_table.add_column("Name")
        enzyme_table.add_column("Annotation")

        for enzyme_id, enzyme in sort_dict_keys(cobrak_model.enzymes).items():
            arguments = [
                enzyme_id,
                str(enzyme.molecular_weight),
                _none_as_na(enzyme.min_conc),
                _none_as_na(enzyme.max_conc),
                enzyme.name,
                str(enzyme.annotation),
            ]
            enzyme_table.add_row(*arguments)
        console.print(enzyme_table)

    if print_mets:
        met_table = Table(title="Metabolites", title_justify="left")
        met_table.add_column("ID")
        met_table.add_column("min(c)")
        met_table.add_column("max(c)")
        met_table.add_column("Name")
        met_table.add_column("Annotation")

        for met_id, metabolite in sort_dict_keys(cobrak_model.metabolites).items():
            arguments = [
                met_id,
                str(round(exp(metabolite.log_min_conc), conc_rounding)),
                str(round(exp(metabolite.log_max_conc), conc_rounding)),
                metabolite.name,
                str(metabolite.annotation),
            ]
            met_table.add_row(*arguments)
        console.print(met_table)

    if print_extra_linear_constraints and cobrak_model.extra_linear_constraints != []:
        console.print("\n[b u]Extra linear constraints[b u]")
        for extra_linear_constraint in cobrak_model.extra_linear_constraints:
            console.print(get_extra_linear_constraint_string(extra_linear_constraint))

    if print_settings:
        console.print("\n[i]General settings[i]")
        print_strkey_dict_as_table(
            {
                "Protein pool": cobrak_model.T,
                "R [kJ⋅K⁻¹⋅mol⁻¹]": cobrak_model.R,
                "T [K]": cobrak_model.T,
                "Kinetic-ignored mets": ", ".join(
                    cobrak_model.kinetic_ignored_metabolites
                ),
            }
        )

print_optimization_result(cobrak_model, optimization_dict, print_exchanges=True, print_reactions=True, print_enzymes=True, print_mets=True, print_error_values_if_existing=True, add_stoichiometries=False, rounding=3, conc_rounding=6, ignore_unused=False, multiple_tables_per_line=True, unused_limit=0.0001)

Pretty-Print the results of an optimization, including exchanges, reactions, enzymes, and metabolites.

Args: cobrak_model (Model): The model used for optimization. optimization_dict (dict[str, float]): A dictionary containing the optimization results. print_exchanges (bool, optional): Whether to print exchange reactions. Defaults to True. print_reactions (bool, optional): Whether to print non-exchange reactions. Defaults to True. print_enzymes (bool, optional): Whether to print enzyme usage. Defaults to True. print_mets (bool, optional): Whether to print metabolite concentrations. Defaults to True. add_stoichiometries (bool, optional): Whether to include reaction stoichiometries. Defaults to False. rounding (int, optional): Number of decimal places to round to. Defaults to 3. conc_rounding (int, optional): Number of decimal places to round concentrations to. Defaults to 6. ignore_unused (bool, optional): Whether to ignore reactions with zero flux. Defaults to False. multiple_tables_per_line (bool, optional): Whether to display multiple tables side by side. Defaults to True.

Source code in cobrak/printing.py
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
@validate_call
def print_optimization_result(
    cobrak_model: Model,
    optimization_dict: dict[str, float],
    print_exchanges: bool = True,
    print_reactions: bool = True,
    print_enzymes: bool = True,
    print_mets: bool = True,
    print_error_values_if_existing: bool = True,
    add_stoichiometries: bool = False,
    rounding: int = 3,
    conc_rounding: int = 6,
    ignore_unused: bool = False,
    multiple_tables_per_line: bool = True,
    unused_limit: float = 1e-4,
) -> None:
    """Pretty-Print the results of an optimization, including exchanges, reactions, enzymes, and metabolites.

    Args:
    cobrak_model (Model): The model used for optimization.
    optimization_dict (dict[str, float]): A dictionary containing the optimization results.
    print_exchanges (bool, optional): Whether to print exchange reactions. Defaults to True.
    print_reactions (bool, optional): Whether to print non-exchange reactions. Defaults to True.
    print_enzymes (bool, optional): Whether to print enzyme usage. Defaults to True.
    print_mets (bool, optional): Whether to print metabolite concentrations. Defaults to True.
    add_stoichiometries (bool, optional): Whether to include reaction stoichiometries. Defaults to False.
    rounding (int, optional): Number of decimal places to round to. Defaults to 3.
    conc_rounding (int, optional): Number of decimal places to round concentrations to. Defaults to 6.
    ignore_unused (bool, optional): Whether to ignore reactions with zero flux. Defaults to False.
    multiple_tables_per_line (bool, optional): Whether to display multiple tables side by side. Defaults to True.
    """

    table_columns: list[Table] = []

    all_fluxes = [
        optimization_dict[reac_id]
        for reac_id in cobrak_model.reactions
        if reac_id in optimization_dict
    ]
    min_flux = min(all_fluxes)
    max_flux = max(all_fluxes)
    all_dfs = [
        optimization_dict[key]
        for key in optimization_dict
        if key.startswith(DF_VAR_PREFIX)
    ]

    substrate_reac_ids, product_reac_ids = (
        get_substrate_and_product_exchanges(cobrak_model, optimization_dict)
        if print_exchanges
        else ([""], [""])
    )

    if print_exchanges:
        for title, exchange_ids in (
            ("Substrates", substrate_reac_ids),
            ("Products", product_reac_ids),
        ):
            exchange_table = Table(title=title, title_justify="left")
            exchange_table.add_column("ID")
            exchange_table.add_column("Flux")
            for exchange_id in exchange_ids:
                exchange_flux = optimization_dict[exchange_id]
                if ignore_unused and exchange_flux <= unused_limit:
                    continue

                exchange_table.add_row(
                    exchange_id,
                    _mapcolored(
                        round(optimization_dict[exchange_id], rounding),
                        min_flux,
                        max_flux,
                        prefix=_zero_prefix(exchange_flux),
                        suffix=_zero_suffix(exchange_flux),
                    ),
                )
            table_columns.append(exchange_table)

    if print_reactions:
        reac_table = Table(
            title="Non-exchange reactions" if print_exchanges else "Reactions",
            title_justify="left",
        )
        reac_table.add_column("ID")
        reac_table.add_column("v")
        if add_stoichiometries:
            reac_table.add_column("Stoichiometries")
        reac_table.add_column("df")
        reac_table.add_column("κ")
        reac_table.add_column("γ")
        reac_table.add_column("ι")
        reac_table.add_column("α")
        for reac_id in sort_dict_keys(cobrak_model.reactions):
            if ignore_unused and (
                reac_id not in optimization_dict
                or optimization_dict[reac_id] <= unused_limit
            ):
                continue

            if (
                (reac_id not in optimization_dict)
                or (reac_id in product_reac_ids)
                or (reac_id in substrate_reac_ids)
            ):
                continue
            arguments: list[str] = [reac_id]
            if add_stoichiometries:
                arguments.append(get_reaction_string(cobrak_model, reac_id))

            reac_flux = optimization_dict[reac_id]
            prefix, suffix = _zero_prefix(reac_flux), _zero_suffix(reac_flux)

            arguments.extend(
                (
                    _mapcolored(
                        round(reac_flux, rounding),
                        min_flux,
                        max_flux,
                        prefix=prefix,
                        suffix=suffix,
                    ),
                    _get_mapcolored_value_or_na(
                        f"{DF_VAR_PREFIX}{reac_id}",
                        optimization_dict,
                        min(all_dfs) if len(all_dfs) > 0 else 0.0,
                        max(all_dfs) if len(all_dfs) > 0 else 0.0,
                        rounding=rounding,
                        prefix=prefix,
                        suffix=suffix,
                    ),
                    _get_mapcolored_value_or_na(
                        f"{KAPPA_VAR_PREFIX}{reac_id}",
                        optimization_dict,
                        0.0,
                        1.0,
                        rounding=rounding,
                        prefix=prefix,
                        suffix=suffix,
                    ),
                    _get_mapcolored_value_or_na(
                        f"{GAMMA_VAR_PREFIX}{reac_id}",
                        optimization_dict,
                        0.0,
                        1.0,
                        rounding=rounding,
                        prefix=prefix,
                        suffix=suffix,
                    ),
                    _get_mapcolored_value_or_na(
                        f"{IOTA_VAR_PREFIX}{reac_id}",
                        optimization_dict,
                        0.0,
                        1.0,
                        rounding=rounding,
                        prefix=prefix,
                        suffix=suffix,
                    ),
                    _get_mapcolored_value_or_na(
                        f"{ALPHA_VAR_PREFIX}{reac_id}",
                        optimization_dict,
                        0.0,
                        1.0,
                        rounding=rounding,
                        prefix=prefix,
                        suffix=suffix,
                    ),
                )
            )
            reac_table.add_row(*arguments)
        table_columns.append(reac_table)

    if print_enzymes:
        enzyme_table = Table(title="Enzyme usage", title_justify="left")
        enzyme_table.add_column("Pool %")
        enzyme_table.add_column("Enzyme IDs")

        enzyme_usage = get_enzyme_usage_by_protein_pool_fraction(
            cobrak_model, optimization_dict
        )
        for pool_fraction, enzyme_ids in enzyme_usage.items():
            if ignore_unused and pool_fraction <= unused_limit:
                continue

            enzyme_table.add_row(
                _mapcolored(
                    round(pool_fraction * 100, rounding),
                    0.0,
                    100.0,
                    prefix=_zero_prefix(pool_fraction),
                    suffix=_zero_suffix(pool_fraction),
                ),
                "; ".join(enzyme_ids),
            )
        table_columns.append(enzyme_table)

    if print_mets:
        met_table = Table(title="Metabolites", title_justify="left")
        met_table.add_column("ID")
        met_table.add_column("Concentration")
        met_table.add_column("Consumption")
        met_table.add_column("Production")
        for met_id, metabolite in sort_dict_keys(cobrak_model.metabolites).items():
            met_var_id = f"{LNCONC_VAR_PREFIX}{met_id}"

            consumption, production = get_metabolite_consumption_and_production(
                cobrak_model, met_id, optimization_dict
            )

            if ignore_unused and production <= unused_limit:
                continue

            prefix, suffix = _zero_prefix(consumption), _zero_suffix(consumption)
            arguments = [met_id]
            arguments.append(
                _get_mapcolored_value_or_na(
                    met_var_id,
                    optimization_dict,
                    metabolite.log_min_conc,
                    metabolite.log_max_conc,
                    apply=exp,
                    special_value=1.0,
                    rounding=conc_rounding,
                    prefix=prefix,
                    suffix=suffix,
                )
            )

            arguments.append(_none_as_na(consumption, prefix=prefix, suffix=suffix))
            arguments.append(_none_as_na(production, prefix=prefix, suffix=suffix))

            met_table.add_row(*arguments)
        table_columns.append(met_table)

    if (
        print_error_values_if_existing
        and sum(
            key.startswith(ERROR_VAR_PREFIX) for key in list(optimization_dict.keys())
        )
        > 0
    ):
        error_table = Table(title="Errors", title_justify="left")
        error_table.add_column("ID")
        sorted_error_values = sort_dict_keys(
            {
                key[len(ERROR_VAR_PREFIX) + 1 :]: value
                for key, value in optimization_dict.items()
                if key.startswith(ERROR_VAR_PREFIX) and key != ERROR_SUM_VAR_ID
            }
        )
        min_error_value = min(list(sorted_error_values.values()))
        max_error_value = max(list(sorted_error_values.values()))
        for error_name, error_value in sorted_error_values.items():
            if ignore_unused and (error_value <= unused_limit):
                continue

            prefix, suffix = _zero_prefix(error_value), _zero_suffix(error_value)
            arguments = []
            arguments.append(error_name)
            arguments.append(
                _get_mapcolored_value_or_na(
                    error_name,
                    sorted_error_values,
                    min_value=min_error_value,
                    max_value=max_error_value,
                    prefix=prefix,
                    suffix=suffix,
                )
            )
            error_table.add_row(*arguments)
        error_table.add_row(*["SUM", str(optimization_dict[ERROR_SUM_VAR_ID])])
        table_columns.append(error_table)

    if multiple_tables_per_line:
        console.print(Columns(table_columns))
    else:
        for table in table_columns:
            console.print(table)

    console.print(
        "OBJECTIVE VALUE:",
        str(optimization_dict[OBJECTIVE_VAR_NAME]),
        "| SOLVE STATUS OK?",
        str(optimization_dict[ALL_OK_KEY]),
    )

print_strkey_dict_as_table(dictionary, table_title='', key_title='', value_title='')

Print a dictionary as a formatted table.

Args: dictionary (dict[str, Any]): The dictionary to print. table_title (str, optional): The title of the table. Defaults to "". key_title (str, optional): The title for the key column. Defaults to "". value_title (str, optional): The title for the value column. Defaults to "".

Source code in cobrak/printing.py
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def print_strkey_dict_as_table(
    dictionary: dict[str, Any],
    table_title: str = "",
    key_title: str = "",
    value_title: str = "",
) -> None:
    """Print a dictionary as a formatted table.

    Args:
    dictionary (dict[str, Any]): The dictionary to print.
    table_title (str, optional): The title of the table. Defaults to "".
    key_title (str, optional): The title for the key column. Defaults to "".
    value_title (str, optional): The title for the value column. Defaults to "".
    """
    table = Table(title=table_title, title_justify="left", show_header=False)
    table.add_column(key_title, style="cyan", no_wrap=True)
    table.add_column(value_title, style="magenta")
    for key, value in sort_dict_keys(dictionary).items():
        table.add_row(key, str(value))
    console.print(table)

print_variability_result(cobrak_model, variability_dict, print_exchanges=True, print_reacs=True, print_enzymes=False, print_mets=True, ignore_unused=False, add_stoichiometries=False, rounding=3, multiple_tables_per_line=True)

Print the variability analysis results, including exchanges, reactions, enzymes, and metabolites.

Args: cobrak_model (Model): The model used for variability analysis. variability_dict (dict[str, tuple[float, float]]): A dictionary containing the variability results. print_exchanges (bool, optional): Whether to print exchange reactions. Defaults to True. print_reacs (bool, optional): Whether to print non-exchange reactions. Defaults to True. print_enzymes (bool, optional): Whether to print enzyme usage. Defaults to False. print_mets (bool, optional): Whether to print metabolite concentrations. Defaults to True. ignore_unused (bool, optional): Whether to ignore reactions with zero flux. Defaults to False. add_stoichiometries (bool, optional): Whether to include reaction stoichiometries. Defaults to False. rounding (int, optional): Number of decimal places to round to. Defaults to 3. multiple_tables_per_line (bool, optional): Whether to display multiple tables side by side. Defaults to True.

Source code in cobrak/printing.py
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
@validate_call
def print_variability_result(
    cobrak_model: Model,
    variability_dict: dict[str, tuple[float, float]],
    print_exchanges: bool = True,
    print_reacs: bool = True,
    print_enzymes: bool = False,
    print_mets: bool = True,
    ignore_unused: bool = False,
    add_stoichiometries: bool = False,
    rounding: int = 3,
    multiple_tables_per_line: bool = True,
) -> None:
    """Print the variability analysis results, including exchanges, reactions, enzymes, and metabolites.

    Args:
    cobrak_model (Model): The model used for variability analysis.
    variability_dict (dict[str, tuple[float, float]]): A dictionary containing the variability results.
    print_exchanges (bool, optional): Whether to print exchange reactions. Defaults to True.
    print_reacs (bool, optional): Whether to print non-exchange reactions. Defaults to True.
    print_enzymes (bool, optional): Whether to print enzyme usage. Defaults to False.
    print_mets (bool, optional): Whether to print metabolite concentrations. Defaults to True.
    ignore_unused (bool, optional): Whether to ignore reactions with zero flux. Defaults to False.
    add_stoichiometries (bool, optional): Whether to include reaction stoichiometries. Defaults to False.
    rounding (int, optional): Number of decimal places to round to. Defaults to 3.
    multiple_tables_per_line (bool, optional): Whether to display multiple tables side by side. Defaults to True.
    """

    table_columns: list[Table] = []

    substrate_reac_ids, product_reac_ids = (
        get_substrate_and_product_exchanges(cobrak_model, variability_dict)
        if print_exchanges
        else ([""], [""])
    )

    reac_columns = [
        "ID",
        "min(vᵢ)",
        "max(vᵢ)",
        "min(dfᵢ)",
        "max(dfᵢ)",
    ]
    if add_stoichiometries:
        reac_columns.insert(1, "Reac string")

    if print_exchanges:
        for title, exchange_ids in (
            ("Substrates", substrate_reac_ids),
            ("Products", product_reac_ids),
        ):
            exchange_table = Table(title=title, title_justify="left")
            if add_stoichiometries:
                exchange_table.add_column("Reac string")
            for reac_column in reac_columns:
                exchange_table.add_column(reac_column)
            for exchange_reac_id in exchange_ids:
                prefix, suffix = _varcolor(exchange_reac_id, variability_dict)
                flux_range = _get_var_or_na(
                    exchange_reac_id, variability_dict, rounding, prefix, suffix
                )
                if ignore_unused and flux_range[1] == 0.0:
                    continue
                arguments: list[str] = [
                    exchange_reac_id,
                    *flux_range,
                    *_get_var_or_na(
                        f"{DF_VAR_PREFIX}{exchange_reac_id}",
                        variability_dict,
                        rounding,
                        prefix,
                        suffix,
                    ),
                ]
                exchange_table.add_row(*arguments)
            table_columns.append(exchange_table)

    if print_reacs:
        reacs_table = Table(
            title="Non-exchange reactions" if print_exchanges else "Reactions",
            title_justify="left",
        )
        for reac_column in reac_columns:
            reacs_table.add_column(reac_column)
        for reac_id in sort_dict_keys(cobrak_model.reactions):
            if reac_id in [*substrate_reac_ids, *product_reac_ids]:
                continue
            prefix, suffix = _varcolor(reac_id, variability_dict)

            flux_range = _get_var_or_na(
                reac_id, variability_dict, rounding, prefix, suffix
            )
            if ignore_unused and flux_range[1] == 0.0:
                continue

            arguments = [
                reac_id,
                *flux_range,
                *_get_var_or_na(
                    f"{DF_VAR_PREFIX}{reac_id}",
                    variability_dict,
                    rounding,
                    prefix,
                    suffix,
                ),
            ]
            reacs_table.add_row(*arguments)
        table_columns.append(reacs_table)

    if print_enzymes:
        enzymes_table = Table(title="Enzymes", title_justify="left")
        enzymes_table.add_column("ID")
        enzymes_table.add_column("min(Eᵢ)")
        enzymes_table.add_column("max(Eᵢ)")

        for reac_id, reaction in sort_dict_keys(cobrak_model.reactions).items():
            if reaction.enzyme_reaction_data is None:
                continue
            enzyme_var_id = get_reaction_enzyme_var_id(reac_id, reaction)
            prefix, suffix = _varcolor(enzyme_var_id, variability_dict)
            conc_range = _get_var_or_na(
                enzyme_var_id, variability_dict, rounding, prefix, suffix
            )
            if ignore_unused and conc_range[1] == 0.0:
                continue
            reacs_table.add_row(
                *[
                    enzyme_var_id,
                    conc_range,
                ],
                conc_range,
            )
        table_columns.append(enzymes_table)

    if print_mets:
        mets_table = Table(title="Metabolites", title_justify="left")
        mets_table.add_column("ID")
        mets_table.add_column("min(cᵢ)")
        mets_table.add_column("max(cᵢ)")
        for met_id in sort_dict_keys(cobrak_model.metabolites):
            min_conc_str, max_conc_str = _get_var_or_na(
                f"{LNCONC_VAR_PREFIX}{met_id}", variability_dict, rounding=1_000
            )
            try:
                min_conc = str(round(exp(float(min_conc_str)), rounding))
                max_conc = str(round(exp(float(max_conc_str)), rounding))
            except ValueError:
                min_conc = min_conc_str
                max_conc = max_conc_str
            color = "[blue]" if min_conc != max_conc else "[red]"
            mets_table.add_row(
                *[
                    met_id,
                    f"{color} {min_conc}",
                    f"{color} {max_conc}",
                ]
            )
        table_columns.append(mets_table)

    if multiple_tables_per_line:
        console.print(Columns(table_columns))
    else:
        for table in table_columns:
            console.print(table)

pyomo_functionality

Utilities to work with pyomo ConcreteModel instances directly.

ApproximationPoint

Represents a point in a linear approximation.

This dataclass is used to store the slope, intercept, and x-coordinate of a point in a linear approximation.

Attributes: - slope (float): The slope of the line passing through this point. - intercept (float): The y-intercept of the line passing through this point. - x_point (float): The x-coordinate of this point.

Source code in cobrak/pyomo_functionality.py
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
@dataclass
class ApproximationPoint:
    """Represents a point in a linear approximation.

    This dataclass is used to store the slope, intercept, and x-coordinate of a point in a linear approximation.

    Attributes:
    - slope (float): The slope of the line passing through this point.
    - intercept (float): The y-intercept of the line passing through this point.
    - x_point (float): The x-coordinate of this point.
    """

    slope: float
    intercept: float
    x_point: float

add_linear_approximation_to_pyomo_model(model, y_function, y_function_derivative, x_reference_var_id, new_y_var_name, min_x, max_x, max_rel_difference, max_num_segments=float('inf'), min_abs_error=1e-06)

Add a linear approximation of a given function to a Pyomo model.

This function approximates the provided function y_function with a piecewise linear function and adds the approximation to the given Pyomo model. The approximation is based on the derivative of the function y_function_derivative. The approximation is added as a new variable and a set of constraints to the model.

Parameters: - model (ConcreteModel): The Pyomo model to which the approximation will be added. - y_function (Callable[[float], float]): The function to be approximated. - y_function_derivative (Callable[[float], float]): The derivative of the function to be approximated. - x_reference_var_id (str): The name of the variable in the model that will be used as the independent variable for the approximation. - new_y_var_name (str): The name of the new variable that will be added to the model to represent the approximation. - min_x (float): The minimum value of the independent variable for the approximation. - max_x (float): The maximum value of the independent variable for the approximation. - max_rel_difference (float): The maximum allowed relative difference between the approximation and the original function. - max_num_segments (int, optional): The maximum number of segments to use for the piecewise linear approximation. Defaults to infinity. - min_abs_error (float, optional): The minimum absolute error allowed between the approximation and the original function. Defaults to 1e-6.

Returns: - ConcreteModel: The Pyomo model with the added approximation.

Source code in cobrak/pyomo_functionality.py
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
def add_linear_approximation_to_pyomo_model(
    model: ConcreteModel,
    y_function: Callable[[float], float],
    y_function_derivative: Callable[[float], float],
    x_reference_var_id: str,
    new_y_var_name: str,
    min_x: float,
    max_x: float,
    max_rel_difference: float,
    max_num_segments: int = float("inf"),
    min_abs_error: float = 1e-6,
) -> ConcreteModel:
    """Add a linear approximation of a given function to a Pyomo model.

    This function approximates the provided function `y_function` with a piecewise linear function
    and adds the approximation to the given Pyomo model. The approximation is based on the derivative
    of the function `y_function_derivative`. The approximation is added as a new variable and a set
    of constraints to the model.

    Parameters:
    - model (ConcreteModel): The Pyomo model to which the approximation will be added.
    - y_function (Callable[[float], float]): The function to be approximated.
    - y_function_derivative (Callable[[float], float]): The derivative of the function to be approximated.
    - x_reference_var_id (str): The name of the variable in the model that will be used as the independent variable for the approximation.
    - new_y_var_name (str): The name of the new variable that will be added to the model to represent the approximation.
    - min_x (float): The minimum value of the independent variable for the approximation.
    - max_x (float): The maximum value of the independent variable for the approximation.
    - max_rel_difference (float): The maximum allowed relative difference between the approximation and the original function.
    - max_num_segments (int, optional): The maximum number of segments to use for the piecewise linear approximation. Defaults to infinity.
    - min_abs_error (float, optional): The minimum absolute error allowed between the approximation and the original function. Defaults to 1e-6.

    Returns:
    - ConcreteModel: The Pyomo model with the added approximation.
    """
    # Find fitting approximation
    num_segments = 2
    approximation_points: list[ApproximationPoint] = []
    while True:
        ignored_is = []
        x_points = linspace(min_x, max_x, num_segments)
        approximation_points = [
            ApproximationPoint(
                slope=y_function_derivative(x_point),
                intercept=y_function(x_point)
                - y_function_derivative(x_point) * x_point,
                x_point=x_point,
            )
            for x_point in x_points
        ]

        max_found_min_rel_difference = -float("inf")
        x_midpoints_data: list[tuple[int, int, float]] = []
        for i in range(len(x_points) - 1):
            first_index, second_index = i, i + 1
            if (
                approximation_points[first_index].slope
                - approximation_points[second_index].slope
                == 0
            ):
                continue
            x_midpoint = (
                approximation_points[second_index].intercept
                - approximation_points[first_index].intercept
            ) / (
                approximation_points[first_index].slope
                - approximation_points[second_index].slope
            )
            x_midpoints_data.append((first_index, second_index, x_midpoint))

        for first_index, second_index, x_value in x_midpoints_data:
            real_y = y_function(x_value)
            y_approx_one = (
                approximation_points[first_index].slope * x_value
                + approximation_points[first_index].intercept
            )
            y_approx_two = (
                approximation_points[second_index].slope * x_value
                + approximation_points[second_index].intercept
            )
            errors_absolute = (real_y - y_approx_one, real_y - y_approx_two)
            if max(errors_absolute) < min_abs_error:
                ignored_is.append(first_index)
            errors_relative = (
                abs(errors_absolute[0] / real_y),
                abs(errors_absolute[1] / real_y),
            )
            max_found_min_rel_difference = max(
                max_found_min_rel_difference, min(errors_relative)
            )

        if (max_found_min_rel_difference <= max_rel_difference) or (
            num_segments == max_num_segments
        ):
            break

        num_segments += 1
    # Add approximation to model
    min_approx_y = (
        approximation_points[0].slope * x_points[0] + approximation_points[0].intercept
    )
    max_approx_y = (
        approximation_points[-1].slope * x_points[-1]
        + approximation_points[-1].intercept
    )
    setattr(
        model, new_y_var_name, Var(within=Reals, bounds=(min_approx_y, max_approx_y))
    )
    for approx_i, approximation_point in enumerate(approximation_points):
        if approx_i in ignored_is:
            continue
        setattr(
            model,
            f"{new_y_var_name}_constraint_{approx_i}",
            Constraint(
                rule=getattr(model, new_y_var_name)
                >= approximation_point.slope * getattr(model, x_reference_var_id)
                + approximation_point.intercept
            ),
        )
    return model

add_objective_to_model(model, objective_target, objective_sense, objective_name, objective_var_name=OBJECTIVE_VAR_NAME)

Add an objective function to a Pyomo model.

This function adds an objective function to the given Pyomo model based on the provided target and sense. The target can be a single variable name or a dictionary of variable names with their corresponding multipliers. The sense can be either maximization (as int, value > 0) or minimization (as int, value < 0).

Parameters: - model (ConcreteModel): The Pyomo model to which the objective function will be added. - objective_target (str | dict[str, float]): The target for the objective function. It can be a single variable name or a dictionary of variable names with their corresponding multipliers. - objective_sense (int): The sense of the objective function. It can be an integer (positive for maximization, negative for minimization, zero for no objective). - objective_name (str): The name of the new objective function that will be added to the model. - objective_var_name (str, optional): The name of the new variable that will be added to the model to represent the objective function. Defaults to OBJECTIVE_VAR_NAME.

Returns: - ConcreteModel: The Pyomo model with the added objective function.

Source code in cobrak/pyomo_functionality.py
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
def add_objective_to_model(
    model: ConcreteModel,
    objective_target: str | dict[str, float],
    objective_sense: int,
    objective_name: str,
    objective_var_name: str = OBJECTIVE_VAR_NAME,
) -> ConcreteModel:
    """Add an objective function to a Pyomo model.

    This function adds an objective function to the given Pyomo model based on the provided target and sense.
    The target can be a single variable name or a dictionary of variable names with their corresponding multipliers.
    The sense can be either maximization (as int, value > 0) or minimization (as int, value < 0).

    Parameters:
    - model (ConcreteModel): The Pyomo model to which the objective function will be added.
    - objective_target (str | dict[str, float]): The target for the objective function. It can be a single variable name or a dictionary of variable names with their corresponding multipliers.
    - objective_sense (int): The sense of the objective function. It can be an integer (positive for maximization, negative for minimization, zero for no objective).
    - objective_name (str): The name of the new objective function that will be added to the model.
    - objective_var_name (str, optional): The name of the new variable that will be added to the model to represent the objective function. Defaults to OBJECTIVE_VAR_NAME.

    Returns:
    - ConcreteModel: The Pyomo model with the added objective function.
    """
    setattr(
        model,
        objective_name,
        get_objective(
            model,
            objective_target,
            objective_sense,
            objective_var_name,
        ),
    )
    return model

get_model_var_names(model)

Extracts and returns a list of names of all variable components from a Pyomo model.

This function iterates over all variable objects (Var) defined in the given Pyomo concrete model instance. It collects the name attribute of each variable object and returns these names as a list of strings.

Parameters:

Name Type Description Default
model ConcreteModel

A Pyomo concrete model instance containing various components, including variables.

required

Returns:

Type Description
list[str]

list[str]: A list of string names representing all variable objects in the provided Pyomo model.

Examples:

>>> from pyomo.environ import ConcreteModel, Var
>>> m = ConcreteModel()
>>> m.x = Var(initialize=1.0)
>>> m.y = Var([1, 2], initialize=lambda m,i: i)  # Creates two variables y[1] and y[2]
>>> var_names = get_model_var_names(m)
>>> print(var_names)
['x', 'y[1]', 'y[2]']
Source code in cobrak/pyomo_functionality.py
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
def get_model_var_names(model: ConcreteModel) -> list[str]:
    """Extracts and returns a list of names of all variable components from a Pyomo model.

    This function iterates over all variable objects (`Var`) defined in the given Pyomo concrete model instance.
    It collects the name attribute of each variable object and returns these names as a list of strings.

    Parameters:
        model (ConcreteModel): A Pyomo concrete model instance containing various components, including variables.

    Returns:
        list[str]: A list of string names representing all variable objects in the provided Pyomo model.

    Examples:

        >>> from pyomo.environ import ConcreteModel, Var
        >>> m = ConcreteModel()
        >>> m.x = Var(initialize=1.0)
        >>> m.y = Var([1, 2], initialize=lambda m,i: i)  # Creates two variables y[1] and y[2]
        >>> var_names = get_model_var_names(m)
        >>> print(var_names)
        ['x', 'y[1]', 'y[2]']
    """
    return [v.name for v in model.component_objects(Var)]

get_objective(model, objective_target, objective_sense, objective_var_name=OBJECTIVE_VAR_NAME)

Create and return a pyomo objective function for the given model.

Sets up an objective function based on the provided target and sense. The target can be a single variable or a weighted sum of multiple variables. The sense can be either maximization (as int, value > 0) or minimization (as int, value < 0).

Parameters: - model (ConcreteModel): The Pyomo model to which the objective function will be added. - objective_target (str | dict[str, float]): The target for the objective function. It can be a single variable name or a dictionary of variable names with their corresponding multipliers. - objective_sense (int): The sense of the objective function. It can be an integer (positive for maximization, negative for minimization, zero for no objective).

Returns: - Objective: The Pyomo Objective object representing the objective function.

Source code in cobrak/pyomo_functionality.py
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
def get_objective(
    model: ConcreteModel,
    objective_target: str | dict[str, float],
    objective_sense: int,
    objective_var_name: str = OBJECTIVE_VAR_NAME,
) -> Objective:
    """Create and return a pyomo objective function for the given model.

    Sets up an objective function based on the provided target and sense.
    The target can be a single variable or a weighted sum of multiple variables.
    The sense can be either maximization (as int, value > 0) or minimization (as int, value < 0).

    Parameters:
    - model (ConcreteModel): The Pyomo model to which the objective function will be added.
    - objective_target (str | dict[str, float]): The target for the objective function. It can be a single variable name or a dictionary of
                                                 variable names with their corresponding multipliers.
    - objective_sense (int): The sense of the objective function. It can be an integer
                                        (positive for maximization, negative for minimization, zero for no objective).

    Returns:
    - Objective: The Pyomo Objective object representing the objective function.
    """
    model, expr = set_target_as_var_and_value(
        model,
        objective_target,
        objective_var_name,
        "constraint_of_" + objective_var_name,
    )

    if isinstance(objective_sense, int):
        if objective_sense > 0:
            expr *= objective_sense
            pyomo_sense = maximize
        elif objective_sense < 0:
            expr *= abs(objective_sense)
            pyomo_sense = minimize
        else:  # objective_sense == 0
            expr = 0.0
            pyomo_sense = minimize
    else:
        print(f"ERROR: Objective sense is {objective_sense}, but must be an integer.")
        raise ValueError
    return Objective(expr=expr, sense=pyomo_sense)

get_solver(solver)

Create and configure a solver for the given solver name and options.

This function returns a Pyomo solver using the specified solver name and applies the provided options to it.

Parameters: - solver: The COBRA-k Solver instance.

Returns: - SolverFactory: The configured solver instance.

Source code in cobrak/pyomo_functionality.py
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
def get_solver(solver: Solver) -> SolverFactory:  # pyright: ignore[reportInvalidTypeForm]
    """Create and configure a solver for the given solver name and options.

    This function returns a Pyomo solver using the specified solver name and applies the provided options to it.

    Parameters:
    - solver: The COBRA-k Solver instance.

    Returns:
    - SolverFactory: The configured solver instance.
    """
    pyomo_solver = SolverFactory(solver.name, **solver.solver_factory_args)

    for attr_name, attr_value in solver.solver_attrs.items():
        setattr(pyomo_solver, attr_name, attr_value)
    for option_name, option_value in solver.solver_options.items():
        pyomo_solver.options[option_name] = option_value
    return pyomo_solver

set_target_as_var_and_value(model, target, var_name, constraint_name)

Set a target as a variable and its value in a Pyomo model.

This function adds a new variable to the given Pyomo model and sets its value to the provided target. The target can be either a single variable name or a dictionary of variable names with their corresponding multipliers.

Parameters: - model (ConcreteModel): The Pyomo model to which the variable and constraint will be added. - target (str | dict[str, float]): The target for the new variable. It can be a single variable name or a dictionary of variable names with their corresponding multipliers. - var_name (str): The name of the new variable that will be added to the model. - constraint_name (str): The name of the new constraint that will be added to the model to set the value of the new variable.

Returns: - tuple[ConcreteModel, Expression]: The Pyomo model with the added variable and constraint, and the expression representing the target.

Source code in cobrak/pyomo_functionality.py
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
def set_target_as_var_and_value(
    model: ConcreteModel,
    target: str | dict[str, float],
    var_name: str,
    constraint_name: str,
) -> tuple[ConcreteModel, Expression]:
    """Set a target as a variable and its value in a Pyomo model.

    This function adds a new variable to the given Pyomo model and sets its value to the provided target.
    The target can be either a single variable name or a dictionary of variable names with their corresponding multipliers.

    Parameters:
    - model (ConcreteModel): The Pyomo model to which the variable and constraint will be added.
    - target (str | dict[str, float]): The target for the new variable. It can be a single variable name or a dictionary of variable names with their corresponding multipliers.
    - var_name (str): The name of the new variable that will be added to the model.
    - constraint_name (str): The name of the new constraint that will be added to the model to set the value of the new variable.

    Returns:
    - tuple[ConcreteModel, Expression]: The Pyomo model with the added variable and constraint, and the expression representing the target.
    """
    if isinstance(target, str):
        expr = getattr(model, target)
    else:
        expr = 0.0
        for target_id, multiplier in target.items():  # type: ignore
            expr += multiplier * getattr(model, target_id)
    setattr(model, var_name, Var(within=Reals, bounds=(-QUASI_INF, QUASI_INF)))
    setattr(
        model,
        constraint_name,
        Constraint(expr=getattr(model, var_name) == expr),
    )
    return model, expr

sabio_rk_functionality

Functions and associated dataclasses for retrieving kinetic data from SABIO-RK

SabioDict dataclass

Includes all retrieved SabioEntry instances and shows of which type they are

Source code in cobrak/sabio_rk_functionality.py
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
@dataclass_json
@dataclass
class SabioDict:
    """Includes all retrieved SabioEntry instances and shows of which type they are"""

    kcat_entries: dict[str, list[SabioEntry]]
    """Turnover number entries"""
    km_entries: dict[str, list[SabioEntry]]
    """Michaelis-Menten constant entries"""
    ki_entries: dict[str, list[SabioEntry]]
    """Inhibition constant entries"""
    ka_entries: dict[str, list[SabioEntry]]
    """Activation constant entries"""
    hill_entries: dict[str, list[SabioEntry]]
    """Hill number entries"""

hill_entries instance-attribute

Hill number entries

ka_entries instance-attribute

Activation constant entries

kcat_entries instance-attribute

Turnover number entries

ki_entries instance-attribute

Inhibition constant entries

km_entries instance-attribute

Michaelis-Menten constant entries

SabioEntry dataclass

Represents the COBRAk-relevant data retrieved from a single SABIO-RK entry.

Of which type this entry is (k_cat, k_m, k_i) is not determined here. This is done in the dataclass SabioDict.

Source code in cobrak/sabio_rk_functionality.py
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
@dataclass_json
@dataclass
class SabioEntry:
    """Represents the COBRAk-relevant data retrieved from a single SABIO-RK entry.

    Of which type this entry is (k_cat, k_m, k_i) is not determined here. This is done
    in the dataclass SabioDict.
    """

    entry_id: int
    """The entry's ID number"""
    is_recombinant: bool
    """Whether or not the entry is from a recombinant enzyme"""
    kinetics_mechanism_type: str
    """The reaction's kinetic mechanism (e.g., "Michaelis-Menten")"""
    organism: str
    """The organism (latin-greek name) associated with this entry"""
    temperature: float | None
    """[None if not given] The measurement's temperature in °C"""
    ph: float | None
    """[None if not given] The measurement's pH"""
    parameter_value: float
    """The value of the parameter"""
    parameter_unit: str
    """The unit of the value"""
    parameter_associated_species: str
    """The species (metabolite) associated with the parameter"""
    substrates: list[str]
    """The list of substrate names"""
    products: list[str]
    """The list of product names"""
    chebi_ids: list[str]
    """The list of all CHEBI IDs"""

chebi_ids instance-attribute

The list of all CHEBI IDs

entry_id instance-attribute

The entry's ID number

is_recombinant instance-attribute

Whether or not the entry is from a recombinant enzyme

kinetics_mechanism_type instance-attribute

The reaction's kinetic mechanism (e.g., "Michaelis-Menten")

organism instance-attribute

The organism (latin-greek name) associated with this entry

parameter_associated_species instance-attribute

The species (metabolite) associated with the parameter

parameter_unit instance-attribute

The unit of the value

parameter_value instance-attribute

The value of the parameter

ph instance-attribute

[None if not given] The measurement's pH

products instance-attribute

The list of product names

substrates instance-attribute

The list of substrate names

temperature instance-attribute

[None if not given] The measurement's temperature in °C

SabioThread

Bases: Thread

Represents a single Sabio-RK connection, ready for multi-threading (on one CPU core) using the threading module

Source code in cobrak/sabio_rk_functionality.py
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
class SabioThread(threading.Thread):
    """Represents a single Sabio-RK connection, ready for multi-threading (on one CPU core) using the threading module"""

    def __init__(self, temp_folder: str, start_number: int, end_number: int) -> None:
        """Initializes a SabioThread instance.

        Args:
            temp_folder (str): The path to the temporary folder where the results will be saved.
            start_number (int): The starting number for the query range.
            end_number (int): The ending number for the query range.
        """
        super().__init__()

        self.temp_folder = standardize_folder(temp_folder)
        self.start_number = start_number
        self.end_number = end_number

    def run(self) -> None:
        """Executes the thread's SABIO-RK data request

        Constructs a query string, sends a POST request to the SABIO-RK web service,
        and writes the response to a file in the temporary folder.
        """
        txt_path = f"{self.temp_folder}zzz{self.start_number}.txt"
        if exists(txt_path):
            return

        query_numbers = " OR ".join(
            [str(i + 1) for i in range(self.start_number, self.end_number + 1)]
        )
        query_dict = {"EntryID": f"({query_numbers})"}
        query_string = " AND ".join([f"{k}:{v}" for k, v in query_dict.items()])
        query_string += ' AND Parametertype:("activation constant" OR "Ki" OR "kcat" OR "km" OR "Hill coefficient") AND EnzymeType:"wildtype"'
        query = {
            "fields[]": [
                "EntryID",
                "Organism",
                "IsRecombinant",
                "ECNumber",
                "KineticMechanismType",
                "SabioCompoundID",
                "ChebiID",
                "Parameter",
                "Substrate",
                "Product",
                "Temperature",
                "pH",
            ],
            "q": query_string,
        }
        try:
            t0 = time()
            request = requests.post(
                "http://sabiork.h-its.org/sabioRestWebServices/kineticlawsExportTsv",
                params=query,
                timeout=120,
            )
            t1 = time()
            print(
                f"SABIO-ID REQUEST FROM {self.start_number} TO {self.end_number} FINISHED IN {t1 - t0}"
            )
        except requests.exceptions.ReadTimeout:
            print(
                f"TIMEOUT :O IN REQUEST FROM {self.start_number} TO {self.end_number} IN 120 SEC. YOU MAY TRY THIS AGAIN BY RESTARTING YOUR SCRIPT..."
            )
            return
        request.raise_for_status()
        with open(  # noqa: FURB103
            txt_path, "w", encoding="utf-8"
        ) as f:
            f.write(request.text)

__init__(temp_folder, start_number, end_number)

Initializes a SabioThread instance.

Parameters:

Name Type Description Default
temp_folder str

The path to the temporary folder where the results will be saved.

required
start_number int

The starting number for the query range.

required
end_number int

The ending number for the query range.

required
Source code in cobrak/sabio_rk_functionality.py
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
def __init__(self, temp_folder: str, start_number: int, end_number: int) -> None:
    """Initializes a SabioThread instance.

    Args:
        temp_folder (str): The path to the temporary folder where the results will be saved.
        start_number (int): The starting number for the query range.
        end_number (int): The ending number for the query range.
    """
    super().__init__()

    self.temp_folder = standardize_folder(temp_folder)
    self.start_number = start_number
    self.end_number = end_number

run()

Executes the thread's SABIO-RK data request

Constructs a query string, sends a POST request to the SABIO-RK web service, and writes the response to a file in the temporary folder.

Source code in cobrak/sabio_rk_functionality.py
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
def run(self) -> None:
    """Executes the thread's SABIO-RK data request

    Constructs a query string, sends a POST request to the SABIO-RK web service,
    and writes the response to a file in the temporary folder.
    """
    txt_path = f"{self.temp_folder}zzz{self.start_number}.txt"
    if exists(txt_path):
        return

    query_numbers = " OR ".join(
        [str(i + 1) for i in range(self.start_number, self.end_number + 1)]
    )
    query_dict = {"EntryID": f"({query_numbers})"}
    query_string = " AND ".join([f"{k}:{v}" for k, v in query_dict.items()])
    query_string += ' AND Parametertype:("activation constant" OR "Ki" OR "kcat" OR "km" OR "Hill coefficient") AND EnzymeType:"wildtype"'
    query = {
        "fields[]": [
            "EntryID",
            "Organism",
            "IsRecombinant",
            "ECNumber",
            "KineticMechanismType",
            "SabioCompoundID",
            "ChebiID",
            "Parameter",
            "Substrate",
            "Product",
            "Temperature",
            "pH",
        ],
        "q": query_string,
    }
    try:
        t0 = time()
        request = requests.post(
            "http://sabiork.h-its.org/sabioRestWebServices/kineticlawsExportTsv",
            params=query,
            timeout=120,
        )
        t1 = time()
        print(
            f"SABIO-ID REQUEST FROM {self.start_number} TO {self.end_number} FINISHED IN {t1 - t0}"
        )
    except requests.exceptions.ReadTimeout:
        print(
            f"TIMEOUT :O IN REQUEST FROM {self.start_number} TO {self.end_number} IN 120 SEC. YOU MAY TRY THIS AGAIN BY RESTARTING YOUR SCRIPT..."
        )
        return
    request.raise_for_status()
    with open(  # noqa: FURB103
        txt_path, "w", encoding="utf-8"
    ) as f:
        f.write(request.text)

get_full_sabio_dict(sabio_target_folder)

Parses a SABIO-RK web query TSV file from the target folder to create a SabioDict instance containing SABIO-RK entries.

Parameters:

Name Type Description Default
sabio_target_folder str

The path to the folder containing the TSV file.

required

Returns:

Name Type Description
SabioDict SabioDict

A SabioDict instance whichm in turn, contains SabioEntry instances

Source code in cobrak/sabio_rk_functionality.py
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
def get_full_sabio_dict(sabio_target_folder: str) -> SabioDict:
    """Parses a SABIO-RK web query TSV file from the target folder to create a SabioDict instance containing SABIO-RK entries.

    Args:
        sabio_target_folder (str): The path to the folder containing the TSV file.

    Returns:
        SabioDict: A SabioDict instance whichm in turn, contains SabioEntry instances
    """
    tsv_str = _get_sabio_tsv_str(sabio_target_folder)

    tsv_lines = tsv_str.split("\n")
    titles = tsv_lines[0].split("\t")
    del tsv_lines[0]

    sabio_dict = SabioDict({}, {}, {}, {}, {})
    for tsv_line in tsv_lines:
        line = tsv_line.split("\t")

        parameter_value_str = line[titles.index("parameter.startValue")]
        if not parameter_value_str:
            continue
        parameter_value = float(parameter_value_str)
        if parameter_value <= 0.0:
            continue  # There is no kinetic parameter that is just 0 or below

        parameter_type_str = line[titles.index("parameter.type")]
        match parameter_type_str.lower():
            case "kcat":
                sabio_dict_pointer = sabio_dict.kcat_entries
            case "km":
                sabio_dict_pointer = sabio_dict.km_entries
            case "ki":
                sabio_dict_pointer = sabio_dict.ki_entries
            case "activation constant":
                sabio_dict_pointer = sabio_dict.ka_entries
            case "hill coefficient":
                sabio_dict_pointer = sabio_dict.hill_entries
            case _:
                continue

        ec_number = line[titles.index("ECNumber")]
        entry_id = int(line[titles.index("EntryID")])
        organism = line[titles.index("Organism")]
        is_recombinant = line[titles.index("IsRecombinant")].lower() == "true"
        kinetics_mechanism_type = line[titles.index("KineticMechanismType")]
        parameter_unit = line[titles.index("parameter.unit")]
        parameter_associated_species = line[titles.index("parameter.associatedSpecies")]
        substrates = line[titles.index("Substrate")].split(";")
        products = line[titles.index("Product")].split(";")
        chebi_ids = line[titles.index("ChebiID")].split(";")
        try:
            temperature = float(line[titles.index("Temperature")])
        except (ValueError, IndexError):
            temperature = None
        try:
            ph = float(line[titles.index("pH")])
        except (ValueError, IndexError):
            ph = None

        if ec_number not in sabio_dict_pointer:
            sabio_dict_pointer[ec_number] = []
        sabio_dict_pointer[ec_number].append(
            SabioEntry(
                entry_id=entry_id,
                is_recombinant=is_recombinant,
                kinetics_mechanism_type=kinetics_mechanism_type,
                organism=organism,
                temperature=temperature,
                ph=ph,
                parameter_unit=parameter_unit,
                parameter_value=parameter_value,
                parameter_associated_species=parameter_associated_species,
                substrates=substrates,
                products=products,
                chebi_ids=chebi_ids,
            )
        )
    return sabio_dict

sabio_select_enzyme_kinetic_data_for_sbml(sbml_path, sabio_target_folder, base_species, ncbi_parsed_json_path, bigg_metabolites_json_path, kinetic_ignored_metabolites=[], kinetic_ignored_enzyme_ids=[], custom_enzyme_kinetic_data={}, min_ph=-float('inf'), max_ph=float('inf'), accept_nan_ph=True, min_temperature=-float('inf'), max_temperature=float('inf'), accept_nan_temperature=True, kcat_overwrite={}, transfered_ec_number_json='', max_taxonomy_level=float('inf'), add_hill_coefficients=True, kis_and_kas_only_for_same_compartments=True)

Selects enzyme kinetic data for a given SBML model using SABIO-RK data.

If this data cannot be found, an internet connection is built to SABIO-RK and the relevant data is downloaded, which may take some time in the order of dozens of minutes. If you want to download the full SABIO-RK data beforehand, run get_full_sabio_dict() from this module beforehand, with the same sabio_target_folder.

Collected data includes k_cat, k_m, k_i, k_a and Hill coefficients for all EC numbers that occur in the model's BiGG-compliant EC number annotation.

Parameters:

Name Type Description Default
sbml_path str

Path to the SBML file.

required
sabio_target_folder str

The path to the folder containing SABIO-RK data.

required
base_species str

The base species for taxonomy comparison.

required
ncbi_parsed_json_path str

The path to the NCBI parsed JSON file.

required
bigg_metabolites_json_path str

The path to the BIGG metabolites JSON file.

required
kinetic_ignored_metabolites list[str]

List of metabolites to ignore. Defaults to [].

[]
kinetic_ignored_enzyme_ids list[str]

List of enzyme IDs to ignore. Defaults to [].

[]
custom_enzyme_kinetic_data dict[str, EnzymeReactionData | None]

Custom enzyme kinetic data. Defaults to {}.

{}
min_ph float

Minimum pH value for filtering. Defaults to -float("inf").

-float('inf')
max_ph float

Maximum pH value for filtering. Defaults to float("inf").

float('inf')
accept_nan_ph bool

Whether to accept entries with NaN pH values. Defaults to True.

True
min_temperature float

Minimum temperature value for filtering. Defaults to -float("inf").

-float('inf')
max_temperature float

Maximum temperature value for filtering. Defaults to float("inf").

float('inf')
accept_nan_temperature bool

Whether to accept entries with NaN temperature values. Defaults to True.

True
kcat_overwrite dict[str, float]

Dictionary to overwrite kcat values. Defaults to {}.

{}
add_hill_coefficients bool

Whether Hill coefficeints shall be collected (True) or not (False). Defaults to True.

True

Returns: dict[str, EnzymeReactionData | None]: A dictionary mapping reaction IDs to enzyme kinetic data.

Source code in cobrak/sabio_rk_functionality.py
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
def sabio_select_enzyme_kinetic_data_for_sbml(
    sbml_path: str,
    sabio_target_folder: str,
    base_species: str,
    ncbi_parsed_json_path: str,
    bigg_metabolites_json_path: str,
    kinetic_ignored_metabolites: list[str] = [],
    kinetic_ignored_enzyme_ids: list[str] = [],
    custom_enzyme_kinetic_data: dict[str, EnzymeReactionData | None] = {},
    min_ph: float = -float("inf"),
    max_ph: float = float("inf"),
    accept_nan_ph: bool = True,
    min_temperature: float = -float("inf"),
    max_temperature: float = float("inf"),
    accept_nan_temperature: bool = True,
    kcat_overwrite: dict[str, float] = {},
    transfered_ec_number_json: str = "",
    max_taxonomy_level: int | float = float("inf"),
    add_hill_coefficients: bool = True,
    kis_and_kas_only_for_same_compartments: bool = True,
) -> dict[str, EnzymeReactionData | None]:
    """Selects enzyme kinetic data for a given SBML model using SABIO-RK data.

    If this data cannot be found, an internet connection is built to SABIO-RK and the relevant
    data is downloaded, which may take some time in the order of dozens of minutes.
    If you want to download the full SABIO-RK data beforehand, run get_full_sabio_dict() from
    this module beforehand, with the same sabio_target_folder.

    Collected data includes k_cat, k_m, k_i, k_a and Hill coefficients for all EC numbers that
    occur in the model's BiGG-compliant EC number annotation.

    Args:
        sbml_path (str): Path to the SBML file.
        sabio_target_folder (str): The path to the folder containing SABIO-RK data.
        base_species (str): The base species for taxonomy comparison.
        ncbi_parsed_json_path (str): The path to the NCBI parsed JSON file.
        bigg_metabolites_json_path (str): The path to the BIGG metabolites JSON file.
        kinetic_ignored_metabolites (list[str], optional): List of metabolites to ignore. Defaults to [].
        kinetic_ignored_enzyme_ids (list[str], optional): List of enzyme IDs to ignore. Defaults to [].
        custom_enzyme_kinetic_data (dict[str, EnzymeReactionData | None], optional): Custom enzyme kinetic data. Defaults to {}.
        min_ph (float, optional): Minimum pH value for filtering. Defaults to -float("inf").
        max_ph (float, optional): Maximum pH value for filtering. Defaults to float("inf").
        accept_nan_ph (bool, optional): Whether to accept entries with NaN pH values. Defaults to True.
        min_temperature (float, optional): Minimum temperature value for filtering. Defaults to -float("inf").
        max_temperature (float, optional): Maximum temperature value for filtering. Defaults to float("inf").
        accept_nan_temperature (bool, optional): Whether to accept entries with NaN temperature values. Defaults to True.
        kcat_overwrite (dict[str, float], optional): Dictionary to overwrite kcat values. Defaults to {}.
        add_hill_coefficients (bool, optional): Whether Hill coefficeints shall be collected (True) or not (False). Defaults to True.
        kis_and_kas_only_for_same_compartments (bool, default False). If True, kis and kas can only be attributed to a reaction if the affected metabolite has shares one of the reaction metabolite's compartments.
    Returns:
        dict[str, EnzymeReactionData | None]: A dictionary mapping reaction IDs to enzyme kinetic data.
    """
    cobra_model = cobra.io.read_sbml_model(sbml_path)
    sabio_dict = get_full_sabio_dict(
        sabio_target_folder,
    )
    ncbi_parsed_json_data = json_zip_load(ncbi_parsed_json_path)
    name_to_bigg_id_dict: dict[str, str] = json_load(
        bigg_metabolites_json_path, dict[str, str]
    )

    # Get reaction<->enzyme reaction data mapping
    enzyme_reaction_data: dict[str, EnzymeReactionData | None] = {}
    transfered_ec_codes: dict[str, str] = (
        json_load(transfered_ec_number_json, dict[str, str])
        if transfered_ec_number_json
        else {}
    )
    for reaction in cobra_model.reactions:
        if "ec-code" not in reaction.annotation:
            continue

        enzyme_identifiers = reaction.gene_reaction_rule.split(" and ")
        has_found_ignored_enzyme = False
        for enzyme_identifier in enzyme_identifiers:
            if enzyme_identifier in kinetic_ignored_enzyme_ids:
                has_found_ignored_enzyme = True
                break
        if has_found_ignored_enzyme:
            continue

        reac_met_ids = [met.id for met in reaction.metabolites]
        substrate_bigg_ids = [
            met_id[: met_id.rfind("_")]
            for met_id in reac_met_ids
            if reaction.metabolites[cobra_model.metabolites.get_by_id(met_id)] < 0
        ]
        product_bigg_ids = [
            met_id[: met_id.rfind("_")]
            for met_id in reac_met_ids
            if reaction.metabolites[cobra_model.metabolites.get_by_id(met_id)] > 0
        ]

        ec_codes = reaction.annotation["ec-code"]
        if isinstance(ec_codes, str):
            ec_codes = [ec_codes]
        reaction_transfered_ec_codes = [
            transfered_ec_codes[ec_code]
            for ec_code in ec_codes
            if ec_code in transfered_ec_codes
        ]
        ec_codes += reaction_transfered_ec_codes

        all_entries = (
            (
                "kcat",
                _get_ec_code_entries(
                    sabio_dict.kcat_entries,
                    ec_codes,
                    min_ph,
                    max_ph,
                    accept_nan_ph,
                    min_temperature,
                    max_temperature,
                    accept_nan_temperature,
                    substrate_bigg_ids,
                    product_bigg_ids,
                    name_to_bigg_id_dict,
                ),
            ),
            (
                "km",
                _get_ec_code_entries(
                    sabio_dict.km_entries,
                    ec_codes,
                    min_ph,
                    max_ph,
                    accept_nan_ph,
                    min_temperature,
                    max_temperature,
                    accept_nan_temperature,
                    substrate_bigg_ids,
                    product_bigg_ids,
                    name_to_bigg_id_dict,
                ),
            ),
            (
                "ki",
                _get_ec_code_entries(
                    sabio_dict.ki_entries,
                    ec_codes,
                    min_ph,
                    max_ph,
                    accept_nan_ph,
                    min_temperature,
                    max_temperature,
                    accept_nan_temperature,
                    substrate_bigg_ids,
                    product_bigg_ids,
                    name_to_bigg_id_dict,
                ),
            ),
            (
                "ka",
                _get_ec_code_entries(
                    sabio_dict.ka_entries,
                    ec_codes,
                    min_ph,
                    max_ph,
                    accept_nan_ph,
                    min_temperature,
                    max_temperature,
                    accept_nan_temperature,
                    substrate_bigg_ids,
                    product_bigg_ids,
                    name_to_bigg_id_dict,
                ),
            ),
            (
                "hill",
                _get_ec_code_entries(
                    sabio_dict.hill_entries,
                    ec_codes,
                    min_ph,
                    max_ph,
                    accept_nan_ph,
                    min_temperature,
                    max_temperature,
                    accept_nan_temperature,
                    substrate_bigg_ids,
                    product_bigg_ids,
                    name_to_bigg_id_dict,
                ),
            ),
        )

        # {'mol', 'katal*g^(-1)', 'M', 'M^2', 'g', 'mol/mol', 'J/mol', '-',
        # 's^(-1)', 's^(-1)*g^(-1)', 'mg/ml', 'mol*s^(-1)*mol^(-1)', 'M^(-1)', 'Pa',
        # 'M^(-1)*s^(-1)', 'mol*s^(-1)*g^(-1)', 'katal'}
        k_cat_per_tax_score: dict[int, list[float]] = {}
        k_cat_refs_per_tax_score: dict[int, list[ParameterReference]] = {}
        k_ms_per_tax_score: dict[str, dict[int, list[float]]] = {}
        k_m_refs_per_tax_score: dict[str, dict[int, list[ParameterReference]]] = {}
        k_is_per_tax_score: dict[str, dict[int, list[float]]] = {}
        k_i_refs_per_tax_score: dict[str, dict[int, list[ParameterReference]]] = {}
        k_as_per_tax_score: dict[str, dict[int, list[float]]] = {}
        k_a_refs_per_tax_score: dict[str, dict[int, list[ParameterReference]]] = {}
        hills_per_tax_score: dict[str, dict[int, list[float]]] = {}
        hill_refs_per_tax_score: dict[str, dict[int, list[ParameterReference]]] = {}
        reaction_compartments = [met.compartment for met in reaction.metabolites]
        for entries_type, entries in all_entries:
            if entries_type == "kcat":  # Reaction-wide search
                for entry in entries:
                    match entry.parameter_unit:
                        case "s^(-1)":
                            multiplier = 3_600
                        case _:
                            continue

                    taxonomy_dict = get_taxonomy_dict_from_nbci_taxonomy(
                        [base_species, entry.organism], ncbi_parsed_json_data
                    )
                    taxonomy_score = get_taxonomy_scores(base_species, taxonomy_dict)[
                        entry.organism
                    ]
                    if taxonomy_score > max_taxonomy_level:
                        continue
                    if taxonomy_score not in k_cat_per_tax_score:
                        k_cat_per_tax_score[taxonomy_score] = []
                        k_cat_refs_per_tax_score[taxonomy_score] = []
                    k_cat_per_tax_score[taxonomy_score].append(
                        entry.parameter_value * multiplier
                    )
                    k_cat_refs_per_tax_score[taxonomy_score].append(
                        ParameterReference(
                            database="SABIO-RK",
                            comment="SabioEntryID: " + str(entry.entry_id),
                            species=entry.organism,
                            substrate=entry.parameter_associated_species,
                            value=entry.parameter_value * multiplier,
                            tax_distance=taxonomy_score,
                        )
                    )
            else:  # Metabolite-wide search
                match entries_type:
                    case "ka":
                        values_pointer = k_as_per_tax_score
                        ref_pointer = k_a_refs_per_tax_score
                    case "ki":
                        values_pointer = k_is_per_tax_score
                        ref_pointer = k_i_refs_per_tax_score
                    case "km":
                        values_pointer = k_ms_per_tax_score
                        ref_pointer = k_m_refs_per_tax_score
                    case "hill":
                        if not add_hill_coefficients:
                            continue
                        values_pointer = hills_per_tax_score
                        ref_pointer = hill_refs_per_tax_score
                    case _:
                        raise ValueError
                for met in cobra_model.metabolites:
                    if met.id in kinetic_ignored_metabolites:
                        continue
                    if (entries_type == "km") and met not in reaction.metabolites:
                        continue
                    if (
                        met.compartment not in reaction_compartments
                        and (entries_type != "km")
                        and kis_and_kas_only_for_same_compartments
                    ):
                        continue
                    bigg_id = met.id[: met.id.rfind("_")]
                    for entry in entries:
                        entry_met_id = (
                            entry.parameter_associated_species.lower().strip()
                        )
                        if entry_met_id in name_to_bigg_id_dict:
                            entry_bigg_id = name_to_bigg_id_dict[entry_met_id]
                        else:
                            entry_bigg_id = _search_metname_in_bigg_ids(
                                met_id=entry_met_id,
                                bigg_id="",
                                entry=entry,
                                name_to_bigg_id_dict=name_to_bigg_id_dict,
                            )
                            if not entry_bigg_id:
                                continue
                        if entry_bigg_id != bigg_id:
                            continue

                        match entry.parameter_unit:
                            case "M^2":
                                applier = sqrt
                            case "M^(-1)":
                                applier = lambda x: 1 / x  # noqa: E731
                            case "M":
                                applier = lambda x: x  # noqa: E731
                            case "-":  # e.g. for Hill coefficients
                                applier = lambda x: x  # noqa: E731
                            case _:  # unknown unit
                                continue
                        taxonomy_dict = get_taxonomy_dict_from_nbci_taxonomy(
                            [base_species, entry.organism], ncbi_parsed_json_data
                        )
                        taxonomy_score = get_taxonomy_scores(
                            base_species, taxonomy_dict
                        )[entry.organism]
                        if taxonomy_score > max_taxonomy_level:
                            continue

                        if met.id not in values_pointer:
                            values_pointer[met.id] = {}
                            ref_pointer[met.id] = {}
                        if taxonomy_score not in values_pointer[met.id]:
                            values_pointer[met.id][taxonomy_score] = []
                            ref_pointer[met.id][taxonomy_score] = []
                        values_pointer[met.id][taxonomy_score].append(
                            applier(entry.parameter_value)
                        )
                        ref_pointer[met.id][taxonomy_score].append(
                            ParameterReference(
                                database="SABIO-RK",
                                comment="SabioEntryID: " + str(entry.entry_id),
                                species=entry.organism,
                                substrate=entry.parameter_associated_species,
                                tax_distance=taxonomy_score,
                                value=applier(entry.parameter_value),
                            )
                        )

        if reaction.id in kcat_overwrite:
            k_cat = kcat_overwrite[reaction.id]
            k_cat_references = [
                ParameterReference(database="OVERWRITE", tax_distance=-1)
            ]
        elif (
            (reaction.id not in kcat_overwrite) and (kcat_overwrite != {})
        ) or not k_cat_per_tax_score:
            continue
        else:
            min_k_cat_tax_score = min(k_cat_per_tax_score.keys())
            k_cat = median(k_cat_per_tax_score[min_k_cat_tax_score])
            k_cat_references = k_cat_refs_per_tax_score[min_k_cat_tax_score]

        k_ms: dict[str, float] = {}
        k_m_references: dict[str, list[ParameterReference]] = {}
        for met_id, k_m_per_tax_score in k_ms_per_tax_score.items():
            k_ms[met_id] = median(k_m_per_tax_score[min(k_m_per_tax_score.keys())])
            k_m_references[met_id] = k_m_refs_per_tax_score[met_id][
                min(k_m_per_tax_score.keys())
            ]

        k_is: dict[str, float] = {}
        k_i_references: dict[str, list[ParameterReference]] = {}
        for met_id, k_i_per_tax_score in k_is_per_tax_score.items():
            k_is[met_id] = median(k_i_per_tax_score[min(k_i_per_tax_score.keys())])
            k_i_references[met_id] = k_i_refs_per_tax_score[met_id][
                min(k_i_per_tax_score.keys())
            ]

        k_as: dict[str, float] = {}
        k_a_references: dict[str, list[ParameterReference]] = {}
        for met_id, k_a_per_tax_score in k_as_per_tax_score.items():
            k_as[met_id] = median(k_a_per_tax_score[min(k_a_per_tax_score.keys())])
            k_a_references[met_id] = k_a_refs_per_tax_score[met_id][
                min(k_a_per_tax_score.keys())
            ]

        hills: HillCoefficients = HillCoefficients()
        hill_references: dict[str, list[ParameterReference]] = {}
        for met_id, hills_per_tax_score in hills_per_tax_score.items():
            hills.kappa[met_id] = median(
                hills_per_tax_score[min(hills_per_tax_score.keys())]
            )
            hills.iota[met_id] = median(
                hills_per_tax_score[min(hills_per_tax_score.keys())]
            )
            hills.alpha[met_id] = median(
                hills_per_tax_score[min(hills_per_tax_score.keys())]
            )
            hill_references[met_id] = hill_refs_per_tax_score[met_id][
                min(hills_per_tax_score.keys())
            ]

        enzyme_reaction_data[reaction.id] = EnzymeReactionData(
            identifiers=enzyme_identifiers,
            k_cat=k_cat,
            k_cat_references=k_cat_references,
            k_ms=k_ms,
            k_m_references=k_m_references,
            k_is=k_is,
            k_i_references=k_i_references,
            k_as=k_as,
            k_a_references=k_a_references,
            hill_coefficients=hills,
            hill_coefficient_references=HillParameterReferences(
                kappa=hill_references,
                iota=hill_references,
                alpha=hill_references,
            ),
        )

    enzyme_reaction_data = {**enzyme_reaction_data, **custom_enzyme_kinetic_data}

    for reac_id in kcat_overwrite:  # noqa: PLC0206
        if reac_id not in enzyme_reaction_data:
            reaction = cobra_model.reactions.get_by_id(reac_id)
            enzyme_identifiers = reaction.gene_reaction_rule.split(" and ")
            enzyme_reaction_data[reac_id] = EnzymeReactionData(
                identifiers=enzyme_identifiers,
                k_cat=kcat_overwrite[reac_id],
                k_cat_references=[
                    ParameterReference(database="OVERWRITE", tax_distance=-1)
                ],
                k_ms={},
                k_is={},
            )

    return enzyme_reaction_data

spreadsheet_functionality

Functions for generating spreadsheet overviews of variability/optimization results

ABS_EPSILON = 1e-12 module-attribute

Lower absolute values are shown as 0 in the spreadsheet

EMPTY_CELL = SpreadsheetCell(None) module-attribute

Represents an empty spreadsheeet cell without content

FONT_BOLD = Font(name='Calibri', bold=True) module-attribute

Bold font for spreadsheet cells

FONT_BOLD_AND_UNDERLINED = Font(name='Calibri', bold=True, underline='single') module-attribute

Bold and underlined font for spreadsheet cells

FONT_DEFAULT = Font(name='Calibri') module-attribute

Default font for spreadsheet cells

WIDTH_DEFAULT = 12 module-attribute

Default spreadsheel column width

OptimizationDataset dataclass

Represents an optimization result and which of its data shall be shown in the spreadsheet

Source code in cobrak/spreadsheet_functionality.py
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
@dataclass
class OptimizationDataset:
    """Represents an optimization result and which of its data shall be shown in the spreadsheet"""

    data: dict[str, float]
    """The optimization result"""
    with_df: bool = False
    """Shall driving forces be shown in the spreadsheet?"""
    with_vplus: bool = False
    """Shall V+ values be shown in the spreadsheet?"""
    with_kappa: bool = False
    """Shall saturation term values be shown in the spreadsheet?"""
    with_gamma: bool = False
    """Shall gamma values be shown in the spreadsheet?"""
    with_iota: bool = False
    """Shall iota values (inhibition terms) be shown in the spreadsheet?"""
    with_alpha: bool = False
    """Shall alpha values (activation terms) be shown in the spreadsheet?"""
    with_kinetic_differences: bool = False
    """Shall differences between NLP fluxes and 'real' fluxes from kinetics be shown in the spreadsheet?"""
    with_error_corrections: bool = False
    """Shall error corrections be shown as their own sheet?"""

data instance-attribute

The optimization result

with_alpha = False class-attribute instance-attribute

Shall alpha values (activation terms) be shown in the spreadsheet?

with_df = False class-attribute instance-attribute

Shall driving forces be shown in the spreadsheet?

with_error_corrections = False class-attribute instance-attribute

Shall error corrections be shown as their own sheet?

with_gamma = False class-attribute instance-attribute

Shall gamma values be shown in the spreadsheet?

with_iota = False class-attribute instance-attribute

Shall iota values (inhibition terms) be shown in the spreadsheet?

with_kappa = False class-attribute instance-attribute

Shall saturation term values be shown in the spreadsheet?

with_kinetic_differences = False class-attribute instance-attribute

Shall differences between NLP fluxes and 'real' fluxes from kinetics be shown in the spreadsheet?

with_vplus = False class-attribute instance-attribute

Shall V+ values be shown in the spreadsheet?

SpreadsheetCell dataclass

Represents the content of a spreadsheet cell.

Includes the shown value, background color, font style and border setting.

Source code in cobrak/spreadsheet_functionality.py
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
@dataclass
class SpreadsheetCell:
    """Represents the content of a spreadsheet cell.

    Includes the shown value, background color, font style
    and border setting.
    """

    value: float | str | int | bool | None
    """The cell's shown content value (if None, nothing is shown)"""
    bg_color: PatternFill = field(default=BG_COLOR_DEFAULT)
    """The cell's background color (default: BG_COLOR_DEFAULT)"""
    font: Font = field(default=FONT_DEFAULT)
    """The cell's font style (default: FONT_DEFAULT)"""
    border: Border | None = field(default=None)
    """The cell's border style (None if no style given; default: None)"""

bg_color = field(default=BG_COLOR_DEFAULT) class-attribute instance-attribute

The cell's background color (default: BG_COLOR_DEFAULT)

border = field(default=None) class-attribute instance-attribute

The cell's border style (None if no style given; default: None)

font = field(default=FONT_DEFAULT) class-attribute instance-attribute

The cell's font style (default: FONT_DEFAULT)

value instance-attribute

The cell's shown content value (if None, nothing is shown)

Title dataclass

Represents a title or metatitle used in visualizations.

Source code in cobrak/spreadsheet_functionality.py
140
141
142
143
144
145
146
147
148
149
@dataclass
class Title:
    """Represents a title or metatitle used in visualizations."""

    text: str
    """Title text content"""
    width: float
    """With of column"""
    is_metatitle: bool = field(default=False)
    """If True, the title is shown *under* a the major title line in a second line. Defaults to False."""

is_metatitle = field(default=False) class-attribute instance-attribute

If True, the title is shown under a the major title line in a second line. Defaults to False.

text instance-attribute

Title text content

width instance-attribute

With of column

VariabilityDataset dataclass

Represents a dataset with variability for plotting, including error bars or ranges.

Source code in cobrak/spreadsheet_functionality.py
152
153
154
155
156
157
158
159
@dataclass
class VariabilityDataset:
    """Represents a dataset with variability for plotting, including error bars or ranges."""

    data: dict[str, tuple[float, float]]
    """The variability data dict, as returned by COBRAk's variability functions"""
    with_df: bool = False
    """Shall driving force variabilities be shown?"""

data instance-attribute

The variability data dict, as returned by COBRAk's variability functions

with_df = False class-attribute instance-attribute

Shall driving force variabilities be shown?

create_cobrak_spreadsheet(path, cobrak_model, variability_datasets, optimization_datasets, is_maximization=True, sheet_description=[], min_var_value=1e-06, min_rel_correction=0.01, kinetic_difference_precision=6, objective_overwrite=None, extra_optstatistics_data={}, show_regulation_coefficients=True)

Generates a comprehensive Excel spreadsheet summarizing variability and optimization results for a COBRAk model.

This function creates an Excel file that organizes and visualizes various aspects of the model's reactions, metabolites, enzymes, and optimization results. It includes multiple sheets, each focusing on different components of the model and their corresponding data.

In particular, the generated Excel workbook includes the following sheets:

  1. Index: Provides an overview of the different sections in the spreadsheet.
  2. A) Optimization statistics: Displays statistical summaries of the optimization results, including objective values, solver status, and flux comparisons.
  3. B) Model settings: Lists the model's parameters such as protein pool, gas constant, temperature, and annotations.
  4. C) Reactions: Details each reaction's properties, including reaction strings, ΔG'° values, enzyme associations, and kinetic parameters.
  5. D) Metabolites: Shows metabolite concentrations, their ranges, and annotations.
  6. E) Enzymes: Lists individual enzymes with their molecular weights and concentration ranges.
  7. F) Complexes: Provides information on enzyme complexes, including associated reactions and molecular weights.
  8. G) Corrections (optional): If error corrections are included in the optimization datasets, this sheet displays the corrections applied.

Each sheet is populated with data from the provided variability and optimization datasets, formatted for readability with appropriate styling, including background colors and borders to highlight important information.

The function also handles various edge cases, such as missing data and low-flux reactions, ensuring that the spreadsheet remains organized and informative.

Parameters:

Name Type Description Default
path str

The file path where the Excel workbook will be saved.

required
cobrak_model Model

The COBRAk model containing reactions, metabolites, and enzymes.

required
variability_datasets dict[str, VariabilityDataset]

A dictionary of variability datasets, where each key is a dataset name and the value contains the data and flags for what to display.

required
optimization_datasets dict[str, OptimizationDataset]

A dictionary of optimization results, where each key is a dataset name and the value contains the optimization data and flags for what to display.

required
is_maximization bool

Indicates whether the optimization is a maximization problem. Defaults to True.

True
sheet_description list[str]

A list of description lines to include in the index sheet. Defaults to an empty list.

[]
min_var_value float

Where applicable (e.g. for fluxes), the minimum value to display a variable's value. Does not apply for error correction value (see next argument for that. Defaults to 1e-6.

1e-06
min_rel_correction float

Minimal relative change to associated original value for which an error correction value is shown.

0.01
kinetic_difference_precision int

The number of decimal places to round kinetic differences. Defaults to 6.

6

Returns:

Name Type Description
None None

The function does not return any value but saves the Excel workbook to the specified path.

Source code in cobrak/spreadsheet_functionality.py
 399
 400
 401
 402
 403
 404
 405
 406
 407
 408
 409
 410
 411
 412
 413
 414
 415
 416
 417
 418
 419
 420
 421
 422
 423
 424
 425
 426
 427
 428
 429
 430
 431
 432
 433
 434
 435
 436
 437
 438
 439
 440
 441
 442
 443
 444
 445
 446
 447
 448
 449
 450
 451
 452
 453
 454
 455
 456
 457
 458
 459
 460
 461
 462
 463
 464
 465
 466
 467
 468
 469
 470
 471
 472
 473
 474
 475
 476
 477
 478
 479
 480
 481
 482
 483
 484
 485
 486
 487
 488
 489
 490
 491
 492
 493
 494
 495
 496
 497
 498
 499
 500
 501
 502
 503
 504
 505
 506
 507
 508
 509
 510
 511
 512
 513
 514
 515
 516
 517
 518
 519
 520
 521
 522
 523
 524
 525
 526
 527
 528
 529
 530
 531
 532
 533
 534
 535
 536
 537
 538
 539
 540
 541
 542
 543
 544
 545
 546
 547
 548
 549
 550
 551
 552
 553
 554
 555
 556
 557
 558
 559
 560
 561
 562
 563
 564
 565
 566
 567
 568
 569
 570
 571
 572
 573
 574
 575
 576
 577
 578
 579
 580
 581
 582
 583
 584
 585
 586
 587
 588
 589
 590
 591
 592
 593
 594
 595
 596
 597
 598
 599
 600
 601
 602
 603
 604
 605
 606
 607
 608
 609
 610
 611
 612
 613
 614
 615
 616
 617
 618
 619
 620
 621
 622
 623
 624
 625
 626
 627
 628
 629
 630
 631
 632
 633
 634
 635
 636
 637
 638
 639
 640
 641
 642
 643
 644
 645
 646
 647
 648
 649
 650
 651
 652
 653
 654
 655
 656
 657
 658
 659
 660
 661
 662
 663
 664
 665
 666
 667
 668
 669
 670
 671
 672
 673
 674
 675
 676
 677
 678
 679
 680
 681
 682
 683
 684
 685
 686
 687
 688
 689
 690
 691
 692
 693
 694
 695
 696
 697
 698
 699
 700
 701
 702
 703
 704
 705
 706
 707
 708
 709
 710
 711
 712
 713
 714
 715
 716
 717
 718
 719
 720
 721
 722
 723
 724
 725
 726
 727
 728
 729
 730
 731
 732
 733
 734
 735
 736
 737
 738
 739
 740
 741
 742
 743
 744
 745
 746
 747
 748
 749
 750
 751
 752
 753
 754
 755
 756
 757
 758
 759
 760
 761
 762
 763
 764
 765
 766
 767
 768
 769
 770
 771
 772
 773
 774
 775
 776
 777
 778
 779
 780
 781
 782
 783
 784
 785
 786
 787
 788
 789
 790
 791
 792
 793
 794
 795
 796
 797
 798
 799
 800
 801
 802
 803
 804
 805
 806
 807
 808
 809
 810
 811
 812
 813
 814
 815
 816
 817
 818
 819
 820
 821
 822
 823
 824
 825
 826
 827
 828
 829
 830
 831
 832
 833
 834
 835
 836
 837
 838
 839
 840
 841
 842
 843
 844
 845
 846
 847
 848
 849
 850
 851
 852
 853
 854
 855
 856
 857
 858
 859
 860
 861
 862
 863
 864
 865
 866
 867
 868
 869
 870
 871
 872
 873
 874
 875
 876
 877
 878
 879
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898
 899
 900
 901
 902
 903
 904
 905
 906
 907
 908
 909
 910
 911
 912
 913
 914
 915
 916
 917
 918
 919
 920
 921
 922
 923
 924
 925
 926
 927
 928
 929
 930
 931
 932
 933
 934
 935
 936
 937
 938
 939
 940
 941
 942
 943
 944
 945
 946
 947
 948
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
@validate_call
def create_cobrak_spreadsheet(
    path: str,
    cobrak_model: Model,
    variability_datasets: dict[str, VariabilityDataset],
    optimization_datasets: dict[str, OptimizationDataset],
    is_maximization: bool = True,
    sheet_description: list[str] = [],
    min_var_value: float = 1e-6,
    min_rel_correction: float = 0.01,
    kinetic_difference_precision: int = 6,
    objective_overwrite: None | str = None,
    extra_optstatistics_data: dict[str, list[str | float | int | bool | None]] = {},
    show_regulation_coefficients: bool = True,
) -> None:
    """Generates a comprehensive Excel spreadsheet summarizing variability and optimization results for a COBRAk model.

    This function creates an Excel file that organizes and visualizes various aspects of the model's reactions, metabolites, enzymes, and optimization results.
    It includes multiple sheets, each focusing on different components of the model and their corresponding data.

    In particular, the generated Excel workbook includes the following sheets:

    1. **Index**: Provides an overview of the different sections in the spreadsheet.
    2. **A) Optimization statistics**: Displays statistical summaries of the optimization results, including objective values, solver status, and flux comparisons.
    3. **B) Model settings**: Lists the model's parameters such as protein pool, gas constant, temperature, and annotations.
    4. **C) Reactions**: Details each reaction's properties, including reaction strings, ΔG'° values, enzyme associations, and kinetic parameters.
    5. **D) Metabolites**: Shows metabolite concentrations, their ranges, and annotations.
    6. **E) Enzymes**: Lists individual enzymes with their molecular weights and concentration ranges.
    7. **F) Complexes**: Provides information on enzyme complexes, including associated reactions and molecular weights.
    8. **G) Corrections (optional)**: If error corrections are included in the optimization datasets, this sheet displays the corrections applied.

    Each sheet is populated with data from the provided variability and optimization datasets, formatted for readability with appropriate styling,
    including background colors and borders to highlight important information.

    The function also handles various edge cases, such as missing data and low-flux reactions, ensuring that the spreadsheet remains organized and informative.

    Args:
        path (str): The file path where the Excel workbook will be saved.
        cobrak_model (Model): The COBRAk model containing reactions, metabolites, and enzymes.
        variability_datasets (dict[str, VariabilityDataset]): A dictionary of variability datasets, where each key is a dataset name and the value contains
                                                             the data and flags for what to display.
        optimization_datasets (dict[str, OptimizationDataset]): A dictionary of optimization results, where each key is a dataset name and the value contains
                                                                the optimization data and flags for what to display.
        is_maximization (bool, optional): Indicates whether the optimization is a maximization problem. Defaults to True.
        sheet_description (list[str], optional): A list of description lines to include in the index sheet. Defaults to an empty list.
        min_var_value (float, optional): Where applicable (e.g. for fluxes), the minimum value to display a variable's value. Does not apply for error correction value (see next argument for that.
                                         Defaults to 1e-6.
        min_rel_correction (float, optional): Minimal relative change to associated original value for which an error correction value is shown.
        kinetic_difference_precision (int, optional): The number of decimal places to round kinetic differences. Defaults to 6.

    Returns:
        None: The function does not return any value but saves the Excel workbook to the specified path.
    """
    all_reac_ids = list(cobrak_model.reactions.keys())
    all_met_ids = list(cobrak_model.metabolites.keys())
    all_enzyme_ids = list(cobrak_model.enzymes.keys())
    all_met_var_ids = [LNCONC_VAR_PREFIX + met_id for met_id in all_met_ids]
    all_enzcomplex_ids = []
    for reac_id, reaction in cobrak_model.reactions.items():
        if reaction.enzyme_reaction_data is None:
            continue
        all_enzcomplex_ids.append(get_reaction_enzyme_var_id(reac_id, reaction))

    has_any_vplus = any(
        opt_data.with_vplus for opt_data in optimization_datasets.values()
    )
    has_any_df = any(opt_data.with_df for opt_data in optimization_datasets.values())
    has_any_kappa = any(
        opt_data.with_kappa for opt_data in optimization_datasets.values()
    )
    has_any_gamma = any(
        opt_data.with_gamma for opt_data in optimization_datasets.values()
    )
    has_any_iota = any(
        opt_data.with_iota for opt_data in optimization_datasets.values()
    )
    has_any_alpha = any(
        opt_data.with_alpha for opt_data in optimization_datasets.values()
    )
    has_any_kinetic_differences = any(
        opt_data.with_kinetic_differences for opt_data in optimization_datasets.values()
    )

    kappa_gamma_iota_alpha_str_list = []
    if has_any_kappa:
        kappa_gamma_iota_alpha_str_list.append("κ")
    if has_any_gamma:
        kappa_gamma_iota_alpha_str_list.append("γ")
    if has_any_iota:
        kappa_gamma_iota_alpha_str_list.append("ι")
    if has_any_alpha:
        kappa_gamma_iota_alpha_str_list.append("α")
    kappa_gamma_iota_alpha_str = "⋅".join(kappa_gamma_iota_alpha_str_list)

    # Index sheet
    index_titles: list[Title] = []
    index_cells: dict[str, list[str | float | int | bool | None | SpreadsheetCell]] = {}
    sheet_line = 1
    for description_line in sheet_description:
        index_cells[_num_to_sheet_letter(sheet_line)] = [
            SpreadsheetCell(
                description_line,
            ),
        ]
        sheet_line += 1

    if sheet_line == 1:
        sheet_line = 0  # No description lines provided, first line can be used for A) to ... as following
    index_cells = {
        _num_to_sheet_letter(sheet_line + 1): [
            SpreadsheetCell(
                "A) Optimization statistics: Objective values, minimal/maximal occurring kinetic values, ...",
            ),
        ],
        _num_to_sheet_letter(sheet_line + 2): [
            SpreadsheetCell(
                "B) Global setting: Model settings such as the temperature, protein pool, ...",
            ),
        ],
        _num_to_sheet_letter(sheet_line + 3): [
            SpreadsheetCell(
                "C) Reactions: Their fluxes, driving forces, kinetic values...",
            ),
        ],
        _num_to_sheet_letter(sheet_line + 4): [
            SpreadsheetCell(
                "D) Metabolites: Their concentrations, formulas, ...",
            ),
        ],
        _num_to_sheet_letter(sheet_line + 5): [
            SpreadsheetCell(
                "E) Enzymes: The single enzymes occurring in the model with their concentration settings (if any given)",
            ),
        ],
        _num_to_sheet_letter(sheet_line + 6): [
            SpreadsheetCell(
                "F) Complexes: The (multi- or single-)enzyme complexes occurring in the model with protein pool fraction data",
            ),
        ],
    }

    if has_any_kappa or has_any_gamma or has_any_iota or has_any_alpha:
        index_cells |= {
            _num_to_sheet_letter(sheet_line + 6): [
                SpreadsheetCell(
                    "F) Complexes: The (multi- or single-)enzyme complexes occurring in the model with protein pool fraction data",
                ),
            ],
        }

    # Model settings sheet
    model_titles: list[Title] = []
    model_cells: dict[str, list[str | float | int | bool | None | SpreadsheetCell]] = {
        "A": [
            SpreadsheetCell("Protein pool [g⋅gDW⁻¹]", font=FONT_BOLD),
            SpreadsheetCell(cobrak_model.max_prot_pool),
        ],
        "B": [
            SpreadsheetCell("R [kJ⋅K⁻¹⋅mol⁻¹)]", font=FONT_BOLD),
            SpreadsheetCell(cobrak_model.R),
        ],
        "C": [
            SpreadsheetCell("T [K]", font=FONT_BOLD),
            SpreadsheetCell(cobrak_model.T),
        ],
        "D": [
            SpreadsheetCell("R⋅T [kJ⋅mol⁻¹]", font=FONT_BOLD),
            SpreadsheetCell(cobrak_model.R * cobrak_model.T),
        ],
        "E": [
            SpreadsheetCell("κ-ignored metabolites", font=FONT_BOLD),
            SpreadsheetCell(str(cobrak_model.kinetic_ignored_metabolites)),
        ],
        "F": [
            SpreadsheetCell("Model annotation", font=FONT_BOLD),
            SpreadsheetCell(str(cobrak_model.annotation)),
        ],
        "G": [
            SpreadsheetCell("Maximal concentration sum [M]:", font=FONT_BOLD),
            SpreadsheetCell(str(cobrak_model.max_conc_sum)),
        ],
        "H": [
            SpreadsheetCell("Metabolite pool [M]", font=FONT_BOLD),
            SpreadsheetCell(cobrak_model.max_conc_sum),
        ],
        "I": [
            SpreadsheetCell(
                "Metabolite pool ignore prefixes (i.e. metabolites with this prefix are not counted)",
                font=FONT_BOLD,
            ),
            SpreadsheetCell("; ".join(cobrak_model.conc_sum_ignore_prefixes)),
        ],
        "J": [
            SpreadsheetCell(
                "Metabolite pool include prefixes (i.e. only metabolites with this prefix are counted)",
                font=FONT_BOLD,
            ),
            SpreadsheetCell("; ".join(cobrak_model.conc_sum_include_suffixes)),
        ],
    }

    # Statistics sheet
    comparisons = compare_multiple_results_to_best(
        cobrak_model,
        [dataset.data for dataset in optimization_datasets.values()],
        is_maximization,
        min_var_value,
    )

    stats_titles: list[Title] = [Title("", WIDTH_DEFAULT)]
    stats_cells: dict[str, list[str | float | int | bool | None | SpreadsheetCell]] = {
        "0": [
            SpreadsheetCell(
                "Objective value"
                if objective_overwrite is None
                else f"{objective_overwrite} value",
                font=FONT_BOLD,
            ),
        ],
        "1": [
            SpreadsheetCell("Solver status (see COBRAk documentation)", font=FONT_BOLD),
        ],
        "2": [
            SpreadsheetCell(
                "Termination condition (see COBRAk documentaiton)", font=FONT_BOLD
            ),
        ],
    }

    statline = 3
    if has_any_vplus:
        stats_cells |= {
            f"{statline}": [
                SpreadsheetCell("Used protein pool [g⋅gDW⁻¹]", font=FONT_BOLD),
            ],
        }
        statline += 1

    if has_any_df or has_any_gamma:
        stats_cells |= {
            f"{statline}": [
                SpreadsheetCell(
                    "Used metabolite concentration pool [M]", font=FONT_BOLD
                ),
            ],
        }
        statline += 1

    if has_any_df:
        stats_cells |= {
            f"{statline}": [
                SpreadsheetCell("Min driving force [kJ⋅mol⁻¹]", font=FONT_BOLD),
            ],
            f"{statline + 1}": [
                SpreadsheetCell("Max driving force [kJ⋅mol⁻¹]", font=FONT_BOLD),
            ],
            f"{statline + 2}": [
                SpreadsheetCell("Mean driving force [kJ⋅mol⁻¹]", font=FONT_BOLD),
            ],
            f"{statline + 3}": [
                SpreadsheetCell("Median driving force [kJ⋅mol⁻¹]", font=FONT_BOLD),
            ],
        }
        statline += 4

    if has_any_gamma:
        stats_cells |= {
            f"{statline}": [
                SpreadsheetCell("Min γ", font=FONT_BOLD),
            ],
            f"{statline + 1}": [
                SpreadsheetCell("Max γ", font=FONT_BOLD),
            ],
            f"{statline + 2}": [
                SpreadsheetCell("Mean γ", font=FONT_BOLD),
            ],
            f"{statline + 3}": [
                SpreadsheetCell("Median γ", font=FONT_BOLD),
            ],
        }
        statline += 4

    if has_any_kappa:
        stats_cells |= {
            f"{statline}": [
                SpreadsheetCell("Min κ", font=FONT_BOLD),
            ],
            f"{statline + 1}": [
                SpreadsheetCell("Max κ", font=FONT_BOLD),
            ],
            f"{statline + 2}": [
                SpreadsheetCell("Mean κ", font=FONT_BOLD),
            ],
            f"{statline + 3}": [
                SpreadsheetCell("Median κ", font=FONT_BOLD),
            ],
        }
        statline += 4

    if has_any_iota:
        stats_cells |= {
            f"{statline}": [
                SpreadsheetCell("Min ι", font=FONT_BOLD),
            ],
            f"{statline + 1}": [
                SpreadsheetCell("Max ι", font=FONT_BOLD),
            ],
            f"{statline + 2}": [
                SpreadsheetCell("Mean ι", font=FONT_BOLD),
            ],
            f"{statline + 3}": [
                SpreadsheetCell("Median ι", font=FONT_BOLD),
            ],
        }
        statline += 4

    if has_any_alpha:
        stats_cells |= {
            f"{statline}": [
                SpreadsheetCell("Min α", font=FONT_BOLD),
            ],
            f"{statline + 1}": [
                SpreadsheetCell("Max α", font=FONT_BOLD),
            ],
            f"{statline + 2}": [
                SpreadsheetCell("Mean α", font=FONT_BOLD),
            ],
            f"{statline + 3}": [
                SpreadsheetCell("Median α", font=FONT_BOLD),
            ],
        }
        statline += 4

    if has_any_kappa or has_any_gamma or has_any_alpha or has_any_iota:
        stats_cells |= {
            f"{statline}": [
                SpreadsheetCell(f"Min {kappa_gamma_iota_alpha_str}", font=FONT_BOLD),
            ],
            f"{statline + 1}": [
                SpreadsheetCell(f"Max {kappa_gamma_iota_alpha_str}", font=FONT_BOLD),
            ],
            f"{statline + 2}": [
                SpreadsheetCell(f"Mean {kappa_gamma_iota_alpha_str}", font=FONT_BOLD),
            ],
            f"{statline + 3}": [
                SpreadsheetCell(f"Median {kappa_gamma_iota_alpha_str}", font=FONT_BOLD),
            ],
        }
        statline += 4

    stats_cells |= {
        f"{statline}": [
            SpreadsheetCell("Min flux difference to best", font=FONT_BOLD),
        ],
        f"{statline + 1}": [
            SpreadsheetCell("Max flux difference to best", font=FONT_BOLD),
        ],
        f"{statline + 2}": [
            SpreadsheetCell("Sum of flux differences to best", font=FONT_BOLD),
        ],
        f"{statline + 3}": [
            SpreadsheetCell("Mean flux difference to best", font=FONT_BOLD),
        ],
        f"{statline + 4}": [
            SpreadsheetCell("Median flux difference to best", font=FONT_BOLD),
        ],
        f"{statline + 5}": [
            SpreadsheetCell("Objective difference to best", font=FONT_BOLD),
        ],
        f"{statline + 6}": [
            SpreadsheetCell(
                "Only in this to best (regarding active reactions)", font=FONT_BOLD
            ),
        ],
        f"{statline + 7}": [
            SpreadsheetCell(
                "Only in best to this (regarding active reactions)", font=FONT_BOLD
            ),
        ],
    }
    statline += 8

    if has_any_kinetic_differences:
        stats_cells |= {
            f"{statline}": [
                SpreadsheetCell("'Really' used protein pool [g⋅gDW⁻¹]", font=FONT_BOLD),
            ]
        }
        statline += 1

    for extrai, extratitle in enumerate(extra_optstatistics_data.keys()):
        stats_cells[f"{statline + extrai}"] = [
            SpreadsheetCell(extratitle, font=FONT_BOLD)
        ]

    # Optimization data
    for current_dataset_i, (opt_dataset_name, opt_dataset) in enumerate(
        optimization_datasets.items()
    ):
        statline = 0
        stats_titles.append(Title(opt_dataset_name, WIDTH_DEFAULT))
        if objective_overwrite is None:
            stats_cells[f"{statline}"].append(opt_dataset.data[OBJECTIVE_VAR_NAME])
        else:
            stats_cells[f"{statline}"].append(opt_dataset.data[objective_overwrite])
        stats_cells[f"{statline + 1}"].append(opt_dataset.data[SOLVER_STATUS_KEY])
        stats_cells[f"{statline + 2}"].append(
            opt_dataset.data[TERMINATION_CONDITION_KEY]
        )
        statline += 3

        if has_any_vplus:
            if PROT_POOL_REAC_NAME in opt_dataset.data:
                stats_cells[f"{statline}"].append(opt_dataset.data[PROT_POOL_REAC_NAME])
                statline += 1
            else:
                stats_cells[f"{statline}"].append(_get_empty_cell())
                statline += 1

        if has_any_df or has_any_gamma:
            if any(x.startswith(LNCONC_VAR_PREFIX) for x in opt_dataset.data):
                stats_cells[f"{statline}"].append(
                    sum_concs(
                        opt_dataset.data,
                        cobrak_model.conc_sum_include_suffixes,
                        cobrak_model.conc_sum_ignore_prefixes,
                    )
                )
                statline += 1
            else:
                stats_cells[f"{statline}"].append(_get_empty_cell())
                statline += 1

        if opt_dataset.with_df:
            df_stats, _, _, _, _, _ = get_df_and_efficiency_factors_sorted_lists(
                cobrak_model,
                opt_dataset.data,
                min_var_value,
            )
            stats_cells[f"{statline}"].append(min(df_stats.values()))
            stats_cells[f"{statline + 1}"].append(max(df_stats.values()))
            stats_cells[f"{statline + 2}"].append(mean(df_stats.values()))
            stats_cells[f"{statline + 3}"].append(median(df_stats.values()))
            statline += 4
        elif has_any_df:
            for line_letter in (f"{statline + j}" for j in range(4)):
                stats_cells[line_letter].append(_get_empty_cell())
            statline += 4

        if opt_dataset.with_gamma:
            _, _, gamma_stats, _, _, _ = get_df_and_efficiency_factors_sorted_lists(
                cobrak_model,
                opt_dataset.data,
                min_var_value,
            )
            stats_cells[f"{statline}"].append(min(gamma_stats.values()))
            stats_cells[f"{statline + 1}"].append(max(gamma_stats.values()))
            stats_cells[f"{statline + 2}"].append(mean(gamma_stats.values()))
            stats_cells[f"{statline + 3}"].append(median(gamma_stats.values()))
            statline += 4
        elif has_any_gamma:
            for line_letter in (f"{statline + j}" for j in range(4)):
                stats_cells[line_letter].append(_get_empty_cell())
            statline += 4

        if opt_dataset.with_kappa:
            _, kappa_stats, _, _, _, _ = get_df_and_efficiency_factors_sorted_lists(
                cobrak_model,
                opt_dataset.data,
                min_var_value,
            )
            stats_cells[f"{statline}"].append(min(kappa_stats.values()))
            stats_cells[f"{statline + 1}"].append(max(kappa_stats.values()))
            stats_cells[f"{statline + 2}"].append(mean(kappa_stats.values()))
            stats_cells[f"{statline + 3}"].append(median(kappa_stats.values()))
            statline += 4
        elif has_any_kappa:
            for line_letter in (f"{statline + j}" for j in range(4)):
                stats_cells[line_letter].append(_get_empty_cell())
            statline += 4

        if opt_dataset.with_iota:
            iota_values = [
                opt_dataset.data[x]
                for x in opt_dataset.data
                if x.startswith(IOTA_VAR_PREFIX)
                and (opt_dataset.data[x[len(IOTA_VAR_PREFIX) :]] > min_var_value)
            ]
            stats_cells[f"{statline}"].append(min(iota_values))
            stats_cells[f"{statline + 1}"].append(max(iota_values))
            stats_cells[f"{statline + 2}"].append(mean(iota_values))
            stats_cells[f"{statline + 3}"].append(median(iota_values))
            statline += 4
        elif has_any_iota:
            for line_letter in (f"{statline + j}" for j in range(4)):
                stats_cells[line_letter].append(_get_empty_cell())
            statline += 4

        if opt_dataset.with_alpha:
            alpha_values = [
                opt_dataset.data[x]
                for x in opt_dataset.data
                if x.startswith(ALPHA_VAR_PREFIX)
                and (opt_dataset.data[x[len(ALPHA_VAR_PREFIX) :]] > min_var_value)
            ]
            stats_cells[f"{statline}"].append(min(alpha_values))
            stats_cells[f"{statline + 1}"].append(max(alpha_values))
            stats_cells[f"{statline + 2}"].append(mean(alpha_values))
            stats_cells[f"{statline + 3}"].append(median(alpha_values))
            statline += 4
        elif has_any_alpha:
            for line_letter in (f"{statline + j}" for j in range(4)):
                stats_cells[line_letter].append(_get_empty_cell())
            statline += 4

        if (
            opt_dataset.with_kappa
            or opt_dataset.with_gamma
            or opt_dataset.with_alpha
            or opt_dataset.with_iota
        ):
            _, _, _, _, _, multiplier_stats = (
                get_df_and_efficiency_factors_sorted_lists(
                    cobrak_model,
                    opt_dataset.data,
                    min_var_value,
                )
            )
            efficiencies_product_stats_values = [
                x[0] for x in multiplier_stats.values()
            ]
            stats_cells[f"{statline}"].append(min(efficiencies_product_stats_values))
            stats_cells[f"{statline + 1}"].append(
                max(efficiencies_product_stats_values)
            )
            stats_cells[f"{statline + 2}"].append(
                mean(efficiencies_product_stats_values)
            )
            stats_cells[f"{statline + 3}"].append(
                median(efficiencies_product_stats_values)
            )
            statline += 4
        elif has_any_gamma and has_any_kappa:
            for line_letter in (f"{statline + j}" for j in range(4)):
                stats_cells[line_letter].append(_get_empty_cell())
            statline += 4

        if current_dataset_i in comparisons:
            dataset_comparison_stats, dataset_unique_reacs = comparisons[
                current_dataset_i
            ]
            tempstatline = statline
            for j, comparison_value in enumerate(dataset_comparison_stats.values()):
                stats_cells[f"{statline + j}"].append(comparison_value)
                tempstatline = statline + j
            statline = tempstatline + 1
            stats_cells[f"{statline}"].append(
                str(list(dataset_unique_reacs.values())[0])
            )
            stats_cells[f"{statline + 1}"].append(
                str(list(dataset_unique_reacs.values())[1])
            )
            statline += 2
        else:
            for line_letter in (f"{statline + j}" for j in range(8)):
                stats_cells[line_letter].append("(is best)")
            statline += 8

        if opt_dataset.with_kinetic_differences:
            unoptimized_reactions = get_unoptimized_reactions_in_nlp_solution(
                cobrak_model,
                opt_dataset.data,
                regard_iota=has_any_iota,
                regard_alpha=has_any_alpha,
            )
            prot_pool_sum = 0.0
            for reac_id, reac_data in cobrak_model.reactions.items():
                if reac_id not in opt_dataset.data:
                    continue
                if opt_dataset.data[reac_id] < min_var_value:
                    continue
                if reac_data.enzyme_reaction_data is None:
                    continue
                enzyme_var_id = get_reaction_enzyme_var_id(reac_id, reac_data)
                if enzyme_var_id not in opt_dataset.data:
                    continue
                enzyme_conc = opt_dataset.data[enzyme_var_id]
                mw = get_full_enzyme_mw(cobrak_model, reac_data)
                if reac_id in unoptimized_reactions:
                    ratio = (
                        unoptimized_reactions[reac_id][0]
                        / unoptimized_reactions[reac_id][1]
                    )
                    if ratio < 1.0:
                        ratio = 1.0
                    prot_pool_sum += mw * enzyme_conc * (ratio)
                else:
                    prot_pool_sum += mw * enzyme_conc

            stats_cells[f"{statline}"].append(prot_pool_sum)
            statline += 1
        elif has_any_kinetic_differences:
            stats_cells[f"{statline}"].append(" ")
            statline += 1

    for extrai, extravalues in enumerate(extra_optstatistics_data.values()):
        stats_cells[f"{statline + extrai}"].extend(
            [SpreadsheetCell(extravalue) for extravalue in extravalues]
        )

    # Reaction sheet
    reac_titles: list[Title] = [
        Title("ID", WIDTH_DEFAULT),
        Title("String", WIDTH_DEFAULT),
        Title("ΔG'° [kJ⋅mol⁻¹]", WIDTH_DEFAULT),
        Title("Enzyme(s)", WIDTH_DEFAULT),
        Title("kcat [h⁻¹]", WIDTH_DEFAULT),
    ]
    if show_regulation_coefficients:
        reac_titles += [
            Title("kms [M]", WIDTH_DEFAULT),
            Title("kis [M]", WIDTH_DEFAULT),
            Title("kas [M]", WIDTH_DEFAULT),
            Title("Hill coefficients [-]", WIDTH_DEFAULT),
        ]
    reac_cells: dict[str, list[str | float | int | bool | None | SpreadsheetCell]] = {
        reac_id: [] for reac_id in all_reac_ids
    }
    # Reaction data
    for reac_id in all_reac_ids:
        reaction = cobrak_model.reactions[reac_id]
        # Reac ID
        reac_cells[reac_id].append(reac_id)
        # Reac string
        reac_cells[reac_id].append(get_reaction_string(cobrak_model, reac_id))
        # Reac ΔG'°
        reac_cells[reac_id].append(str(_na_str_or_value(reaction.dG0)))
        enzyme_reaction_data = reaction.enzyme_reaction_data
        match enzyme_reaction_data:
            case None:
                enzyme_id = None
                k_cat = None
                k_ms = None
                k_is = None
                k_as = None
                hills = None
            case _:
                enzyme_id = str(enzyme_reaction_data.identifiers)
                k_cat = enzyme_reaction_data.k_cat
                k_ms = str(enzyme_reaction_data.k_ms)
                k_is = str(enzyme_reaction_data.k_is)
                k_as = str(enzyme_reaction_data.k_as)
                hills = str(enzyme_reaction_data.hill_coefficients)
        # Enzyme ID
        reac_cells[reac_id].append(enzyme_id)
        # kcat
        reac_cells[reac_id].append(k_cat)
        # kms
        reac_cells[reac_id].append(k_ms)
        if show_regulation_coefficients:
            # kis
            reac_cells[reac_id].append(k_is)
            # kas
            reac_cells[reac_id].append(k_as)
            # Hill coefficients
            reac_cells[reac_id].append(hills)

    # Variability data
    for var_dataset_name, var_dataset in variability_datasets.items():
        reac_titles.extend(
            (
                Title(var_dataset_name, WIDTH_DEFAULT, is_metatitle=True),
                Title("Min flux [mmol⋅gDW⁻¹⋅h⁻¹]", WIDTH_DEFAULT),
                Title("Max flux [mmol⋅gDW⁻¹⋅h⁻¹]", WIDTH_DEFAULT),
            )
        )
        if var_dataset.with_df:
            reac_titles.extend(
                (
                    Title("Min driving force [kJ⋅mol⁻¹]", WIDTH_DEFAULT),
                    Title("Max driving force [kJ⋅mol⁻¹]", WIDTH_DEFAULT),
                )
            )
        var_reac_ids = set(all_reac_ids) & set(var_dataset.data.keys())
        for reac_id in var_reac_ids:
            variability_tuple = var_dataset.data[reac_id]
            min_flux = variability_tuple[0]
            max_flux = variability_tuple[1]
            bg_color = _get_variability_bg_color(min_flux, max_flux)
            reac_cells[reac_id].append(
                SpreadsheetCell(min_flux, bg_color=bg_color, border=BORDER_BLACK_LEFT)
            )
            reac_cells[reac_id].append(SpreadsheetCell(max_flux, bg_color=bg_color))
            if var_dataset.with_df:
                df_var_id = f"{DF_VAR_PREFIX}{reac_id}"
                if df_var_id in var_dataset.data:
                    min_df = str(round(var_dataset.data[df_var_id][0], 4))
                    max_df = str(round(var_dataset.data[df_var_id][1], 4))
                else:
                    min_df = " "
                    max_df = " "
            else:
                min_df = " "
                max_df = " "
            reac_cells[reac_id].append(SpreadsheetCell(min_df, bg_color=bg_color))
            reac_cells[reac_id].append(SpreadsheetCell(max_df, bg_color=bg_color))
        missing_reac_ids = set(all_reac_ids) - set(var_dataset.data.keys())
        for missing_reac_id in missing_reac_ids:
            reac_cells[missing_reac_id].append(_get_empty_cell())
            reac_cells[missing_reac_id].append(_get_empty_cell())
            if var_dataset.with_df:
                reac_cells[missing_reac_id].append(_get_empty_cell())
                reac_cells[missing_reac_id].append(_get_empty_cell())

    # Optimization data
    for opt_dataset_name, opt_dataset in optimization_datasets.items():
        reac_titles.extend(
            (
                Title(opt_dataset_name, WIDTH_DEFAULT, is_metatitle=True),
                Title("Flux", WIDTH_DEFAULT),
            )
        )
        if opt_dataset.with_df:
            reac_titles.append(Title("Driving force [kJ⋅mol⁻¹]", WIDTH_DEFAULT))
        if opt_dataset.with_vplus:
            reac_titles.append(Title("V⁺ [mmol⋅gDW⁻¹⋅h⁻¹]", WIDTH_DEFAULT))
        if opt_dataset.with_kappa:
            reac_titles.append(Title("κ [0,1]", WIDTH_DEFAULT))
        if opt_dataset.with_gamma:
            reac_titles.append(Title("γ [0,1]", WIDTH_DEFAULT))
        if opt_dataset.with_iota:
            reac_titles.append(Title("ι [0,1]", WIDTH_DEFAULT))
        if opt_dataset.with_alpha:
            reac_titles.append(Title("α [0,1]", WIDTH_DEFAULT))
        if opt_dataset.with_kinetic_differences:
            reac_titles.append(Title('"Real" flux', WIDTH_DEFAULT))
            unoptimized_reactions = get_unoptimized_reactions_in_nlp_solution(
                cobrak_model,
                opt_dataset.data,
                regard_alpha=True,
                regard_iota=True,
            )
        opt_reac_ids = set(all_reac_ids) & set(opt_dataset.data.keys())
        reacs_with_too_low_flux = []
        for reac_id in opt_reac_ids:
            flux = get_fwd_rev_corrected_flux(
                reac_id=reac_id,
                usable_reac_ids=opt_reac_ids,
                result=opt_dataset.data,
                fwd_suffix=cobrak_model.fwd_suffix,
                rev_suffix=cobrak_model.rev_suffix,
            )
            if flux < min_var_value:
                reacs_with_too_low_flux.append(reac_id)
                continue
            bg_color = _get_optimization_bg_color(flux)
            reac_cells[reac_id].append(
                SpreadsheetCell(flux, bg_color=bg_color, border=BORDER_BLACK_LEFT)
            )
            enzyme_reaction_data = cobrak_model.reactions[reac_id].enzyme_reaction_data
            reaction = cobrak_model.reactions[reac_id]
            if opt_dataset.with_df:
                df_var_id = f"{DF_VAR_PREFIX}{reac_id}"
                if df_var_id in opt_dataset.data:
                    df_value = str(round(opt_dataset.data[df_var_id], 4))
                else:
                    df_value = " "
                reac_cells[reac_id].append(SpreadsheetCell(df_value, bg_color=bg_color))
            if opt_dataset.with_vplus:
                if (
                    enzyme_reaction_data is not None
                    and enzyme_reaction_data.k_cat < 1e20
                    and any(
                        identifier in cobrak_model.enzymes
                        for identifier in enzyme_reaction_data.identifiers
                    )
                ):
                    vplus = str(
                        enzyme_reaction_data.k_cat
                        * opt_dataset.data[
                            get_reaction_enzyme_var_id(reac_id, reaction)
                        ]
                    )
                else:
                    vplus = " "
                reac_cells[reac_id].append(SpreadsheetCell(vplus, bg_color=bg_color))
            if opt_dataset.with_kappa:
                kappa_var_id = KAPPA_VAR_PREFIX + reac_id
                if kappa_var_id in opt_dataset.data:
                    kappa_value = str(round(opt_dataset.data[kappa_var_id], 4))
                else:
                    kappa_value = " "
                reac_cells[reac_id].append(
                    SpreadsheetCell(kappa_value, bg_color=bg_color)
                )
            if opt_dataset.with_gamma:
                gamma_var_id = GAMMA_VAR_PREFIX + reac_id
                if gamma_var_id in opt_dataset.data:
                    gamma_value = str(round(opt_dataset.data[gamma_var_id], 4))
                else:
                    gamma_value = " "
                reac_cells[reac_id].append(
                    SpreadsheetCell(gamma_value, bg_color=bg_color)
                )
            if opt_dataset.with_iota:
                iota_var_id = IOTA_VAR_PREFIX + reac_id
                if iota_var_id in opt_dataset.data:
                    iota_value = str(opt_dataset.data[iota_var_id])
                else:
                    iota_value = " "
                reac_cells[reac_id].append(
                    SpreadsheetCell(iota_value, bg_color=bg_color)
                )
            if opt_dataset.with_alpha:
                alpha_var_id = ALPHA_VAR_PREFIX + reac_id
                if alpha_var_id in opt_dataset.data:
                    alpha_value = str(opt_dataset.data[alpha_var_id])
                else:
                    alpha_value = " "
                reac_cells[reac_id].append(
                    SpreadsheetCell(alpha_value, bg_color=bg_color)
                )
            if opt_dataset.with_kinetic_differences:
                if reac_id in unoptimized_reactions and (
                    round(
                        unoptimized_reactions[reac_id][1], kinetic_difference_precision
                    )
                    != round(
                        unoptimized_reactions[reac_id][0], kinetic_difference_precision
                    )
                ):
                    reac_cells[reac_id].append(
                        SpreadsheetCell(
                            unoptimized_reactions[reac_id][1], bg_color=bg_color
                        )
                    )
                else:
                    reac_cells[reac_id].append(
                        SpreadsheetCell(flux, bg_color=bg_color, font=FONT_ITALIC)
                    )
        missing_reac_ids = set(all_reac_ids) - set(opt_dataset.data.keys())
        missing_reac_ids |= set(reacs_with_too_low_flux)
        for missing_reac_id in missing_reac_ids:
            reac_cells[missing_reac_id].append(_get_empty_cell())
            num_extra = sum(
                [
                    opt_dataset.with_df,
                    opt_dataset.with_vplus,
                    opt_dataset.with_kappa,
                    opt_dataset.with_gamma,
                    opt_dataset.with_iota,
                    opt_dataset.with_alpha,
                    opt_dataset.with_kinetic_differences,
                ]
            )
            for _ in range(num_extra):
                reac_cells[missing_reac_id].append(_get_empty_cell())

    # Single enzyme sheet
    enzyme_titles: list[Title] = [
        Title("ID", WIDTH_DEFAULT),
        Title("MW", WIDTH_DEFAULT),
        Title("Conc. range [mmol⋅gDW⁻¹]", WIDTH_DEFAULT),
    ]
    enzyme_cells: dict[str, list[str | float | int | bool | None | SpreadsheetCell]] = {
        enzyme_id: [] for enzyme_id in all_enzyme_ids
    }
    # Single enzyme data
    for enzyme_id in all_enzyme_ids:
        enzyme: Enzyme = cobrak_model.enzymes[enzyme_id]
        # Enzyme ID
        enzyme_cells[enzyme_id].append(enzyme_id)
        # Enzyme MW
        enzyme_cells[enzyme_id].append(enzyme.molecular_weight)
        # Enzyme concentration range
        match enzyme.min_conc:
            case None:
                min_conc = None
                bg_color = BG_COLOR_BLACK
            case _:
                min_conc = enzyme.min_conc
                bg_color = BG_COLOR_DEFAULT
        match enzyme.max_conc:
            case None:
                max_conc = None
                bg_color = BG_COLOR_BLACK
            case _:
                max_conc = enzyme.max_conc
                bg_color = BG_COLOR_DEFAULT

        enzyme_cells[enzyme_id].append(SpreadsheetCell(min_conc, bg_color=bg_color))
        enzyme_cells[enzyme_id].append(SpreadsheetCell(min_conc, bg_color=bg_color))

    # Enzyme complexes sheet
    enzcomplex_titles: list[Title] = [
        Title("ID", WIDTH_DEFAULT),
        Title("Reactions", WIDTH_DEFAULT),
        Title("MW", WIDTH_DEFAULT),
    ]
    enzcomplex_cells: dict[
        str, list[str | float | int | bool | None | SpreadsheetCell]
    ] = {enzcomplex_id: [] for enzcomplex_id in all_enzcomplex_ids}
    # Enzyme complex data
    for enzcomplex_id in all_enzcomplex_ids:
        reac_id, reaction = _get_enzcomplex_reaction(cobrak_model, enzcomplex_id)
        if not all(
            identifier in cobrak_model.enzymes
            for identifier in reaction.enzyme_reaction_data.identifiers
        ):  # e.g., s0001 (spontaneous reactions)
            continue
        # Enzyme complex ID
        enzcomplex_cells[enzcomplex_id].append(
            enzcomplex_id.replace(ENZYME_VAR_PREFIX, "").split(ENZYME_VAR_INFIX)[0]
        )
        # Associated reaction
        if reaction.enzyme_reaction_data is None:
            raise ValueError
        if reaction.enzyme_reaction_data.identifiers == [""]:
            continue
        enzcomplex_cells[enzcomplex_id].append(reac_id)
        # Enzyme complex MW
        full_mw = get_full_enzyme_mw(cobrak_model, reaction)
        enzcomplex_cells[enzcomplex_id].append(full_mw)

    # Variability data
    for var_dataset_name, var_dataset in variability_datasets.items():
        enzcomplex_titles.extend(
            (
                Title(var_dataset_name, WIDTH_DEFAULT, is_metatitle=True),
                Title("Min conc. [mmol⋅gDW⁻¹]", WIDTH_DEFAULT),
                Title("Max conc. [mmolgDW⁻¹]", WIDTH_DEFAULT),
            )
        )
        var_enzcomplex_ids = set(all_enzcomplex_ids) & set(var_dataset.data.keys())
        for enzcomplex_id in var_enzcomplex_ids:
            _, reaction = _get_enzcomplex_reaction(cobrak_model, enzcomplex_id)
            variability_tuple = var_dataset.data[enzcomplex_id]
            min_conc = variability_tuple[0]
            max_conc = variability_tuple[1]
            bg_color = _get_variability_bg_color(min_conc, max_conc)
            enzcomplex_cells[enzcomplex_id].append(
                SpreadsheetCell(min_conc, bg_color=bg_color, border=BORDER_BLACK_LEFT)
            )
            enzcomplex_cells[enzcomplex_id].append(
                SpreadsheetCell(max_conc, bg_color=bg_color)
            )
        missing_enzcomplex_ids = set(all_enzcomplex_ids) - set(var_dataset.data.keys())
        for missing_enzcomplex_id in missing_enzcomplex_ids:
            enzcomplex_cells[missing_enzcomplex_id].append(_get_empty_cell())
            enzcomplex_cells[missing_enzcomplex_id].append(_get_empty_cell())

    # Enzyme complex data
    for opt_dataset_name, opt_dataset in optimization_datasets.items():
        enzcomplex_titles.extend(
            (
                Title(opt_dataset_name, WIDTH_DEFAULT, is_metatitle=True),
                Title("Concentration [mmol⋅gDW⁻¹]", WIDTH_DEFAULT),
                Title("% of pool", WIDTH_DEFAULT),
            )
        )
        opt_enzcomplex_ids = set(all_enzcomplex_ids) & set(opt_dataset.data.keys())
        for enzcomplex_id in opt_enzcomplex_ids:
            _, reaction = _get_enzcomplex_reaction(cobrak_model, enzcomplex_id)
            complexconc = opt_dataset.data[enzcomplex_id]
            pool_pct = (
                100
                * complexconc
                * get_full_enzyme_mw(cobrak_model, reaction)
                / cobrak_model.max_prot_pool
            )
            bg_color = _get_optimization_bg_color(complexconc)
            enzcomplex_cells[enzcomplex_id].append(
                SpreadsheetCell(
                    complexconc, bg_color=bg_color, border=BORDER_BLACK_LEFT
                )
            )
            enzcomplex_cells[enzcomplex_id].append(
                SpreadsheetCell(round(pool_pct, 4), bg_color=bg_color)
            )
        missing_enzcomplex_ids = set(all_enzcomplex_ids) - set(opt_dataset.data.keys())
        for missing_enzcomplex_id in missing_enzcomplex_ids:
            enzcomplex_cells[missing_enzcomplex_id].append(_get_empty_cell())
            enzcomplex_cells[missing_enzcomplex_id].append(_get_empty_cell())

    # Metabolite sheet
    met_titles: list[Title] = [
        Title("ID", WIDTH_DEFAULT),
        Title("Min set concentration [mmol⋅gDW⁻¹⋅h⁻¹)]", WIDTH_DEFAULT),
        Title("Max set concentration [mmolgDW⁻¹⋅h⁻¹)]", WIDTH_DEFAULT),
        Title("Annotation", WIDTH_DEFAULT),
    ]
    met_cells: dict[str, list[str | float | int | bool | None | SpreadsheetCell]] = {
        met_id: [] for met_id in all_met_ids
    }
    # Metabolite data
    for met_id in all_met_ids:
        met: Metabolite = cobrak_model.metabolites[met_id]
        # Met ID
        met_cells[met_id].append(met_id)
        # Min conc
        met_cells[met_id].append(exp(met.log_min_conc))
        # Max conc
        met_cells[met_id].append(exp(met.log_max_conc))
        # Annotation
        met_cells[met_id].append(str(met.annotation))

    # Variability data
    for var_dataset_name, var_dataset in variability_datasets.items():
        met_titles.extend(
            (
                Title(var_dataset_name, WIDTH_DEFAULT, is_metatitle=True),
                Title("Min concentration [mmol⋅gDW⁻¹⋅h⁻¹)]", WIDTH_DEFAULT),
                Title("Max concentration [mmol⋅gDW⁻¹⋅h⁻¹)]", WIDTH_DEFAULT),
            )
        )
        all_met_var_ids = [LNCONC_VAR_PREFIX + met_id for met_id in all_met_ids]
        var_met_ids = set(all_met_var_ids) & set(var_dataset.data.keys())
        for met_var_id in var_met_ids:
            variability_tuple = var_dataset.data[met_var_id]
            min_conc = exp(variability_tuple[0])
            max_conc = exp(variability_tuple[1])
            bg_color = BG_COLOR_RED if min_conc == max_conc else BG_COLOR_GREEN
            met_cells[_get_met_id_from_met_var_id(met_var_id)].append(
                SpreadsheetCell(min_conc, bg_color=bg_color, border=BORDER_BLACK_LEFT)
            )
            met_cells[_get_met_id_from_met_var_id(met_var_id)].append(
                SpreadsheetCell(max_conc, bg_color=bg_color)
            )
        missing_met_var_ids = set(all_met_var_ids) - set(var_dataset.data.keys())
        for missing_met_var_id in missing_met_var_ids:
            met_cells[_get_met_id_from_met_var_id(missing_met_var_id)].append(
                _get_empty_cell()
            )
            met_cells[_get_met_id_from_met_var_id(missing_met_var_id)].append(
                _get_empty_cell()
            )

    # Optimization data
    for opt_dataset_name, opt_dataset in optimization_datasets.items():
        met_titles.extend(
            (
                Title(opt_dataset_name, WIDTH_DEFAULT, is_metatitle=True),
                Title("Concentration [M]", WIDTH_DEFAULT),
                Title("Consumption [mmol⋅gDW⁻¹⋅h⁻¹]", WIDTH_DEFAULT),
                Title("Production [mmol⋅gDW⁻¹⋅h⁻¹]", WIDTH_DEFAULT),
            )
        )
        opt_met_ids = set(all_met_var_ids) & set(opt_dataset.data.keys())
        for met_var_id in opt_met_ids:
            conc = exp(opt_dataset.data[met_var_id])
            consumption, production = get_metabolite_consumption_and_production(
                cobrak_model, _get_met_id_from_met_var_id(met_var_id), opt_dataset.data
            )
            bg_color = _get_optimization_bg_color(consumption)
            met_cells[_get_met_id_from_met_var_id(met_var_id)].append(
                SpreadsheetCell(conc, bg_color=bg_color, border=BORDER_BLACK_LEFT)
            )
            met_cells[_get_met_id_from_met_var_id(met_var_id)].append(
                SpreadsheetCell(consumption, bg_color=bg_color)
            )
            met_cells[_get_met_id_from_met_var_id(met_var_id)].append(
                SpreadsheetCell(production, bg_color=bg_color)
            )
        missing_met_ids = set(all_met_var_ids) - set(opt_dataset.data.keys())
        for missing_met_id in missing_met_ids:
            for _ in range(3):
                met_cells[_get_met_id_from_met_var_id(missing_met_id)].append(
                    _get_empty_cell()
                )

    # κ and γ statistics
    kgstats_titles: list[Title] = [Title("Rank", WIDTH_DEFAULT, is_metatitle=False)]
    kgstats_cells: dict[
        str, list[str | float | int | bool | None | SpreadsheetCell]
    ] = {str(i): [i + 1] for i in range(len(cobrak_model.reactions))}
    for opt_dataset_name, opt_dataset in optimization_datasets.items():
        kgstats_titles.extend(
            (
                Title(opt_dataset_name, WIDTH_DEFAULT, is_metatitle=True),
                Title("Reaction ID", WIDTH_DEFAULT),
                Title("κ", WIDTH_DEFAULT),
                Title("Reaction ID", WIDTH_DEFAULT),
                Title("γ", WIDTH_DEFAULT),
                Title("Reaction ID", WIDTH_DEFAULT),
                Title("ι", WIDTH_DEFAULT),
                Title("Reaction ID", WIDTH_DEFAULT),
                Title("α", WIDTH_DEFAULT),
                Title("Reaction ID", WIDTH_DEFAULT),
                Title(kappa_gamma_iota_alpha_str, WIDTH_DEFAULT),
            )
        )

        _, kappa_stats, gamma_stats, iota_stats, alpha_stats, multiplier_stats = (
            get_df_and_efficiency_factors_sorted_lists(
                cobrak_model,
                opt_dataset.data,
                min_var_value,
            )
        )
        kappa_stats_titles = list(kappa_stats.keys())
        gamma_stats_titles = list(gamma_stats.keys())
        iota_stats_titles = list(iota_stats.keys())
        alpha_stats_titles = list(alpha_stats.keys())
        kappa_times_gamma_stats_titles = list(multiplier_stats.keys())
        for key, cell_list in kgstats_cells.items():
            # κ
            if len(kappa_stats_titles) > int(key):
                cell_list.extend(
                    (
                        kappa_stats_titles[int(key)],
                        kappa_stats[kappa_stats_titles[int(key)]],
                    )
                )
            else:
                cell_list.extend((None, None))
            # γ
            if len(gamma_stats_titles) > int(key):
                cell_list.extend(
                    (
                        gamma_stats_titles[int(key)],
                        gamma_stats[gamma_stats_titles[int(key)]],
                    )
                )
            else:
                cell_list.extend((None, None))
            # ι
            if len(iota_stats_titles) > int(key):
                cell_list.extend(
                    (
                        iota_stats_titles[int(key)],
                        iota_stats[iota_stats_titles[int(key)]],
                    )
                )
            else:
                cell_list.extend((None, None))
            # α
            if len(alpha_stats_titles) > int(key):
                cell_list.extend(
                    (
                        alpha_stats_titles[int(key)],
                        alpha_stats[alpha_stats_titles[int(key)]],
                    )
                )
            else:
                cell_list.extend((None, None))
            # κ⋅γ⋅ι⋅α
            if len(kappa_times_gamma_stats_titles) > int(key):
                cell_list.extend(
                    (
                        kappa_times_gamma_stats_titles[int(key)],
                        multiplier_stats[kappa_times_gamma_stats_titles[int(key)]][0],
                    )
                )
            else:
                cell_list.extend((None, None))

    titles_and_data_dict: dict[
        str,
        tuple[
            list[Title],
            dict[str, list[str | float | int | bool | None | SpreadsheetCell]],
        ],
    ] = {
        "Index": (index_titles, index_cells),
        "A) Optimization statistics": (stats_titles, stats_cells),
        "B) Model settings": (model_titles, model_cells),
        "C) Reactions": (reac_titles, reac_cells),
        "D) Metabolites": (met_titles, met_cells),
        "E) Enzymes": (enzyme_titles, enzyme_cells),
        "F) Complexes": (enzcomplex_titles, enzcomplex_cells),
    }
    if has_any_gamma or has_any_kappa:
        titles_and_data_dict |= {
            "G) Efficiency factor statistics": (kgstats_titles, kgstats_cells),
        }

    # Correction data (if given)
    correction_titles: list[Title] = [
        Title("Affected parameter", WIDTH_DEFAULT),
        Title("Original value", WIDTH_DEFAULT),
    ]
    correction_cells: dict[
        str, list[str | float | int | bool | None | SpreadsheetCell]
    ] = {}

    num_processed_datasets = 0
    for opt_dataset_name, opt_dataset in optimization_datasets.items():
        if not opt_dataset.with_error_corrections:
            num_processed_datasets += 1
            continue
        correction_titles.append(Title(opt_dataset_name, WIDTH_DEFAULT))

        for var_name, var_value in opt_dataset.data.items():
            displayed_var = var_name.replace(ERROR_VAR_PREFIX + "_", "")
            if not var_name.startswith(ERROR_VAR_PREFIX):
                if displayed_var in correction_cells:
                    correction_cells[displayed_var].append(None)
                continue

            round_value = 12
            is_dataset_dependent: bool = False
            is_relative: bool = True
            original_value: float = 0.0
            if "kcat_times_e_" in displayed_var:
                reac_id = displayed_var.split("kcat_times_e_")[1]
                enzyme_id = get_reaction_enzyme_var_id(
                    reac_id, cobrak_model.reactions[reac_id]
                )
                original_value = (
                    cobrak_model.reactions[reac_id].enzyme_reaction_data.k_cat
                    * opt_dataset.data[enzyme_id]
                )
                displayed_original_value = "(see comments)"
                error_value = var_value - original_value
                is_dataset_dependent = True
            elif displayed_var.endswith(("_substrate", "_product")):
                reac_id = displayed_var.split("____")[0]
                met_id = (
                    displayed_var.split("____")[1]
                    .replace("_substrate", "")
                    .replace("_product", "")
                )
                original_value = cobrak_model.reactions[
                    reac_id
                ].enzyme_reaction_data.k_ms[met_id]
                displayed_original_value = original_value
                error_mult = +1 if displayed_var.endswith("_product") else -1
                error_value = error_mult * -(
                    original_value - exp(log(original_value) + error_mult * var_value)
                )
                round_value = 12
                if displayed_var.endswith("_substrate"):
                    print(
                        displayed_var,
                        original_value,
                        error_value,
                        abs(error_value) / original_value,
                    )
            elif displayed_var.startswith("dG0_"):
                reac_id = displayed_var[len("dG0_") :]
                original_value = cobrak_model.reactions[reac_id].dG0
                is_relative = False
            elif displayed_var.endswith(("_plus", "_minus")):
                valueblock = displayed_var.split("_origstart_")[1].split("_origend_")[0]
                min_value = float(valueblock.split("__")[0].replace("-", "."))
                max_value = float(valueblock.split("__")[1].replace("-", "."))
                displayed_original_value = f"({min_value}, {max_value}"
                min_difference = abs(var_value - min_value)
                max_difference = abs(var_value - max_value)
                original_value = (
                    min_value if min_difference < max_difference else max_value
                )
                error_value = min(max_difference, min_difference)
            else:
                continue

            if not is_relative:
                print(displayed_var, error_value, original_value)
            if not (
                is_relative and (error_value / original_value) >= min_rel_correction
            ) or (not is_relative and error_value >= min_var_value):
                if displayed_var in correction_cells:
                    correction_cells[displayed_var].append(None)
                continue

            if displayed_var not in correction_cells:
                correction_cells[displayed_var] = [
                    displayed_var,
                    displayed_original_value,
                ] + [None for _ in range(num_processed_datasets)]
            correction_cells[displayed_var].append(
                f"{round(error_value, round_value)}{f' from {round(original_value, round_value)}' if is_dataset_dependent else ''}"
            )

        num_processed_datasets += 1

    if correction_cells != {}:
        titles_and_data_dict[
            f"{'H' if has_any_alpha or has_any_iota or has_any_gamma or has_any_kappa else 'G'}) Corrections"
        ] = (correction_titles, correction_cells)

    _create_xlsx_from_datadicts(
        path=path,
        titles_and_data_dict=titles_and_data_dict,
    )

sum_concs(result, conc_sum_include_suffixes, conc_sum_ignore_prefixes)

Returns the exponentiated concentration of all metabolites in the result.

Source code in cobrak/spreadsheet_functionality.py
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
@validate_call()
def sum_concs(
    result: dict[str, float],
    conc_sum_include_suffixes: list[str],
    conc_sum_ignore_prefixes: list[str],
) -> float:
    """Returns the exponentiated concentration of all metabolites in  the result."""
    concsum = 0.0
    for key, value in result.items():
        if key.startswith(LNCONC_VAR_PREFIX):
            met_id = key[len(LNCONC_VAR_PREFIX) :]
            if any(met_id.startswith(prefix) for prefix in conc_sum_ignore_prefixes):
                continue
            if not any(met_id.endswith(suffix) for suffix in conc_sum_include_suffixes):
                continue
            concsum += exp(value)
    return concsum

standard_solvers

Includes definitions of some (MI)LP and NLP solvers.

Instead of these pre-definitions, you can also use pyomo's solver definitions.

tellurium_functionality

Functions for exporting COBRA-k model and solution to a kinetic model with the help of Tellurium.

Note: Tellurium's description language for kinetic models is called 'Antimony'.

get_tellurium_string_from_cobrak_model_and_solution(cobrak_model, cell_density, e_concs, met_concs, nlp_results)

Convert a complete COBRA‑k model and its optimisation solution into an Antimony string that can be loaded by Tellurium.

The function iterates over all reactions, skips those with negligible net flux, and concatenates the Antimony fragments produced by :func:_get_reaction_string_of_cobrak_reaction. After processing reactions, it adds definitions for metabolites (either user‑provided concentrations or concentrations inferred from the NLP solution) and the global constants R and T.

Parameters

cobrak_model : Model The COBRA‑k model containing reactions, metabolites, and model‑wide parameters. cell_density : float Cell density (g L⁻¹) used to convert between substance‑only and molar concentrations. e_concs : dict[str, float] Optional enzyme concentrations keyed by reaction ID. Missing entries default to 1.0. met_concs : dict[str, float] Optional metabolite concentrations (mol L⁻¹) keyed by metabolite ID. nlp_results : dict[str, float] Optimisation variables returned by the NLP solver (log‑concentrations, fluxes, etc.).

Returns

str A complete Antimony model string ready for tellurium.loada.

Source code in cobrak/tellurium_functionality.py
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
@validate_call(validate_return=True)
def get_tellurium_string_from_cobrak_model_and_solution(
    cobrak_model: Model,
    cell_density: float,
    e_concs: dict[str, float],
    met_concs: dict[str, float],
    nlp_results: dict[str, float],
) -> str:
    """Convert a complete COBRA‑k model and its optimisation solution into an
    Antimony string that can be loaded by Tellurium.

    The function iterates over all reactions, skips those with negligible net
    flux, and concatenates the Antimony fragments produced by
    :func:`_get_reaction_string_of_cobrak_reaction`.  After processing reactions,
    it adds definitions for metabolites (either user‑provided concentrations or
    concentrations inferred from the NLP solution) and the global constants
    ``R`` and ``T``.

    Parameters
    ----------
    cobrak_model : Model
        The COBRA‑k model containing reactions, metabolites, and model‑wide
        parameters.
    cell_density : float
        Cell density (g L⁻¹) used to convert between substance‑only and molar
        concentrations.
    e_concs : dict[str, float]
        Optional enzyme concentrations keyed by reaction ID. Missing entries
        default to ``1.0``.
    met_concs : dict[str, float]
        Optional metabolite concentrations (mol L⁻¹) keyed by metabolite ID.
    nlp_results : dict[str, float]
        Optimisation variables returned by the NLP solver (log‑concentrations,
        fluxes, etc.).

    Returns
    -------
    str
        A complete Antimony model string ready for ``tellurium.loada``.
    """
    unoptimized_reactions = get_unoptimized_reactions_in_nlp_solution(
        cobrak_model,
        nlp_results,
    )
    tellurium_string = (
        "# General constants\n" + f"R = {STANDARD_R}\n" + f"T = {STANDARD_T}\n"
    )
    for reac_id, cobrak_reaction in cobrak_model.reactions.items():
        if reac_id.endswith(cobrak_model.fwd_suffix):
            reverse_id = reac_id.replace(
                cobrak_model.fwd_suffix, cobrak_model.rev_suffix
            )
        elif reac_id.endswith(cobrak_model.rev_suffix):
            reverse_id = reac_id.replace(
                cobrak_model.rev_suffix, cobrak_model.fwd_suffix
            )
        else:
            reverse_id = ""

        if reverse_id in nlp_results:
            reac_flux = nlp_results[reac_id] - nlp_results[reverse_id]
        else:
            reac_flux = nlp_results[reac_id]

        if reac_flux <= abs(1e-12):
            continue

        e_conc = e_concs.get(reac_id, 1.0)

        tellurium_string += _get_reaction_string_of_cobrak_reaction(
            cobrak_model=cobrak_model,
            reac_id=reac_id,
            cobrak_reaction=cobrak_reaction,
            e_conc=e_conc,
            met_concs=met_concs,
            reac_flux=reac_flux,
            nlp_results=nlp_results,
            kinetic_ignored_metabolites=cobrak_model.kinetic_ignored_metabolites,
            unoptimized_reactions=unoptimized_reactions,
        )

    cobrak_model = delete_orphaned_metabolites_and_enzymes(cobrak_model)

    for unsafe_met_id, metabolite in cobrak_model.metabolites.items():
        original_met_id = unsafe_met_id
        met_id = _get_numbersafe_id(unsafe_met_id)
        if met_id in met_concs:
            tellurium_string += (
                f"\nconst substanceOnly species {met_id} = {met_concs[met_id] * 1_000 / cell_density}"
                f"\n{met_id}_molar := {met_id} * {cell_density / 1_000}\n"
            )
        else:
            if ("x_" + original_met_id in nlp_results) and (
                metabolite.log_min_conc != metabolite.log_max_conc
            ):
                exp_conc = (
                    exp(nlp_results[LNCONC_VAR_PREFIX + original_met_id])
                    * 1_000
                    / cell_density
                )
                tellurium_string += (
                    f"\nsubstanceOnly species {met_id} = {exp_conc}"
                    f"\n{met_id}_molar := {met_id} * {cell_density / 1_000}"
                )
            else:
                prefix = "const " if original_met_id.endswith("_e") else ""
                tellurium_string += (
                    f"\n{prefix}substanceOnly species {met_id} = {exp(metabolite.log_min_conc) * 1_000 / cell_density}"
                    f"\n{met_id}_molar := {met_id} * {cell_density / 1_000}"
                )

    return tellurium_string

write_kinetic_sbml_model_from_cobrak_model_and_solution(sbml_path, cobrak_model, cell_density, e_concs, met_concs, nlp_results)

Export a kinetic model derived from a COBRA‑k model to an SBML file.

The function first builds an Antimony string via :func:get_tellurium_string_from_cobrak_model_and_solution, loads it into a Tellurium RoadRunner instance, and then writes the model to the specified SBML file path.

Parameters

sbml_path : str Destination file path for the SBML document (e.g. "model.xml"). cobrak_model : Model The source COBRA‑k model. cell_density : float Cell density used for concentration conversions. e_concs : dict[str, float] Enzyme concentrations per reaction. met_concs : dict[str, float] Metabolite concentrations per species. nlp_results : dict[str, float] NLP optimisation results (log‑concentrations, fluxes, etc.).

Returns

None The function writes the SBML file as a side effect.

Source code in cobrak/tellurium_functionality.py
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
@validate_call
def write_kinetic_sbml_model_from_cobrak_model_and_solution(
    sbml_path: str,
    cobrak_model: Model,
    cell_density: float,
    e_concs: dict[str, float],
    met_concs: dict[str, float],
    nlp_results: dict[str, float],
) -> None:
    """Export a kinetic model derived from a COBRA‑k model to an SBML file.

    The function first builds an Antimony string via
    :func:`get_tellurium_string_from_cobrak_model_and_solution`, loads it into a
    Tellurium ``RoadRunner`` instance, and then writes the model to the specified
    SBML file path.

    Parameters
    ----------
    sbml_path : str
        Destination file path for the SBML document (e.g. ``"model.xml"``).
    cobrak_model : Model
        The source COBRA‑k model.
    cell_density : float
        Cell density used for concentration conversions.
    e_concs : dict[str, float]
        Enzyme concentrations per reaction.
    met_concs : dict[str, float]
        Metabolite concentrations per species.
    nlp_results : dict[str, float]
        NLP optimisation results (log‑concentrations, fluxes, etc.).

    Returns
    -------
    None
        The function writes the SBML file as a side effect.
    """
    tellurium_string = get_tellurium_string_from_cobrak_model_and_solution(
        cobrak_model=cobrak_model,
        cell_density=cell_density,
        e_concs=e_concs,
        met_concs=met_concs,
        nlp_results=nlp_results,
    )
    tellurium_runner = tellurium.loada(tellurium_string)
    tellurium_runner.exportToSBML(sbml_path)

thermokinetic_data_retrieval

Functions for directly retrieving thermokinetic data for and into a COBRA-k Model instance.

add_enzyme_reaction_data_to_cobrak_model(cobrak_model, enzyme_reaction_data, delete_old_enzyme_reaction_data=False, overwrite_existing_enzyme_reaction_data=True)

Insert pre‑computed :class:EnzymeReactionData objects into a model

Model The model to be updated. enzyme_reaction_data : dict[str, EnzymeReactionData] Mapping from reaction IDs to enzyme reaction data objects. delete_old_enzyme_reaction_data : bool, default False If True, remove any existing data before inserting new data. overwrite_existing_enzyme_reaction_data : bool, default True Overwrite existing data when a matching reaction ID is found.

Returns

Model The model with updated enzyme reaction data.

Source code in cobrak/thermokinetic_data_retrieval.py
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
@validate_call(validate_return=True)
def add_enzyme_reaction_data_to_cobrak_model(
    cobrak_model: Model,
    enzyme_reaction_data: dict[str, EnzymeReactionData],
    delete_old_enzyme_reaction_data: bool = False,
    overwrite_existing_enzyme_reaction_data: bool = True,
) -> Model:
    """Insert pre‑computed :class:`EnzymeReactionData` objects into a model

    Model
        The model to be updated.
    enzyme_reaction_data : dict[str, EnzymeReactionData]
        Mapping from reaction IDs to enzyme reaction data objects.
    delete_old_enzyme_reaction_data : bool, default False
        If True, remove any existing data before inserting new data.
    overwrite_existing_enzyme_reaction_data : bool, default True
        Overwrite existing data when a matching reaction ID is found.

    Returns
    -------
    Model
        The model with updated enzyme reaction data.
    """
    for reac_id, reaction in cobrak_model.reactions.items():
        if (reaction.enzyme_reaction_data is not None) and (
            not overwrite_existing_enzyme_reaction_data
        ):
            continue
        if (reaction.enzyme_reaction_data is not None) and (
            delete_old_enzyme_reaction_data
        ):
            reaction.enzyme_reaction_data = None
        if reac_id in enzyme_reaction_data:
            reaction.enzyme_reaction_data = enzyme_reaction_data[reac_id]
    return cobrak_model

add_thermokinetic_data_to_cobrak_model(cobrak_model, mws={}, sequences={}, kcats={}, kms={}, kis={}, kas={}, dG0s={}, dG0_uncertainties={}, conc_ranges={}, delete_old_dG0s=False, overwrite_existing_dG0s=True, overwrite_existing_enzyme_reaction_data=True)

Populate a COBRA-k model with thermodynamic and kinetic parameters.

Parameters

cobrak_model : Model The model to be updated. mws : dict[str, float], optional Molecular weights for enzymes keyed by enzyme ID. sequences: dict[str, str], optional Protein sequences kcats : dict[str, float], optional kcat values keyed by reaction ID. kms : dict[str, dict[str, float]], optional Michaelis‑Menten constants keyed by reaction ID then metabolite ID. kis : dict[str, dict[str, float]], optional Inhibition constants keyed by reaction ID then metabolite ID. kas : dict[str, dict[str, float]], optional Activation constants keyed by reaction ID then metabolite ID. dG0s : dict[str, float], optional Standard Gibbs free energies keyed by reaction ID. conc_ranges : dict[str, tuple[float, float]], optional Log‑concentration bounds for metabolites keyed by metabolite ID. delete_old_dG0s : bool, default False If True, remove any existing dG0 values before adding new ones. overwrite_existing_dG0s : bool, default True Overwrite existing dG0 values when a new value is supplied. overwrite_existing_enzyme_reaction_data : bool, default True Overwrite existing enzyme reaction data when new data is supplied.

Returns

Model The updated model instance.

Source code in cobrak/thermokinetic_data_retrieval.py
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
@validate_call(validate_return=True)
def add_thermokinetic_data_to_cobrak_model(
    cobrak_model: Model,
    mws: dict[str, float] = {},
    sequences: dict[str, str] = {},
    kcats: dict[str, float] = {},
    kms: dict[str, dict[str, float]] = {},
    kis: dict[str, dict[str, float]] = {},
    kas: dict[str, dict[str, float]] = {},
    dG0s: dict[str, float] = {},
    dG0_uncertainties: dict[str, float] = {},
    conc_ranges: dict[str, tuple[float, float]] = {},
    delete_old_dG0s: bool = False,
    overwrite_existing_dG0s: bool = True,
    overwrite_existing_enzyme_reaction_data: bool = True,
) -> Model:
    """Populate a COBRA-k model with thermodynamic and kinetic parameters.

    Parameters
    ----------
    cobrak_model : Model
        The model to be updated.
    mws : dict[str, float], optional
        Molecular weights for enzymes keyed by enzyme ID.
    sequences: dict[str, str], optional
        Protein sequences
    kcats : dict[str, float], optional
        kcat values keyed by reaction ID.
    kms : dict[str, dict[str, float]], optional
        Michaelis‑Menten constants keyed by reaction ID then metabolite ID.
    kis : dict[str, dict[str, float]], optional
        Inhibition constants keyed by reaction ID then metabolite ID.
    kas : dict[str, dict[str, float]], optional
        Activation constants keyed by reaction ID then metabolite ID.
    dG0s : dict[str, float], optional
        Standard Gibbs free energies keyed by reaction ID.
    conc_ranges : dict[str, tuple[float, float]], optional
        Log‑concentration bounds for metabolites keyed by metabolite ID.
    delete_old_dG0s : bool, default False
        If True, remove any existing dG0 values before adding new ones.
    overwrite_existing_dG0s : bool, default True
        Overwrite existing dG0 values when a new value is supplied.
    overwrite_existing_enzyme_reaction_data : bool, default True
        Overwrite existing enzyme reaction data when new data is supplied.

    Returns
    -------
    Model
        The updated model instance.
    """
    # Molecular weights
    for enzyme_id, enzyme in cobrak_model.enzymes.items():
        if enzyme_id in mws:
            enzyme.molecular_weight = mws[enzyme_id]
        if enzyme_id in sequences:
            enzyme.sequence = sequences[enzyme_id]

    # dG0s, kcats, kms, kis and kas
    for reac_id, reaction in cobrak_model.reactions.items():
        # dG0s
        if delete_old_dG0s:
            reaction.dG0 = None
        elif (overwrite_existing_dG0s or reaction.dG0 is None) and (reac_id in dG0s):
            reaction.dG0 = dG0s[reac_id]
            if reac_id in dG0_uncertainties:
                reaction.dG0_uncertainty = dG0_uncertainties[reac_id]

        # enzyme_reaction_data
        if delete_old_dG0s:
            reaction.enzyme_reaction_data = None
        elif (
            overwrite_existing_enzyme_reaction_data
            or reaction.enzyme_reaction_data is None
        ):
            if reaction.enzyme_reaction_data is None:
                continue
            if reac_id in kcats:
                reaction.enzyme_reaction_data.k_cat = kcats[reac_id]
            if reac_id in kms:
                for met_id, value in kms[reac_id].items():
                    reaction.enzyme_reaction_data.kms[met_id] = value
            if reac_id in kis:
                for met_id, value in kis[reac_id].items():
                    reaction.enzyme_reaction_data.kis[met_id] = value
            if reac_id in kas:
                for met_id, value in kas[reac_id].items():
                    reaction.enzyme_reaction_data.kas[met_id] = value

    # concentration ranges
    for met_id, (min_log_conc, max_log_conc) in conc_ranges.items():
        if met_id not in cobrak_model.metabolites:
            continue
        cobrak_model.metabolites[met_id].min_log_conc = min_log_conc
        cobrak_model.metabolites[met_id].max_log_conc = max_log_conc

    return cobrak_model

automatically_add_database_thermokinetic_data_to_cobrak_model(cobrak_model, database_data_path, brenda_version, base_species, do_delete_enzymatically_suboptimal_reactions=True, use_brenda=True, use_sabio_rk=True, prefer_brenda=False, use_ec_number_transfers=True, max_taxonomy_level=1000, kinetic_ignored_enzyme_ids=['s0001'], inner_to_outer_compartments=EC_INNER_TO_OUTER_COMPARTMENTS, phs=EC_PHS, pmgs=EC_PMGS, ionic_strenghts=EC_IONIC_STRENGTHS, potential_differences=EC_POTENTIAL_DIFFERENCES, calculate_multicompartmental_dG0s=True, dG0_exclusion_prefixes=[], dG0_exclusion_inner_parts=[], ignore_dG0_uncertainty=False, max_dG0_uncertainty=1000.0, add_dG0_uncertainties=True, add_hill_coefficients=True, add_protein_sequences=False, kis_and_kas_only_for_same_compartments=False, add_molar_masses=True)

Retrieve kinetic and thermodynamic data from external databases and add them to a model.

Parameters

cobrak_model : Model The model to be enriched. database_data_path : str Path to the folder containing the required database files. use_brenda : bool, default True Include BRENDA data if True. use_sabio_rk : bool, default True Include SABIO-RK data if True. prefer_brenda : bool, default True When both databases provide data, give precedence to BRENDA. use_ec_number_transfers : bool, default True Use EC-number transfer mappings when searching for data. max_taxonomy_level : int, default 1000 Maximum taxonomic distance allowed for data transfer. kinetic_ignored_enzyme_ids : list[str], default ["s0001"] Enzyme IDs to be ignored during kinetic data retrieval. add_protein_sequences: bool, default False Whether or not protein sequences shall be read out kis_and_kas_only_for_same_compartments: bool, default False If True, kis and kas can only be attributed to a reaction if the affected metabolite has shares one of the reaction metabolite's compartments add_molar_masses: bool, default True Whether or not to calculate molar masses for all metabolites through their formula member variable

Returns

Model The model populated with database-derived thermokinetic data.

Source code in cobrak/thermokinetic_data_retrieval.py
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
@validate_call(validate_return=True)
def automatically_add_database_thermokinetic_data_to_cobrak_model(
    cobrak_model: Model,
    database_data_path: str,
    brenda_version: str,
    base_species: str,
    do_delete_enzymatically_suboptimal_reactions: bool = True,
    use_brenda: bool = True,
    use_sabio_rk: bool = True,
    prefer_brenda: bool = False,
    use_ec_number_transfers: bool = True,
    max_taxonomy_level: int = 1_000,
    kinetic_ignored_enzyme_ids: list[str] = ["s0001"],
    inner_to_outer_compartments: list[str] = EC_INNER_TO_OUTER_COMPARTMENTS,
    phs: dict[str, float] = EC_PHS,
    pmgs: dict[str, float] = EC_PMGS,
    ionic_strenghts: dict[str, float] = EC_IONIC_STRENGTHS,
    potential_differences: dict[tuple[str, str], float] = EC_POTENTIAL_DIFFERENCES,
    calculate_multicompartmental_dG0s: bool = True,
    dG0_exclusion_prefixes: list[str] = [],
    dG0_exclusion_inner_parts: list[str] = [],
    ignore_dG0_uncertainty: bool = False,
    max_dG0_uncertainty: float = 1_000.0,
    add_dG0_uncertainties: bool = True,
    add_hill_coefficients: bool = True,
    add_protein_sequences: bool = False,
    kis_and_kas_only_for_same_compartments: bool = False,
    add_molar_masses: bool = True,
) -> Model:
    """Retrieve kinetic and thermodynamic data from external databases and add them to a model.

    Parameters
    ----------
    cobrak_model : Model
        The model to be enriched.
    database_data_path : str
        Path to the folder containing the required database files.
    use_brenda : bool, default True
        Include BRENDA data if True.
    use_sabio_rk : bool, default True
        Include SABIO-RK data if True.
    prefer_brenda : bool, default True
        When both databases provide data, give precedence to BRENDA.
    use_ec_number_transfers : bool, default True
        Use EC-number transfer mappings when searching for data.
    max_taxonomy_level : int, default 1000
        Maximum taxonomic distance allowed for data transfer.
    kinetic_ignored_enzyme_ids : list[str], default ["s0001"]
        Enzyme IDs to be ignored during kinetic data retrieval.
    add_protein_sequences: bool, default False
        Whether or not protein sequences shall be read out
    kis_and_kas_only_for_same_compartments: bool, default False
        If True, kis and kas can only be attributed to a reaction if the affected metabolite has
        shares one of the reaction metabolite's compartments
    add_molar_masses: bool, default True
        Whether or not to calculate molar masses for all metabolites through their formula member variable

    Returns
    -------
    Model
        The model populated with database-derived thermokinetic data.
    """
    database_data_path = standardize_folder(database_data_path)

    if not exists(f"{database_data_path}_cache_enzyme_reaction_data.json"):
        enzyme_reaction_data = get_database_kcats_kms_kis_and_kas_for_cobrak_model(
            cobrak_model=cobrak_model,
            database_data_path=database_data_path,
            use_brenda=use_brenda,
            use_sabio_rk=use_sabio_rk,
            base_species=base_species,
            brenda_version=brenda_version,
            prefer_brenda=prefer_brenda,
            use_ec_number_transfers=use_ec_number_transfers,
            max_taxonomy_level=max_taxonomy_level,
            kinetic_ignored_enzyme_ids=kinetic_ignored_enzyme_ids,
            add_hill_coefficients=add_hill_coefficients,
            kis_and_kas_only_for_same_compartments=kis_and_kas_only_for_same_compartments,
        )
        json_write(
            f"{database_data_path}_cache_enzyme_reaction_data.json",
            enzyme_reaction_data,
        )
    else:
        enzyme_reaction_data = json_load(
            f"{database_data_path}_cache_enzyme_reaction_data.json",
            dict[str, EnzymeReactionData],
        )
    cobrak_model = add_enzyme_reaction_data_to_cobrak_model(
        cobrak_model=cobrak_model,
        enzyme_reaction_data=enzyme_reaction_data,
    )

    mws = get_database_mws_for_cobrak_model(
        cobrak_model=cobrak_model,
        base_species=base_species,
        database_data_path=database_data_path,
    )
    json_write(f"{database_data_path}_cache_uniprot_molecular_weights.json", mws)

    if add_protein_sequences:
        sequences = get_database_protein_sequences_for_cobrak_model(
            cobrak_model=cobrak_model,
            base_species=base_species,
            database_data_path=database_data_path,
        )
        json_write(f"{database_data_path}_cache_uniprot_sequences.json", sequences)
    else:
        sequences = {}

    if not exists(f"{database_data_path}_cache_dG0.json") or not exists(
        f"{database_data_path}_cache_dG0_uncertainties.json"
    ):
        dG0s, dG0_uncertainties = get_database_dG0s_for_cobrak_model(
            cobrak_model=cobrak_model,
            inner_to_outer_compartments=inner_to_outer_compartments,
            phs=phs,
            pmgs=pmgs,
            ionic_strenghts=ionic_strenghts,
            potential_differences=potential_differences,
            calculate_multicompartmental=calculate_multicompartmental_dG0s,
            exclusion_prefixes=dG0_exclusion_prefixes,
            exclusion_inner_parts=dG0_exclusion_inner_parts,
            ignore_uncertainty=ignore_dG0_uncertainty,
            max_uncertainty=max_dG0_uncertainty,
        )
        json_write(f"{database_data_path}_cache_dG0.json", dG0s)
        json_write(
            f"{database_data_path}_cache_dG0_uncertainties.json", dG0_uncertainties
        )
    else:
        dG0s = json_load(f"{database_data_path}_cache_dG0.json", dict[str, float])
        dG0_uncertainties = json_load(
            f"{database_data_path}_cache_dG0_uncertainties.json", dict[str, float]
        )
    cobrak_model = add_thermokinetic_data_to_cobrak_model(
        cobrak_model=cobrak_model,
        mws=mws,
        sequences=sequences,
        dG0s=dG0s,
        dG0_uncertainties=dG0_uncertainties if add_dG0_uncertainties else {},
    )
    if add_molar_masses:
        cobrak_model = add_molar_masses_to_model_metabolites(cobrak_model)
    if do_delete_enzymatically_suboptimal_reactions:
        return delete_enzymatically_suboptimal_reactions_in_cobrak_model(cobrak_model)

    return cobrak_model

get_database_dG0s_for_cobrak_model(cobrak_model, inner_to_outer_compartments=EC_INNER_TO_OUTER_COMPARTMENTS, phs=EC_PHS, pmgs=EC_PMGS, ionic_strenghts=EC_IONIC_STRENGTHS, potential_differences=EC_POTENTIAL_DIFFERENCES, calculate_multicompartmental=True, exclusion_prefixes=[], exclusion_inner_parts=[], ignore_uncertainty=False, max_uncertainty=1000.0)

Compute standard Gibbs free energies (and uncertainties) for all reactions in a model.

Parameters

cobrak_model : Model The model for which dG⁰ values are to be calculated. inner_to_outer_compartments : list[str], optional Mapping of inner to outer compartments for multi‑compartment calculations. phs : dict[str, float], optional pH values per compartment. pmgs : dict[str, float], optional Proton‑motives per compartment. ionic_strenghts : dict[str, float], optional Ionic strength per compartment. potential_differences : dict[str, float], optional Electrical potential differences per compartment. calculate_multicompartmental : bool, default True Whether to compute dG⁰ for reactions spanning multiple compartments. exclusion_prefixes : list[str], optional Reaction ID prefixes to exclude from calculation. exclusion_inner_parts : list[str], optional Inner compartment identifiers to exclude. ignore_uncertainty : bool, default False If True, uncertainties are not calculated. max_uncertainty : float, default 1000.0 Upper bound for acceptable uncertainty; reactions exceeding this are omitted.

Returns

tuple[dict[str, float], dict[str, float]] Two dictionaries mapping reaction IDs to dG⁰ values and to uncertainties, respectively.

Source code in cobrak/thermokinetic_data_retrieval.py
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
@validate_call(validate_return=True)
def get_database_dG0s_for_cobrak_model(
    cobrak_model: Model,
    inner_to_outer_compartments: list[str] = EC_INNER_TO_OUTER_COMPARTMENTS,
    phs: dict[str, float] = EC_PHS,
    pmgs: dict[str, float] = EC_PMGS,
    ionic_strenghts: dict[str, float] = EC_IONIC_STRENGTHS,
    potential_differences: dict[tuple[str, str], float] = EC_POTENTIAL_DIFFERENCES,
    calculate_multicompartmental: bool = True,
    exclusion_prefixes: list[str] = [],
    exclusion_inner_parts: list[str] = [],
    ignore_uncertainty: bool = False,
    max_uncertainty: float = 1_000.0,
) -> tuple[dict[str, float], dict[str, float]]:
    """Compute standard Gibbs free energies (and uncertainties) for all reactions in a model.

    Parameters
    ----------
    cobrak_model : Model
        The model for which dG⁰ values are to be calculated.
    inner_to_outer_compartments : list[str], optional
        Mapping of inner to outer compartments for multi‑compartment calculations.
    phs : dict[str, float], optional
        pH values per compartment.
    pmgs : dict[str, float], optional
        Proton‑motives per compartment.
    ionic_strenghts : dict[str, float], optional
        Ionic strength per compartment.
    potential_differences : dict[str, float], optional
        Electrical potential differences per compartment.
    calculate_multicompartmental : bool, default True
        Whether to compute dG⁰ for reactions spanning multiple compartments.
    exclusion_prefixes : list[str], optional
        Reaction ID prefixes to exclude from calculation.
    exclusion_inner_parts : list[str], optional
        Inner compartment identifiers to exclude.
    ignore_uncertainty : bool, default False
        If True, uncertainties are not calculated.
    max_uncertainty : float, default 1000.0
        Upper bound for acceptable uncertainty; reactions exceeding this are omitted.

    Returns
    -------
    tuple[dict[str, float], dict[str, float]]
        Two dictionaries mapping reaction IDs to dG⁰ values and to uncertainties,
        respectively.
    """
    with TemporaryDirectory() as tmpdict:
        sbml_path = tmpdict + "temp.xml"
        save_cobrak_model_as_annotated_sbml_model(
            cobrak_model=cobrak_model,
            filepath=sbml_path,
        )
        return equilibrator_get_model_dG0_and_uncertainty_values_for_sbml(
            sbml_path=sbml_path,
            inner_to_outer_compartments=inner_to_outer_compartments,
            phs=phs,
            pmgs=pmgs,
            ionic_strengths=ionic_strenghts,
            potential_differences=potential_differences,
            exclusion_prefixes=exclusion_prefixes,
            exclusion_inner_parts=exclusion_inner_parts,
            ignore_uncertainty=ignore_uncertainty,
            max_uncertainty=max_uncertainty,
            calculate_multicompartmental=calculate_multicompartmental,
        )

get_database_kcats_kms_kis_and_kas_for_cobrak_model(cobrak_model, database_data_path, brenda_version, base_species, use_brenda=True, use_sabio_rk=True, prefer_brenda=False, use_ec_number_transfers=True, max_taxonomy_level=1000, kinetic_ignored_enzyme_ids=['s0001'], add_hill_coefficients=True, kis_and_kas_only_for_same_compartments=False)

Query BRENDA and/or SABIO‑RK for kinetic parameters and return (if given) a unified dataset.

Parameters

cobrak_model : Model The model for which kinetic data are required. database_data_path : str Directory containing the database files. use_brenda : bool, default True Retrieve data from BRENDA if True. use_sabio_rk : bool, default True Retrieve data from SABIO‑RK if True. prefer_brenda : bool, default False When both sources contain data for a reaction, keep BRENDA's values. use_ec_number_transfers : bool, default True Apply EC‑number transfer mappings when searching for data. max_taxonomy_level : NonNegativeInt, default 1000 Maximum allowed taxonomic distance for data transfer. kinetic_ignored_enzyme_ids : list[str], default ["s0001"] Enzyme IDs to be excluded from kinetic data retrieval. kis_and_kas_only_for_same_compartments: bool, default False If True, kis and kas can only be attributed to a reaction if the affected metabolite has shares one of the reaction metabolite's compartments

Returns

dict[str, EnzymeReactionData] Mapping from reaction IDs to populated :class:EnzymeReactionData objects.

Source code in cobrak/thermokinetic_data_retrieval.py
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
@validate_call(validate_return=True)
def get_database_kcats_kms_kis_and_kas_for_cobrak_model(
    cobrak_model: Model,
    database_data_path: str,
    brenda_version: str,
    base_species: str,
    use_brenda: bool = True,
    use_sabio_rk: bool = True,
    prefer_brenda: bool = False,
    use_ec_number_transfers: bool = True,
    max_taxonomy_level: NonNegativeInt = 1_000,
    kinetic_ignored_enzyme_ids: list[str] = ["s0001"],
    add_hill_coefficients: bool = True,
    kis_and_kas_only_for_same_compartments: bool = False,
) -> dict[str, EnzymeReactionData]:
    """Query BRENDA and/or SABIO‑RK for kinetic parameters and return (if given) a unified dataset.

    Parameters
    ----------
    cobrak_model : Model
        The model for which kinetic data are required.
    database_data_path : str
        Directory containing the database files.
    use_brenda : bool, default True
        Retrieve data from BRENDA if True.
    use_sabio_rk : bool, default True
        Retrieve data from SABIO‑RK if True.
    prefer_brenda : bool, default False
        When both sources contain data for a reaction, keep BRENDA's values.
    use_ec_number_transfers : bool, default True
        Apply EC‑number transfer mappings when searching for data.
    max_taxonomy_level : NonNegativeInt, default 1000
        Maximum allowed taxonomic distance for data transfer.
    kinetic_ignored_enzyme_ids : list[str], default ["s0001"]
        Enzyme IDs to be excluded from kinetic data retrieval.
    kis_and_kas_only_for_same_compartments: bool, default False
        If True, kis and kas can only be attributed to a reaction if the affected metabolite has
        shares one of the reaction metabolite's compartments

    Returns
    -------
    dict[str, EnzymeReactionData]
        Mapping from reaction IDs to populated :class:`EnzymeReactionData` objects.
    """
    database_data_path = standardize_folder(database_data_path)
    if not use_brenda and not use_sabio_rk:
        print(
            "ERROR: Arguments use_brenda and use_sabio_rk are both False, but at least one of the databases has to be used"
        )
        raise ValueError

    if use_ec_number_transfers:
        transfer_json_path = f"{database_data_path}ec_number_transfers.json"
        if not exists(transfer_json_path):
            if not exists(f"{database_data_path}enzyme.rdf"):
                print(
                    f"ERROR: Argument use_ec_number_transfers is True, but no necessary enzyme.rdf can be found in {database_data_path}"
                )
                print(
                    "You may download it from https://ftp.expasy.org/databases/enzyme/"
                )
                print(f"After downloading, put it into the folder {database_data_path}")
            ec_number_transfers = get_ec_number_transfers(
                f"{database_data_path}enzyme.rdf"
            )
            json_write(transfer_json_path, ec_number_transfers)
    else:
        transfer_json_path = ""

    parse_external_resources(
        path=database_data_path,
        brenda_version=brenda_version,
        parse_brenda=use_brenda,
    )

    with TemporaryDirectory() as tmpdict:
        sbml_path = tmpdict + "temp.xml"
        save_cobrak_model_as_annotated_sbml_model(
            cobrak_model=cobrak_model,
            filepath=sbml_path,
        )

        brenda_enzyme_reaction_data = brenda_select_enzyme_kinetic_data_for_sbml(
            sbml_path=sbml_path,
            brenda_json_targz_file_path=f"{database_data_path}brenda_{brenda_version}.json.tar.gz",
            bigg_metabolites_json_path=f"{database_data_path}bigg_models_metabolites.json",
            brenda_version=brenda_version,
            base_species=base_species,
            ncbi_parsed_json_path=f"{database_data_path}parsed_taxdmp.json",
            kinetic_ignored_metabolites=cobrak_model.kinetic_ignored_metabolites,
            kinetic_ignored_enzyme_ids=kinetic_ignored_enzyme_ids,
            transfered_ec_number_json=transfer_json_path,
            max_taxonomy_level=max_taxonomy_level,
            kis_and_kas_only_for_same_compartments=kis_and_kas_only_for_same_compartments,
        )

        sabio_enzyme_reaction_data = sabio_select_enzyme_kinetic_data_for_sbml(
            sbml_path=sbml_path,
            sabio_target_folder=database_data_path,
            bigg_metabolites_json_path=f"{database_data_path}bigg_models_metabolites.json",
            base_species="Escherichia coli",
            ncbi_parsed_json_path=f"{database_data_path}parsed_taxdmp.json",
            kinetic_ignored_metabolites=cobrak_model.kinetic_ignored_metabolites,
            kinetic_ignored_enzyme_ids=kinetic_ignored_enzyme_ids,
            transfered_ec_number_json=transfer_json_path,
            max_taxonomy_level=max_taxonomy_level,
            add_hill_coefficients=add_hill_coefficients,
            kis_and_kas_only_for_same_compartments=kis_and_kas_only_for_same_compartments,
        )

    if use_brenda and use_sabio_rk:
        return combine_enzyme_reaction_datasets(
            [brenda_enzyme_reaction_data, sabio_enzyme_reaction_data]
            if prefer_brenda
            else [sabio_enzyme_reaction_data, brenda_enzyme_reaction_data],
        )
    if use_brenda:
        return sabio_enzyme_reaction_data
    return brenda_enzyme_reaction_data

get_database_mws_for_cobrak_model(cobrak_model, base_species, database_data_path='')

Retrieve enzyme molecular weights from UniProt for a given model.

Parameters

cobrak_model : Model The model whose enzymes require molecular weights. database_data_path : str, optional Base path for caching UniProt queries (default empty string).

Returns

dict[str, float] Mapping from enzyme IDs to molecular weight values (Daltons).

Source code in cobrak/thermokinetic_data_retrieval.py
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
@validate_call(validate_return=True)
def get_database_mws_for_cobrak_model(
    cobrak_model: Model,
    base_species: str,
    database_data_path: str = "",
) -> dict[str, float]:
    """Retrieve enzyme molecular weights from UniProt for a given model.

    Parameters
    ----------
    cobrak_model : Model
        The model whose enzymes require molecular weights.
    database_data_path : str, optional
        Base path for caching UniProt queries (default empty string).

    Returns
    -------
    dict[str, float]
        Mapping from enzyme IDs to molecular weight values (Daltons).
    """
    database_data_path = standardize_folder(database_data_path)
    with TemporaryDirectory() as tmpdict:
        sbml_path = tmpdict + "temp.xml"
        save_cobrak_model_as_annotated_sbml_model(
            cobrak_model=cobrak_model,
            filepath=sbml_path,
        )
        return uniprot_get_enzyme_molecular_weights_for_sbml(
            sbml_path=sbml_path,
            cache_basepath=database_data_path,
            base_species=base_species,
        )

get_database_protein_sequences_for_cobrak_model(cobrak_model, base_species, database_data_path='')

Retrieve enzyme sequences from UniProt for a given model.

Parameters

cobrak_model : Model The model whose enzymes require molecular weights. database_data_path : str, optional Base path for caching UniProt queries (default empty string).

Returns

dict[str, str] Mapping from enzyme IDs to sequences.

Source code in cobrak/thermokinetic_data_retrieval.py
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
@validate_call(validate_return=True)
def get_database_protein_sequences_for_cobrak_model(
    cobrak_model: Model,
    base_species: str,
    database_data_path: str = "",
) -> dict[str, str]:
    """Retrieve enzyme sequences from UniProt for a given model.

    Parameters
    ----------
    cobrak_model : Model
        The model whose enzymes require molecular weights.
    database_data_path : str, optional
        Base path for caching UniProt queries (default empty string).

    Returns
    -------
    dict[str, str]
        Mapping from enzyme IDs to sequences.
    """
    with TemporaryDirectory() as tmpdict:
        sbml_path = tmpdict + "temp.xml"
        save_cobrak_model_as_annotated_sbml_model(
            cobrak_model=cobrak_model,
            filepath=sbml_path,
        )
        database_data_path = standardize_folder(database_data_path)
        return uniprot_get_enzyme_sequences_for_sbml(
            sbml_path=sbml_path,
            cache_basepath=database_data_path,
            base_species=base_species,
        )

uniprot_functionality

get_protein_mass_mapping.py

Functions for the generation of a model's mapping of its proteins and their masses.

uniprot_get_enzyme_molecular_weights_for_sbml(sbml_path, cache_basepath, base_species, multiplication_factor=1 / 1000)

Returns a JSON with a mapping of protein IDs as keys, and as values the protein mass in kDa.

The protein masses are taken from UniProt (retrieved using UniProt's REST API).

Arguments
  • sbml_path: str ~ The SBML's file path
Output

A JSON file with the path project_folder+project_name+'_protein_id_mass_mapping.json' and the following structure:

{
    "$PROTEIN_ID": $PROTEIN_MASS_IN_KDA,
    (...),
}
Source code in cobrak/uniprot_functionality.py
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
@validate_call(config=ConfigDict(arbitrary_types_allowed=True), validate_return=True)
def uniprot_get_enzyme_molecular_weights_for_sbml(
    sbml_path: str,
    cache_basepath: str,
    base_species: str,
    multiplication_factor: float = 1 / 1000,
) -> dict[str, float]:
    """Returns a JSON with a mapping of protein IDs as keys, and as values the protein mass in kDa.

    The protein masses are taken  from UniProt (retrieved using
    UniProt's REST API).

    Arguments
    ----------
    * sbml_path: str ~ The SBML's file path

    Output
    ----------
    A JSON file with the path project_folder+project_name+'_protein_id_mass_mapping.json'
    and the following structure:
    <pre>
    {
        "$PROTEIN_ID": $PROTEIN_MASS_IN_KDA,
        (...),
    }
    </pre>
    """
    model = cobra.io.read_sbml_model(sbml_path)
    uniprot_id_protein_id_mapping = _get_uniprot_id_protein_id_mapping(model)

    # GET UNIPROT ID<->PROTEIN MASS MAPPING
    uniprot_id_protein_mass_mapping: dict[str, float] = {}
    # The cache stored UniProt masses for already searched
    # UniProt IDs (each file in the cache folder has the name
    # of the corresponding UniProt ID). This prevents searching
    # UniProt for already found protein masses. :-)
    cache_basepath = standardize_folder(cache_basepath)
    ensure_folder_existence(cache_basepath)
    cache_filepath = f"{cache_basepath}_cache_uniprot_molecular_weights.json"
    try:
        cache_json: dict[str, float] = json_load(cache_filepath, dict[str, float])
    except Exception:
        cache_json: dict[str, float] = {}
    original_cache_json_keys = deepcopy(list(cache_json.keys()))
    # Go through each batch of UniProt IDs (multiple UniProt IDs
    # are searched at once in order to save an amount of UniProt API calls)
    # and retrieve their masses.
    print("Starting UniProt ID<->Protein mass search using UniProt API...")
    uniprot_ids = list(uniprot_id_protein_id_mapping.keys())

    batch_size = 12
    batch_start = 0
    while batch_start < len(uniprot_ids):
        # Create the batch with all UniProt IDs
        prebatch = uniprot_ids[batch_start : batch_start + batch_size]
        batch = []
        # Remove all IDs which are present in the cache (i.e.,
        # which were searched for already).
        # The cache consists of pickled protein mass floats, each
        # onein a file with the name of the associated protein.
        for uniprot_id in prebatch:
            if uniprot_id not in cache_json:
                batch.append(uniprot_id)
            else:
                uniprot_id_protein_mass_mapping[uniprot_id] = cache_json[uniprot_id]

        # If all IDs could be found in the cache, continue with the next batch.
        if len(batch) == 0:
            batch_start += batch_size
            continue

        # Create the UniProt query for the batch
        # With 'OR', all given IDs are searched, and subsequently in this script,
        # the right associated masses are being picked.
        query = " OR ".join(batch)
        uniprot_query_url = f"https://rest.uniprot.org/uniprotkb/search?query={query}&format=tsv&fields=accession,id,mass,gene_names,gene_orf,gene_oln,organism_name"
        print(f"UniProt batch search for: {query}")

        # Call UniProt's API :-)
        uniprot_data: list[str] = requests.get(
            uniprot_query_url, timeout=1e6
        ).text.split("\n")
        # Wait in order to cool down their server :-)
        time.sleep(1.0)

        # Read out the API-returned lines
        found_ids = []
        for line in uniprot_data[1:]:
            if not line:
                continue
            accession_id = line.split("\t")[0].lstrip().rstrip()
            entry_id = line.split("\t")[1].lstrip().rstrip()
            mass_string = line.split("\t")[2].lstrip().rstrip()
            gene_names = line.split("\t")[3].lstrip().rstrip().split(" ")
            gene_names_orf = line.split("\t")[4].lstrip().rstrip().split(" ")
            gene_names_ordered_locus = line.split("\t")[5].lstrip().rstrip().split(" ")
            organism_name = line.split("\t")[6].lstrip().rstrip()
            if base_species.lower() not in organism_name.lower():
                continue
            try:
                # Note that the mass entry from UniProt uses a comma as a thousand separator, so it has to be removed before parsing
                mass = float(mass_string.replace(",", ""))
            except ValueError:  # We may also risk the entry is missing
                continue
            uniprot_id_protein_mass_mapping[accession_id] = float(mass)
            uniprot_id_protein_mass_mapping[entry_id] = float(mass)
            for extraname in [
                extraname
                for extraname in gene_names + gene_names_orf + gene_names_ordered_locus
                if len(extraname) > 0
            ]:
                uniprot_id_protein_mass_mapping[extraname] = float(mass)
                found_ids.append(extraname)
            found_ids.extend((accession_id, entry_id))

        # Continue with the next batch :D
        batch_start += batch_size

    # Create the final protein ID <-> mass mapping
    protein_id_mass_mapping: dict[str, float] = {}
    not_found_ids = set(uniprot_ids) - set(cache_json.keys())
    if len(not_found_ids):
        print(
            f"INFO: Molecular weights not found for the following IDs: {'; '.join(not_found_ids)}"
        )
        print(
            "You may try to re-run the Uniprot MW search, this helps sometimes to find missing MWs."
        )
    for uniprot_id in list(uniprot_id_protein_mass_mapping.keys()):
        try:
            protein_ids = uniprot_id_protein_id_mapping[uniprot_id]
        except KeyError:
            continue
        for protein_id in protein_ids:
            if protein_id not in original_cache_json_keys:
                protein_id_mass_mapping[protein_id] = uniprot_id_protein_mass_mapping[
                    uniprot_id
                ] * (
                    multiplication_factor
                    if protein_id not in original_cache_json_keys
                    else 1.0
                )

    # Return protein mass list JSON :D
    return protein_id_mass_mapping | cache_json

uniprot_get_enzyme_sequences_for_sbml(sbml_path, cache_basepath, base_species)

Returns a JSON with a mapping of protein IDs as keys, and as values the protein sequences.

The sequences are taken from UniProt (retrieved using UniProt's REST API).

Arguments
  • sbml_path: str ~ The SBML's file path
Output

A JSON file with the path project_folder+project_name+'_protein_id_sequence_mapping.json' and the following structure:

{
    "$PROTEIN_ID": $PROTEIN_SEQUENCE_AS_STRING,
    (...),
}
Source code in cobrak/uniprot_functionality.py
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
@validate_call(config=ConfigDict(arbitrary_types_allowed=True), validate_return=True)
def uniprot_get_enzyme_sequences_for_sbml(
    sbml_path: str,
    cache_basepath: str,
    base_species: str,
) -> dict[str, str]:
    """Returns a JSON with a mapping of protein IDs as keys, and as values the protein sequences.

    The sequences are taken from UniProt (retrieved using UniProt's REST API).

    Arguments
    ----------
    * sbml_path: str ~ The SBML's file path

    Output
    ----------
    A JSON file with the path project_folder+project_name+'_protein_id_sequence_mapping.json'
    and the following structure:
    <pre>
    {
        "$PROTEIN_ID": $PROTEIN_SEQUENCE_AS_STRING,
        (...),
    }
    </pre>
    """
    model = cobra.io.read_sbml_model(sbml_path)
    uniprot_id_protein_id_mapping = _get_uniprot_id_protein_id_mapping(model)

    # GET UNIPROT ID<->SEQUENCE MAPPING
    uniprot_id_sequence_mapping: dict[str, float] = {}
    # The cache stored UniProt sequences for already searched
    # UniProt IDs (each file in the cache folder has the name
    # of the corresponding UniProt ID). This prevents searching
    # UniProt for already found protein masses. :-)
    cache_basepath = standardize_folder(cache_basepath)
    ensure_folder_existence(cache_basepath)
    cache_filepath = f"{cache_basepath}_cache_uniprot_sequences.json"
    try:
        cache_json: dict[str, float] = json_load(cache_filepath, dict[str, str])
    except Exception:
        cache_json: dict[str, float] = {}
    original_cache_json_keys = deepcopy(list(cache_json.keys()))
    # Go through each batch of UniProt IDs (multiple UniProt IDs
    # are searched at once in order to save an amount of UniProt API calls)
    # and retrieve the amino acid sequences.
    print("Starting UniProt ID<->Protein mass search using UniProt API...")
    uniprot_ids = list(uniprot_id_protein_id_mapping.keys())

    batch_size = 12
    batch_start = 0
    while batch_start < len(uniprot_ids):
        # Create the batch with all UniProt IDs
        prebatch = uniprot_ids[batch_start : batch_start + batch_size]
        batch = []
        # Remove all IDs which are present in the cache (i.e.,
        # which were searched for already).
        # The cache consists of pickled protein mass floats, each
        # onein a file with the name of the associated protein.
        for uniprot_id in prebatch:
            if uniprot_id not in cache_json:
                batch.append(uniprot_id)
            else:
                uniprot_id_sequence_mapping[uniprot_id] = cache_json[uniprot_id]

        # If all IDs could be found in the cache, continue with the next batch.
        if len(batch) == 0:
            batch_start += batch_size
            continue

        # Create the UniProt query for the batch
        # With 'OR', all given IDs are searched, and subsequently in this script,
        # the right associated masses are being picked.
        query = " OR ".join(batch)
        uniprot_query_url = f"https://rest.uniprot.org/uniprotkb/search?query={query}&format=tsv&fields=accession,id,sequence,gene_names,gene_orf,gene_oln,organism_name"
        print(f"UniProt batch search for: {query}")

        # Call UniProt's API :-)
        uniprot_data: list[str] = requests.get(
            uniprot_query_url, timeout=1e6
        ).text.split("\n")
        # Wait in order to cool down their server :-)
        time.sleep(1.0)

        # Read out the API-returned lines
        found_ids = []
        for line in uniprot_data[1:]:
            if not line:
                continue
            accession_id = line.split("\t")[0].strip().rstrip()
            entry_id = line.split("\t")[1].strip().rstrip()
            sequence_string = line.split("\t")[2].strip()
            gene_names = line.split("\t")[3].strip().split(" ")
            gene_names_orf = line.split("\t")[4].strip().split(" ")
            gene_names_ordered_locus = line.split("\t")[5].strip().split(" ")
            organism_name = line.split("\t")[6].strip()
            if base_species.lower() not in organism_name.lower():
                continue
            uniprot_id_sequence_mapping[accession_id] = sequence_string
            uniprot_id_sequence_mapping[entry_id] = sequence_string
            for extraname in [
                extraname
                for extraname in gene_names + gene_names_orf + gene_names_ordered_locus
                if len(extraname) > 0
            ]:
                uniprot_id_sequence_mapping[extraname] = sequence_string
                found_ids.append(extraname)
            found_ids.extend((accession_id, entry_id))

        # Continue with the next batch :D
        batch_start += batch_size

    # Create the final protein ID <-> mass mapping
    protein_id_sequence_mapping: dict[str, float] = {}
    not_found_ids = set(uniprot_ids) - set(cache_json.keys())
    if len(not_found_ids):
        print(
            f"INFO: Protein sequences not found for the following IDs: {'; '.join(not_found_ids)}"
        )
        print(
            "You may try to re-run the Uniprot sequence search, this helps sometimes to find missing sequences."
        )
    for uniprot_id in list(uniprot_id_sequence_mapping.keys()):
        try:
            protein_ids = uniprot_id_protein_id_mapping[uniprot_id]
        except KeyError:
            continue
        for protein_id in protein_ids:
            if protein_id not in original_cache_json_keys:
                protein_id_sequence_mapping[protein_id] = uniprot_id_sequence_mapping[
                    uniprot_id
                ]

    # Return protein mass list JSON :D
    return protein_id_sequence_mapping | cache_json

utilities

General utility functions for COBRAk dataclasses and more.

This module does not include I/O functions which are found in COBRAk's "io" module.

add_objective_value_as_extra_linear_constraint(cobrak_model, objective_value, objective_target, objective_sense)

Adds a linear constraint to a COBRA-k model that enforces the objective value.

This function creates an extra linear constraint that limits the objective value to be within a small range around the original objective value. This can be useful for enforcing constraints during model manipulation or optimization.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRA-k Model object to be modified.

required
objective_value float

The original objective value.

required
objective_target str | dict[str, float]

A string representing the objective variable or a dictionary mapping variables to their coefficients in the objective function.

required
objective_sense int

The sense of the objective function (1 for maximization, -1 for minimization).

required

Returns:

Type Description
Model

The modified COBRA-k Model object with the extra linear constraint added.

Source code in cobrak/utilities.py
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
@validate_call(validate_return=True)
def add_objective_value_as_extra_linear_constraint(
    cobrak_model: Model,
    objective_value: float,
    objective_target: str | dict[str, float],
    objective_sense: int,
) -> Model:
    """Adds a linear constraint to a COBRA-k model that enforces the objective value.

    This function creates an extra linear constraint that limits the objective
    value to be within a small range around the original objective value. This
    can be useful for enforcing constraints during model manipulation or
    optimization.

    Args:
        cobrak_model: The COBRA-k Model object to be modified.
        objective_value: The original objective value.
        objective_target: A string representing the objective variable or a dictionary
            mapping variables to their coefficients in the objective function.
        objective_sense: The sense of the objective function (1 for maximization, -1 for minimization).

    Returns:
        The modified COBRA-k Model object with the extra linear constraint added.
    """
    if is_objsense_maximization(objective_sense):
        lower_value = objective_value - 1e-12
        upper_value = None
    else:
        lower_value = None
        upper_value = objective_value + 1e-12

    if type(objective_target) is str:
        objective_target = {objective_target: 1.0}
    cobrak_model.extra_linear_constraints = [
        ExtraLinearConstraint(
            stoichiometries=objective_target,
            lower_value=lower_value,
            upper_value=upper_value,
        )
    ]
    return cobrak_model

add_statuses_to_optimziation_dict(optimization_dict, pyomo_results)

Adds solver statuses to the optimization dict.

This includes: * SOLVER_STATUS_KEY's value, which is 0 for ok, 1 for warning and higher values for problems. * TERMINATION_CONDITION_KEY's value, which is 0.1 for globally optimal, 0.2 for optimal, 0.3 for locally optimal and >=1 for any result with problems. * ALL_OK_KEY's value, which is True if SOLVER_STATUS_KEY's value < 0 and TERMINATION_CONDITION_KEY's value < 1.

Parameters:

Name Type Description Default
optimization_dict dict[str, float]

The optimization dict

required
pyomo_results SolverResults

The pyomo results object

required

Raises:

Type Description
ValueError

Unknown pyomo_results.solver.status or termination_condition

Returns:

Type Description
dict[str, float]

dict[str, float]: The pyomo results dict with the added statuses.

Source code in cobrak/utilities.py
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
def add_statuses_to_optimziation_dict(
    optimization_dict: dict[str, float], pyomo_results: SolverResults
) -> dict[str, float]:
    """Adds solver statuses to the optimization dict.

    This includes:
    * SOLVER_STATUS_KEY's value, which is 0 for ok, 1 for warning
       and higher values for problems.
    * TERMINATION_CONDITION_KEY's value, which is 0.1 for globally optimal,
      0.2 for optimal, 0.3 for locally optimal and >=1 for any result with problems.
    * ALL_OK_KEY's value, which is True if SOLVER_STATUS_KEY's value < 0
      and TERMINATION_CONDITION_KEY's value < 1.

    Args:
        optimization_dict (dict[str, float]): The optimization dict
        pyomo_results (SolverResults): The pyomo results object

    Raises:
        ValueError: Unknown pyomo_results.solver.status or termination_condition

    Returns:
        dict[str, float]: The pyomo results dict with the added statuses.
    """
    solver_status = get_solver_status_from_pyomo_results(pyomo_results)

    termination_condition = get_termination_condition_from_pyomo_results(pyomo_results)

    optimization_dict[SOLVER_STATUS_KEY] = solver_status
    optimization_dict[TERMINATION_CONDITION_KEY] = termination_condition
    optimization_dict[ALL_OK_KEY] = (
        termination_condition >= 0 and termination_condition < 1
    ) and (solver_status == 0)

    return optimization_dict

apply_error_correction_on_model(cobrak_model, correction_result, min_abs_error_value=0.01, min_rel_error_value=0.01, verbose=False)

Applies error corrections to a COBRAl model based on a correction result dictionary.

This function iterates through the correction_result dictionary and applies corrections to reaction k_cat values, Michaelis-Menten constants (k_M), Gibbs free energy changes (ΔᵣG'°) as well as the inhibition terms (k_I) and activation terms (k_A). The corrections are applied only if the (for all parameters except ΔᵣG'°) relative or (for ΔᵣG'°) absolute error exceeds specified thresholds.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAl model to be corrected.

required
correction_result dict[str, float]

A dictionary containing error correction values. Keys are expected to contain information about the reaction, metabolite or other variable value being corrected.

required
min_abs_error_value NonNegativeFloat

The minimum absolute error value for applying corrections.

0.01
min_rel_error_value NonNegativeFloat

The minimum relative error value for applying corrections.

0.01
verbose bool

If True, prints details of the corrections being applied.

False

Returns:

Type Description
Model

A deep copy of the COBRAk model with the error corrections applied.

Source code in cobrak/utilities.py
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
@validate_call(validate_return=True)
def apply_error_correction_on_model(
    cobrak_model: Model,
    correction_result: dict[str, float],
    min_abs_error_value: NonNegativeFloat = 0.01,
    min_rel_error_value: NonNegativeFloat = 0.01,
    verbose: bool = False,
) -> Model:
    """Applies error corrections to a COBRAl model based on a correction result dictionary.

    This function iterates through the `correction_result` dictionary and applies corrections
    to reaction k_cat values, Michaelis-Menten constants (k_M), Gibbs free energy changes (ΔᵣG'°)
    as well as the inhibition terms (k_I) and activation terms (k_A).
    The corrections are applied only if the (for all parameters except ΔᵣG'°) relative or (for ΔᵣG'°) absolute
    error exceeds specified thresholds.

    Args:
        cobrak_model: The COBRAl model to be corrected.
        correction_result: A dictionary containing error correction values.  Keys are expected to
            contain information about the reaction, metabolite or other variable value being corrected.
        min_abs_error_value: The minimum absolute error value for applying corrections.
        min_rel_error_value: The minimum relative error value for applying corrections.
        verbose: If True, prints details of the corrections being applied.

    Returns:
        A deep copy of the COBRAk model with the error corrections applied.
    """
    changed_model = deepcopy(cobrak_model)
    error_entries = {
        key: value
        for key, value in correction_result.items()
        if key.startswith(ERROR_VAR_PREFIX)
    }
    for key, value in error_entries.items():
        if "_kcat_times_e_" in key:
            reac_id = key.split("_kcat_times_e_")[1]
            enzyme_id = get_reaction_enzyme_var_id(
                reac_id, cobrak_model.reactions[reac_id]
            )
            k_cat = cobrak_model.reactions[reac_id].enzyme_reaction_data.k_cat
            enzyme_conc = correction_result[enzyme_id]
            e_times_kcat = k_cat * enzyme_conc
            if e_times_kcat == 0.0:
                continue
            kcat_correction = (e_times_kcat + value) / e_times_kcat
            if (kcat_correction - 1.0) < min_rel_error_value:
                continue
            changed_model.reactions[
                reac_id
            ].enzyme_reaction_data.k_cat *= kcat_correction
            if verbose:
                print(
                    f"Correct kcat of {reac_id} from {k_cat} to {changed_model.reactions[reac_id].enzyme_reaction_data.k_cat}"
                )
        elif key.endswith(("_substrate", "_product")):
            reac_id = key.split("____")[0].replace(ERROR_VAR_PREFIX + "_", "")
            met_id = (
                key.split("____")[1].replace("_substrate", "").replace("_product", "")
            )
            original_km = cobrak_model.reactions[reac_id].enzyme_reaction_data.k_ms[
                met_id
            ]
            if key.endswith("_product"):
                new_value = exp(log(original_km) + value)
                if new_value / original_km < (min_rel_error_value + 1.0):
                    continue
                changed_model.reactions[reac_id].enzyme_reaction_data.k_ms[met_id] = (
                    exp(log(original_km) + value)
                )
            else:
                new_value = exp(log(original_km) - value)
                if (original_km / new_value) < (min_rel_error_value + 1.0):
                    continue
                changed_model.reactions[reac_id].enzyme_reaction_data.k_ms[met_id] = (
                    exp(log(original_km) - value)
                )
            if verbose:
                print(
                    f"Correct km of {met_id} in {reac_id} from {original_km} to {changed_model.reactions[reac_id].enzyme_reaction_data.k_ms[met_id]}"
                )
        elif key.endswith("_iota"):
            reac_id = key.split("____")[1]
            met_id = key.split("____")[2]
            original_ki = cobrak_model.reactions[reac_id].enzyme_reaction_data.k_i[
                met_id
            ]
            new_value = exp(log(original_ki) + value)
            if new_value / original_ki < (min_rel_error_value + 1.0):
                continue
            changed_model.reactions[reac_id].enzyme_reaction_data.k_is[met_id] = exp(
                log(original_ki) + value
            )
            if verbose:
                print(
                    f"Correct ki of {met_id} in {reac_id} from {original_ki} to {changed_model.reactions[reac_id].enzyme_reaction_data.k_is[met_id]}"
                )
        elif key.endswith("_alpha"):
            reac_id = key.split("____")[1]
            met_id = key.split("____")[2]
            original_ka = cobrak_model.reactions[reac_id].enzyme_reaction_data.k_a[
                met_id
            ]
            new_value = exp(log(original_ka) + value)
            if new_value / original_ka < (min_rel_error_value + 1.0):
                continue
            changed_model.reactions[reac_id].enzyme_reaction_data.k_as[met_id] = exp(
                log(original_ka) + value
            )
            if verbose:
                print(
                    f"Correct ka of {met_id} in {reac_id} from {original_ka} to {changed_model.reactions[reac_id].enzyme_reaction_data.k_as[met_id]}"
                )
        elif "dG0_" in key:
            if value < min_abs_error_value:
                continue
            reac_id = key.split("dG0_")[1]
            changed_model.reactions[reac_id].dG0 -= value
            if verbose:
                original_dG0 = cobrak_model.reactions[reac_id].dG0
                print(
                    f"Correct ΔG'° {reac_id} from {original_dG0} to {changed_model.reactions[reac_id].dG0}"
                )

    return changed_model

apply_variability_dict(model, cobrak_model, variability_dict, error_scenario={}, abs_epsilon=1e-05)

Applies the variability data as new variable bounds in the pyomo model

I.e., if the variaility of a variable A is [-10;10], A is now set to be -10 <= A <= 10 by changing its lower and upper bound.

Parameters:

Name Type Description Default
model ConcreteModel

The pyomo model

required
variability_dict dict[str, tuple[float, float]]

The variability data

required
abs_epsilon _type_

Under this value, the given value is assumed to be 0.0. Defaults to 1e-9.

1e-05

Returns:

Name Type Description
ConcreteModel ConcreteModel

The pyomo model with newly set variable bounds

Source code in cobrak/utilities.py
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
@validate_call(config=ConfigDict(arbitrary_types_allowed=True), validate_return=True)
def apply_variability_dict(
    model: ConcreteModel,
    cobrak_model: Model,  # noqa: ARG001
    variability_dict: dict[str, tuple[float, float]],
    error_scenario: dict[str, tuple[float, float]] = {},
    abs_epsilon: NonNegativeFloat = 1e-5,
) -> ConcreteModel:
    """Applies the variability data as new variable bounds in the pyomo model

    I.e., if the variaility of a variable A is [-10;10],
    A is now set to be -10 <= A <= 10 by changing
    its lower and upper bound.

    Args:
        model (ConcreteModel): The pyomo model
        variability_dict (dict[str, tuple[float, float]]): The variability data
        abs_epsilon (_type_, optional): Under this value, the given value is assumed to be 0.0. Defaults to 1e-9.

    Returns:
        ConcreteModel: The pyomo model with newly set variable bounds
    """
    model_varnames = get_model_var_names(model)
    for var_id, variability in variability_dict.items():
        if var_id in error_scenario:
            continue
        try:
            if abs(variability[0]) < abs_epsilon:
                getattr(model, var_id).setlb(0.0)
            else:
                lbchange_var_id = f"{ERROR_BOUND_LOWER_CHANGE_PREFIX}{var_id}"
                if lbchange_var_id in model_varnames:
                    getattr(model, var_id).setlb(
                        variability[0] - getattr(model, lbchange_var_id).value
                    )
                else:
                    getattr(model, var_id).setlb(variability[0])
            if abs(variability[1]) < abs_epsilon:
                getattr(model, var_id).setub(0.0)
            else:
                ubchange_var_id = f"{ERROR_BOUND_UPPER_CHANGE_PREFIX}{var_id}"
                if ubchange_var_id in model_varnames:
                    getattr(model, var_id).setub(
                        variability[1] + getattr(model, ubchange_var_id).value
                    )
                else:
                    getattr(model, var_id).setub(variability[1])
        except AttributeError:
            pass
    return model

combine_enzyme_reaction_datasets(datasets)

Combines the enzyme reaction data from the given sources

The first given dataset has precedence, meaning that its data (k_cats, k_ms, ...) will be set first. For any reaction/metabolite where data is missing, it is then looked up in the second given dataset, then in the third and so on.

Parameters:

Name Type Description Default
datasets list[dict[str, EnzymeReactionData | None]]

The enzyme reaction datasets

required

Returns:

Type Description
dict[str, EnzymeReactionData | None]

dict[str, EnzymeReactionData | None]: The combined enzyme reaction data

Source code in cobrak/utilities.py
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
@validate_call(validate_return=True)
def combine_enzyme_reaction_datasets(
    datasets: list[dict[str, EnzymeReactionData | None]],
) -> dict[str, EnzymeReactionData | None]:
    """Combines the enzyme reaction data from the given sources

    The first given dataset has precedence, meaning that its data (k_cats, k_ms, ...)
    will be set first. For any reaction/metabolite where data is missing, it is then looked
    up in the second given dataset, then in the third and so on.

    Args:
        datasets (list[dict[str, EnzymeReactionData  |  None]]): The enzyme reaction datasets

    Returns:
        dict[str, EnzymeReactionData | None]: The combined enzyme reaction data
    """
    combined_data: dict[str, EnzymeReactionData] = {}
    for dataset in datasets:
        for reac_id, enzyme_reaction_data in dataset.items():
            if enzyme_reaction_data is None:
                continue

            if (reac_id not in combined_data) or (
                combined_data[reac_id].k_cat_references[0].tax_distance
                > enzyme_reaction_data.k_cat_references[0].tax_distance
            ):
                combined_data[reac_id] = EnzymeReactionData(
                    identifiers=enzyme_reaction_data.identifiers,
                    k_cat=enzyme_reaction_data.k_cat,
                    k_cat_references=enzyme_reaction_data.k_cat_references,
                )

            for met_id, k_m in enzyme_reaction_data.k_ms.items():
                if met_id not in combined_data[reac_id].k_ms or (
                    combined_data[reac_id].k_m_references[met_id][0].tax_distance
                    > enzyme_reaction_data.k_m_references[met_id][0].tax_distance
                ):
                    combined_data[reac_id].k_ms[met_id] = k_m
                    combined_data[reac_id].k_m_references[met_id] = (
                        enzyme_reaction_data.k_m_references[met_id]
                    )

            for met_id, k_i in enzyme_reaction_data.k_is.items():
                if met_id not in combined_data[reac_id].k_is or (
                    combined_data[reac_id].k_i_references[met_id][0].tax_distance
                    > enzyme_reaction_data.k_i_references[met_id][0].tax_distance
                ):
                    combined_data[reac_id].k_is[met_id] = k_i
                    combined_data[reac_id].k_i_references[met_id] = (
                        enzyme_reaction_data.k_i_references[met_id]
                    )

            for met_id, k_a in enzyme_reaction_data.k_as.items():
                if met_id not in combined_data[reac_id].k_as or (
                    combined_data[reac_id].k_a_references[met_id][0].tax_distance
                    > enzyme_reaction_data.k_a_references[met_id][0].tax_distance
                ):
                    combined_data[reac_id].k_as[met_id] = k_a
                    combined_data[reac_id].k_a_references[met_id] = (
                        enzyme_reaction_data.k_a_references[met_id]
                    )

            hills = enzyme_reaction_data.hill_coefficients
            for met_id in hills.kappa:
                if met_id not in combined_data[reac_id].hill_coefficients.kappa or (
                    combined_data[reac_id]
                    .hill_coefficient_references.kappa[met_id][0]
                    .tax_distance
                    > enzyme_reaction_data.hill_coefficient_references.kappa[met_id][
                        0
                    ].tax_distance
                ):
                    combined_data[reac_id].hill_coefficients.kappa[met_id] = (
                        hills.kappa[met_id]
                    )
                    combined_data[reac_id].hill_coefficient_references.kappa[met_id] = (
                        enzyme_reaction_data.hill_coefficient_references.kappa[met_id]
                    )
            for met_id in hills.iota:
                if met_id not in combined_data[reac_id].hill_coefficients.iota or (
                    combined_data[reac_id]
                    .hill_coefficient_references.iota[met_id][0]
                    .tax_distance
                    > enzyme_reaction_data.hill_coefficient_references.iota[met_id][
                        0
                    ].tax_distance
                ):
                    combined_data[reac_id].hill_coefficients.iota[met_id] = hills.iota[
                        met_id
                    ]
                    combined_data[reac_id].hill_coefficient_references.iota[met_id] = (
                        enzyme_reaction_data.hill_coefficient_references.iota[met_id]
                    )
            for met_id in hills.alpha:
                if met_id not in combined_data[reac_id].hill_coefficients.alpha or (
                    combined_data[reac_id]
                    .hill_coefficient_references.alpha[met_id][0]
                    .tax_distance
                    > enzyme_reaction_data.hill_coefficient_references.alpha[met_id][
                        0
                    ].tax_distance
                ):
                    combined_data[reac_id].hill_coefficients.alpha[met_id] = (
                        hills.alpha[met_id]
                    )
                    combined_data[reac_id].hill_coefficient_references.alpha[met_id] = (
                        enzyme_reaction_data.hill_coefficient_references.alpha[met_id]
                    )

    return combined_data

compare_multiple_results_to_best(cobrak_model, results, is_maximization, min_reac_flux=1e-08)

Compares multiple optimization results to the best result and returns a dictionary with statistics and comparisons.

This function first identifies the best result based on the objective value. It then compares each result to the best result and calculates statistics and comparisons. The comparisons include the difference between the objective values and the reaction fluxes. Reactions with fluxes below the minimum reaction flux threshold are ignored.

Args: cobrak_model (Model): The COBRA-k model used for the optimization. results (list[dict[str, float]]): A list of optimization results. is_maximization (bool): Whether the optimization is a maximization problem. min_reac_flux (float, optional): The minimum reaction flux to consider. Defaults to 1e-8.

Returns: dict[int, tuple[dict[str, float], dict[int, list[str]]]]: A dictionary where each key is the index of a result and each value is a tuple containing: - A dictionary with reaction statistics, including: - "min": The minimum absolute flux difference. - "max": The maximum absolute flux difference. - "sum": The sum of all absolute flux differences. - "mean": The mean of all absolute flux differences. - "median": The median of all absolute flux differences. - "obj_difference": The difference between the objective value of the current result and the best result. - A dictionary with reaction comparisons, where each key is an integer indicating which result has a higher flux: - 0: The best result has a higher flux. - 1: The current result has a higher flux.

Source code in cobrak/utilities.py
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
@validate_call(validate_return=True)
def compare_multiple_results_to_best(
    cobrak_model: Model,
    results: list[dict[str, float]],
    is_maximization: bool,
    min_reac_flux: float = 1e-8,
) -> dict[int, tuple[dict[str, float], dict[int, list[str]]]]:
    """Compares multiple optimization results to the best result and returns a dictionary with statistics and comparisons.

    This function first identifies the best result based on the objective value.
    It then compares each result to the best result and calculates statistics and comparisons.
    The comparisons include the difference between the objective values and the reaction fluxes.
    Reactions with fluxes below the minimum reaction flux threshold are ignored.

    Args:
    cobrak_model (Model): The COBRA-k model used for the optimization.
    results (list[dict[str, float]]): A list of optimization results.
    is_maximization (bool): Whether the optimization is a maximization problem.
    min_reac_flux (float, optional): The minimum reaction flux to consider. Defaults to 1e-8.

    Returns:
    dict[int, tuple[dict[str, float], dict[int, list[str]]]]: A dictionary where each key is the index of a result and each value is a tuple containing:
    - A dictionary with reaction statistics, including:
    - "min": The minimum absolute flux difference.
    - "max": The maximum absolute flux difference.
    - "sum": The sum of all absolute flux differences.
    - "mean": The mean of all absolute flux differences.
    - "median": The median of all absolute flux differences.
    - "obj_difference": The difference between the objective value of the current result and the best result.
    - A dictionary with reaction comparisons, where each key is an integer indicating which result has a higher flux:
    - 0: The best result has a higher flux.
    - 1: The current result has a higher flux.
    """
    objective_values = [x[OBJECTIVE_VAR_NAME] for x in results]
    best_objective = max(objective_values) if is_maximization else min(objective_values)
    best_idx = objective_values.index(best_objective)

    comparisons: dict[int, tuple[dict[str, float], dict[int, list[str]]]] = {}
    for idx in range(len(results)):
        if idx == best_idx:
            continue
        obj_difference = (
            objective_values[idx] - best_objective
            if is_maximization
            else best_objective - objective_values[idx]
        )
        reac_statistics, reac_comparisons = _compare_two_results_with_statistics(
            cobrak_model,
            results[idx],
            results[best_idx],
            min_reac_flux,
        )
        reac_statistics["obj_difference"] = obj_difference
        comparisons[idx] = (reac_statistics, reac_comparisons)

    return comparisons

compare_optimization_result_fluxes(cobrak_model, result_1, result_2, min_reac_flux=1e-08)

Compares the fluxes of two optimization results and returns a dictionary with the absolute differences and indicators of which result has a higher flux.

This function first corrects the fluxes of the two results by considering the forward and reverse reactions. It then calculates the absolute differences between the corrected fluxes and determines which result has a higher flux for each reaction. Reactions with fluxes below the minimum reaction flux threshold are ignored.

Args: cobrak_model (Model): The COBRA-k model used for the optimization. result_1 (dict[str, float]): The first optimization result. result_2 (dict[str, float]): The second optimization result. min_reac_flux (float, optional): The minimum reaction flux to consider. Defaults to 1e-8.

Returns: dict[str, tuple[float, int]]: A dictionary where each key is a reaction ID and each value is a tuple containing: - The absolute difference between the fluxes of the two results. - An indicator of which result has a higher flux: - 0: Both results have the same flux. - 1: The first result has a higher flux. - 2: The second result has a higher flux.

Source code in cobrak/utilities.py
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
@validate_call(validate_return=True)
def compare_optimization_result_fluxes(
    cobrak_model: Model,
    result_1: dict[str, float],
    result_2: dict[str, float],
    min_reac_flux: float = 1e-8,
) -> dict[str, tuple[float, int]]:
    """Compares the fluxes of two optimization results and returns a dictionary with the absolute differences and indicators of which result has a higher flux.

    This function first corrects the fluxes of the two results by considering the forward and reverse reactions.
    It then calculates the absolute differences between the corrected fluxes and determines which result has a higher flux for each reaction.
    Reactions with fluxes below the minimum reaction flux threshold are ignored.

    Args:
    cobrak_model (Model): The COBRA-k model used for the optimization.
    result_1 (dict[str, float]): The first optimization result.
    result_2 (dict[str, float]): The second optimization result.
    min_reac_flux (float, optional): The minimum reaction flux to consider. Defaults to 1e-8.

    Returns:
    dict[str, tuple[float, int]]: A dictionary where each key is a reaction ID and each value is a tuple containing:
    - The absolute difference between the fluxes of the two results.
    - An indicator of which result has a higher flux:
    - 0: Both results have the same flux.
    - 1: The first result has a higher flux.
    - 2: The second result has a higher flux.
    """
    corrected_result_1: dict[str, float] = {}
    corrected_result_2: dict[str, float] = {}
    for result, corrected_result in [
        (result_1, corrected_result_1),
        (result_2, corrected_result_2),
    ]:
        for var_id in result:
            if var_id not in cobrak_model.reactions:
                continue
            flux = get_fwd_rev_corrected_flux(
                var_id,
                list(result.keys()),
                result,
                cobrak_model.fwd_suffix,
                cobrak_model.rev_suffix,
            )
            if flux >= min_reac_flux:
                corrected_result[var_id] = flux

    abs_results = {}
    for reac_id in cobrak_model.reactions:
        if (reac_id in corrected_result_1) and (reac_id in corrected_result_2):
            flux_1, flux_2 = corrected_result_1[reac_id], corrected_result_2[reac_id]
            # if other_id not in abs_results:
            abs_results[reac_id] = (abs(flux_1 - flux_2), 0)
        elif reac_id in corrected_result_1:
            flux_1 = corrected_result_1[reac_id]
            # if other_id not in abs_results:
            abs_results[reac_id] = (flux_1, 1)
        elif reac_id in corrected_result_2:
            flux_2 = corrected_result_2[reac_id]
            # if other_id not in abs_results:
            abs_results[reac_id] = (flux_2, 2)

    return abs_results

compare_optimization_result_reaction_uses(cobrak_model, results, min_abs_flux=1e-06)

Compares the usage of reactions across multiple optimization (e.g. FBA) results.

This function analyzes the frequency of reaction usage in a set of optimization results from a COBRAk Model. It identifies which reactions are used in each solution and prints the number of solutions in which each reaction is active, considering a minimum absolute flux threshold.

Parameters: - cobrak_model (Model): The COBRAk model containing the reactions to be analyzed. - results (list[dict[str, float]]): A list of dictionaries, each representing an optimization result with reaction IDs as keys and their corresponding flux values as values. - min_abs_flux (float, optional): The minimum absolute flux value to consider a reaction as used. Reactions with absolute flux values below this threshold are ignored. Defaults to 1e-6.

  • None: This function does not return a value. It prints the number of solutions in which each reaction is used, grouped by the number of solutions.
Source code in cobrak/utilities.py
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
@validate_call(validate_return=True)
def compare_optimization_result_reaction_uses(
    cobrak_model: Model,
    results: list[dict[str, float]],
    min_abs_flux: NonNegativeFloat = 1e-6,
) -> None:
    """Compares the usage of reactions across multiple optimization (e.g. FBA) results.

    This function analyzes the frequency of reaction usage in a set of optimization results
    from a COBRAk Model. It identifies which reactions are used in each solution and prints
    the number of solutions in which each reaction is active, considering a minimum absolute
    flux threshold.

    Parameters:
    - cobrak_model (Model): The COBRAk model containing the reactions to be analyzed.
    - results (list[dict[str, float]]): A list of dictionaries, each representing an optimization
      result with reaction IDs as keys and their corresponding flux values as values.
    - min_abs_flux (float, optional): The minimum absolute flux value to consider a reaction as used.
      Reactions with absolute flux values below this threshold are ignored. Defaults to 1e-6.

    Returns:
    - None: This function does not return a value. It prints the number of solutions in which each
      reaction is used, grouped by the number of solutions.
    """
    results = deepcopy(results)
    results = [
        get_base_id_optimzation_result(
            cobrak_model,
            result,
        )
        for result in results
    ]

    reac_ids: list[str] = [
        get_base_id(
            reac_id,
            cobrak_model.fwd_suffix,
            cobrak_model.rev_suffix,
            cobrak_model.reac_enz_separator,
        )
        for reac_id in cobrak_model.reactions
    ]
    reacs_to_uses: dict[str, list[int]] = {reac_id: [] for reac_id in reac_ids}
    for num, result in enumerate(results):
        for reac_id in reac_ids:
            if reac_id not in result:
                continue
            if abs(result[reac_id]) <= min_abs_flux:
                continue
            if num in reacs_to_uses[reac_id]:
                continue
            reacs_to_uses[reac_id].append(num)
    min_num_results = min(len(i) for i in reacs_to_uses.values())
    max_num_results = max(len(i) for i in reacs_to_uses.values())
    print(min_num_results, max_num_results)
    for num_results in range(min_num_results, max_num_results + 1):
        print(f"Reactions used in {num_results} solutions:")
        print(
            [
                (reac_id, uses)
                for reac_id, uses in reacs_to_uses.items()
                if len(uses) == num_results
            ]
        )
        print("===")

count_last_equal_elements(lst)

Counts the number of consecutive equal elements from the end of the list.

Parameters: lst (list[Any]): A Python list.

Returns: int: The number of consecutive equal elements from the end of the list.

Examples:

count_last_equal_elements([1.0, 2.0, 1.0, 3.0, 3.0, 3.0]) 3 count_last_equal_elements([1.0, 2.0, 2.0, 1.0]) 1 count_last_equal_elements([1.0, 1.0, 1.0, 1.0]) 4 count_last_equal_elements([]) 0

Source code in cobrak/utilities.py
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
@validate_call(validate_return=True)
def count_last_equal_elements(lst: list[Any]) -> int:
    """Counts the number of consecutive equal elements from the end of the list.

    Parameters:
    lst (list[Any]): A Python list.

    Returns:
    int: The number of consecutive equal elements from the end of the list.

    Examples:
    >>> count_last_equal_elements([1.0, 2.0, 1.0, 3.0, 3.0, 3.0])
    3
    >>> count_last_equal_elements([1.0, 2.0, 2.0, 1.0])
    1
    >>> count_last_equal_elements([1.0, 1.0, 1.0, 1.0])
    4
    >>> count_last_equal_elements([])
    0
    """
    if not lst:
        return 0  # Return 0 if the list is empty

    count = 1  # Start with the last element
    last_element = lst[-1]

    # Iterate from the second last element to the beginning
    for i in range(len(lst) - 2, -1, -1):
        if lst[i] == last_element:
            count += 1
        else:
            break  # Stop counting when a different element is found

    return count

create_cnapy_scenario_out_of_optimization_dict(path, cobrak_model, optimization_dict, desplit_reactions=True)

Create a CNApy scenario file from an optimization dictionary and a COBRAk Model.

Parameters:

Name Type Description Default
path str

The file path where the CNApy scenario will be saved.

required
cobrak_model Model

The COBRAk Model.

required
optimization_dict dict[str, float]

An optimization result dict.

required
desplit_reactions bool

bool: Whether or not the fluxes of split reversible reaction shall be recombined. Defaults to True.

True

Returns:

Name Type Description
None None

The function saves the CNApy scenario to the specified path.

Source code in cobrak/utilities.py
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
@validate_call(validate_return=True)
def create_cnapy_scenario_out_of_optimization_dict(
    path: str,
    cobrak_model: Model,
    optimization_dict: dict[str, float],
    desplit_reactions: bool = True,
) -> None:
    """Create a CNApy scenario file from an optimization dictionary and a COBRAk Model.

    Args:
        path (str): The file path where the CNApy scenario will be saved.
        cobrak_model (Model): The COBRAk Model.
        optimization_dict (dict[str, float]): An optimization result dict.
        desplit_reactions: bool: Whether or not the fluxes of split reversible reaction
                                 shall be recombined. Defaults to True.

    Returns:
        None: The function saves the CNApy scenario to the specified path.
    """
    base_id_result = (
        get_base_id_optimzation_result(
            cobrak_model,
            optimization_dict,
        )
        if desplit_reactions
        else optimization_dict
    )
    cnapy_scenario: dict[str, tuple[float, float]] = {
        key: (value, value) for key, value in base_id_result.items()
    }
    json_write(path, cnapy_scenario)

create_cnapy_scenario_out_of_variability_dict(path, cobrak_model, variability_dict, desplit_reactions=True)

Create a CNApy scenario file from a variability dictionary and a COBRAk model.

Parameters:

Name Type Description Default
path str

The file path where the CNApy scenario file will be saved.

required
cobrak_model Model

The COBRA-k model containing reactions.

required
variability_dict dict[str, list[float]]

A dictionary mapping reaction IDs to their minimum and maximum flux values.

required
desplit_reactions bool

bool: Whether or not the fluxes of split reversible reaction shall be recombined. Defaults to True.

True

Returns: None: The function saves the CNApy scenario to the specified path.

Source code in cobrak/utilities.py
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
@validate_call(validate_return=True)
def create_cnapy_scenario_out_of_variability_dict(
    path: str,
    cobrak_model: Model,
    variability_dict: dict[str, tuple[float, float]],
    desplit_reactions: bool = True,
) -> None:
    """Create a CNApy scenario file from a variability dictionary and a COBRAk model.

    Args:
        path (str): The file path where the CNApy scenario file will be saved.
        cobrak_model (Model): The COBRA-k model containing reactions.
        variability_dict (dict[str, list[float]]): A dictionary mapping reaction IDs to their minimum and maximum flux values.
        desplit_reactions: bool: Whether or not the fluxes of split reversible reaction
                                 shall be recombined. Defaults to True.
    Returns:
        None: The function saves the CNApy scenario to the specified path.
    """
    cnapy_scenario: dict[str, list[tuple[float, float]]] = {}

    for reac_id in cobrak_model.reactions:
        if reac_id not in variability_dict:
            continue
        base_id = (
            get_base_id(
                reac_id,
                cobrak_model.fwd_suffix,
                cobrak_model.rev_suffix,
                cobrak_model.reac_enz_separator,
            )
            if desplit_reactions
            else reac_id
        )

        multiplier = -1 if reac_id.endswith(cobrak_model.rev_suffix) else 1
        min_flux = variability_dict[reac_id][0]
        max_flux = variability_dict[reac_id][1]

        if base_id not in cnapy_scenario:
            cnapy_scenario[base_id] = [0.0, 0.0]

        cnapy_scenario[base_id][0] += multiplier * min_flux
        cnapy_scenario[base_id][1] += multiplier * max_flux

    json_write(path, cnapy_scenario)

delete_orphaned_metabolites_and_enzymes(cobrak_model)

Removes orphaned metabolites and enzymes from a COBRAk model.

This function cleans up a COBRAk model by deleting metabolites and enzymes that are not used in any reactions. A metabolite is considered orphaned if it does not appear in the stoichiometries of any reactions. Similarly, an enzyme is considered orphaned if it is not associated with any enzyme reaction data in the model's reactions.

  • cobrak_model (Model): The COBRAk model to be cleaned. This model contains reactions, metabolites, and enzymes that may include unused entries.

Returns: - Model: The cleaned COBRAk model with orphaned metabolites and enzymes removed.

Source code in cobrak/utilities.py
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
@validate_call(validate_return=True)
def delete_orphaned_metabolites_and_enzymes(cobrak_model: Model) -> Model:
    """Removes orphaned metabolites and enzymes from a COBRAk model.

    This function cleans up a COBRAk model by deleting metabolites and enzymes that are not used
    in any reactions. A metabolite is considered orphaned if it does not appear in the stoichiometries
    of any reactions. Similarly, an enzyme is considered orphaned if it is not associated with any
    enzyme reaction data in the model's reactions.

    Parameters:
    - cobrak_model (Model): The COBRAk model to be cleaned. This model contains reactions,
      metabolites, and enzymes that may include unused entries.

    Returns:
    - Model: The cleaned COBRAk model with orphaned metabolites and enzymes removed.
    """
    used_metabolites = []
    used_enzyme_ids = []
    for reaction in cobrak_model.reactions.values():
        used_metabolites += list(reaction.stoichiometries.keys())

        if reaction.enzyme_reaction_data is not None:
            used_enzyme_ids += reaction.enzyme_reaction_data.identifiers

    # Delete metabolites
    mets_to_delete = [
        met_id for met_id in cobrak_model.metabolites if met_id not in used_metabolites
    ]
    for met_to_delete in mets_to_delete:
        del cobrak_model.metabolites[met_to_delete]

    # Delete enzymes
    enzymes_to_delete = [
        enzyme_id
        for enzyme_id in cobrak_model.enzymes
        if enzyme_id not in used_enzyme_ids
    ]
    for enzyme_to_delete in enzymes_to_delete:
        del cobrak_model.enzymes[enzyme_to_delete]

    return cobrak_model

delete_unused_reactions_in_optimization_dict(cobrak_model, optimization_dict, exception_prefix='', delete_missing_reactions=True, min_abs_flux=1e-15, do_not_delete_with_z_var_one=True)

Delete unused reactions in a COBRAk model based on an optimization dictionary.

This function creates a deep copy of the provided COBRAk model and removes reactions that are either not present in the optimization dictionary or have flux values below a specified threshold. Optionally, reactions with a specific prefix can be excluded from deletion. Additionally, orphaned metabolites (those not used in any remaining reactions) are also removed.

Parameters:

Name Type Description Default
cobrak_model Model

COBRAk model containing reactions and metabolites.

required
optimization_dict dict[str, float]

Dictionary mapping reaction IDs to their optimized flux values.

required
exception_prefix str

A prefix for reaction IDs that should not be deleted. Defaults to "".

''
delete_missing_reactions bool

Whether to delete reactions not present in the optimization dictionary. Defaults to True.

True
min_abs_flux float

The minimum absolute flux value below which reactions are considered unused. Defaults to 1e-10.

1e-15

Returns:

Name Type Description
Model Model

A new COBRAk model with unused reactions and orphaned metabolites removed.

Source code in cobrak/utilities.py
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
@validate_call(validate_return=True)
def delete_unused_reactions_in_optimization_dict(
    cobrak_model: Model,
    optimization_dict: dict[str, float],
    exception_prefix: str = "",
    delete_missing_reactions: bool = True,
    min_abs_flux: NonNegativeFloat = 1e-15,
    do_not_delete_with_z_var_one: bool = True,
) -> Model:
    """Delete unused reactions in a COBRAk model based on an optimization dictionary.

    This function creates a deep copy of the provided COBRAk model and removes reactions that are either not present
    in the optimization dictionary or have flux values below a specified threshold. Optionally,
    reactions with a specific prefix can be excluded from deletion.
    Additionally, orphaned metabolites (those not used in any remaining reactions) are also removed.

    Args:
        cobrak_model (Model): COBRAk model containing reactions and metabolites.
        optimization_dict (dict[str, float]): Dictionary mapping reaction IDs to their optimized flux values.
        exception_prefix (str, optional): A prefix for reaction IDs that should not be deleted. Defaults to "".
        delete_missing_reactions (bool, optional): Whether to delete reactions not present in the optimization dictionary. Defaults to True.
        min_abs_flux (float, optional): The minimum absolute flux value below which reactions are considered unused. Defaults to 1e-10.

    Returns:
        Model: A new COBRAk model with unused reactions and orphaned metabolites removed.
    """
    cobrak_model = deepcopy(cobrak_model)
    reacs_to_delete: list[str] = []
    for reac_id in cobrak_model.reactions:
        to_delete = False
        if (reac_id not in optimization_dict) and delete_missing_reactions:
            to_delete = True
        elif (reac_id in optimization_dict) and abs(
            optimization_dict[reac_id]
        ) <= min_abs_flux:
            z_var_id = f"{Z_VAR_PREFIX}{reac_id}"
            if z_var_id in optimization_dict:
                if do_not_delete_with_z_var_one and (
                    optimization_dict[z_var_id] <= 1e-6
                ):
                    to_delete = True
                else:
                    to_delete = True
            else:
                to_delete = True
        if to_delete:
            reacs_to_delete.append(reac_id)
    for reac_to_delete in reacs_to_delete:
        if (exception_prefix) and (reac_to_delete.startswith(exception_prefix)):
            continue
        del cobrak_model.reactions[reac_to_delete]
    return delete_orphaned_metabolites_and_enzymes(cobrak_model)

delete_unused_reactions_in_variability_dict(cobrak_model, variability_dict, extra_reacs_to_delete=[])

Delete unused reactions in a COBRAk model based on a variability dictionary.

This function creates a deep copy of the provided COBRA-k model and removes reactions that have both minimum and maximum flux values equal to zero, as indicated in the variability dictionary. Additionally, any extra reactions specified in the extra_reacs_to_delete list are also removed. Orphaned metabolites (those not used in any remaining reactions) are subsequently deleted, too.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAk model containing reactions and metabolites.

required
variability_dict dict[str, tuple[float, float]]

A dictionary mapping reaction IDs to their minimum and maximum flux values.

required
extra_reacs_to_delete list[str]

A list of additional reaction IDs to be deleted. Defaults to an empty list.

[]

Returns:

Name Type Description
Model Model

A new COBRAk model with unused reactions and orphaned metabolites removed.

Source code in cobrak/utilities.py
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
@validate_call(validate_return=True)
def delete_unused_reactions_in_variability_dict(
    cobrak_model: Model,
    variability_dict: dict[str, tuple[float, float]],
    extra_reacs_to_delete: list[str] = [],
) -> Model:
    """Delete unused reactions in a COBRAk model based on a variability dictionary.

    This function creates a deep copy of the provided COBRA-k model and removes reactions that have both minimum and maximum flux values
    equal to zero, as indicated in the variability dictionary.
    Additionally, any extra reactions specified in the `extra_reacs_to_delete` list are also removed.
    Orphaned metabolites (those not used in any remaining reactions) are subsequently deleted, too.

    Args:
        cobrak_model (Model): The COBRAk model containing reactions and metabolites.
        variability_dict (dict[str, tuple[float, float]]): A dictionary mapping reaction IDs to their minimum and maximum flux values.
        extra_reacs_to_delete (list[str], optional): A list of additional reaction IDs to be deleted. Defaults to an empty list.

    Returns:
        Model: A new COBRAk model with unused reactions and orphaned metabolites removed.
    """
    cobrak_model = deepcopy(cobrak_model)
    reacs_to_delete: list[str] = [] + extra_reacs_to_delete
    for reac_id in cobrak_model.reactions:
        if (variability_dict[reac_id][0] == 0.0) and (
            variability_dict[reac_id][1] == 0.0
        ):
            reacs_to_delete.append(reac_id)
    for reac_to_delete in reacs_to_delete:
        del cobrak_model.reactions[reac_to_delete]

    return delete_orphaned_metabolites_and_enzymes(cobrak_model)

get_active_reacs_from_optimization_dict(cobrak_model, fba_dict)

Get a list of active reactions from an optimization (e.g. FBA (Flux Balance Analysis)) dictionary.

This function iterates through the reactions in a COBRAk model and identifies those that have a positive flux value in the provided FBA dictionary. Only reactions present in the optimization dictionary and with a flux greater than zero are considered active.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAk model containing reactions.

required
fba_dict dict[str, float]

A dictionary mapping reaction IDs to their flux values from an optimization.

required

Returns:

Type Description
list[str]

list[str]: A list of reaction IDs that are active (i.e., have a positive flux) according to the optimization dictionary.

Source code in cobrak/utilities.py
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
@validate_call(validate_return=True)
def get_active_reacs_from_optimization_dict(
    cobrak_model: Model,
    fba_dict: dict[str, float],
) -> list[str]:
    """Get a list of active reactions from an optimization (e.g. FBA (Flux Balance Analysis)) dictionary.

    This function iterates through the reactions in a COBRAk model and identifies those that have a positive flux value in the provided FBA dictionary.
    Only reactions present in the optimization dictionary and with a flux greater than zero are considered active.

    Args:
        cobrak_model (Model): The COBRAk model containing reactions.
        fba_dict (dict[str, float]): A dictionary mapping reaction IDs to their flux values from an optimization.

    Returns:
        list[str]: A list of reaction IDs that are active (i.e., have a positive flux) according to the optimization dictionary.
    """
    active_reacs: list[str] = []
    for reac_id in cobrak_model.reactions:
        if reac_id not in fba_dict:
            continue
        if fba_dict[reac_id] > 0.0:
            active_reacs.append(reac_id)
    return active_reacs

get_base_id(reac_id, fwd_suffix=REAC_FWD_SUFFIX, rev_suffix=REAC_REV_SUFFIX, reac_enz_separator=REAC_ENZ_SEPARATOR)

Extract the base ID from a reaction ID by removing specified suffixes and separators.

Processes a reaction ID to remove forward and reverse suffixes as well as any enzyme separators, to obtain the base reaction ID.

Parameters:

Name Type Description Default
reac_id str

The reaction ID to be processed.

required
fwd_suffix str

The suffix indicating forward reactions. Defaults to REAC_FWD_SUFFIX.

REAC_FWD_SUFFIX
rev_suffix str

The suffix indicating reverse reactions. Defaults to REAC_REV_SUFFIX.

REAC_REV_SUFFIX
reac_enz_separator str

The separator used between reaction and enzyme identifiers. Defaults to REAC_ENZ_SEPARATOR.

REAC_ENZ_SEPARATOR

Returns:

Name Type Description
str str

The base reaction ID with specified suffixes and separators removed.

Source code in cobrak/utilities.py
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
@validate_call(validate_return=True)
def get_base_id(
    reac_id: str,
    fwd_suffix: str = REAC_FWD_SUFFIX,
    rev_suffix: str = REAC_REV_SUFFIX,
    reac_enz_separator: str = REAC_ENZ_SEPARATOR,
) -> str:
    """Extract the base ID from a reaction ID by removing specified suffixes and separators.

    Processes a reaction ID to remove forward and reverse suffixes
    as well as any enzyme separators, to obtain the base reaction ID.

    Args:
        reac_id (str): The reaction ID to be processed.
        fwd_suffix (str, optional): The suffix indicating forward reactions. Defaults to REAC_FWD_SUFFIX.
        rev_suffix (str, optional): The suffix indicating reverse reactions. Defaults to REAC_REV_SUFFIX.
        reac_enz_separator (str, optional): The separator used between reaction and enzyme identifiers. Defaults to REAC_ENZ_SEPARATOR.

    Returns:
        str: The base reaction ID with specified suffixes and separators removed.
    """
    reac_id_split = reac_id.split(reac_enz_separator)
    return (
        (reac_id_split[0] + "\b")
        .replace(f"{fwd_suffix}\b", "")
        .replace(f"{rev_suffix}\b", "")
        .replace("\b", "")
    )

get_base_id_optimzation_result(cobrak_model, optimization_dict)

Converts an optimization result to a base reaction ID format in a COBRAk model.

This function processes an optimization result dictionary, which contains reaction IDs with their corresponding flux values, and consolidates these fluxes into base reaction IDs. It accounts for forward and reverse reaction suffixes to ensure that the net flux for each base reaction ID is calculated correctly.

Parameters: - cobrak_model (Model): The COBRAk model containing the reactions to be processed. - optimization_dict (dict[str, float]): A dictionary mapping reaction IDs to their flux values from an optimization result.

  • dict[str, float]: A dictionary mapping base reaction IDs to their net flux values, consolidating forward and reverse reactions.
Source code in cobrak/utilities.py
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
@validate_call(validate_return=True)
def get_base_id_optimzation_result(
    cobrak_model: Model,
    optimization_dict: dict[str, float],
) -> dict[str, float]:
    """Converts an optimization result to a base reaction ID format in a COBRAk model.

    This function processes an optimization result dictionary, which contains reaction IDs with
    their corresponding flux values, and consolidates these fluxes into base reaction IDs. It
    accounts for forward and reverse reaction suffixes to ensure that the net flux for each base
    reaction ID is calculated correctly.

    Parameters:
    - cobrak_model (Model): The COBRAk model containing the reactions to be processed.
    - optimization_dict (dict[str, float]): A dictionary mapping reaction IDs to their flux values
      from an optimization result.

    Returns:
    - dict[str, float]: A dictionary mapping base reaction IDs to their net flux values, consolidating
      forward and reverse reactions.
    """
    base_id_scenario: dict[str, float] = {}

    for reac_id in cobrak_model.reactions:
        if reac_id not in optimization_dict:
            continue
        base_id = get_base_id(
            reac_id,
            cobrak_model.fwd_suffix,
            cobrak_model.rev_suffix,
            cobrak_model.reac_enz_separator,
        )

        multiplier = -1 if reac_id.endswith(cobrak_model.rev_suffix) else +1
        flux = optimization_dict[reac_id]

        if base_id not in base_id_scenario:
            base_id_scenario[base_id] = 0.0

        base_id_scenario[base_id] += multiplier * flux

    return base_id_scenario

get_cobrak_enzyme_reactions_string(cobrak_model, enzyme_id)

Get string of reaction IDs associated with a specific enzyme in the COBRAk model.

This function iterates through the reactions in a COBRAk model and collects the IDs of reaction that involve the specified enzyme. The collected reaction IDs are then concatenated into a single string, separated by semicolons.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAk model containing reactions and enzyme data.

required
enzyme_id str

The ID of the enzyme for which associated reactions are to be found.

required

Returns:

Name Type Description
str str

A semicolon-separated string of reaction IDs that involve the specified enzyme.

Source code in cobrak/utilities.py
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
@validate_call(validate_return=True)
def get_cobrak_enzyme_reactions_string(cobrak_model: Model, enzyme_id: str) -> str:
    """Get string of reaction IDs associated with a specific enzyme in the COBRAk model.

    This function iterates through the reactions in a COBRAk model and collects the IDs of reaction
    that involve the specified enzyme.
    The collected reaction IDs are then concatenated into a single string, separated by semicolons.

    Args:
        cobrak_model (Model): The COBRAk model containing reactions and enzyme data.
        enzyme_id (str): The ID of the enzyme for which associated reactions are to be found.

    Returns:
        str: A semicolon-separated string of reaction IDs that involve the specified enzyme.
    """
    enzyme_reactions = []
    for reac_id, reaction in cobrak_model.reactions.items():
        if reaction.enzyme_reaction_data is None:
            continue
        if enzyme_id in reaction.enzyme_reaction_data.identifiers:
            enzyme_reactions.append(reac_id)
    return "; ".join(enzyme_reactions)

get_df_and_efficiency_factors_sorted_lists(cobrak_model, result, min_flux=0.0)

Extracts and sorts lists of flux values (df) and κ, γ, ι, α, κ⋅γ⋅ι⋅α values from a result.

This function processes a dictionary of results of a COBRA-k optimization to extract and sort lists of flux values (df) and κ, γ, ι, α values values. It filters these values based on a minimum flux threshold and returns them as sorted dictionaries. The function also calculates and returns a dictionary of kappa times gamma values, along with a status indicator representing the number of these values present for each reaction.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRA-k Model object.

required
result dict[str, float]

A dictionary containing optimization results. Keys are expected to start with prefixes like 'DF_VAR_PREFIX', 'KAPPA_VAR_PREFIX', and 'GAMMA_VAR_PREFIX'.

required
min_flux NonNegativeFloat

The minimum flux value to consider when filtering the results. Values below this threshold are excluded. Defaults to 0.0.

0.0

Returns:

Type Description
dict[str, float]

A tuple containing six dictionaries:

dict[str, float]
  1. A dictionary of sorted flux values (df) above the minimum flux.
dict[str, float]
  1. A dictionary of sorted κ values above the minimum flux.
dict[str, float]
  1. A dictionary of sorted γ values above the minimum flux.
dict[str, float]
  1. A dictionary of sorted ι values above the minimum flux.
dict[str, tuple[float, int]]
  1. A dictionary of sorted α values above the minimum flux.
tuple[dict[str, float], dict[str, float], dict[str, float], dict[str, float], dict[str, float], dict[str, tuple[float, int]]]
  1. A dictionary of sorted κ⋅γ⋅ι⋅α values, along with a status indicator. If, for a reaction,
tuple[dict[str, float], dict[str, float], dict[str, float], dict[str, float], dict[str, float], dict[str, tuple[float, int]]]

one or more of these efficiency factors is missing, the respective factor is assumed to be 1.0

tuple[dict[str, float], dict[str, float], dict[str, float], dict[str, float], dict[str, float], dict[str, tuple[float, int]]]

thus having no effect on the multiplied value.

Source code in cobrak/utilities.py
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
@validate_call(validate_return=True)
def get_df_and_efficiency_factors_sorted_lists(
    cobrak_model: Model,
    result: dict[str, float],
    min_flux: NonNegativeFloat = 0.0,
) -> tuple[
    dict[str, float],
    dict[str, float],
    dict[str, float],
    dict[str, float],
    dict[str, float],
    dict[str, tuple[float, int]],
]:
    """Extracts and sorts lists of flux values (df) and κ, γ, ι, α, κ⋅γ⋅ι⋅α values from a result.

    This function processes a dictionary of results of a COBRA-k optimization
    to extract and sort lists of flux values (df) and κ, γ, ι, α values values. It filters
    these values based on a minimum flux threshold and returns them as sorted dictionaries.
    The function also calculates and returns a dictionary of kappa times gamma values,
    along with a status indicator representing the number of these values present for each reaction.

    Args:
        cobrak_model: The COBRA-k Model object.
        result: A dictionary containing optimization results.  Keys are expected to
            start with prefixes like 'DF_VAR_PREFIX', 'KAPPA_VAR_PREFIX', and 'GAMMA_VAR_PREFIX'.
        min_flux: The minimum flux value to consider when filtering the results.  Values below this
            threshold are excluded.  Defaults to 0.0.

    Returns:
        A tuple containing six dictionaries:
        1. A dictionary of sorted flux values (df) above the minimum flux.
        2. A dictionary of sorted κ values above the minimum flux.
        3. A dictionary of sorted γ values above the minimum flux.
        4. A dictionary of sorted ι values above the minimum flux.
        5. A dictionary of sorted α values above the minimum flux.
        6. A dictionary of sorted κ⋅γ⋅ι⋅α values, along with a status indicator. If, for a reaction,
        one or more of these efficiency factors is missing, the respective factor is assumed to be 1.0
        thus having no effect on the multiplied value.
    """
    dfs: dict[str, float] = {}
    kappas: dict[str, float] = {}
    gammas: dict[str, float] = {}
    iotas: dict[str, float] = {}
    alphas: dict[str, float] = {}
    for var_id, value in result.items():
        if var_id.startswith(DF_VAR_PREFIX):
            reac_id = var_id[len(DF_VAR_PREFIX) :]
            dfs[reac_id] = value
        if var_id.startswith(KAPPA_VAR_PREFIX):
            reac_id = var_id[len(KAPPA_VAR_PREFIX) :]
            kappas[reac_id] = value
        elif var_id.startswith(GAMMA_VAR_PREFIX):
            reac_id = var_id[len(GAMMA_VAR_PREFIX) :]
            gammas[reac_id] = value
        elif var_id.startswith(IOTA_VAR_PREFIX):
            reac_id = var_id[len(IOTA_VAR_PREFIX) :]
            iotas[reac_id] = value
        elif var_id.startswith(ALPHA_VAR_PREFIX):
            reac_id = var_id[len(ALPHA_VAR_PREFIX) :]
            alphas[reac_id] = value

    all_multiplied_dict: dict[str, tuple[float, int]] = {}
    for reac_id in cobrak_model.reactions:
        status = 0
        product = 1.0
        if reac_id in kappas:
            product *= kappas[reac_id]
            status += 1
        if reac_id in gammas:
            product *= gammas[reac_id]
            status += 1
        if reac_id in iotas:
            product *= iotas[reac_id]
            status += 1
        if reac_id in alphas:
            product *= alphas[reac_id]
            status += 1
        all_multiplied_dict[reac_id] = (product, status)

    sorted_df_keys = sorted(dfs, key=lambda k: dfs[k], reverse=False)
    sorted_kappa_keys = sorted(kappas, key=lambda k: kappas[k], reverse=False)
    sorted_gamma_keys = sorted(gammas, key=lambda k: gammas[k], reverse=False)
    sorted_iota_keys = sorted(iotas, key=lambda k: iotas[k], reverse=False)
    sorted_alpha_keys = sorted(alphas, key=lambda k: alphas[k], reverse=False)
    sorted_product_keys = sorted(
        all_multiplied_dict, key=lambda k: all_multiplied_dict[k], reverse=False
    )
    return (
        {key: dfs[key] for key in sorted_df_keys if result[key] > min_flux},
        {key: kappas[key] for key in sorted_kappa_keys if result[key] > min_flux},
        {key: gammas[key] for key in sorted_gamma_keys if result[key] > min_flux},
        {key: iotas[key] for key in sorted_iota_keys if result[key] > min_flux},
        {key: alphas[key] for key in sorted_alpha_keys if result[key] > min_flux},
        {
            key: all_multiplied_dict[key]
            for key in sorted_product_keys
            if key in result and result[key] > min_flux
        },
    )

get_elementary_conservation_relations(cobrak_model)

Calculate and return the elementary conservation relations (ECRs) of a COBRAk model as a string.

Computes the null space of the stoichiometric matrix of a COBRAk model to determine the elementary conservation relations. It then formats these relations into a human-readable string such as "1 ATP * 1 ADP"

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAk model containing reactions and metabolites.

required

Returns:

Name Type Description
str str

A string representation of the elementary conservation relations, where each relation is expressed as a linear combination of metabolites.

Source code in cobrak/utilities.py
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
@validate_call(validate_return=True)
def get_elementary_conservation_relations(
    cobrak_model: Model,
) -> str:
    """Calculate and return the elementary conservation relations (ECRs) of a COBRAk model as a string.

    Computes the null space of the stoichiometric matrix of a COBRAk model to determine the elementary conservation relations.
    It then formats these relations into a human-readable string such as "1 ATP * 1 ADP"

    Args:
        cobrak_model (Model): The COBRAk model containing reactions and metabolites.

    Returns:
        str: A string representation of the elementary conservation relations, where each relation is expressed as a linear combination of metabolites.
    """
    # Convert the list of lists to a sympy Matrix
    S_matrix = Matrix(get_stoichiometric_matrix(cobrak_model)).T  # type: ignore

    # Calculate the null space of the stoichiometric matrix
    null_space = S_matrix.nullspace()

    # Convert the null space vectors to a NumPy array
    ECRs = np.array([ns.T.tolist()[0] for ns in null_space], dtype=float)

    ecrs_list = ECRs.tolist()
    met_ids = list(cobrak_model.metabolites)
    conservation_relations = ""
    for current_ecr in range(len(ecrs_list)):
        ecr = ecrs_list[current_ecr]
        for current_met in range(len(met_ids)):
            value = ecr[current_met]
            if value != 0.0:
                conservation_relations += f" {value} * {met_ids[current_met]} "
        conservation_relations += "\n"

    return conservation_relations

get_enzyme_usage_by_protein_pool_fraction(cobrak_model, result, min_conc=1e-12, rounding=5)

Return enzyme usage as a fraction of the total protein pool in a COBRAk model.

This function computes the fraction of the total protein pool used by each enzyme based on the given result dictionary. It filters out enzymes with concentrations below a specified minimum and groups the reactions by their protein pool fractions. The dictionary is sorted, i.e., low fractions occur first and high fractions last as keys.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAk model containing reactions and enzyme data.

required
result dict[str, float]

A dictionary mapping variable names to their values, typically from an optimization result.

required
min_conc float

The minimum concentration threshold for considering enzyme usage. Defaults to 1e-12.

1e-12
rounding int

The number of decimal places to round the protein pool fractions. Defaults to 5.

5

Returns:

Type Description
dict[NonNegativeFloat, list[str]]

dict[NonNegativeFloat, list[str]]: A dictionary where the keys are protein pool fractions and the values are lists of reaction IDs that use that fraction of the protein pool.

Source code in cobrak/utilities.py
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
@validate_call(validate_return=True)
def get_enzyme_usage_by_protein_pool_fraction(
    cobrak_model: Model,
    result: dict[str, float],
    min_conc: NonNegativeFloat = 1e-12,
    rounding: NonNegativeInt = 5,
) -> dict[NonNegativeFloat, list[str]]:
    """Return enzyme usage as a fraction of the total protein pool in a COBRAk model.

    This function computes the fraction of the total protein pool used by each enzyme based on the given result dictionary.
    It filters out enzymes with concentrations below a specified minimum and groups the reactions by their protein pool fractions.
    The dictionary is sorted, i.e., low fractions occur first and high fractions last as keys.

    Args:
        cobrak_model (Model): The COBRAk model containing reactions and enzyme data.
        result (dict[str, float]): A dictionary mapping variable names to their values, typically from an optimization result.
        min_conc (float, optional): The minimum concentration threshold for considering enzyme usage. Defaults to 1e-12.
        rounding (int, optional): The number of decimal places to round the protein pool fractions. Defaults to 5.

    Returns:
        dict[NonNegativeFloat, list[str]]: A dictionary where the keys are protein pool fractions and the values are lists of
                                reaction IDs that use that fraction of the protein pool.
    """
    protein_pool_fractions: dict[float, list[str]] = {}
    for var_name, value in result.items():
        if not var_name.startswith(ENZYME_VAR_PREFIX):
            continue
        reac_id = var_name.split(ENZYME_VAR_INFIX)[-1]
        full_mw = get_full_enzyme_mw(cobrak_model, cobrak_model.reactions[reac_id])
        if value > min_conc:
            protein_pool_fraction = round(
                (full_mw * value) / cobrak_model.max_prot_pool, rounding
            )
        else:
            continue
        if protein_pool_fraction not in protein_pool_fractions:
            protein_pool_fractions[protein_pool_fraction] = []
        protein_pool_fractions[protein_pool_fraction].append(reac_id)

    return dict(sorted(protein_pool_fractions.items()))

get_extra_linear_constraint_string(extra_linear_constraint)

Returns a string representation of an extra linear constraint.

The returned format is: "lower_value ≤ stoichiometry * var_id + ... ≤ upper_value"

Parameters:

Name Type Description Default
extra_linear_constraint ExtraLinearConstraint

The extra linear constraint to convert to a string.

required

Returns:

Name Type Description
str str

A string representation of the extra linear constraint

Source code in cobrak/utilities.py
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
@validate_call(validate_return=True)
def get_extra_linear_constraint_string(
    extra_linear_constraint: ExtraLinearConstraint,
) -> str:
    """Returns a string representation of an extra linear constraint.

    The returned format is:
    "lower_value ≤ stoichiometry * var_id + ... ≤ upper_value"

    Args:
        extra_linear_constraint (ExtraLinearConstraint): The extra linear constraint to convert to a string.

    Returns:
        str: A string representation of the extra linear constraint
    """
    string = ""

    if extra_linear_constraint.lower_value is not None:
        string += f"{extra_linear_constraint.lower_value} ≤ "

    for var_id, stoichiometry in sort_dict_keys(
        extra_linear_constraint.stoichiometries
    ).items():
        if stoichiometry > 0:
            printed_stoichiometry = f" + {stoichiometry}"
        else:
            printed_stoichiometry = f" - {abs(stoichiometry)}"
        string += f"{printed_stoichiometry} {var_id}"

    if extra_linear_constraint.upper_value is not None:
        string += f"≤ {extra_linear_constraint.upper_value}"

    return string.lstrip()

get_full_enzyme_id(identifiers)

Generate a full enzyme ID by concatenating the list of enzyme identifiers with a specific separator.

Parameters:

Name Type Description Default
identifiers list[str]

A list of enzyme identifiers.

required

Returns:

Name Type Description
str str

A single string representing the full enzyme ID, with single identifiers separated by "AND".

Source code in cobrak/utilities.py
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
@validate_call(validate_return=True)
def get_full_enzyme_id(identifiers: list[str]) -> str:
    """Generate a full enzyme ID by concatenating the list of enzyme identifiers with a specific separator.

    Args:
        identifiers (list[str]): A list of enzyme identifiers.

    Returns:
        str: A single string representing the full enzyme ID, with single identifiers separated by "_AND_".
    """
    return "_AND_".join(identifiers)

get_full_enzyme_mw(cobrak_model, reaction)

Calculate the full molecular weight of enzymes (in kDa) involved in a given reaction.

This function computes the total molecular weight of all enzymes associated with a specified reaction in the COBRAk model. If the reaction does not have any enzyme reaction data, a ValueError is raised.

  • If special (i.e. non-1) stoichiometries are provided in the reaction's enzyme_reaction_data, they are used to scale the molecular weights accordingly.
  • If no special stoichiometry is provided for an enzyme, a default stoichiometry of 1 is assumed.
  • The function sums up the molecular weights of all enzymes, multiplied by their respective stoichiometries, to compute the total.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRA-k model containing enzyme data.

required
reaction Reaction

The reaction for which the full enzyme molecular weight is to be calculated.

required

Returns:

Name Type Description
float float

The total molecular weight of all enzymes involved in the reaction in kDa

Raises:

Type Description
ValueError

If the reaction does not have any enzyme reaction data.

Source code in cobrak/utilities.py
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
@validate_call(validate_return=True)
def get_full_enzyme_mw(cobrak_model: Model, reaction: Reaction) -> float:
    """Calculate the full molecular weight of enzymes (in kDa) involved in a given reaction.

    This function computes the total molecular weight of all enzymes associated with a specified reaction in the COBRAk model.
    If the reaction does not have any enzyme reaction data, a ValueError is raised.

    - If special (i.e. non-1) stoichiometries are provided in the reaction's `enzyme_reaction_data`, they are used to scale the molecular weights accordingly.
    - If no special stoichiometry is provided for an enzyme, a default stoichiometry of 1 is assumed.
    - The function sums up the molecular weights of all enzymes, multiplied by their respective stoichiometries, to compute the total.

    Args:
        cobrak_model (Model): The COBRA-k model containing enzyme data.
        reaction (Reaction): The reaction for which the full enzyme molecular weight is to be calculated.

    Returns:
        float: The total molecular weight of all enzymes involved in the reaction in kDa

    Raises:
        ValueError: If the reaction does not have any enzyme reaction data.
    """
    if reaction.enzyme_reaction_data is None:
        raise ValueError
    full_mw = 0.0
    for identifier in reaction.enzyme_reaction_data.identifiers:
        if identifier in reaction.enzyme_reaction_data.special_stoichiometries:
            stoichiometry = reaction.enzyme_reaction_data.special_stoichiometries[
                identifier
            ]
        else:
            stoichiometry = 1
        full_mw += stoichiometry * cobrak_model.enzymes[identifier].molecular_weight
    return full_mw

get_fwd_rev_corrected_flux(reac_id, usable_reac_ids, result, fwd_suffix=REAC_FWD_SUFFIX, rev_suffix=REAC_REV_SUFFIX)

Calculates the direction-corrected flux for a reaction, taking into account the flux of its reverse reaction.

If the reverse reaction exists and its flux is greater than the flux of the forward reaction, the corrected flux is set to 0.0. Otherwise, the corrected flux is calculated as the difference between the flux of the forward reaction and the flux of the reverse reaction. If the reverse reaction does not exist or is not usable, the corrected flux is set to the flux of the forward reaction.

Args: reac_id (str): The ID of the reaction. usable_reac_ids (list[str] | set[str]): A list or set of IDs of reactions that can be used for correction. result (dict[str, float]): A dictionary containing the flux values for each reaction. fwd_suffix (str, optional): The suffix used to identify forward reactions. Defaults to REAC_FWD_SUFFIX. rev_suffix (str, optional): The suffix used to identify reverse reactions. Defaults to REAC_REV_SUFFIX.

Returns: float: The corrected flux value for the reaction.

Source code in cobrak/utilities.py
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
@validate_call(validate_return=True)
def get_fwd_rev_corrected_flux(
    reac_id: str,
    usable_reac_ids: list[str] | set[str],
    result: dict[str, float],
    fwd_suffix: str = REAC_FWD_SUFFIX,
    rev_suffix: str = REAC_REV_SUFFIX,
) -> float:
    """Calculates the direction-corrected flux for a reaction, taking into account the flux of its reverse reaction.

    If the reverse reaction exists and its flux is greater than the flux of the forward reaction, the corrected flux is set to 0.0.
    Otherwise, the corrected flux is calculated as the difference between the flux of the forward reaction and the flux of the reverse reaction.
    If the reverse reaction does not exist or is not usable, the corrected flux is set to the flux of the forward reaction.

    Args:
    reac_id (str): The ID of the reaction.
    usable_reac_ids (list[str] | set[str]): A list or set of IDs of reactions that can be used for correction.
    result (dict[str, float]): A dictionary containing the flux values for each reaction.
    fwd_suffix (str, optional): The suffix used to identify forward reactions. Defaults to REAC_FWD_SUFFIX.
    rev_suffix (str, optional): The suffix used to identify reverse reactions. Defaults to REAC_REV_SUFFIX.

    Returns:
    float: The corrected flux value for the reaction.
    """
    other_id = get_reverse_reac_id_if_existing(
        reac_id,
        fwd_suffix,
        rev_suffix,
    )
    if other_id in usable_reac_ids:
        other_flux = result[other_id]
        this_flux = result[reac_id]
        flux = 0.0 if other_flux > this_flux else this_flux - other_flux
    else:
        flux = result[reac_id]

    return flux

get_metabolite_consumption_and_production(cobrak_model, met_id, optimization_dict)

Calculate the consumption and production rates of a metabolite in a COBRAk model.

This function computes the total consumption and production of a specified metabolite based on the flux values provided in an optimization dictionary. It iterates through the reactions in the COBRAk model, checking the stoichiometries to determine the metabolite's consumption or production in each reaction.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAk model containing reactions and metabolites.

required
met_id str

The ID of the metabolite for which consumption and production rates are to be calculated.

required
optimization_dict dict[str, float]

A dictionary mapping reaction IDs to their optimized flux values.

required

Returns:

Type Description
tuple[float, float]

tuple[float, float]: A tuple containing the total consumption and production rates of the specified metabolite.

Source code in cobrak/utilities.py
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
@validate_call(validate_return=True)
def get_metabolite_consumption_and_production(
    cobrak_model: Model, met_id: str, optimization_dict: dict[str, float]
) -> tuple[float, float]:
    """Calculate the consumption and production rates of a metabolite in a COBRAk model.

    This function computes the total consumption and production of a specified metabolite
    based on the flux values provided in an optimization dictionary.
    It iterates through the reactions in the COBRAk model, checking the stoichiometries to determine the metabolite's
    consumption or production in each reaction.

    Args:
        cobrak_model (Model): The COBRAk model containing reactions and metabolites.
        met_id (str): The ID of the metabolite for which consumption and production rates are to be calculated.
        optimization_dict (dict[str, float]): A dictionary mapping reaction IDs to their optimized flux values.

    Returns:
        tuple[float, float]: A tuple containing the total consumption and production rates of the specified metabolite.
    """
    consumption = 0.0
    production = 0.0
    for reac_id, reaction in cobrak_model.reactions.items():
        if reac_id not in optimization_dict:
            continue
        if met_id not in reaction.stoichiometries:
            continue
        stoichiometry = reaction.stoichiometries[met_id]
        if stoichiometry < 0.0:
            consumption += optimization_dict[reac_id] * stoichiometry
        else:
            production += optimization_dict[reac_id] * stoichiometry
    return consumption, production

get_metabolites_in_elementary_conservation_relations(cobrak_model)

Identify metabolites involved in elementary conservation relations (ECRs) in a COBRAk model.

Calculates the null space of the stoichiometric matrix of a COBRAk model to determine the elementary conservation relations. It then identifies the metabolites that are part of these relations.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAk model containing reactions and metabolites.

required

Returns:

Type Description
list[str]

list[str]: A list of metabolite IDs that are involved in elementary conservation relations.

Source code in cobrak/utilities.py
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
@validate_call(validate_return=True)
def get_metabolites_in_elementary_conservation_relations(
    cobrak_model: Model,
) -> list[str]:
    """Identify metabolites involved in elementary conservation relations (ECRs) in a COBRAk model.

    Calculates the null space of the stoichiometric matrix of a COBRAk model to determine the elementary conservation relations.
    It then identifies the metabolites that are part of these relations.

    Args:
        cobrak_model (Model): The COBRAk model containing reactions and metabolites.

    Returns:
        list[str]: A list of metabolite IDs that are involved in elementary conservation relations.
    """
    # Convert the list of lists to a sympy Matrix
    S_matrix = array(get_stoichiometric_matrix(cobrak_model)).T

    # Calculate the null space of the stoichiometric matrix using Gaussian elimination
    null_spacex = null_space(S_matrix)

    # Convert the null space vectors to a NumPy array
    ECRs = (
        null_spacex.T
    )  # np.array([ns.T.tolist()[0] for ns in null_spacex], dtype=float)

    # Simplify the ECRs by removing near-zero elements
    threshold = 1e-10
    ECRs[np.abs(ECRs) < threshold] = 0
    met_ids = list(cobrak_model.metabolites)

    dependencies = []
    for ecr in ECRs.tolist():
        for entry_num in range(len(ecr)):
            if ecr[entry_num] != 0.0:
                dependencies.append(met_ids[entry_num])
    return list(set(dependencies))

get_model_dG0s(cobrak_model, abs_values=False, exclude_bw_reacs=True)

Extracts standard Gibbs free energy changes (dG0) from reactions in the model.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAk model containing reactions with thermodynamic data.

required
abs_values bool

If True, returns absolute values of dG0. Defaults to False.

False

Returns:

Type Description
list[float]

list[float]: A list of dG0 values, possibly as absolute values if specified.

Source code in cobrak/utilities.py
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
@validate_call(validate_return=True)
def get_model_dG0s(
    cobrak_model: Model, abs_values: bool = False, exclude_bw_reacs: bool = True
) -> list[float]:
    """Extracts standard Gibbs free energy changes (dG0) from reactions in the model.

    Args:
        cobrak_model (Model): The COBRAk model containing reactions with thermodynamic data.
        abs_values (bool, optional): If True, returns absolute values of dG0. Defaults to False.

    Returns:
        list[float]: A list of dG0 values, possibly as absolute values if specified.
    """
    return [
        x[1] for x in get_sorted_model_dG0s(cobrak_model, abs_values, exclude_bw_reacs)
    ]

get_model_hill_coefficients(cobrak_model, return_only_values_with_reference=False)

Collects k_A values from a COBRA-k model.

This function iterates through the reactions in a COBRA-k model and extracts the k_A values associated with each metabolite

Parameters:

Name Type Description Default
cobrak_model Model

The COBRA-k Model object.

required
return_only_values_with_reference bool

Returns only values with a given database reference. Defaults to False.

False

Returns:

Type Description
list[PositiveFloat]

A tuple containing three lists: the first list contains κ Hill coefficients, the second

list[PositiveFloat]

ι Hill coefficients, the third α Hill coefficients.

Source code in cobrak/utilities.py
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
@validate_call(validate_return=True)
def get_model_hill_coefficients(
    cobrak_model: Model,
    return_only_values_with_reference: bool = False,
) -> list[PositiveFloat]:
    """Collects k_A values from a COBRA-k model.

    This function iterates through the reactions in a COBRA-k model and extracts the
    k_A values associated with each metabolite

    Args:
        cobrak_model: The COBRA-k Model object.
        return_only_values_with_reference: Returns only values with a given database reference. Defaults to False.

    Returns:
        A tuple containing three lists: the first list contains κ Hill coefficients, the second
        ι Hill coefficients, the third α Hill coefficients.
    """
    kappa_hills: list[PositiveFloat] = []
    iota_hills: list[PositiveFloat] = []
    alpha_hills: list[PositiveFloat] = []
    for reaction in cobrak_model.reactions.values():
        if reaction.enzyme_reaction_data is None:
            continue

        # κ Hills
        for (
            met_id,
            hill_coefficient,
        ) in reaction.enzyme_reaction_data.hill_coefficients.kappa.items():
            if return_only_values_with_reference:
                references = (
                    reaction.enzyme_reaction_data.hill_coefficient_references.kappa
                )
                if (met_id not in references) or (len(references[met_id]) == 0):
                    tax_distance = -1
                else:
                    tax_distance = references[met_id][0].tax_distance
                if tax_distance < 0:
                    continue
            kappa_hills.append(hill_coefficient)

        # ι Hills
        for (
            met_id,
            hill_coefficient,
        ) in reaction.enzyme_reaction_data.hill_coefficients.iota.items():
            if return_only_values_with_reference:
                references = (
                    reaction.enzyme_reaction_data.hill_coefficient_references.iota
                )
                if (met_id not in references) or (len(references[met_id]) == 0):
                    tax_distance = -1
                else:
                    tax_distance = references[met_id][0].tax_distance
                if tax_distance < 0:
                    continue
            iota_hills.append(hill_coefficient)

        # α Hills
        for (
            met_id,
            hill_coefficient,
        ) in reaction.enzyme_reaction_data.hill_coefficients.alpha.items():
            if return_only_values_with_reference:
                references = (
                    reaction.enzyme_reaction_data.hill_coefficient_references.alpha
                )
                if (met_id not in references) or (len(references[met_id]) == 0):
                    tax_distance = -1
                else:
                    tax_distance = references[met_id][0].tax_distance
                if tax_distance < 0:
                    continue
            alpha_hills.append(hill_coefficient)

    return kappa_hills, iota_hills, alpha_hills

get_model_kas(cobrak_model, return_only_values_with_reference=False)

Collects k_A values from a COBRA-k model.

This function iterates through the reactions in a COBRA-k model and extracts the k_A values associated with each metabolite

Parameters:

Name Type Description Default
cobrak_model Model

The COBRA-k Model object.

required
return_only_values_with_reference bool

Returns only values with a given database reference. Defaults to False.

False

Returns:

Type Description
list[PositiveFloat]

A list containing the k_A values

Source code in cobrak/utilities.py
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
@validate_call(validate_return=True)
def get_model_kas(
    cobrak_model: Model,
    return_only_values_with_reference: bool = False,
) -> list[PositiveFloat]:
    """Collects k_A values from a COBRA-k model.

    This function iterates through the reactions in a COBRA-k model and extracts the
    k_A values associated with each metabolite

    Args:
        cobrak_model: The COBRA-k Model object.
        return_only_values_with_reference: Returns only values with a given database reference. Defaults to False.

    Returns:
        A list containing the k_A values
    """
    return [
        x[2]
        for x in get_sorted_model_kas(cobrak_model, return_only_values_with_reference)
    ]

get_model_kcats(cobrak_model)

Extracts k_cat values from reactions with enzyme data in the model.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAk model containing reactions with enzyme data.

required

Returns:

Type Description
list[float]

list[float]: A list of k_cat values for reactions with available enzyme data.

Source code in cobrak/utilities.py
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
@validate_call(validate_return=True)
def get_model_kcats(cobrak_model: Model) -> list[float]:
    """Extracts k_cat values from reactions with enzyme data in the model.

    Args:
        cobrak_model (Model): The COBRAk model containing reactions with enzyme data.

    Returns:
        list[float]: A list of k_cat values for reactions with available enzyme data.
    """
    return [x[1] for x in get_sorted_model_kcats(cobrak_model)]

get_model_kis(cobrak_model, return_only_values_with_reference=False)

Collects k_I values from a COBRA-k model.

This function iterates through the reactions in a COBRA-k model and extracts the k_I values associated with each metabolite

Parameters:

Name Type Description Default
cobrak_model Model

The COBRA-k Model object.

required
return_only_values_with_reference bool

Returns only values with a given database reference. Defaults to False.

False

Returns:

Type Description
list[PositiveFloat]

A list containing the k_I values

Source code in cobrak/utilities.py
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
@validate_call(validate_return=True)
def get_model_kis(
    cobrak_model: Model,
    return_only_values_with_reference: bool = False,
) -> list[PositiveFloat]:
    """Collects k_I values from a COBRA-k model.

    This function iterates through the reactions in a COBRA-k model and extracts the
    k_I values associated with each metabolite

    Args:
        cobrak_model: The COBRA-k Model object.
        return_only_values_with_reference: Returns only values with a given database reference. Defaults to False.

    Returns:
        A list containing the k_I values
    """
    return [
        x[2]
        for x in get_sorted_model_kis(cobrak_model, return_only_values_with_reference)
    ]

get_model_kms(cobrak_model, return_only_values_with_reference=False)

Extracts k_m values from reactions with enzyme data in the model.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAk model containing reactions with enzyme data.

required

Returns:

Type Description
list[float]

list[float]: A flat list of k_m values from all reactions with available enzyme data.

Source code in cobrak/utilities.py
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
@validate_call(validate_return=True)
def get_model_kms(
    cobrak_model: Model, return_only_values_with_reference: bool = False
) -> list[float]:
    """Extracts k_m values from reactions with enzyme data in the model.

    Args:
        cobrak_model (Model): The COBRAk model containing reactions with enzyme data.

    Returns:
        list[float]: A flat list of k_m values from all reactions with available enzyme data.
    """
    substrate_kms, product_kms = get_model_kms_by_usage(
        cobrak_model, return_only_values_with_reference
    )
    return substrate_kms + product_kms

get_model_kms_by_usage(cobrak_model, return_only_values_with_reference=False)

Collects k_M values from a COBRA-k model, separating them into substrate and product lists.

This function iterates through the reactions in a COBRA-k model and extracts the k_M values associated with each metabolite. It distinguishes between substrates (metabolites with negative stoichiometry) and products (metabolites with positive stoichiometry) and separates the corresponding Kms values into two lists.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRA-k Model object.

required
return_only_values_with_reference bool

Returns only values with a given database reference. Defaults to False.

False

Returns:

Type Description
list[PositiveFloat]

A tuple containing two lists: the first list contains k_M values for substrates,

list[PositiveFloat]

and the second list contains k_M values for products.

Source code in cobrak/utilities.py
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
@validate_call(validate_return=True)
def get_model_kms_by_usage(
    cobrak_model: Model,
    return_only_values_with_reference: bool = False,
) -> tuple[list[PositiveFloat], list[PositiveFloat]]:
    """Collects k_M values from a COBRA-k model, separating them into substrate and product lists.

    This function iterates through the reactions in a COBRA-k model and extracts the
    k_M values associated with each metabolite. It distinguishes between substrates
    (metabolites with negative stoichiometry) and products (metabolites with positive
    stoichiometry) and separates the corresponding Kms values into two lists.

    Args:
        cobrak_model: The COBRA-k Model object.
        return_only_values_with_reference: Returns only values with a given database reference. Defaults to False.

    Returns:
        A tuple containing two lists: the first list contains k_M values for substrates,
        and the second list contains k_M values for products.
    """
    substrate_kms, product_kms = get_sorted_model_kms_by_usage(
        cobrak_model=cobrak_model,
        return_only_values_with_reference=return_only_values_with_reference,
    )
    return [x[2] for x in substrate_kms], [x[2] for x in product_kms]

get_model_max_kcat_times_e_values(cobrak_model)

Calculates the maximum k_cat * E (enzyme concentration in terms of its molecular weight) for each reaction with enzyme data and returns these values.

The maximal k_catE is Ωk_cat/W, with Ω as protein pool and W as enzyme molecular weight.

Parameters:

Name Type Description Default
cobrak_model Model

A metabolic model instance that includes enzymatic constraints, which must contain Reaction instances with enzyme_reaction_data.

required

Returns:

Type Description
list[NonNegativeFloat]

List[float]: A list containing the calculated maximum k_cat * E values for reactions having enzyme reaction data.

Notes
  • The function requires 'reaction.enzyme_reaction_data.k_cat' and 'get_full_enzyme_mw(cobrak_model, reaction)' to be non-zero.
  • If a reaction lacks enzyme reaction data, it is skipped in the calculation.
Source code in cobrak/utilities.py
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
@validate_call(validate_return=True)
def get_model_max_kcat_times_e_values(cobrak_model: Model) -> list[NonNegativeFloat]:
    """Calculates the maximum k_cat * E (enzyme concentration in terms of its molecular weight)
    for each reaction with enzyme data and returns these values.

    The maximal k_cat*E is Ω*k_cat/W, with Ω as protein pool and W as enzyme molecular weight.

    Parameters:
        cobrak_model (Model): A metabolic model instance that includes enzymatic constraints,
                              which must contain Reaction instances with enzyme_reaction_data.

    Returns:
        List[float]: A list containing the calculated maximum k_cat * E values for reactions
                     having enzyme reaction data.

    Notes:
        - The function requires 'reaction.enzyme_reaction_data.k_cat' and
          'get_full_enzyme_mw(cobrak_model, reaction)' to be non-zero.
        - If a reaction lacks enzyme reaction data, it is skipped in the calculation.
    """
    max_kcat_times_e_values: list[float] = []
    for reaction in cobrak_model.reactions.values():
        if (
            reaction.enzyme_reaction_data is None
            or reaction.enzyme_reaction_data.k_cat >= 1e19
        ):
            continue
        max_kcat_times_e_values.append(
            reaction.enzyme_reaction_data.k_cat
            * cobrak_model.max_prot_pool
            / get_full_enzyme_mw(cobrak_model, reaction)
        )
    return max_kcat_times_e_values

get_model_mws(cobrak_model)

Extracts molecular weights of enzymes from the model.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAk model containing enzyme data.

required

Returns:

Type Description
list[PositiveFloat]

list[PositiveFloat]: A list of molecular weights for each enzyme in the model.

Source code in cobrak/utilities.py
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
@validate_call(validate_return=True)
def get_model_mws(cobrak_model: Model) -> list[PositiveFloat]:
    """Extracts molecular weights of enzymes from the model.

    Args:
        cobrak_model (Model): The COBRAk model containing enzyme data.

    Returns:
        list[PositiveFloat]: A list of molecular weights for each enzyme in the model.
    """
    mws = []
    for enzyme in cobrak_model.enzymes.values():
        mws.append(enzyme.molecular_weight)
    return mws

get_model_with_filled_missing_parameters(cobrak_model, add_dG0_extra_constraints=False, param_percentile=90, ignore_prefixes=['EX_'], use_median_for_kms=True, use_median_for_kcats=True, ignored_enzyme_ids=['s0001'], exclude_bw_reac_ids_for_dG0s=False, verbose=False, ignore_nameparts=['diffusion'], ignore_infixes=[])

Fills missing parameters in a COBRA-k model, including dG0, k_cat, and k_ms values.

This function iterates through the reactions in a COBRA-k model and fills in missing parameters based on percentile values from the entire model. Missing dG0 values are filled using a percentile of the absolute dG0 values. Missing k_cat values are filled using a percentile or median of the k_cat values. Missing k_ms values are filled using a percentile or median of the k_ms values, depending on whether the metabolite is a substrate or a product. Optionally, extra linear constraints can be added to enforce consistency between the dG0 values of coupled reversible reactions.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRA-k Model object to be modified.

required
add_dG0_extra_constraints bool

Whether to add extra linear constraints for reversible reactions. Defaults to False.

False
param_percentile conint(ge=0, le=100)

The percentile to use for filling missing parameters. Defaults to 90.

90
ignore_prefixes list[str]

List of prefixes to ignore when processing reactions. Defaults to ["EX_"] (i.e. exchange reactions).

['EX_']
use_median_for_kms bool

Whether to use the median instead of the percentile for k_ms values. Defaults to True.

True
use_median_for_kcats bool

Whether to use the median instead of the percentile for k_cat values. Defaults to True.

True

Returns:

Type Description
Model

A deep copy of the input COBRA-k model with missing parameters filled.

Source code in cobrak/utilities.py
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
@validate_call(validate_return=True)
def get_model_with_filled_missing_parameters(
    cobrak_model: Model,
    add_dG0_extra_constraints: bool = False,
    param_percentile: conint(ge=0, le=100) = 90,  # pyright: ignore[reportInvalidTypeForm]
    ignore_prefixes: list[str] = ["EX_"],
    use_median_for_kms: bool = True,
    use_median_for_kcats: bool = True,
    ignored_enzyme_ids: list[str] = ["s0001"],
    exclude_bw_reac_ids_for_dG0s: bool = False,
    verbose: bool = False,
    ignore_nameparts: list[str] = ["diffusion"],
    ignore_infixes: list[str] = [],
) -> Model:
    """Fills missing parameters in a COBRA-k model, including dG0, k_cat, and k_ms values.

    This function iterates through the reactions in a COBRA-k model and fills in missing
    parameters based on percentile values from the entire model.  Missing dG0 values
    are filled using a percentile of the absolute dG0 values.  Missing k_cat values
    are filled using a percentile or median of the k_cat values.  Missing k_ms values
    are filled using a percentile or median of the k_ms values, depending on whether
    the metabolite is a substrate or a product.  Optionally, extra linear constraints
    can be added to enforce consistency between the dG0 values of coupled reversible
    reactions.

    Args:
        cobrak_model: The COBRA-k Model object to be modified.
        add_dG0_extra_constraints: Whether to add extra linear constraints for reversible reactions. Defaults to False.
        param_percentile: The percentile to use for filling missing parameters. Defaults to 90.
        ignore_prefixes: List of prefixes to ignore when processing reactions. Defaults to ["EX_"] (i.e. exchange reactions).
        use_median_for_kms: Whether to use the median instead of the percentile for k_ms values. Defaults to True.
        use_median_for_kcats: Whether to use the median instead of the percentile for k_cat values. Defaults to True.

    Returns:
        A deep copy of the input COBRA-k model with missing parameters filled.
    """
    cobrak_model = deepcopy(cobrak_model)

    all_mws = get_model_mws(cobrak_model)
    all_kcats = get_model_kcats(cobrak_model)
    substrate_kms, product_kms = get_model_kms_by_usage(cobrak_model)
    all_abs_dG0s = [
        abs(dG0)
        for dG0 in get_model_dG0s(
            cobrak_model, exclude_bw_reacs=exclude_bw_reac_ids_for_dG0s
        )
    ]
    if verbose:
        filled_kcats = 0
        filled_dG0s = 0
        filled_substrate_kms = 0
        filled_product_kms = 0
    dG0_reverse_couples: set[tuple[str]] = set()
    for reac_id, reaction in cobrak_model.reactions.items():
        if sum(reac_id.startswith(ignore_prefix) for ignore_prefix in ignore_prefixes):
            continue
        if sum(
            ignore_namepart in reaction.name for ignore_namepart in ignore_nameparts
        ):
            continue
        if any(ignore_infix in reac_id for ignore_infix in ignore_infixes):
            continue
        if cobrak_model.reactions[reac_id].dG0 is None:
            reverse_id = get_reverse_reac_id_if_existing(
                reac_id, cobrak_model.fwd_suffix, cobrak_model.rev_suffix
            )
            reverse_id = reverse_id if reverse_id in cobrak_model.reactions else ""
            if add_dG0_extra_constraints and reverse_id:
                dG0_reverse_couples.add(tuple(sorted([reac_id, reverse_id])))
                cobrak_model.reactions[reac_id].dG0 = 0.0
                cobrak_model.reactions[reac_id].dG0_uncertainty = percentile(
                    all_abs_dG0s, param_percentile
                )
            else:
                cobrak_model.reactions[reac_id].dG0 = -percentile(
                    all_abs_dG0s, param_percentile
                )
            if verbose:
                filled_dG0s += 1
        if cobrak_model.reactions[reac_id].enzyme_reaction_data is not None:
            stop = False
            for ignored_enzyme_id in ignored_enzyme_ids:
                for identifier in cobrak_model.reactions[
                    reac_id
                ].enzyme_reaction_data.identifiers:
                    if ignored_enzyme_id in identifier:
                        cobrak_model.reactions[reac_id].enzyme_reaction_data = None
                        stop = True
                        break
                if stop:
                    break
        if (cobrak_model.reactions[reac_id].enzyme_reaction_data is None) or (
            "" in cobrak_model.reactions[reac_id].enzyme_reaction_data.identifiers
        ):
            enzyme_substitue_id = f"{reac_id}_enzyme_substitute"
            cobrak_model.enzymes[enzyme_substitue_id] = Enzyme(
                molecular_weight=percentile(all_mws, 100 - param_percentile),
            )
            identifiers = [enzyme_substitue_id]
            cobrak_model.enzymes[enzyme_substitue_id] = Enzyme(
                molecular_weight=percentile(all_mws, 100 - param_percentile),
            )
        else:
            identifiers = cobrak_model.reactions[
                reac_id
            ].enzyme_reaction_data.identifiers

        if (
            (cobrak_model.reactions[reac_id].enzyme_reaction_data is None)
            or ("" in cobrak_model.reactions[reac_id].enzyme_reaction_data.identifiers)
            or (cobrak_model.reactions[reac_id].enzyme_reaction_data.k_cat > 1e19)
        ):
            enzyme_substitue_id = f"{reac_id}_enzyme_substitute"
            if not use_median_for_kcats:
                cobrak_model.reactions[
                    reac_id
                ].enzyme_reaction_data = EnzymeReactionData(
                    identifiers=identifiers,
                    k_cat=percentile(all_kcats, param_percentile),
                )
            else:
                cobrak_model.reactions[
                    reac_id
                ].enzyme_reaction_data = EnzymeReactionData(
                    identifiers=identifiers,
                    k_cat=median(all_kcats),
                )
            if verbose:
                filled_kcats += 1
        if not have_all_unignored_km(
            cobrak_model.reactions[reac_id], cobrak_model.kinetic_ignored_metabolites
        ):
            existing_kms: list[str] = list(
                cobrak_model.reactions[reac_id].enzyme_reaction_data.k_ms.keys()
            )
            for met_id, stoichiometry in cobrak_model.reactions[
                reac_id
            ].stoichiometries.items():
                if (met_id in cobrak_model.kinetic_ignored_metabolites) or (
                    met_id in existing_kms
                ):
                    continue
                if not use_median_for_kms:
                    cobrak_model.reactions[reac_id].enzyme_reaction_data.k_ms[
                        met_id
                    ] = float(
                        percentile(
                            product_kms if stoichiometry > 0.0 else substrate_kms,
                            param_percentile
                            if stoichiometry > 0.0
                            else 100 - param_percentile,
                        )
                    )
                else:
                    cobrak_model.reactions[reac_id].enzyme_reaction_data.k_ms[
                        met_id
                    ] = (
                        median(substrate_kms)
                        if stoichiometry < 0.0
                        else median(product_kms)
                    )
                if verbose:
                    if stoichiometry > 0.0:
                        filled_product_kms += 1
                    else:
                        filled_substrate_kms += 1

    for dG0_reverse_couple in dG0_reverse_couples:
        reac_id_1, reac_id_2 = dG0_reverse_couple
        cobrak_model.extra_linear_constraints.append(
            ExtraLinearConstraint(
                stoichiometries={
                    f"{DG0_VAR_PREFIX}{reac_id_1}": 1.0,
                    f"{DG0_VAR_PREFIX}{reac_id_2}": 1.0,
                },
                lower_value=0.0,
                upper_value=0.0,
            )
        )

    if verbose:
        print("# filled kcats:", filled_kcats)
        print("# filled substrate kms:", filled_substrate_kms)
        print("# filled product kms:", filled_product_kms)
        print("# filled kms in total:", filled_product_kms + filled_substrate_kms)
        print("# filled ΔG'° values:", filled_dG0s)

    return cobrak_model

get_model_with_varied_parameters(model, max_km_variation=None, max_kcat_variation=None, max_ki_variation=None, max_ka_variation=None, max_dG0_variation=None, varied_reacs=[], change_unknown_values=True, change_known_values=True, use_shuffling_instead_of_uniform_random=False, use_shuffling_with_putting_back=False, shuffle_using_distribution_of_values_with_reference=True)

Generates a modified copy of the input Model with varied reaction parameters.

This function creates a deep copy of the input Model and introduces random variations to several reaction parameters, including dG0, k_cat, k_ms, k_is, and k_as. The magnitude of the variation is controlled by the provided max_..._variation parameters. If a max_..._variation parameter is not provided (i.e., is None), the corresponding parameter will not be varied. Variations are applied randomly using a uniform distribution. For reactions with a reverse reaction, the dG0 values of the forward and reverse reactions are updated to maintain thermodynamic consistency.

Parameters:

Name Type Description Default
model Model

The Model object to be modified.

required
max_km_variation NonNegativeFloat | None

Maximum factor by which to vary Kms. Defaults to None. No effect (except if it is None, then nothing happens) if use_shuffling_instead_of_uniform_random=True.

None
max_kcat_variation NonNegativeFloat | None

Maximum factor by which to vary k_cat. Defaults to None. No effect (except if it is None, then nothing happens) if use_shuffling_instead_of_uniform_random=True.

None
max_ki_variation NonNegativeFloat | None

Maximum factor by which to vary k_is. Defaults to None. No effect (except if it is None, then nothing happens) if use_shuffling_instead_of_uniform_random=True.

None
max_ka_variation NonNegativeFloat | None

Maximum factor by which to vary k_as. Defaults to None. No effect (except if it is None, then nothing happens) if use_shuffling_instead_of_uniform_random=True.

None
max_dG0_variation NonNegativeFloat | None

Maximum factor by which to vary dG0. Defaults to None. No effect (except if it is None, then nothing happens) if use_shuffling_instead_of_uniform_random=True.

None
varied_reacs list[str]

If not [], only reactions with IDs in this list are varied. Defaults to [].

[]
change_known_values bool

Change values if they are set with a taxonomic distance in their reference. Defaults to True.

True
change_unknown_values bool

Change values if they are not set with a taxonomic distance in their reference. Defaults to True.

True
use_shuffling_instead_of_uniform_random bool

Overwrites max variation parameters, and switches to shuffling inside the known kcats, educt kms, product kms and so on.

False
shuffle_using_distribution_of_values_with_reference bool

If True (the default), the shuffling will only choose values with a reference for the shuffling; note that if change_unknown_values=True, the unknown values will still be shuffled, but if ```shuffle_using_distribution_of_values_with_reference=True```` just using the distribution with values with references.

True

Returns:

Type Description
Model

A deep copy of the input model with varied reaction parameters.

Source code in cobrak/utilities.py
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
2295
2296
2297
2298
2299
2300
2301
2302
2303
2304
2305
2306
2307
2308
2309
2310
2311
2312
2313
2314
2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
2349
2350
2351
2352
2353
2354
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366
2367
2368
2369
@validate_call(validate_return=True)
def get_model_with_varied_parameters(
    model: Model,
    max_km_variation: NonNegativeFloat | None = None,
    max_kcat_variation: NonNegativeFloat | None = None,
    max_ki_variation: NonNegativeFloat | None = None,
    max_ka_variation: NonNegativeFloat | None = None,
    max_dG0_variation: NonNegativeFloat | None = None,
    varied_reacs: list[str] = [],
    change_unknown_values: bool = True,
    change_known_values: bool = True,
    use_shuffling_instead_of_uniform_random: bool = False,
    use_shuffling_with_putting_back: bool = False,
    shuffle_using_distribution_of_values_with_reference: bool = True,
) -> Model:
    """Generates a modified copy of the input Model with varied reaction parameters.

    This function creates a deep copy of the input Model and introduces random variations
    to several reaction parameters, including dG0, k_cat, k_ms, k_is, and k_as.  The
    magnitude of the variation is controlled by the provided `max_..._variation`
    parameters.  If a `max_..._variation` parameter is not provided (i.e., is None),
    the corresponding parameter will not be varied.  Variations are applied randomly
    using a uniform distribution.  For reactions with a reverse reaction, the dG0 values
    of the forward and reverse reactions are updated to maintain thermodynamic consistency.

    Args:
        model: The Model object to be modified.
        max_km_variation: Maximum factor by which to vary Kms.  Defaults to None.
            No effect (except if it is None, then nothing happens) if ```use_shuffling_instead_of_uniform_random=True```.
        max_kcat_variation: Maximum factor by which to vary k_cat. Defaults to None.
            No effect (except if it is None, then nothing happens) if ```use_shuffling_instead_of_uniform_random=True```.
        max_ki_variation: Maximum factor by which to vary k_is. Defaults to None.
            No effect (except if it is None, then nothing happens) if ```use_shuffling_instead_of_uniform_random=True```.
        max_ka_variation: Maximum factor by which to vary k_as. Defaults to None.
            No effect (except if it is None, then nothing happens) if ```use_shuffling_instead_of_uniform_random=True```.
        max_dG0_variation: Maximum factor by which to vary dG0. Defaults to None.
            No effect (except if it is None, then nothing happens) if ```use_shuffling_instead_of_uniform_random=True```.
        varied_reacs: If not [], only reactions with IDs in this list are varied. Defaults to [].
        change_known_values: Change values if they *are* set with a
            taxonomic distance in their reference. Defaults to True.
        change_unknown_values: Change values if they are *not* set with a
            taxonomic distance in their reference. Defaults to True.
        use_shuffling_instead_of_uniform_random: Overwrites max variation parameters, and switches
            to shuffling inside the known kcats, educt kms, product kms and so on.
        shuffle_using_distribution_of_values_with_reference: If True (the default), the shuffling will only
            choose values with a reference for the shuffling; note that if ```change_unknown_values=True```,
            the unknown values will still be shuffled, but if ```shuffle_using_distribution_of_values_with_reference=True````
            just using the distribution with values with references.

    Returns:
        A deep copy of the input model with varied reaction parameters.
    """
    varied_model = deepcopy(model)
    tested_rev_reacs: list[str] = []
    if use_shuffling_instead_of_uniform_random:
        if max_km_variation is not None:
            substrate_kms, product_kms = get_model_kms_by_usage(
                model,
                return_only_values_with_reference=shuffle_using_distribution_of_values_with_reference,
            )
            all_substrate_km_indices = list(range(len(substrate_kms)))
            all_product_km_indices = list(range(len(product_kms)))
        if max_dG0_variation is not None:
            all_dG0s = get_model_dG0s(model)
            all_dG0_indices = list(range(len(all_dG0s)))
        if max_kcat_variation is not None:
            all_kcats = get_model_kcats(model)
            all_kcat_indices = list(range(len(all_kcats)))
        if max_ki_variation is not None:
            all_kis = get_model_kis(model)
            all_ki_indices = list(range(len(all_kis)))
        if max_ka_variation is not None:
            all_kas = get_model_kas(model)
            all_ka_indices = list(range(len(all_kas)))
    for reac_id, reaction in varied_model.reactions.items():
        if (varied_reacs != []) and (reac_id not in varied_reacs):
            continue
        if (
            max_dG0_variation is not None
            and reaction.dG0 is not None
            and reac_id not in tested_rev_reacs
        ):
            if use_shuffling_instead_of_uniform_random:
                if not use_shuffling_with_putting_back:
                    chosen_index = choice(all_dG0_indices)
                    reaction.dG0 = all_dG0s[chosen_index]
                    del all_dG0_indices[all_dG0_indices.index(chosen_index)]
                else:
                    reaction.dG0 = choice(all_dG0s)
            else:
                reaction.dG0 += uniform(-max_dG0_variation, +max_dG0_variation)  # noqa: NPY002
            rev_id = get_reverse_reac_id_if_existing(
                reac_id=reac_id,
                fwd_suffix=varied_model.fwd_suffix,
                rev_suffix=varied_model.rev_suffix,
            )
            if rev_id in varied_model.reactions:
                varied_model.reactions[rev_id].dG0 = -reaction.dG0
                tested_rev_reacs.append(rev_id)
        if reaction.enzyme_reaction_data is not None:
            if max_kcat_variation is not None:
                kcat_tax_distance = (
                    -1
                    if len(reaction.enzyme_reaction_data.k_cat_references) == 0
                    else reaction.enzyme_reaction_data.k_cat_references[0].tax_distance
                )
                if (change_known_values and kcat_tax_distance >= 0) or (
                    change_unknown_values and kcat_tax_distance < 0
                ):
                    if use_shuffling_instead_of_uniform_random:
                        if not use_shuffling_with_putting_back:
                            chosen_index = choice(all_kcat_indices)
                            reaction.enzyme_reaction_data.k_cat = all_kcats[
                                chosen_index
                            ]
                            del all_kcat_indices[all_kcat_indices.index(chosen_index)]
                        else:
                            reaction.enzyme_reaction_data.k_cat = choice(all_kcats)
                    else:
                        reaction.enzyme_reaction_data.k_cat *= max_kcat_variation ** (
                            uniform(-1, 1)  # noqa: NPY002
                        )  # noqa: NPY002
            if max_km_variation is not None:
                for met_id in reaction.enzyme_reaction_data.k_ms:
                    references = reaction.enzyme_reaction_data.k_m_references
                    km_tax_distance = (
                        -1
                        if met_id not in references or len(references[met_id]) == 0
                        else references[met_id][0].tax_distance
                    )
                    if not (
                        (change_known_values and km_tax_distance >= 0)
                        or (change_unknown_values and km_tax_distance < 0)
                    ):
                        continue
                    if (
                        met_id in reaction.stoichiometries
                        and reaction.stoichiometries[met_id] < 0.0
                    ):  # Substrate k_ms
                        if use_shuffling_instead_of_uniform_random:
                            chosen_index = choice(all_substrate_km_indices)
                            if not use_shuffling_with_putting_back:
                                reaction.enzyme_reaction_data.k_ms[met_id] = (
                                    substrate_kms[chosen_index]
                                )
                                del all_substrate_km_indices[
                                    all_substrate_km_indices.index(chosen_index)
                                ]
                            else:
                                reaction.enzyme_reaction_data.k_ms[met_id] = choice(
                                    substrate_kms
                                )
                        else:
                            reaction.enzyme_reaction_data.k_ms[met_id] *= (
                                max_km_variation ** (uniform(-1, 1))  # noqa: NPY002
                            )  # noqa: NPY002
                    else:  # Product k_ms
                        if use_shuffling_instead_of_uniform_random:
                            if not use_shuffling_with_putting_back:
                                chosen_index = choice(all_product_km_indices)
                                reaction.enzyme_reaction_data.k_ms[met_id] = (
                                    product_kms[chosen_index]
                                )
                                del all_product_km_indices[
                                    all_product_km_indices.index(chosen_index)
                                ]
                            else:
                                reaction.enzyme_reaction_data.k_ms[met_id] = choice(
                                    product_kms
                                )
                        else:
                            reaction.enzyme_reaction_data.k_ms[met_id] *= (
                                max_km_variation ** (uniform(-1, 1))  # noqa: NPY002
                            )  # noqa: NPY002
            if max_ki_variation is not None:
                references = reaction.enzyme_reaction_data.k_i_references
                ki_tax_distance = (
                    -1
                    if met_id not in references or len(references[met_id]) == 0
                    else references[met_id][0].tax_distance
                )
                if not (
                    (change_known_values and ki_tax_distance >= 0)
                    or (change_unknown_values and ki_tax_distance < 0)
                ):
                    continue
                for met_id in reaction.enzyme_reaction_data.k_is:
                    if use_shuffling_instead_of_uniform_random:
                        if not use_shuffling_with_putting_back:
                            chosen_index = choice(all_substrate_km_indices)
                            reaction.enzyme_reaction_data.k_is[met_id] = all_kis[
                                chosen_index
                            ]
                            del all_ki_indices[all_ki_indices.index(chosen_index)]
                        else:
                            reaction.enzyme_reaction_data.k_is[met_id] = choice(all_kis)
                    else:
                        reaction.enzyme_reaction_data.k_is[met_id] *= (
                            max_ki_variation
                            ** (
                                uniform(-1, 1)  # noqa: NPY002
                            )
                        )  # noqa: NPY002
            if max_ka_variation is not None:
                references = reaction.enzyme_reaction_data.k_a_references
                ka_tax_distance = (
                    -1
                    if met_id not in references or len(references[met_id]) == 0
                    else references[met_id][0].tax_distance
                )
                if not (
                    (change_known_values and ka_tax_distance >= 0)
                    or (change_unknown_values and ka_tax_distance < 0)
                ):
                    continue
                for met_id in reaction.enzyme_reaction_data.k_as:
                    if use_shuffling_instead_of_uniform_random:
                        if not use_shuffling_with_putting_back:
                            chosen_index = choice(all_ka_indices)
                            reaction.enzyme_reaction_data.k_as[met_id] = all_kas[
                                chosen_index
                            ]
                            del all_ka_indices[all_ka_indices.index(chosen_index)]
                        else:
                            reaction.enzyme_reaction_data.k_as[met_id] = choice(all_kas)
                    else:
                        reaction.enzyme_reaction_data.k_as[met_id] *= (
                            max_ka_variation
                            ** (
                                uniform(-1, 1)  # noqa: NPY002
                            )
                        )  # noqa: NPY002
    return varied_model

get_potentially_active_reactions_in_variability_dict(cobrak_model, variability_dict)

Identify potentially active reactions in a COBRAk model based on a variability dictionary.

This function returns a list of reaction IDs that are present in both the COBRAk model and the variability dictionary, and have a maximum flux greater than zero while having a minimum flux equal to zero. These reactions are considered potentially active.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAk model containing reactions.

required
variability_dict dict[str, tuple[float, float]]

A dictionary mapping reaction IDs to their minimum and maximum flux values.

required

Returns:

Type Description
list[str]

list[str]: A list of reaction IDs that are potentially active.

Source code in cobrak/utilities.py
2372
2373
2374
2375
2376
2377
2378
2379
2380
2381
2382
2383
2384
2385
2386
2387
2388
2389
2390
2391
2392
2393
2394
@validate_call(validate_return=True)
def get_potentially_active_reactions_in_variability_dict(
    cobrak_model: Model, variability_dict: dict[str, tuple[float, float]]
) -> list[str]:
    """Identify potentially active reactions in a COBRAk model based on a variability dictionary.

    This function returns a list of reaction IDs that are present in both the COBRAk model and the variability dictionary,
    and have a maximum flux greater than zero while having a minimum flux equal to zero. These reactions are considered potentially active.

    Args:
        cobrak_model (Model): The COBRAk model containing reactions.
        variability_dict (dict[str, tuple[float, float]]): A dictionary mapping reaction IDs to their minimum and maximum flux values.

    Returns:
        list[str]: A list of reaction IDs that are potentially active.
    """
    return [
        reac_id
        for reac_id in variability_dict
        if (reac_id in cobrak_model.reactions)
        and (variability_dict[reac_id][1] > 0.0)
        and (variability_dict[reac_id][0] <= 0.0)
    ]

get_pyomo_solution_as_dict(model)

Returns the pyomo solution as a dictionary of { "\(VAR_NAME": "\)VAR_VALUE", ... }

Value is None for all uninitialized variables.

Parameters:

Name Type Description Default
model ConcreteModel

The pyomo model

required

Returns:

Type Description
dict[str, float]

dict[str, float]: The solution dictionary

Source code in cobrak/utilities.py
2397
2398
2399
2400
2401
2402
2403
2404
2405
2406
2407
2408
2409
2410
2411
2412
2413
2414
2415
2416
def get_pyomo_solution_as_dict(model: ConcreteModel) -> dict[str, float]:
    """Returns the pyomo solution as a dictionary of { "$VAR_NAME": "$VAR_VALUE", ... }

    Value is None for all uninitialized variables.

    Args:
        model (ConcreteModel): The pyomo model

    Returns:
        dict[str, float]: The solution dictionary
    """
    model_var_names = [v.name for v in model.component_objects(Var)]
    solution_dict = {}
    for model_var_name in model_var_names:
        try:
            var_value = getattr(model, model_var_name).value
        except ValueError:
            var_value = None  # Uninitialized variable (e.g., x_Biomass)
        solution_dict[model_var_name] = var_value
    return solution_dict

get_reaction_enzyme_var_id(reac_id, reaction)

Returns the pyomo model name of the reaction's enzyme

Parameters:

Name Type Description Default
reac_id str

Reaction ID

required
reaction Reaction

Reaction instance

required

Returns:

Name Type Description
str str

Reaction enzyme's name

Source code in cobrak/utilities.py
2419
2420
2421
2422
2423
2424
2425
2426
2427
2428
2429
2430
2431
2432
2433
2434
2435
2436
2437
@validate_call(validate_return=True)
def get_reaction_enzyme_var_id(reac_id: str, reaction: Reaction) -> str:
    """Returns the pyomo model name of the reaction's enzyme

    Args:
        reac_id (str): Reaction ID
        reaction (Reaction): Reaction instance

    Returns:
        str: Reaction enzyme's name
    """
    if reaction.enzyme_reaction_data is None:
        return ""
    return (
        ENZYME_VAR_PREFIX
        + get_full_enzyme_id(reaction.enzyme_reaction_data.identifiers)
        + ENZYME_VAR_INFIX
        + reac_id
    )

get_reaction_string(cobrak_model, reac_id)

Generate a string representation of a reaction in a COBRAk model.

This function constructs a string that represents the stoichiometry of a specified reaction, including the direction of the reaction based on its flux bounds. E.g., a reaction R1: A ⇒ B, [0, 1000] is returned as "1 A ⇒ 1 B"

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAk model containing the reaction.

required
reac_id str

The ID of the reaction to be represented as a string.

required

Returns:

Name Type Description
str str

A string representation of the reaction, showing educts, products, and the reaction direction.

Source code in cobrak/utilities.py
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
@validate_call(validate_return=True)
def get_reaction_string(cobrak_model: Model, reac_id: str) -> str:
    """Generate a string representation of a reaction in a COBRAk model.

    This function constructs a string that represents the stoichiometry of a specified reaction,
    including the direction of the reaction based on its flux bounds. E.g., a reaction
    R1: A ⇒ B, [0, 1000]
    is returned
    as "1 A ⇒ 1 B"

    Args:
        cobrak_model (Model): The COBRAk model containing the reaction.
        reac_id (str): The ID of the reaction to be represented as a string.

    Returns:
        str: A string representation of the reaction, showing educts, products, and the reaction direction.
    """
    reaction = cobrak_model.reactions[reac_id]
    educt_parts = []
    product_parts = []
    for met_id, stoichiometry in reaction.stoichiometries.items():
        met_string = f"{stoichiometry} {met_id}"
        if stoichiometry > 0:
            product_parts.append(met_string)
        else:
            educt_parts.append(met_string)
    if (reaction.min_flux < 0) and (reaction.max_flux > 0):
        arrow = "⇔"
    elif (reaction.min_flux < 0) and (reaction.max_flux <= 0):
        arrow = "⇐"
    else:
        arrow = "⇒"

    return " + ".join(educt_parts) + " " + arrow + " " + " + ".join(product_parts)

get_reverse_reac_id_if_existing(reac_id, fwd_suffix=REAC_FWD_SUFFIX, rev_suffix=REAC_REV_SUFFIX)

Returns the ID of the reverse reaction if it exists, otherwise returns an empty string.

Args: reac_id (str): The ID of the reaction. fwd_suffix (str, optional): The suffix used to identify forward reactions. Defaults to REAC_FWD_SUFFIX. rev_suffix (str, optional): The suffix used to identify reverse reactions. Defaults to REAC_REV_SUFFIX.

Returns: str: The ID of the reverse reaction if it exists, otherwise an empty string.

Source code in cobrak/utilities.py
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
@validate_call(validate_return=True)
def get_reverse_reac_id_if_existing(
    reac_id: str,
    fwd_suffix: str = REAC_FWD_SUFFIX,
    rev_suffix: str = REAC_REV_SUFFIX,
) -> str:
    """Returns the ID of the reverse reaction if it exists, otherwise returns an empty string.

    Args:
    reac_id (str): The ID of the reaction.
    fwd_suffix (str, optional): The suffix used to identify forward reactions. Defaults to REAC_FWD_SUFFIX.
    rev_suffix (str, optional): The suffix used to identify reverse reactions. Defaults to REAC_REV_SUFFIX.

    Returns:
    str: The ID of the reverse reaction if it exists, otherwise an empty string.
    """
    if reac_id.endswith(fwd_suffix):
        return reac_id.replace(fwd_suffix, rev_suffix)
    if reac_id.endswith(rev_suffix):
        return reac_id.replace(rev_suffix, fwd_suffix)
    return ""

get_solver_status_from_pyomo_results(pyomo_results)

Returns the solver status from the pyomo results as an integer code.

This function interprets the solver status from a SolverResults object and returns a corresponding integer code. The mapping is as follows: - 0 for SolverStatus.ok - 1 for SolverStatus.warning - 2 for SolverStatus.error - 3 for SolverStatus.aborted - 4 for SolverStatus.unknown

Parameters:

Name Type Description Default
pyomo_results SolverResults

The results object from a Pyomo solver containing the solver status.

required

Raises:

Type Description
ValueError

If the solver status is not recognized.

Returns:

Name Type Description
int NonNegativeInt

An integer code representing the solver status.

Source code in cobrak/utilities.py
2440
2441
2442
2443
2444
2445
2446
2447
2448
2449
2450
2451
2452
2453
2454
2455
2456
2457
2458
2459
2460
2461
2462
2463
2464
2465
2466
2467
2468
2469
2470
2471
2472
2473
2474
2475
@validate_call(config=ConfigDict(arbitrary_types_allowed=True), validate_return=True)
def get_solver_status_from_pyomo_results(
    pyomo_results: SolverResults,
) -> NonNegativeInt:
    """Returns the solver status from the pyomo results as an integer code.

    This function interprets the solver status from a `SolverResults` object and returns a corresponding integer code.
    The mapping is as follows:
    - 0 for `SolverStatus.ok`
    - 1 for `SolverStatus.warning`
    - 2 for `SolverStatus.error`
    - 3 for `SolverStatus.aborted`
    - 4 for `SolverStatus.unknown`

    Args:
        pyomo_results (SolverResults): The results object from a Pyomo solver containing the solver status.

    Raises:
        ValueError: If the solver status is not recognized.

    Returns:
        int: An integer code representing the solver status.
    """
    match pyomo_results.solver.status:
        case SolverStatus.ok:
            return 0
        case SolverStatus.warning:
            return 1
        case SolverStatus.error:
            return 2
        case SolverStatus.aborted:
            return 3
        case SolverStatus.unknown:
            return 4
        case _:
            raise ValueError

get_sorted_model_dG0s(cobrak_model, abs_values=False, exclude_bw_reacs=True)

Extracts standard Gibbs free energy changes (dG0) from reactions in the model and returns them, with reaction IDs, in ascending order.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAk model containing reactions with thermodynamic data.

required
abs_values bool

If True, returns absolute values of dG0. Defaults to False.

False

Returns:

Type Description
list[tuple[str, float]]

list[tuple[str, float]]: A list of (reac_id, dG0) values, possibly as absolute values if specified.

Source code in cobrak/utilities.py
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
@validate_call(validate_return=True)
def get_sorted_model_dG0s(
    cobrak_model: Model, abs_values: bool = False, exclude_bw_reacs: bool = True
) -> list[tuple[str, float]]:
    """Extracts standard Gibbs free energy changes (dG0) from reactions in the model and returns them,
       with reaction IDs, in ascending order.

    Args:
        cobrak_model (Model): The COBRAk model containing reactions with thermodynamic data.
        abs_values (bool, optional): If True, returns absolute values of dG0. Defaults to False.

    Returns:
        list[tuple[str, float]]: A list of (reac_id, dG0) values, possibly as absolute values if specified.
    """
    dG0s = []
    for reac_id, reaction in cobrak_model.reactions.items():
        if exclude_bw_reacs and reac_id.endswith(cobrak_model.rev_suffix):
            continue
        if reaction.dG0 is not None:
            dG0s.append(
                (reac_id, abs(reaction.dG0)) if abs_values else (reac_id, reaction.dG0)
            )
    return sorted(dG0s, key=operator.itemgetter(1))

get_sorted_model_kas(cobrak_model, return_only_values_with_reference=False)

Collects k_A values from a COBRA-k model, in ascending order together with associated reaction and metabolite IDs.

This function iterates through the reactions in a COBRA-k model and extracts the k_A values associated with each metabolite

Parameters:

Name Type Description Default
cobrak_model Model

The COBRA-k Model object.

required
return_only_values_with_reference bool

Returns only values with a given database reference. Defaults to False.

False

Returns:

Type Description
list[tuple[str, str, PositiveFloat]]

A list containing the (reac_id, met_id, k_A) values

Source code in cobrak/utilities.py
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
@validate_call(validate_return=True)
def get_sorted_model_kas(
    cobrak_model: Model,
    return_only_values_with_reference: bool = False,
) -> list[tuple[str, str, PositiveFloat]]:
    """Collects k_A values from a COBRA-k model, in ascending order together with associated reaction and metabolite IDs.

    This function iterates through the reactions in a COBRA-k model and extracts the
    k_A values associated with each metabolite

    Args:
        cobrak_model: The COBRA-k Model object.
        return_only_values_with_reference: Returns only values with a given database reference. Defaults to False.

    Returns:
        A list containing the (reac_id, met_id, k_A) values
    """
    all_kas: list[tuple[str, PositiveFloat]] = []
    for reac_id, reaction in cobrak_model.reactions.items():
        if reaction.enzyme_reaction_data is None:
            continue
        for met_id, k_a in reaction.enzyme_reaction_data.k_as.items():
            if return_only_values_with_reference:
                references = reaction.enzyme_reaction_data.k_a_references
                if (met_id not in references) or (len(references[met_id]) == 0):
                    tax_distance = -1
                else:
                    tax_distance = references[met_id][0].tax_distance
                if tax_distance < 0:
                    continue
            all_kas.append((reac_id, met_id, k_a))
    return sorted(all_kas, key=operator.itemgetter(2))

get_sorted_model_kcats(cobrak_model)

Extracts k_cat values from reactions with enzyme data in the model, in ascending order together with the associated reaction ID.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAk model containing reactions with enzyme data.

required

Returns:

Type Description
list[tuple[str, float]]

list[tuple[str, float]]: A list of (reac_id, k_cat) values for reactions with available enzyme data.

Source code in cobrak/utilities.py
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
@validate_call(validate_return=True)
def get_sorted_model_kcats(cobrak_model: Model) -> list[tuple[str, float]]:
    """Extracts k_cat values from reactions with enzyme data in the model, in ascending order
       together with the associated reaction ID.

    Args:
        cobrak_model (Model): The COBRAk model containing reactions with enzyme data.

    Returns:
        list[tuple[str, float]]: A list of (reac_id, k_cat) values for reactions with available enzyme data.
    """
    kcats = []
    for reac_id, reaction in cobrak_model.reactions.items():
        if (
            reaction.enzyme_reaction_data is not None
            and reaction.enzyme_reaction_data.k_cat < 1e19
        ):
            kcats.append((reac_id, reaction.enzyme_reaction_data.k_cat))
    return sorted(kcats, key=operator.itemgetter(1))

get_sorted_model_kis(cobrak_model, return_only_values_with_reference=False)

Collects k_I values from a COBRA-k model and returns them, with reaction and metabolite IDs, in ascending order.

This function iterates through the reactions in a COBRA-k model and extracts the k_I values associated with each metabolite

Parameters:

Name Type Description Default
cobrak_model Model

The COBRA-k Model object.

required
return_only_values_with_reference bool

Returns only values with a given database reference. Defaults to False.

False

Returns:

Type Description
list[tuple[str, str, PositiveFloat]]

A list containing the (reac_id, k_I) values

Source code in cobrak/utilities.py
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
@validate_call(validate_return=True)
def get_sorted_model_kis(
    cobrak_model: Model,
    return_only_values_with_reference: bool = False,
) -> list[tuple[str, str, PositiveFloat]]:
    """Collects k_I values from a COBRA-k model and returns them, with reaction and metabolite IDs, in ascending order.

    This function iterates through the reactions in a COBRA-k model and extracts the
    k_I values associated with each metabolite

    Args:
        cobrak_model: The COBRA-k Model object.
        return_only_values_with_reference: Returns only values with a given database reference. Defaults to False.

    Returns:
        A list containing the (reac_id, k_I) values
    """
    all_kis: list[tuple[str, PositiveFloat]] = []
    for reac_id, reaction in cobrak_model.reactions.items():
        if reaction.enzyme_reaction_data is None:
            continue
        for met_id, k_i in reaction.enzyme_reaction_data.k_is.items():
            if return_only_values_with_reference:
                references = reaction.enzyme_reaction_data.k_i_references
                if (met_id not in references) or (len(references[met_id]) == 0):
                    tax_distance = -1
                else:
                    tax_distance = references[met_id][0].tax_distance
                if tax_distance < 0:
                    continue
            all_kis.append((reac_id, met_id, k_i))
    return sorted(all_kis, key=operator.itemgetter(2))

get_sorted_model_kms_by_usage(cobrak_model, return_only_values_with_reference=False)

Collects k_M values from a COBRA-k model, separating them into substrate and product lists, and returns them in ascending order, together with their associated metabolite and reaction IDs.

This function iterates through the reactions in a COBRA-k model and extracts the k_M values associated with each metabolite. It distinguishes between substrates (metabolites with negative stoichiometry) and products (metabolites with positive stoichiometry) and separates the corresponding Kms values into two lists.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRA-k Model object.

required
return_only_values_with_reference bool

Returns only values with a given database reference. Defaults to False.

False

Returns:

Type Description
list[tuple[str, str, PositiveFloat]]

A tuple containing two lists: the first list contains tuples of (reac_id, met_id, k_m) substrates,

list[tuple[str, str, PositiveFloat]]

and the second list contains the same for products.

Source code in cobrak/utilities.py
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
@validate_call(validate_return=True)
def get_sorted_model_kms_by_usage(
    cobrak_model: Model,
    return_only_values_with_reference: bool = False,
) -> tuple[list[tuple[str, str, PositiveFloat]], list[tuple[str, str, PositiveFloat]]]:
    """Collects k_M values from a COBRA-k model, separating them into substrate and product lists,
       and returns them in ascending order, together with their associated metabolite and reaction IDs.

    This function iterates through the reactions in a COBRA-k model and extracts the
    k_M values associated with each metabolite. It distinguishes between substrates
    (metabolites with negative stoichiometry) and products (metabolites with positive
    stoichiometry) and separates the corresponding Kms values into two lists.

    Args:
        cobrak_model: The COBRA-k Model object.
        return_only_values_with_reference: Returns only values with a given database reference. Defaults to False.

    Returns:
        A tuple containing two lists: the first list contains tuples of (reac_id, met_id, k_m) substrates,
        and the second list contains the same for products.
    """
    substrate_kms: list[tuple[str, PositiveFloat]] = []
    product_kms: list[tuple[str, PositiveFloat]] = []
    for reac_id, reaction in cobrak_model.reactions.items():
        if reaction.enzyme_reaction_data is None:
            continue
        for met_id, stoichiometry in reaction.stoichiometries.items():
            if met_id not in reaction.enzyme_reaction_data.k_ms:
                continue
            if return_only_values_with_reference:
                references = reaction.enzyme_reaction_data.k_m_references
                if (met_id not in references) or (len(references[met_id]) == 0):
                    tax_distance = -1
                else:
                    tax_distance = references[met_id][0].tax_distance
                if tax_distance < 0:
                    continue
            met_km = reaction.enzyme_reaction_data.k_ms[met_id]
            if stoichiometry < 0:
                substrate_kms.append((reac_id, met_id, met_km))
            else:
                product_kms.append((reac_id, met_id, met_km))
    return sorted(substrate_kms, key=operator.itemgetter(2)), sorted(
        product_kms, key=operator.itemgetter(2)
    )

get_stoichiometric_matrix(cobrak_model)

Returns the model's stoichiometric matrix.

The matrix is returned as a list of float lists, where each float list stands for a reaction and each entry in the float list for a metabolite.

Parameters:

Name Type Description Default
cobrak_model Model

The model

required

Returns:

Type Description
list[list[float]]

list[list[float]]: The stoichiometric matrix

Source code in cobrak/utilities.py
2478
2479
2480
2481
2482
2483
2484
2485
2486
2487
2488
2489
2490
2491
2492
2493
2494
2495
2496
2497
2498
2499
2500
@validate_call(validate_return=True)
def get_stoichiometric_matrix(cobrak_model: Model) -> list[list[float]]:
    """Returns the model's stoichiometric matrix.

    The matrix is returned as a list of float lists, where each float list
    stands for a reaction and each entry in the float list for a metabolite.

    Args:
        cobrak_model (Model): The model

    Returns:
        list[list[float]]: The stoichiometric matrix
    """
    matrix: list[list[float]] = []
    for met_id in cobrak_model.metabolites:
        met_row: list[float] = []
        for reac_data in cobrak_model.reactions.values():
            if met_id in reac_data.stoichiometries:
                met_row.append(reac_data.stoichiometries[met_id])
            else:
                met_row.append(0.0)
        matrix.append(met_row.copy())
    return matrix

get_stoichiometrically_coupled_reactions(cobrak_model, rounding=10)

Returns stoichiometrically coupled reactions.

The returned format is as follows: Say that reactions (R1 & R2) as well as (R5 & R6 & R7) are stoichiometrically coupled (i.e, their fluxes are in a strict linear relationship to each other), then this function returns [["R1", "R2"], ["R5", "R6", "R7"].

The identification of stoichiometrically coupled reactions happens through the calculation of the model's stoichiometric matrix nullspace.

Parameters:

Name Type Description Default
cobrak_model Model

The model

required
rounding int

Precision for the calculation of the nullspace. Defaults to 10.

10

Returns:

Type Description
list[list[str]]

list[list[str]]: The stoichiometrically coupled reactions

Source code in cobrak/utilities.py
2503
2504
2505
2506
2507
2508
2509
2510
2511
2512
2513
2514
2515
2516
2517
2518
2519
2520
2521
2522
2523
2524
2525
2526
2527
2528
2529
2530
2531
2532
2533
2534
2535
2536
2537
2538
2539
2540
2541
2542
2543
@validate_call(validate_return=True)
def get_stoichiometrically_coupled_reactions(
    cobrak_model: Model, rounding: NonNegativeInt = 10
) -> list[list[str]]:
    """Returns stoichiometrically coupled reactions.

    The returned format is as follows: Say that reactions (R1 & R2) as well
    as (R5 & R6 & R7) are stoichiometrically coupled (i.e, their fluxes are
    in a strict linear relationship to each other), then this function
    returns [["R1", "R2"], ["R5", "R6", "R7"].

    The identification of stoichiometrically coupled reactions happens through
    the calculation of the model's stoichiometric matrix nullspace.

    Args:
        cobrak_model (Model): The model
        rounding (int, optional): Precision for the calculation of the nullspace. Defaults to 10.

    Returns:
        list[list[str]]: The stoichiometrically coupled reactions
    """
    # Calculate nullspace and convert each row to rounded tuples
    null_space_matrix = null_space(get_stoichiometric_matrix(cobrak_model))
    null_space_tuples = [
        tuple(round(value, rounding) for value in row) for row in null_space_matrix
    ]

    # Map the null space tuples to reaction indices
    occcurences: dict[tuple[float, ...], list[int]] = {}
    for reac_idx, null_space_tuple in enumerate(null_space_tuples):
        if null_space_tuple not in occcurences:
            occcurences[null_space_tuple] = []
        occcurences[null_space_tuple].append(reac_idx)

    # Map the reaction indices to the final couples reactions list
    coupled_reacs: list[list[str]] = []
    reac_ids = list(cobrak_model.reactions.keys())
    for coupled_indices in occcurences.values():
        coupled_reacs.append([reac_ids[reac_idx] for reac_idx in coupled_indices])

    return coupled_reacs

get_substrate_and_product_exchanges(cobrak_model, optimization_dict={})

Identifies and categorizes reactions as substrate or product exchanges based on reaction stoichiometries.

This function analyzes each reaction in the provided COBRAk model to determine whether it primarily represents substrate consumption or product formation. It categorizes reactions into substrate reactions (where all stoichiometries are positive, indicating metabolite consumption) and product reactions (where all stoichiometries are negative, indicating metabolite production).

  • A reaction is classified as a substrate reaction if all its stoichiometries are positive, indicating that all metabolites involved are being consumed.
  • A reaction is classified as a product reaction if all its stoichiometries are negative, indicating that all metabolites involved are being produced.
  • If the optimization_dict is provided, only the reactions listed in this dictionary are considered for classification.
  • The function returns tuples of reaction IDs, which can be used for further processing or analysis of substrate and product reactions.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAk model containing reactions to be analyzed.

required
optimization_dict dict[str, Any]

An optional dictionary to filter reactions. Only reactions whose IDs are present in this dictionary will be considered.

{}

Returns:

Type Description
tuple[tuple[str, ...], tuple[str, ...]]

tuple[tuple[str, ...], tuple[str, ...]]: A tuple containing two elements: - The first element is a tuple of reaction IDs identified as substrate reactions. - The second element is a tuple of reaction IDs identified as product reactions.

Source code in cobrak/utilities.py
2546
2547
2548
2549
2550
2551
2552
2553
2554
2555
2556
2557
2558
2559
2560
2561
2562
2563
2564
2565
2566
2567
2568
2569
2570
2571
2572
2573
2574
2575
2576
2577
2578
2579
2580
2581
@validate_call(validate_return=True)
def get_substrate_and_product_exchanges(
    cobrak_model: Model, optimization_dict: dict[str, Any] = {}
) -> tuple[tuple[str, ...], tuple[str, ...]]:
    """Identifies and categorizes reactions as substrate or product exchanges based on reaction stoichiometries.

    This function analyzes each reaction in the provided COBRAk model to determine whether it primarily represents substrate consumption or product formation.
    It categorizes reactions into substrate reactions (where all stoichiometries are positive, indicating metabolite consumption) and product reactions
    (where all stoichiometries are negative, indicating metabolite production).

    * A reaction is classified as a substrate reaction if all its stoichiometries are positive, indicating that all metabolites involved are being consumed.
    * A reaction is classified as a product reaction if all its stoichiometries are negative, indicating that all metabolites involved are being produced.
    * If the `optimization_dict` is provided, only the reactions listed in this dictionary are considered for classification.
    * The function returns tuples of reaction IDs, which can be used for further processing or analysis of substrate and product reactions.

    Args:
        cobrak_model (Model): The COBRAk model containing reactions to be analyzed.
        optimization_dict (dict[str, Any], optional): An optional dictionary to filter reactions. Only reactions whose IDs are present in this dictionary will be considered.
        Defaults to {}.

    Returns:
        tuple[tuple[str, ...], tuple[str, ...]]: A tuple containing two elements:
            - The first element is a tuple of reaction IDs identified as substrate reactions.
            - The second element is a tuple of reaction IDs identified as product reactions.
    """
    substrate_reac_ids: list[str] = []
    product_reac_ids: list[str] = []
    for reac_id, reaction in cobrak_model.reactions.items():
        if optimization_dict != {} and reac_id not in optimization_dict:
            continue
        stoichiometries = list(reaction.stoichiometries.values())
        if min(stoichiometries) > 0 and max(stoichiometries) > 0:
            substrate_reac_ids.append(reac_id)
        elif min(stoichiometries) < 0 and max(stoichiometries) < 0:
            product_reac_ids.append(reac_id)
    return tuple(substrate_reac_ids), tuple(product_reac_ids)

get_termination_condition_from_pyomo_results(pyomo_results)

Returns the termination condition from the pyomo results as a float code.

This function interprets the termination condition from a SolverResults object and returns a corresponding float code. The mapping is as follows: - 0.1 for TerminationCondition.globallyOptimal - 0.2 for TerminationCondition.optimal - 0.3 for TerminationCondition.locallyOptimal - 1 for TerminationCondition.maxTimeLimit - 2 for TerminationCondition.maxIterations - 3 for TerminationCondition.minFunctionValue - 4 for TerminationCondition.minStepLength - 5 for TerminationCondition.maxEvaluations - 6 for TerminationCondition.other - 7 for TerminationCondition.unbounded - 8 for TerminationCondition.infeasible - 9 for TerminationCondition.invalidProblem - 10 for TerminationCondition.solverFailure - 11 for TerminationCondition.internalSolverError - 12 for TerminationCondition.error - 13 for TerminationCondition.userInterrupt - 14 for TerminationCondition.resourceInterrupt - 15 for TerminationCondition.licensingProblem

Parameters:

Name Type Description Default
pyomo_results SolverResults

The results object from a Pyomo solver containing the termination condition.

required

Raises:

Type Description
ValueError

If the termination condition is not recognized.

Returns:

Name Type Description
float NonNegativeFloat

A float code representing the termination condition.

Source code in cobrak/utilities.py
2584
2585
2586
2587
2588
2589
2590
2591
2592
2593
2594
2595
2596
2597
2598
2599
2600
2601
2602
2603
2604
2605
2606
2607
2608
2609
2610
2611
2612
2613
2614
2615
2616
2617
2618
2619
2620
2621
2622
2623
2624
2625
2626
2627
2628
2629
2630
2631
2632
2633
2634
2635
2636
2637
2638
2639
2640
2641
2642
2643
2644
2645
2646
2647
2648
2649
2650
2651
2652
2653
2654
2655
2656
2657
2658
2659
2660
@validate_call(config=ConfigDict(arbitrary_types_allowed=True), validate_return=True)
def get_termination_condition_from_pyomo_results(
    pyomo_results: SolverResults,
) -> NonNegativeFloat:
    """Returns the termination condition from the pyomo results as a float code.

    This function interprets the termination condition from a `SolverResults` object and returns a corresponding float code.
    The mapping is as follows:
    - 0.1 for `TerminationCondition.globallyOptimal`
    - 0.2 for `TerminationCondition.optimal`
    - 0.3 for `TerminationCondition.locallyOptimal`
    - 1 for `TerminationCondition.maxTimeLimit`
    - 2 for `TerminationCondition.maxIterations`
    - 3 for `TerminationCondition.minFunctionValue`
    - 4 for `TerminationCondition.minStepLength`
    - 5 for `TerminationCondition.maxEvaluations`
    - 6 for `TerminationCondition.other`
    - 7 for `TerminationCondition.unbounded`
    - 8 for `TerminationCondition.infeasible`
    - 9 for `TerminationCondition.invalidProblem`
    - 10 for `TerminationCondition.solverFailure`
    - 11 for `TerminationCondition.internalSolverError`
    - 12 for `TerminationCondition.error`
    - 13 for `TerminationCondition.userInterrupt`
    - 14 for `TerminationCondition.resourceInterrupt`
    - 15 for `TerminationCondition.licensingProblem`

    Args:
        pyomo_results (SolverResults): The results object from a Pyomo solver containing the termination condition.

    Raises:
        ValueError: If the termination condition is not recognized.

    Returns:
        float: A float code representing the termination condition.
    """
    match pyomo_results.solver.termination_condition:
        case TerminationCondition.globallyOptimal:
            return 0.1
        case TerminationCondition.optimal:
            return 0.2
        case TerminationCondition.locallyOptimal:
            return 0.3
        case TerminationCondition.maxTimeLimit:
            return 1
        case TerminationCondition.maxIterations:
            return 2
        case TerminationCondition.minFunctionValue:
            return 3
        case TerminationCondition.minStepLength:
            return 4
        case TerminationCondition.maxEvaluations:
            return 5
        case TerminationCondition.other:
            return 6
        case TerminationCondition.unbounded:
            return 7
        case TerminationCondition.infeasible:
            return 8
        case TerminationCondition.invalidProblem:
            return 9
        case TerminationCondition.solverFailure:
            return 10
        case TerminationCondition.internalSolverError:
            return 11
        case TerminationCondition.error:
            return 12
        case TerminationCondition.userInterrupt:
            return 13
        case TerminationCondition.resourceInterrupt:
            return 14
        case TerminationCondition.licensingProblems:
            return 15
        case TerminationCondition.intermediateNonInteger:
            return 16
        case _:
            raise ValueError

get_unoptimized_reactions_in_nlp_solution(cobrak_model, solution, verbose=False, regard_iota=False, regard_alpha=False)

Identify unoptimized reactions in the NLP (Non-Linear Programming) solution.

This function checks each reaction in the COBRAk model to determine if the flux values in the provided NLP solution match the expected values based on enzyme kinetics and thermodynamics. Reactions with discrepancies are considered unoptimized and are returned in a dictionary.

Discrepancies occur because, in COBRAk, the saturation term and the thermodynamic restriction are set as maximal values (<=), they are not fixed.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRAk model containing reactions and enzyme data.

required
solution dict[str, float]

A dictionary mapping variable names to their values from an NLP solution.

required
verbose bool

bool: Whether or not to print the discrepancies for each reaction

False

Returns:

Type Description
dict[str, tuple[float, float]]

dict[str, tuple[float, float]]: Dictionary where the keys are reaction IDs and the values are tuples containing the NLP solution flux and the real flux for unoptimized reactions.

Source code in cobrak/utilities.py
2663
2664
2665
2666
2667
2668
2669
2670
2671
2672
2673
2674
2675
2676
2677
2678
2679
2680
2681
2682
2683
2684
2685
2686
2687
2688
2689
2690
2691
2692
2693
2694
2695
2696
2697
2698
2699
2700
2701
2702
2703
2704
2705
2706
2707
2708
2709
2710
2711
2712
2713
2714
2715
2716
2717
2718
2719
2720
2721
2722
2723
2724
2725
2726
2727
2728
2729
2730
2731
2732
2733
2734
2735
2736
2737
2738
2739
2740
2741
2742
2743
2744
2745
2746
2747
2748
2749
2750
2751
2752
2753
2754
2755
2756
2757
2758
2759
2760
2761
2762
2763
2764
2765
2766
2767
2768
2769
2770
2771
2772
2773
2774
2775
2776
2777
2778
2779
2780
2781
2782
2783
2784
2785
2786
2787
2788
2789
2790
2791
2792
2793
2794
2795
2796
2797
2798
2799
2800
2801
2802
2803
2804
2805
2806
2807
2808
2809
2810
2811
2812
2813
2814
2815
2816
2817
2818
2819
2820
2821
2822
2823
2824
2825
2826
2827
2828
2829
2830
2831
2832
2833
2834
2835
2836
2837
2838
2839
2840
2841
2842
2843
2844
2845
2846
2847
2848
@validate_call(validate_return=True)
def get_unoptimized_reactions_in_nlp_solution(
    cobrak_model: Model,
    solution: dict[str, float],
    verbose: bool = False,
    regard_iota: bool = False,
    regard_alpha: bool = False,
) -> dict[str, tuple[float, float]]:
    """Identify unoptimized reactions in the NLP (Non-Linear Programming) solution.

    This function checks each reaction in the COBRAk model to determine if the flux values in the provided NLP solution match
    the expected values based on enzyme kinetics and thermodynamics.
    Reactions with discrepancies are considered unoptimized and are returned in a dictionary.

    Discrepancies occur because, in COBRAk, the saturation term and the thermodynamic restriction are set as maximal values (<=),
    they are not fixed.

    Args:
        cobrak_model (Model): The COBRAk model containing reactions and enzyme data.
        solution (dict[str, float]): A dictionary mapping variable names to their values from an NLP solution.
        verbose: bool: Whether or not to print the discrepancies for each reaction

    Returns:
        dict[str, tuple[float, float]]: Dictionary where the keys are reaction IDs and the values are
                                        tuples containing the NLP solution flux and the real flux for unoptimized reactions.
    """
    unoptimized_reactions: dict[str, tuple[float, float]] = {}
    RT = cobrak_model.R * cobrak_model.T

    for reac_id, reaction in cobrak_model.reactions.items():
        if reac_id not in solution:
            continue
        if reaction.enzyme_reaction_data is None:
            continue
        if reaction.enzyme_reaction_data.identifiers == [""]:
            continue

        nlp_flux = solution[reac_id]
        has_problem = False

        # Kappa check
        if have_all_unignored_km(reaction, cobrak_model.kinetic_ignored_metabolites):
            kappa_substrates = 1.0
            kappa_products = 1.0
            for met_id, raw_stoichiometry in reaction.stoichiometries.items():
                if met_id in cobrak_model.kinetic_ignored_metabolites:
                    continue

                stoichiometry = (
                    raw_stoichiometry
                    * reaction.enzyme_reaction_data.hill_coefficients.kappa.get(
                        met_id, 1.0
                    )
                )
                expconc = exp(solution[f"{LNCONC_VAR_PREFIX}{met_id}"])
                multiplier = (
                    expconc / reaction.enzyme_reaction_data.k_ms[met_id]
                ) ** abs(stoichiometry)
                if stoichiometry < 0.0:
                    kappa_substrates *= multiplier
                else:
                    kappa_products *= multiplier

            real_kappa = kappa_substrates / (1 + kappa_substrates + kappa_products)
            nlp_kappa = solution[f"{KAPPA_VAR_PREFIX}{reac_id}"]

            if abs(nlp_kappa - real_kappa) > 0.001:
                has_problem = True
                if verbose:
                    print(
                        f"κ problem in {reac_id}: Real is {real_kappa}, NLP value is {nlp_kappa}"
                    )
        else:
            real_kappa = 1.0
            nlp_kappa = 1.0

        # Iota and alpha check
        real_iota = 1.0
        real_alpha = 1.0
        if (
            reac_id in solution
            and reaction.enzyme_reaction_data is not None
            and reaction.enzyme_reaction_data.identifiers != [""]
        ):
            alpha_and_iota_mets = set(
                list(reaction.enzyme_reaction_data.k_is.keys())
                + list(reaction.enzyme_reaction_data.k_as.keys())
            )
            for met_id in alpha_and_iota_mets:
                met_var_id = f"{LNCONC_VAR_PREFIX}{met_id}"
                if met_var_id not in solution:
                    continue
                expconc = exp(solution[met_var_id])
                stoichiometry_iota = abs(
                    reaction.stoichiometries.get(met_id, 1.0)
                ) * reaction.enzyme_reaction_data.hill_coefficients.iota.get(
                    met_id, 1.0
                )
                stoichiometry_alpha = abs(
                    reaction.stoichiometries.get(met_id, 1.0)
                ) * reaction.enzyme_reaction_data.hill_coefficients.alpha.get(
                    met_id, 1.0
                )

                if met_id in reaction.enzyme_reaction_data.k_is and regard_iota:
                    real_iota *= 1 / (
                        1
                        + (expconc / reaction.enzyme_reaction_data.k_is[met_id])
                        ** stoichiometry_iota
                    )
                if met_id in reaction.enzyme_reaction_data.k_as and regard_alpha:
                    real_alpha *= 1 / (
                        1
                        + (reaction.enzyme_reaction_data.k_as[met_id] / expconc)
                        ** stoichiometry_alpha
                    )

        nlp_iota = (
            solution.get(f"{IOTA_VAR_PREFIX}{reac_id}", 1.0) if regard_iota else 1.0
        )
        if abs(nlp_iota - real_iota) > 0.001:
            has_problem = True
            if verbose:
                print(
                    f"ι problem in {reac_id}: Real is {real_iota}, NLP value is {nlp_iota}"
                )
        nlp_alpha = (
            solution.get(f"{ALPHA_VAR_PREFIX}{reac_id}", 1.0) if regard_alpha else 1.0
        )
        if abs(nlp_alpha - real_alpha) > 0.001:
            has_problem = True
            if verbose:
                print(
                    f"α problem in {reac_id}: Real is {real_alpha}, NLP value is {nlp_alpha}"
                )

        # Gamma check
        if reaction.dG0 is not None:
            gamma_substrates = 1.0
            gamma_products = 1.0

            for met_id, stoichiometry in reaction.stoichiometries.items():
                multiplier = exp(solution[f"{LNCONC_VAR_PREFIX}{met_id}"]) ** abs(
                    stoichiometry
                )
                if stoichiometry < 0.0:
                    gamma_substrates *= multiplier
                else:
                    gamma_products *= multiplier

            dg = -(reaction.dG0 + RT * log(gamma_products) - RT * log(gamma_substrates))
            real_gamma = 1 - exp(-dg / RT)
            nlp_gamma = solution[f"{GAMMA_VAR_PREFIX}{reac_id}"]

            if abs(nlp_gamma - real_gamma) > 0.001:
                has_problem = True
                if verbose:
                    print(
                        f"γ problem in {reac_id}: Real is {real_gamma}, NLP value is {nlp_gamma}"
                    )
                    print(
                        f"ΔG': Real is {dg}, NLP value is {solution[DF_VAR_PREFIX + reac_id]}"
                    )
                    print("E", solution[get_reaction_enzyme_var_id(reac_id, reaction)])
                    print(
                        "E_use",
                        solution[get_reaction_enzyme_var_id(reac_id, reaction)]
                        * get_full_enzyme_mw(cobrak_model, reaction),
                    )
        else:
            real_gamma = 1.0
            nlp_gamma = 1.0

        # V plus
        enzyme_conc = solution[get_reaction_enzyme_var_id(reac_id, reaction)]
        v_plus = enzyme_conc * reaction.enzyme_reaction_data.k_cat

        nlp_flux = v_plus * nlp_gamma * nlp_kappa * nlp_alpha * nlp_iota
        real_flux = v_plus * real_gamma * real_kappa * real_alpha * real_iota
        if has_problem and verbose:
            print(nlp_flux, real_flux, solution[reac_id])

        if real_flux != solution[reac_id]:
            unoptimized_reactions[reac_id] = (solution[reac_id], real_flux)

    return unoptimized_reactions

have_all_unignored_km(reaction, kinetic_ignored_metabolites)

Check if all non-ignored metabolites in a reaction have associated Michaelis-Menten constants (k_m).

This function checks whether all substrates and products of a reaction, excluding those specified in the kinetically ignored metabolites list, have associated Km values. It also ensures that there is at least one substrate and one product with a k_m value.

Parameters:

Name Type Description Default
reaction Reaction

The reaction to be checked.

required
kinetic_ignored_metabolites list[str]

A list of metabolite IDs to be ignored in the k_m check.

required

Returns:

Name Type Description
bool bool

True if all non-ignored metabolites have Km values and there is at least one substrate and one product with Km values, False otherwise.

Source code in cobrak/utilities.py
2851
2852
2853
2854
2855
2856
2857
2858
2859
2860
2861
2862
2863
2864
2865
2866
2867
2868
2869
2870
2871
2872
2873
2874
2875
2876
2877
2878
2879
2880
2881
2882
2883
2884
2885
2886
2887
2888
2889
2890
2891
@validate_call(validate_return=True)
def have_all_unignored_km(
    reaction: Reaction, kinetic_ignored_metabolites: list[str]
) -> bool:
    """Check if all non-ignored metabolites in a reaction have associated Michaelis-Menten constants (k_m).

    This function checks whether all substrates and products of a reaction, excluding those specified in the kinetically ignored metabolites list,
    have associated Km values. It also ensures that there is at least one substrate and one product with a k_m value.

    Args:
        reaction (Reaction): The reaction to be checked.
        kinetic_ignored_metabolites (list[str]): A list of metabolite IDs to be ignored in the k_m check.

    Returns:
        bool: True if all non-ignored metabolites have Km values and there is at least one substrate and one product with Km values, False otherwise.
    """
    if reaction.enzyme_reaction_data is None:
        return False

    eligible_mets = [
        met_id
        for met_id, stoichiometry in reaction.stoichiometries.items()
        if met_id not in kinetic_ignored_metabolites
    ]
    for eligible_met in eligible_mets:
        if eligible_met not in reaction.enzyme_reaction_data.k_ms:
            return False

    substrates_with_km = [
        met_id
        for met_id in eligible_mets
        if (met_id in reaction.enzyme_reaction_data.k_ms)
        and (reaction.stoichiometries[met_id] < 0)
    ]
    products_with_km = [
        met_id
        for met_id in eligible_mets
        if (met_id in reaction.enzyme_reaction_data.k_ms)
        and (reaction.stoichiometries[met_id] > 0)
    ]
    return not (len(substrates_with_km) == 0 or len(products_with_km) == 0)

in_out_fluxes(cobrak_model, opt_dict, met_id)

Return consumption and production fluxes for a metabolite.

Parameters

cobrak_model : Model COBRA-k model instance. opt_dict : dict[str, float] Reaction‑id → optimal flux (e.g., FBA solution). met_id : str Metabolite identifier to analyse.

Returns

tuple[dict[str, float], dict[str, float]] (cons_dict, prod_dict) where each maps reaction ids to the absolute flux contributed to consumption (negative stoichiometry) or production (positive stoichiometry) of met_id. All is returned as absolute values.

Source code in cobrak/utilities.py
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
@validate_call(validate_return=True)
def in_out_fluxes(
    cobrak_model: Model, opt_dict: dict[str, float], met_id: str
) -> tuple[dict[str, float], dict[str, float]]:
    """Return consumption and production fluxes for a metabolite.

    Parameters
    ----------
    cobrak_model : Model
        COBRA-k model instance.
    opt_dict : dict[str, float]
        Reaction‑id → optimal flux (e.g., FBA solution).
    met_id : str
        Metabolite identifier to analyse.

    Returns
    -------
    tuple[dict[str, float], dict[str, float]]
        (cons_dict, prod_dict) where each maps reaction ids to the absolute
        flux contributed to consumption (negative stoichiometry) or production
        (positive stoichiometry) of ``met_id``. All is returned as absolute values.
    """
    cons_dict: dict[str, float] = {}
    prod_dict: dict[str, float] = {}
    for reac_id, reaction in cobrak_model.reactions.items():
        if reac_id not in opt_dict:
            continue
        if met_id not in reaction.stoichiometries:
            continue
        stoichiometry = reaction.stoichiometries[met_id]
        reac_flux = opt_dict[reac_id]
        if stoichiometry < 0:
            cons_dict.append(abs(stoichiometry) * reac_flux)
        else:
            prod_dict.append(stoichiometry * reac_flux)
    return cons_dict, prod_dict

is_any_error_term_active(correction_config)

Checks if any error term is active in the correction configuration.

This function determines whether any of the error terms specified in the CorrectionConfig object are enabled. It sums the boolean values of the flags indicating whether each error term is active. If the sum is greater than zero, it means at least one error term is active.

Parameters:

Name Type Description Default
correction_config CorrectionConfig

The CorrectionConfig object to check.

required

Returns:

Type Description
bool

True if at least one error term is active, False otherwise.

Source code in cobrak/utilities.py
2894
2895
2896
2897
2898
2899
2900
2901
2902
2903
2904
2905
2906
2907
2908
2909
2910
2911
2912
2913
2914
2915
2916
2917
2918
2919
2920
@validate_call(validate_return=True)
def is_any_error_term_active(correction_config: CorrectionConfig) -> bool:
    """Checks if any error term is active in the correction configuration.

    This function determines whether any of the error terms specified in the
    `CorrectionConfig` object are enabled.  It sums the boolean values of the
    flags indicating whether each error term is active.  If the sum is greater
    than zero, it means at least one error term is active.

    Args:
        correction_config: The CorrectionConfig object to check.

    Returns:
        True if at least one error term is active, False otherwise.
    """
    return bool(
        sum(
            [
                correction_config.add_flux_error_term,
                correction_config.add_met_logconc_error_term,
                correction_config.add_enzyme_conc_error_term,
                correction_config.add_kcat_times_e_error_term,
                correction_config.add_dG0_error_term,
                correction_config.add_km_error_term,
            ]
        )
    )

is_objsense_maximization(objsense)

Checks if the objective sense is maximization.

Args: objsense (int): The objective sense, where in this function's definition: - >0: Maximization - ≤0: Minimization

Returns: bool: True if the objective sense is maximization, False otherwise.

Source code in cobrak/utilities.py
2923
2924
2925
2926
2927
2928
2929
2930
2931
2932
2933
2934
@validate_call(validate_return=True)
def is_objsense_maximization(objsense: int) -> bool:
    """Checks if the objective sense is maximization.

    Args:
    objsense (int): The objective sense, where in this function's definition:
    - >0: Maximization
    - ≤0: Minimization

    Returns:
    bool: True if the objective sense is maximization, False otherwise."""
    return objsense > 0

last_n_elements_equal(lst, n)

Check if the last n elements of a list are equal.

Parameters:

Name Type Description Default
lst list[Any]

The list to check.

required
n int

The number of elements from the end of the list to compare.

required

Returns:

Name Type Description
bool bool

True if the last n elements are equal, False otherwise.

Example

last_n_elements_equal([1, 2, 3, 4, 4, 4], 3) True last_n_elements_equal([1, 2, 3, 4, 5, 6], 3) False

Source code in cobrak/utilities.py
2937
2938
2939
2940
2941
2942
2943
2944
2945
2946
2947
2948
2949
2950
2951
2952
2953
2954
@validate_call(config=ConfigDict(arbitrary_types_allowed=True), validate_return=True)
def last_n_elements_equal(lst: list[Any], n: int | float) -> bool:
    """Check if the last n elements of a list are equal.

    Args:
        lst (list[Any]): The list to check.
        n (int): The number of elements from the end of the list to compare.

    Returns:
        bool: True if the last n elements are equal, False otherwise.

    Example:
        >>> last_n_elements_equal([1, 2, 3, 4, 4, 4], 3)
        True
        >>> last_n_elements_equal([1, 2, 3, 4, 5, 6], 3)
        False
    """
    return (n == 0) or (len(lst) >= n and all(x == lst[-n] for x in lst[-n:]))

make_kms_better_by_factor(cobrak_model, reac_id, factor)

Adjusts the Michaelis constants (Km) for substrates and products of a specified reaction in the metabolic model.

  • Substrate's Michaelis constants are divided by 'factor'.
  • Product's Michaelis constants are multiplied by 'factor'.
  • Only affects metabolites with existing enzyme reaction data.

Parameters:

Name Type Description Default
cobrak_model Model

The metabolic model containing enzymatic constraints.

required
reac_id str

The ID of the reaction to adjust the Km values for.

required
factor float

The multiplication/division factor used to modify the Michaelis constants.

required

Returns:

Name Type Description
None None

This function modifies the input Model object in place and does not return any value.

Source code in cobrak/utilities.py
2957
2958
2959
2960
2961
2962
2963
2964
2965
2966
2967
2968
2969
2970
2971
2972
2973
2974
2975
2976
2977
2978
2979
2980
2981
2982
2983
2984
2985
2986
2987
2988
2989
2990
2991
2992
2993
2994
2995
@validate_call(validate_return=True)
def make_kms_better_by_factor(
    cobrak_model: Model, reac_id: str, factor: NonNegativeFloat
) -> None:
    """Adjusts the Michaelis constants (Km) for substrates and products of a specified reaction in the metabolic model.

    - Substrate's Michaelis constants are divided by 'factor'.
    - Product's Michaelis constants are multiplied by 'factor'.
    - Only affects metabolites with existing enzyme reaction data.

    Parameters:
        cobrak_model (Model): The metabolic model containing enzymatic constraints.
        reac_id (str): The ID of the reaction to adjust the Km values for.
        factor (float): The multiplication/division factor used to modify the Michaelis constants.

    Returns:
        None: This function modifies the input Model object in place and does not return any value.
    """
    reaction = cobrak_model.reactions[reac_id]

    substrate_ids = [
        met_id
        for met_id in reaction.stoichiometries
        if reaction.stoichiometries[met_id] < 0
    ]
    for substrate_id in substrate_ids:
        if substrate_id not in reaction.enzyme_reaction_data.k_ms:
            continue
        reaction.enzyme_reaction_data.k_ms[substrate_id] /= factor

    product_ids = [
        met_id
        for met_id in reaction.stoichiometries
        if reaction.stoichiometries[met_id] > 0
    ]
    for product_id in product_ids:
        if product_id not in reaction.enzyme_reaction_data.k_ms:
            continue
        reaction.enzyme_reaction_data.k_ms[product_id] *= factor

parse_external_resources(path, brenda_version, parse_brenda=True)

Parse and verify the presence of external resource files required for a COBRAk model.

This function checks if the necessary external resource files are present in the specified directory. If any required files are missing, it provides instructions on where to download them. Additionally, it processes certain files if their parsed versions are not found.

The particular files that are lloked after are the NCBI TAXONOMY taxdump file and the BRENDA JSON TAR GZ as wel as the bigg_models_metabolites.txt file.

Parameters:

Name Type Description Default
path str

The directory path where the external resource files are located.

required
brenda_version str

The version of the BRENDA database to be used.

required

Raises:

Type Description
ValueError

If the specified path is not a directory.

FileNotFoundError

If any required files are missing from the specified directory.

Source code in cobrak/utilities.py
2998
2999
3000
3001
3002
3003
3004
3005
3006
3007
3008
3009
3010
3011
3012
3013
3014
3015
3016
3017
3018
3019
3020
3021
3022
3023
3024
3025
3026
3027
3028
3029
3030
3031
3032
3033
3034
3035
3036
3037
3038
3039
3040
3041
3042
3043
3044
3045
3046
3047
3048
3049
3050
3051
3052
3053
@validate_call(validate_return=True)
def parse_external_resources(
    path: str, brenda_version: str, parse_brenda: bool = True
) -> None:
    """Parse and verify the presence of external resource files required for a COBRAk model.

    This function checks if the necessary external resource files are present in the specified directory.
    If any required files are missing, it provides instructions on where to download them. Additionally,
    it processes certain files if their parsed versions are not found.

    The particular files that are lloked after are the NCBI TAXONOMY taxdump file and
    the BRENDA JSON TAR GZ as wel as the bigg_models_metabolites.txt file.

    Args:
        path (str): The directory path where the external resource files are located.
        brenda_version (str): The version of the BRENDA database to be used.

    Raises:
        ValueError: If the specified path is not a directory.
        FileNotFoundError: If any required files are missing from the specified directory.
    """
    path = standardize_folder(path)
    if not os.path.isdir(path):
        print(
            f"ERROR: Given external resources path {path} does not seem to be a folder!"
        )
        raise ValueError
    filenames = get_files(path)

    needed_filename_data = [
        ("taxdmp.zip", "https://ftp.ncbi.nih.gov/pub/taxonomy/"),
        ("bigg_models_metabolites.txt", "http://bigg.ucsd.edu/data_access"),
    ]
    if parse_brenda:
        needed_filename_data.append(
            (
                f"brenda_{brenda_version}.json.tar.gz",
                "https://www.brenda-enzymes.org/download.php",
            )
        )
    for needed_filename, link in needed_filename_data:
        if needed_filename not in filenames:
            print(
                f"ERROR: File {needed_filename} not found in given external resources path {path}!"
            )
            print(
                "Solution: Either change the path if it is wrong, or download the file from:"
            )
            print(link)
            raise FileNotFoundError
    if "parsed_taxdmp.json.zip" not in filenames:
        parse_ncbi_taxonomy(f"{path}taxdmp.zip", f"{path}parsed_taxdmp.json")
    if "bigg_models_metabolites.json" not in filenames:
        bigg_parse_metabolites_file(
            f"{path}bigg_models_metabolites.txt", f"{path}bigg_models_metabolites.json"
        )

print_model_parameter_statistics(cobrak_model)

Prints statistics about reaction parameters (kcats and kms) in a COBRA-k model.

This function calculates and prints statistics about the kcat and Km values associated with reactions in a COBRA-k model. It groups these values by their taxonomic distance (as indicated by references) and prints the counts for each distance group. It also prints the median kcat and the median Km values for substrates and products separately.

Parameters:

Name Type Description Default
cobrak_model Model

The COBRA Model object.

required

Returns:

Type Description
None

None. Prints statistics to the console.

Source code in cobrak/utilities.py
3056
3057
3058
3059
3060
3061
3062
3063
3064
3065
3066
3067
3068
3069
3070
3071
3072
3073
3074
3075
3076
3077
3078
3079
3080
3081
3082
3083
3084
3085
3086
3087
3088
3089
3090
3091
3092
3093
3094
3095
3096
3097
3098
3099
3100
3101
3102
3103
3104
3105
3106
3107
3108
3109
3110
3111
3112
3113
3114
3115
3116
3117
3118
3119
3120
3121
@validate_call(validate_return=True)
def print_model_parameter_statistics(cobrak_model: Model) -> None:
    """Prints statistics about reaction parameters (kcats and kms) in a COBRA-k model.

    This function calculates and prints statistics about the kcat and Km values
    associated with reactions in a COBRA-k model. It groups these values by their
    taxonomic distance (as indicated by references) and prints the counts for each
    distance group.  It also prints the median kcat and the median Km values for
    substrates and products separately.

    Args:
        cobrak_model: The COBRA Model object.

    Returns:
        None.  Prints statistics to the console.
    """
    substrate_kms, product_kms = get_model_kms_by_usage(cobrak_model)
    all_kms = substrate_kms + product_kms
    all_kcats = get_model_kcats(cobrak_model)

    kcats_by_taxonomy_score: dict[int, int] = {}
    kms_by_taxonomy_score: dict[int, int] = {}
    for reaction in cobrak_model.reactions.values():
        if reaction.enzyme_reaction_data is None:
            continue

        enzdata = reaction.enzyme_reaction_data
        if len(enzdata.k_cat_references) > 0:
            tax_distance = enzdata.k_cat_references[0].tax_distance
        else:
            tax_distance = -2
        if tax_distance not in kcats_by_taxonomy_score:
            kcats_by_taxonomy_score[tax_distance] = 0
        kcats_by_taxonomy_score[tax_distance] += 1

        for met_id in enzdata.k_ms:
            if (met_id in enzdata.k_m_references) and len(
                enzdata.k_m_references[met_id]
            ) > 0:
                tax_distance = enzdata.k_m_references[met_id][0].tax_distance
            else:
                tax_distance = -2
            if tax_distance not in kms_by_taxonomy_score:
                kms_by_taxonomy_score[tax_distance] = 0
            kms_by_taxonomy_score[tax_distance] += 1

    print(
        "kcats:",
        sort_dict_keys(kcats_by_taxonomy_score),
        sum(kcats_by_taxonomy_score.values()),
        len(all_kcats),
    )
    print(" ->median:", median(all_kcats))
    print(
        "kms:",
        sort_dict_keys(kms_by_taxonomy_score),
        sum(kms_by_taxonomy_score.values()),
        len(all_kms),
    )
    print(
        " ->median substrates:",
        median(substrate_kms),
        "->median products:",
        median(product_kms),
    )
    print(len(cobrak_model.reactions))

sort_dict_keys(dictionary, reverse=False)

Sorts all keys in a dictionary alphabetically.

Parameters:

Name Type Description Default
dictionary dict

The dictionary to sort.

required

Returns:

Name Type Description
dict dict[T, U]

A new dictionary with the keys sorted alphabetically.

Source code in cobrak/utilities.py
3124
3125
3126
3127
3128
3129
3130
3131
3132
3133
3134
@validate_call(config=ConfigDict(arbitrary_types_allowed=True), validate_return=True)
def sort_dict_keys(dictionary: dict[T, U], reverse: bool = False) -> dict[T, U]:
    """Sorts all keys in a dictionary alphabetically.

    Args:
        dictionary (dict): The dictionary to sort.

    Returns:
        dict: A new dictionary with the keys sorted alphabetically.
    """
    return dict(sorted(dictionary.items(), reverse=reverse))

split_list(lst, n)

Split a list into n nearly equal parts.

This function divides a given list into n sublists, distributing the elements as evenly as possible.

Parameters: - lst (list[Any]): The list to be split. - n (int): The number of sublists to create.

Returns: - list[list[Any]]: A list of n sublists, each containing a portion of the original list's elements.

Example:

result = _split_list([1, 2, 3, 4, 5], 3)
# result: [[1, 2], [3, 4], [5]]

Raises: - ValueError: If n is less than or equal to 0.

Source code in cobrak/utilities.py
3137
3138
3139
3140
3141
3142
3143
3144
3145
3146
3147
3148
3149
3150
3151
3152
3153
3154
3155
3156
3157
3158
3159
3160
@validate_call(config=ConfigDict(arbitrary_types_allowed=True), validate_return=True)
def split_list(lst: list[Any], n: PositiveInt) -> list[list[Any]]:
    """Split a list into `n` nearly equal parts.

    This function divides a given list into `n` sublists, distributing the elements as evenly as possible.

    Parameters:
    - lst (list[Any]): The list to be split.
    - n (int): The number of sublists to create.

    Returns:
    - list[list[Any]]: A list of `n` sublists, each containing a portion of the original list's elements.

    Example:
    ```
    result = _split_list([1, 2, 3, 4, 5], 3)
    # result: [[1, 2], [3, 4], [5]]
    ```

    Raises:
    - ValueError: If `n` is less than or equal to 0.
    """
    k, m = divmod(len(lst), n)
    return [lst[i * k + min(i, m) : (i + 1) * k + min(i + 1, m)] for i in range(n)]