BioCyc is a Python interface to the BioCyc. Acting as a wrapper it queries the database and then presents the XML returned in a pythonic object-based interface. Support for IPython views is included offering nice summary tables of object attributes.
BioCyc are approaching the renewal period for their NIH grant. If you find the tools useful please consider writing a letter of support. If you use EcoCyc there is a seperate call. It's incredibly important to keep public databases like these available for both research and educational value. I know they've been indispensable in my PhD.
The BioCyc interface provides acces to most attributes, with inter-object links presented as lazy-loading lists. These links are followed and auto-queried on access, allowing navigation through the entire database tree by simply accessing object attributes and slices.
The interface is throttled to one request per second (by request of BioCyc). However,
the module comes with a built-in cache (stored by default under
that stores retrieved objects for future use. As such subsequent requests are much quicker.
Multiple and configurable caches may be used, and it's possible to share caches across multiple machines.
To install, get on the command line and type:
pip install BioCyc
or download from PyPi or Github.
A demo IPython notebook (available here) is walked through below.
biocyc object from the
biocyc module. This object
provides the base access to the database for the initial get. You can
set the organism using
set_organism and one of the standard BioCyc
database identifiers. Note that this only affects the organism-database
used for direct requests on the biocyc object. Sub-requests on existing
objects will use the same database as that object (otherwise things
would be very confusing indeed).
import os from biocyc import biocyc os.environ['http_proxy'] = '' # Set your proxy if neccessary biocyc.set_organism('meta')
Making a request
To get an database object (of any type) simply using the unique BioCyc
identifiers for it. Here we request
L-Lactate. Note that if you do
this from within an IP[y] Notebook you get a nice table output of all
associated attributes for an object. This includes direct links to the
BioCyc database and other database annotations.
|Reactions||TRANS-RXN-104, RXN-12165, RXN-12096, LACTALDDEHYDROG-RXN, RXN0-5269, D-LACTATE-2-SULFATASE-RXN, TRANS-RXN-104, L-LACTDEHYDROGFMN-RXN, LACTATE-MALATE-TRANSHYDROGENASE-RXN, LACTATE-2-MONOOXYGENASE-RXN, L-LACTATE-DEHYDROGENASE-CYTOCHROME-RXN, L-LACTATE-DEHYDROGENASE-RXN, RXN-9067, RXN-8076, PROPIONLACT-RXN, LACTATE-RACEMASE-RXN, LACTATE-ALDOLASE-RXN|
|Database links||CAS: 79-33-4, PUBCHEM: 5460161, LIGAND-CPD: C00186, CHEMSPIDER: 4573803, CHEBI: 16651, BIGG: 34179|
Now we have an object we can perform sub-queries by accessing fields. If
you access the
o.reactionsfield you will trigger a dynamic request
for all entities in that list. Connections to the BioCyc server are
throttled at 1/second, so this may take a little while on long lists.
However, retrieved data is cached under
requests will be much quicker. By default the cache is set to expire
objects after ~6 months, and the cache folder can be shared between
Note: If you just want access to the identifiers, you can use the
`o._reactionsfield to access these without triggering a request
r = o.reactions r
|Name||NADP+ L-lactaldehyde dehydrogenase|
You can access sub-entities and manipulate objects using standard Python list processing.
ps = [r.pathways for r in o.reactions] p = [p for sl in ps for p in sl] p [L-rhamnose degradation II, L-rhamnose degradation III, L-rhamnose degradation II, methylglyoxal degradation V, lactate biosynthesis (archaea), L-lactaldehyde degradation (aerobic), L-lactaldehyde degradation (aerobic), methylglyoxal degradation V, pyruvate fermentation to lactate, glucose and xylose degradation, Bifidobacterium shunt, heterolactic fermentation, factor 420 biosynthesis]
|Name||L-rhamnose degradation II|
|Species||TAX-5580, ORG-6176, TAX-95486, TAX-284592, TAX-322104|
|Taxonomic range||TAX-2, TAX-4751|
That's all for now! Hopefully this shows how Python (and IPython notebook) access to the BioCyc Web API may be useful. Support for additional attributes, API calls etc. is planned for the future. If you have specific requests, get in touch!