-
Notifications
You must be signed in to change notification settings - Fork 9
Background
Gene expression in bacteria begins with an RNA polymerase transcribing portions of the genome from DNA into RNA. For messenger RNAs (mRNAs) that encode proteins, a ribosome then binds to a start codon and translates its nucleotide sequence into an amino acid sequence until it encounters a stop codon and terminates. Because the pool of ribosomes in a cell is limited, start codons on different mRNAs compete for ribosome binding. This makes the rate of translation initiation at a gene's start codon an important determinant of how much protein it will produce relative to other genes.
OSTIR uses a thermodynamic model developed by Salis et al. (2009) to predict the relative rate of translation initiation from different start codons in bacterial mRNAs. This model assumes that the rate of translation initiation is determined by the free energy change upon binding of the ribosome to utilize a given start codon in an mRNA sequence. This energy is broken down into several different components that can be calculated using mRNA folding/interaction energies or derived from experiments. OSTIR uses the open source ViennaRNA package for the necessary RNA secondary structure energy calculations (Lorenz et al. 2011).
The five energy components are:
The
The
There is an optimal spacing in the mRNA between the start codon and the ribosome binding site that supports efficient translation initiation. Deviation from this optimal spacing incurs the energy penalty
The
The
These five energies are combined to calculate the total free energy change of ribosome binding to a given start codon. The standby and mRNA structure energies are subtracted because these structures must be disrupted for translation initiation.
Finally, we can predict translation initiation rates based on an equation that is derived from the standard relationship between the standard change in the Gibbs free energy of the overall ribosome binding reaction and the equilibrium constant that it predicts for binding of ribosomes to this particular start codon.
The proportionality constant
The input data and a script to fit these constants and the additional parameters in the spacing penalty model are provided in the calibration
directory.
A full descripton of the thermodynamic model and some assumptions of this approach can be found in Salis (2011).
Although the general thermodynamic model used by OSTIR should apply for any bacterial species, it is specifically parameterized based on experiments in Escherichia coli.
Because OSTIR predictions are relative, it can be helpful to have a frame of reference for what value indicates a start codon with a strong ribosome binding site that is likely to drive significant gene expression in a bacterial cell versus a weak start codon that does not have appreciable activity.
The graph below shows the results of using OSTIR on the entire E. coli MG1655 genome sequence (GenBank:NC_000913.3). For the 4,249 annotated protein coding genes, there is a peak in the predicted relative initiation rate around 1,000 and most values are in the range 100-10,000. The distribution for other start codons peaks around a value of 10 and most predicted rates are from roughly from 1 to 100. Note that there are still many potential start codons that do not begin annotated genes that are predicted to have relatively high translation initiation rates. However, they will not lead to translation if they are not in a region of the genome that is transcribed into mRNA.
Lorenz R, Bernhart SH, Höner Zu Siederdissen C, Tafer H, Flamm C, Stadler PF, Hofacker, IL. (2011) ViennaRNA Package 2.0. Algorithms Mol. Biol. 6:26. https://doi.org/10.1186/1748-7188-6-26.
Salis HM, Mirsky EA, Voigt CA. (2009) Automated Design of Synthetic Ribosome Binding Sites to Control Protein Expression. Nat. Biotechnol. 27:946–950. https://doi.org/10.1038/nbt.1568.
Salis HM. (2011) The Ribosome Binding Site Calculator. Methods Enzymol. 498:19–42. https://doi.org/10.1016/B978-0-12-385120-8.00002-4.