GeneticCode

class GeneticCode(transl_table=1)

A container for NCBI translation tables.

See the ncbi translation tables, which this week are at http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=c

(If they move, poke around the ‘taxonomy browser’ area.)

This week we have

  • 1 standard
  • 2 vertebrate mito
  • 3 yeast mito
  • 4 Mold, Protozoan
  • 5 invertbrate mito
  • 6 The Ciliate, Dasycladacean and Hexamita Nuclear Code
  • 9 echinoderm and flatworm mito
  • 10 Euplotid Nuclear Code
  • 11 Bacterial and Plant Plastid Code
  • 12 Alternative Yeast Nuclear Code
  • 13 Ascidian Mitochondrial Code
  • 14 Alternative Flatworm Mitochondrial Code
  • 21 Trematode Mitochondrial Code
  • 24 Pterobranchia mito

If more transl_tables are needed, you should be able to just drop them in, with a little tweaking.

This provides

  • code A dictionary. So you can ask for eg myGC.code[‘ggg’]
  • codonsForAA Another dictionary, where you can ask for eg myGC.codonsForAA[‘v’]
  • startList A list of start codons
Methods
GeneticCode.translate(theCodon[, verbose]) Translate a codon, handling ambiguities.
GeneticCode.wise2Table() Output in a form suitable to replace codon.table in genewise in Wise2.

Wikipedia says: The joint nomenclature committee of the IUPAC/IUBMB has officially recommended the three-letter symbol Sec and the one-letter symbol U for selenocysteine. The UGA codon is made to encode selenocysteine by the presence of a SECIS element (SElenoCysteine Insertion Sequence) in the mRNA.

translate(theCodon, verbose=1)

Translate a codon, handling ambiguities.

This method will translate a codon, depending of course on the transl_table (which is specific to self), correctly handling ambiguities appropriate to the transl_table. It does not give info about whether the codon is potentially a start codon.

This method is used by the methods Alignment.Alignment.translate() and Alignment.Alignment.checkTranslation().

A translation like that from codon ggg to amino acid g is direct and easy. However, this method will also translate ambiguous codons ggy or ggs (and so on) to g, unequivocally, because the four codons that start with gg all code for g (which appears to be true for all translation tables, but that is not a requirement here).

If all possible disambiguations of an ambiguous codon code for a particular amino acid then this method will return that amino acid; otherwise the translation is ambiguous and this method returns x.

So for example the codon tgr will translate to x using transl_table = 1 (standard) because tga is a stop codon and tgg codes for w. However, using transl_table = 2 (vertebrate mito), codon tgr will translate to w because both tga and tgg code for w:

>>> gc = GeneticCode(transl_table=1)
>>> gc.translate('tgr')
    #   codon 'tgr' translates to ['*', 'w'] -- ambiguous -- returning 'x'
>>> gc = GeneticCode(transl_table=2)
>>> gc.translate('tgr')
    #   codon 'tgr' translates to 'w'

Exceptions to the latter rule are codons that ambiguously code for either d or n, which return ambiguous amino acid b, and codons that ambigously code for either q or e, which return ambiguous amino acid z. See the example below.

If arg verbose is 0, it does not speak (except for errors, of course). If its 1, it speaks for ambiguous translations. If its 2, it speaks for all translations. The default is 1:

gc = GeneticCode(transl_table=1)

for cdn in 'gga ggy ray ggn aam sar ccm'.split():
    gc.translate(cdn, verbose=2)

prints:

codon 'gga' translates to 'g'
codon 'ggy' translates to 'g'
codon 'ray' translates to ['d', 'n'] -- ambiguous aa 'b'
codon 'ggn' translates to 'g'
codon 'aam' translates to ['k', 'n'] -- ambiguous -- returning 'x'
codon 'sar' translates to ['q', 'e'] -- ambiguous aa 'z'
codon 'ccm' translates to 'p'
wise2Table()

Output in a form suitable to replace codon.table in genewise in Wise2.

See the Wise2 web site.

By default, genewise from the Wise2 package uses the standard genetic code, defined in the file codon.table. However, you can supply your own table, and you can use this method to make a suitable file. Due to lazy programming, this method prints to stdout, so you will need to put the output in a file yourself. Put that file, suitably named (eg codon.table5 or whatever) in the wisecfg directory (where the original codon.table resides), which might be /usr/local/src/wise2.2.0/wisecfg or some such location.

Then, when you call genewise you can use the -codon option to set the codon table file that you want to use, eg:

genewise -genes -cdna -trans -pep -pretty -silent -codon codon.table5 guideFileName dnaFileName