Sequence
¶
-
class
Sequence
[source]¶ A container for a single molecular sequence.
Data attributes
sequence a string, the molecular sequence
name a string, the name
dataType either ‘dna’, or ‘protein’, or None, meaning ‘standard’
-
checkTranslation
(theProteinSequence, transl_table=1, checkStarts=False)[source]¶ Check that self translates to theProteinSequence
Self should be a DNA sequence. It is translated using
p4.geneticcode.GeneticCode.translate()
(so it should handle ambiguities) and compared against theProteinSequence. The theProteinSequence name and gap pattern should be the same as in the DNA sequence. The default transl_table is the standard (or so-called universal) genetic code.Other available translation tables, this week:
if transl_table == 1: # standard elif transl_table == 2: # vertebrate mito elif transl_table == 4: # Mold, Protozoan, # and Coelenterate Mitochondrial Code # and the Mycoplasma/Spiroplasma Code elif transl_table == 5: # invertebrate mito elif transl_table == 9: # echinoderm mito # and now 6, 10, 11, 12, 13, 14, 21.
(These are found in
GeneticCode
)See also
p4.alignment.Alignment.translate()
andp4.alignment.Alignment.checkTranslation()
If the arg checkStarts is turned on (by default it is not turned on) then this method checks whether the first codon is a start codon.
-
property
nChar
¶ (property) return the length of the sequence, or zero
-
reverseComplement
()[source]¶ Convert self.sequence, a DNA sequence, to its reverse complement.
Ambigs are handled correctly. I think.
-
translate
(transl_table=1, checkStarts=False, nnn_is_gap=False)[source]¶ Returns a protein Sequence from self, a DNA sequence.
Self is translated using
p4.geneticcode.GeneticCode.translate()
, so it handles ambiguities. At the moment, we can only do translations where the frame of the codon is 123, ie the first sequence position is the first position of the codon. The default transl_table is the standard (or so-called universal) genetic code, but you can change it.Other available translation tables, this week:
if transl_table == 1: # standard elif transl_table == 2: # vertebrate mito elif transl_table == 4: # Mold, Protozoan, # and Coelenterate Mitochondrial Code # and the Mycoplasma/Spiroplasma Code elif transl_table == 5: # invertebrate mito elif transl_table == 9: # echinoderm mito and now 6, 10, 11, 12, 13, 14, 21.
(These are found in
p4.geneticcode.GeneticCode
)See also
p4.alignment.Alignment.checkTranslation()
andp4.alignment.Alignment.translate()
.If the arg checkStarts is turned on (by default it is not turned on) then this method checks whether the first codon is a start codon, and if it is then it uses it.
Arg nnn_is_gap is for odd sequences where there are long stretches of ‘nnn’ codons, which probably should be gaps. Probably best to correct those elsewise.