Sequence

class Sequence[source]

A container for a single molecular sequence.

Data attributes

  • sequence a string, the molecular sequence

  • name a string, the name

  • dataType either ‘dna’, or ‘protein’, or None, meaning ‘standard’

checkTranslation(theProteinSequence, transl_table=1, checkStarts=False)[source]

Check that self translates to theProteinSequence

Self should be a DNA sequence. It is translated using p4.geneticcode.GeneticCode.translate() (so it should handle ambiguities) and compared against theProteinSequence. The theProteinSequence name and gap pattern should be the same as in the DNA sequence. The default transl_table is the standard (or so-called universal) genetic code.

Other available translation tables, this week:

if transl_table == 1:   # standard
elif transl_table == 2: # vertebrate mito
elif transl_table == 4: # Mold, Protozoan,
                        # and Coelenterate Mitochondrial Code
                        # and the Mycoplasma/Spiroplasma Code
elif transl_table == 5: # invertebrate mito
elif transl_table == 9: # echinoderm mito

# and now 6, 10, 11, 12, 13, 14, 21.

(These are found in GeneticCode)

See also p4.alignment.Alignment.translate() and p4.alignment.Alignment.checkTranslation()

If the arg checkStarts is turned on (by default it is not turned on) then this method checks whether the first codon is a start codon.

dump()[source]

Print rubbish about self.

dupe()[source]

Return a duplicate of self.

property nChar

(property) return the length of the sequence, or zero

reverseComplement()[source]

Convert self.sequence, a DNA sequence, to its reverse complement.

Ambigs are handled correctly. I think.

translate(transl_table=1, checkStarts=False, nnn_is_gap=False)[source]

Returns a protein Sequence from self, a DNA sequence.

Self is translated using p4.geneticcode.GeneticCode.translate(), so it handles ambiguities. At the moment, we can only do translations where the frame of the codon is 123, ie the first sequence position is the first position of the codon. The default transl_table is the standard (or so-called universal) genetic code, but you can change it.

Other available translation tables, this week:

if transl_table == 1: # standard
elif transl_table == 2: # vertebrate mito
elif transl_table == 4: # Mold, Protozoan,
                        # and Coelenterate Mitochondrial Code
                        # and the Mycoplasma/Spiroplasma Code
elif transl_table == 5: # invertebrate mito
elif transl_table == 9: # echinoderm mito

and now 6, 10, 11, 12, 13, 14, 21.

(These are found in p4.geneticcode.GeneticCode)

See also p4.alignment.Alignment.checkTranslation() and p4.alignment.Alignment.translate().

If the arg checkStarts is turned on (by default it is not turned on) then this method checks whether the first codon is a start codon, and if it is then it uses it.

Arg nnn_is_gap is for odd sequences where there are long stretches of ‘nnn’ codons, which probably should be gaps. Probably best to correct those elsewise.

write()[source]
writeFasta(fName=None, width=60, doComment=True, writeExtraNewline=True)[source]
writeFastaToOpenFile(flob, width=60, doComment=True, writeExtraNewline=True)[source]