gde_linux/CORE/xylem/ribosome.doc

      ribosome                                                update  3 Feb 94

      NAME
           ribosome - translates nucleic acid into protein

      SYNOPSIS
           ribosome [-g gcfile] < input > output

      DESCRIPTION
           ribosome reads a file of one or more nucleic acid sequences
           and writes the corresponding amino acid sequence, in the standard
           one letter code, to output. Ribosome begins translating at the
           first nucleotide in each input sequence and continues to the end.
           If the length of the translated sequence is not divisible by 3,
           ribosome pads the final codon with N's and attempts to use ambi-
           guity rules to translate the final codon. Based on the genetic
           code used, ribosome derives a set of rules to resolve all ambi-
           guities that can possibly be resolved.

           -g    read in an alternative genetic code from gcfile.  If this
                 option is not specified, ribosome uses the universal
                 genetic code.

           gcfile - This file specifies an alternative genetic code. An
           example is shown below.  ribosome reads the first 64 legal
           capital letters as amino acids.  Consequently, lowercase letters
           can be used for annotation purposes, as shown in the example.
           All non-amino acid characters are ignored.

                    sgc2 - yeast mitochondrial genetic code

                                  second position
           first position  -------------------------------    third position
           (5' end)        u         c         a         g    (3' end)
           -----------------------------------------------------------------
           u               F         S         Y         C    u
                           F         S         Y         C    c
                           L         S         *         W    a
                           L         S         *         W    g
           -----------------------------------------------------------------
           c               T         P         H         R    u
                           T         P         H         R    c
                           T         P         Q         R    a
                           T         P         Q         R    g
           -----------------------------------------------------------------
           a               I         T         N         S    u
                           I         T         N         S    c
                           M         T         K         R    a
                           M         T         K         R    g
           -----------------------------------------------------------------
           g               V         A         D         G    u
                           V         A         D         G    c
                           V         A         E         G    a
                           V         A         E         G    g


           input - If the first line of the file begins with '>' or ';',
           input will be read as the standard .wrp (Pearson) format,
           such as that produced by getob:

           >name
           ; one or more comment lines (optional)
           sequence lines


           Otherwise, it will be assumed that the file ONLY contains
           sequence, and all legal IUPAC/IUB DNA characters will be
           read as sequence.

      SEE ALSO
            getob

     AUTHOR
       Dr. Brian Fristensky
       Dept. of Plant Science
       University of Manitoba
       Winnipeg, MB  Canada  R3T 2N2
       Phone: 204-474-6085
       FAX: 204-261-5732
       frist@cc.umanitoba.ca

     REFERENCE
       Fristensky, B. (1993) Feature expressions: creating and manipulating
       sequence datasets. Nucleic Acids Research 21:5997-6003.