184 lines
No EOL
8 KiB
Text
184 lines
No EOL
8 KiB
Text
|
|
Introduction to the Staden sequence analysis package and its
|
|
user interface
|
|
|
|
The package contains the following programs:
|
|
|
|
GIP Gel input program
|
|
SAP Sequence assemble program
|
|
NIP Nucleotide interpretation program
|
|
PIP Protein interpretation program
|
|
SIP Similarity investigation program
|
|
MEP Motif exploration program
|
|
NIPL Nucleotide interpretation program (library)
|
|
PIPL Protein interpretation program (library)
|
|
SIPL Similarity investigation program (library)
|
|
|
|
GIP uses a digitiser for entry of DNA sequences from
|
|
autoradiographs.
|
|
SAP handles everything relating to assembling gel readings in order
|
|
to produce a consensus sequence. It can also deal with families of
|
|
protein sequences.
|
|
NIP provides functions for analysing and interpretting individual
|
|
nucleotide sequences.
|
|
PIP provides functions for analysing and interpretting individual
|
|
protein sequences.
|
|
MEP analyses families of nucleotide sequences to help discover new
|
|
motifs.
|
|
NIPL performs pattern searches on nucleotide sequence libraries.
|
|
PIPL performs pattern searches on protein sequence libraries.
|
|
SIP provides functions for comparing and aligning pairs of protein
|
|
or nucleotide sequences.
|
|
SIPL searches nucleotide and protein sequence libraries for entries
|
|
similar to probe sequences.
|
|
|
|
|
|
Documentation
|
|
|
|
As is explained below, the programs SAP, NIP, PIP, SIP and MEP
|
|
have online help, and the help files have the names: HELPSAP,
|
|
HELPNIP, HELPPIP, HELPSIP, HELPMEP. These files can be displayed on
|
|
the screen or printed using the appropriate commands. Currently the
|
|
help for the other programs is also contained in these files. For
|
|
example help for NIPL is in HELPNIP. This file is called HELPSTADEN.
|
|
|
|
Sequence formats
|
|
|
|
The shotgun sequencing program SAP deals only with simple text
|
|
files for gel readings, and is a self-contained system. However as
|
|
there is still no single agreed format for finished sequences or for
|
|
libraries of sequences, the other programs in the package can read
|
|
data that is stored in several ways.
|
|
|
|
The analytical programs can read individual sequences stored
|
|
in the following formats: Staden, EMBL, Genbank, PIR (also known as
|
|
NBRF), and GCG, but for storing whole libraries we use only PIR
|
|
format. In addition these programs can perform a number of simple
|
|
operations using libraries stored in this format. They can extract
|
|
entries by entry name, can search titles for keywords, can search
|
|
the whole of the annotation files for keywords, and can extract
|
|
annotations for any named entry. We reformat all sequence libraries
|
|
into PIR format. Currently we have NBRF, EMBL, SWISSPROT and VECBASE
|
|
libraries in PIR format.
|
|
|
|
The library searching programs operate only on sequences
|
|
stored in PIR format.
|
|
|
|
The analytical programs will operate with uppercase or
|
|
lowercase sequence characters. In addition T and U are equivalent.
|
|
SAP uses uppercase letters for original gel readings and lowercase
|
|
letters for characters that are corrected by the automatic editor.
|
|
Programs NIP and PIP use IUB symbols for redundancy in back
|
|
translations and for sequence searches. The symbols are shown
|
|
below.
|
|
|
|
|
|
NC-IUB SYMBOLS
|
|
|
|
A,C,G,T
|
|
R (A,G) 'puRine'
|
|
Y (T,C) 'pYrimidine'
|
|
W (A,T) 'Weak'
|
|
S (C,G) 'Strong'
|
|
M (A,C) 'aMino'
|
|
K (G,T) 'Keto'
|
|
H (A,T,C) 'not G'
|
|
B (G,C,T) 'not A'
|
|
V (G,A,C) 'not T'
|
|
D (G,A,T) 'not C'
|
|
N (G,A,C,T) 'aNy'
|
|
|
|
|
|
The user interface
|
|
|
|
The user interface is common to all programs. It consists of a
|
|
set of menus and a uniform way of presenting choices and obtaining
|
|
input from the user. This section describes: the menu system; how
|
|
options are selected and other choices made; how values are
|
|
supplied to the program; how help is obtained, and how to escape
|
|
from any part of a program. In addition it gives information about
|
|
saving results in files and the use of graphics for presenting
|
|
results.
|
|
|
|
Menus
|
|
|
|
Each program has several menus and numerous options. Each menu
|
|
or option has a unique number that is used to identify it. Menu
|
|
numbers are distinguished from option numbers by being preceded by
|
|
the letter m (or M, all programs make no distinction between upper
|
|
and lower case letters). With the exception of some parts of program
|
|
SAP, the menus are not hierachical, rather the options they each
|
|
contain are simply lists of related functions and their identifying
|
|
numbers. Therefore options can be selected independently of the menu
|
|
that is currently being shown on the screen, and the menus are
|
|
simply memory aides. All options and menus are selected by typing
|
|
their option number when the programs present the prompt
|
|
|
|
"? Menu or option number =".
|
|
|
|
To select a menu type its number preceded by the letter M. To
|
|
select an option type its number. If you type only "return" you
|
|
will get menu m0 which is simply a list of menus. If you select an
|
|
option you will return to the current menu after the function is
|
|
completed.
|
|
|
|
When you select an option, in many cases the program will
|
|
immediately perform the operation selected without further dialogue.
|
|
If you precede an option number by the letter d (e.g. D17), you will
|
|
force the program to offer dialogue about the selected option before
|
|
the function operates, hence allowing you to change the value of any
|
|
of its parameters. If you precede an option number by the symbol ?
|
|
(e.g. ?17), you will be given help on the option (here 17).
|
|
|
|
Where possible, equivalent or identical options have been
|
|
given the same numbers in all programs, and so users quickly learn
|
|
the numbers for the functions they employ most often.
|
|
|
|
Help
|
|
|
|
As mentioned above, help about each option can be obtained by
|
|
preceding the option number by the symbol ? when you are presented
|
|
with the prompt "? Menu or option number", but there are two further
|
|
ways of obtaining help. Whenever the program asks a question you can
|
|
respond by typing the symbol ? and you will receive information
|
|
about the current option. In addition, option number 1 in all the
|
|
programs will give help on all of a programs functions.
|
|
|
|
Quitting
|
|
|
|
To exit from any point in a program you type ! for quit. If a
|
|
menu is on the screen this will stop the program, otherwise you will
|
|
be returned to the last menu.
|
|
|
|
Other interactions
|
|
|
|
Questions are presented in a few restricted ways. In all
|
|
cases typing only "return" in response to a question means yes, and
|
|
typing N or n means no.
|
|
|
|
Obvious opposites such as "clear screen" and "keep picture"
|
|
are presented with only the default shown. For example in this case
|
|
the default is generally "keep picture" so the program will display:
|
|
|
|
"(y/n) (y) Keep picture"
|
|
|
|
and the picture will be retained if the user types anything
|
|
other than N or n, (in which case the screen will be cleared).
|
|
|
|
Where there are choices that are not obvious opposites, or
|
|
there are more than two choices, two further conventions are used:
|
|
"radio buttons" and "check boxes".
|
|
|
|
Radio buttons are used when only one of a number of choices
|
|
can be made at any one time. The choices are presented arranged one
|
|
above the other, each choice with a number for its selection, and
|
|
the default choice marked with an X. For example in the restriction
|
|
enzyme search routine the following choices are offered:
|
|
|
|
|
|
Select output mode
|
|
1 order results enzyme by enzyme
|
|
2 order results by positon
|
|
X 3 show only infrequent cutters
|
|
4 show names above the sequence
|
|
|