staden-lg/help/staden_help
2021-12-04 05:07:58 +00:00

184 lines
No EOL
8 KiB
Text

Introduction to the Staden sequence analysis package and its
user interface
The package contains the following programs:
GIP Gel input program
SAP Sequence assemble program
NIP Nucleotide interpretation program
PIP Protein interpretation program
SIP Similarity investigation program
MEP Motif exploration program
NIPL Nucleotide interpretation program (library)
PIPL Protein interpretation program (library)
SIPL Similarity investigation program (library)
GIP uses a digitiser for entry of DNA sequences from
autoradiographs.
SAP handles everything relating to assembling gel readings in order
to produce a consensus sequence. It can also deal with families of
protein sequences.
NIP provides functions for analysing and interpretting individual
nucleotide sequences.
PIP provides functions for analysing and interpretting individual
protein sequences.
MEP analyses families of nucleotide sequences to help discover new
motifs.
NIPL performs pattern searches on nucleotide sequence libraries.
PIPL performs pattern searches on protein sequence libraries.
SIP provides functions for comparing and aligning pairs of protein
or nucleotide sequences.
SIPL searches nucleotide and protein sequence libraries for entries
similar to probe sequences.
Documentation
As is explained below, the programs SAP, NIP, PIP, SIP and MEP
have online help, and the help files have the names: HELPSAP,
HELPNIP, HELPPIP, HELPSIP, HELPMEP. These files can be displayed on
the screen or printed using the appropriate commands. Currently the
help for the other programs is also contained in these files. For
example help for NIPL is in HELPNIP. This file is called HELPSTADEN.
Sequence formats
The shotgun sequencing program SAP deals only with simple text
files for gel readings, and is a self-contained system. However as
there is still no single agreed format for finished sequences or for
libraries of sequences, the other programs in the package can read
data that is stored in several ways.
The analytical programs can read individual sequences stored
in the following formats: Staden, EMBL, Genbank, PIR (also known as
NBRF), and GCG, but for storing whole libraries we use only PIR
format. In addition these programs can perform a number of simple
operations using libraries stored in this format. They can extract
entries by entry name, can search titles for keywords, can search
the whole of the annotation files for keywords, and can extract
annotations for any named entry. We reformat all sequence libraries
into PIR format. Currently we have NBRF, EMBL, SWISSPROT and VECBASE
libraries in PIR format.
The library searching programs operate only on sequences
stored in PIR format.
The analytical programs will operate with uppercase or
lowercase sequence characters. In addition T and U are equivalent.
SAP uses uppercase letters for original gel readings and lowercase
letters for characters that are corrected by the automatic editor.
Programs NIP and PIP use IUB symbols for redundancy in back
translations and for sequence searches. The symbols are shown
below.
NC-IUB SYMBOLS
A,C,G,T
R (A,G) 'puRine'
Y (T,C) 'pYrimidine'
W (A,T) 'Weak'
S (C,G) 'Strong'
M (A,C) 'aMino'
K (G,T) 'Keto'
H (A,T,C) 'not G'
B (G,C,T) 'not A'
V (G,A,C) 'not T'
D (G,A,T) 'not C'
N (G,A,C,T) 'aNy'
The user interface
The user interface is common to all programs. It consists of a
set of menus and a uniform way of presenting choices and obtaining
input from the user. This section describes: the menu system; how
options are selected and other choices made; how values are
supplied to the program; how help is obtained, and how to escape
from any part of a program. In addition it gives information about
saving results in files and the use of graphics for presenting
results.
Menus
Each program has several menus and numerous options. Each menu
or option has a unique number that is used to identify it. Menu
numbers are distinguished from option numbers by being preceded by
the letter m (or M, all programs make no distinction between upper
and lower case letters). With the exception of some parts of program
SAP, the menus are not hierachical, rather the options they each
contain are simply lists of related functions and their identifying
numbers. Therefore options can be selected independently of the menu
that is currently being shown on the screen, and the menus are
simply memory aides. All options and menus are selected by typing
their option number when the programs present the prompt
"? Menu or option number =".
To select a menu type its number preceded by the letter M. To
select an option type its number. If you type only "return" you
will get menu m0 which is simply a list of menus. If you select an
option you will return to the current menu after the function is
completed.
When you select an option, in many cases the program will
immediately perform the operation selected without further dialogue.
If you precede an option number by the letter d (e.g. D17), you will
force the program to offer dialogue about the selected option before
the function operates, hence allowing you to change the value of any
of its parameters. If you precede an option number by the symbol ?
(e.g. ?17), you will be given help on the option (here 17).
Where possible, equivalent or identical options have been
given the same numbers in all programs, and so users quickly learn
the numbers for the functions they employ most often.
Help
As mentioned above, help about each option can be obtained by
preceding the option number by the symbol ? when you are presented
with the prompt "? Menu or option number", but there are two further
ways of obtaining help. Whenever the program asks a question you can
respond by typing the symbol ? and you will receive information
about the current option. In addition, option number 1 in all the
programs will give help on all of a programs functions.
Quitting
To exit from any point in a program you type ! for quit. If a
menu is on the screen this will stop the program, otherwise you will
be returned to the last menu.
Other interactions
Questions are presented in a few restricted ways. In all
cases typing only "return" in response to a question means yes, and
typing N or n means no.
Obvious opposites such as "clear screen" and "keep picture"
are presented with only the default shown. For example in this case
the default is generally "keep picture" so the program will display:
"(y/n) (y) Keep picture"
and the picture will be retained if the user types anything
other than N or n, (in which case the screen will be cleared).
Where there are choices that are not obvious opposites, or
there are more than two choices, two further conventions are used:
"radio buttons" and "check boxes".
Radio buttons are used when only one of a number of choices
can be made at any one time. The choices are presented arranged one
above the other, each choice with a number for its selection, and
the default choice marked with an X. For example in the restriction
enzyme search routine the following choices are offered:
Select output mode
1 order results enzyme by enzyme
2 order results by positon
X 3 show only infrequent cutters
4 show names above the sequence