shuffle.doc update 3 Feb 94 SYNOPSIS shuffle -sn [-wn -on] DESCRIPTION Shuffles sequences locally. See Lipman DJ, Wilbur WJ, Smith TF and Waterman MS (1984) On the statistical significance of nucleic acid similarities. Nucl. Acids Res. 12:215-226. -sn n is a random integer between 0 and 32767. This number must be provided for each run. -wn n is an integer, indicating the width of the window for random localization. If w exceeds the length of a sequence, or is negative, the entire sequence is scrambled as a single window. This is also the case if w is not specified. -on n is an integer, indicating the number of nucleotides overlap between adjacent windows. It should never exceed the window size. o defaults to 0 if not specified. If w and o are specified, overlapping windows of w nucleotides are shuffled, thus preserving the local characteristic base composition. Windows overlap by o nucleotides. If w and o are not specified, each sequence is shuffled globally, thus preserving the overall base composition, but not the local variations in comp. Any number of sequences may be processed from a single input file. In Pearson-format files, each new sequence begins with a '>' comment line, indicating the name and a short description of the sequence. No distinction is made between protein or nucleic acid sequences. That is, shuffle will read any of the following characters as sequence: T,U,C,A,G,N,R,Y,M,W,S,K,D,H,V,B,L,Z,F,P,E,I,Q,X,*,- where '*' is the result of translating a stop codon, and '-' is a gap generated during sequence alignment. Lowercase is also accepted. EXAMPLE A sample output file is shown below. Note that the first two lines of output are comment lines, listing the version of the program and the parameters used in the run. >SHUFFLE VERSION 11/ 8/93 >RANDOM SEED: 9873 WINDOW: 12 OVERLAP: 3 >BAZFAZ - Borborigmus azerbi F-actin-zeta gene ctgagtagctagtcctaaatagttagtccatagtactagtacgggtcgtt cacccttgggcagtg.....(etc.) AUTHOR Dr. Brian Fristensky Dept. of Plant Science University of Manitoba Winnipeg, MB Canada R3T 2N2 Phone: 204-474-6085 FAX: 204-261-5732 frist@cc.umanitoba.ca REFERENCE Fristensky, B. (1993) Feature expressions: creating and manipulating sequence datasets. Nucleic Acids Research 21:5997-6003.