gde_linux/CORE/xylem/shuffle.doc

     shuffle.doc                                           update 3 Feb 94

     SYNOPSIS                                                  
           shuffle -sn [-wn -on]                           
                                                             
     DESCRIPTION                                               
          Shuffles sequences locally. See Lipman DJ, Wilbur WJ, Smith TF
          and Waterman MS (1984) On the statistical significance of nucleic
          acid similarities. Nucl. Acids Res. 12:215-226.                                      
          -sn    n is a random integer between 0 and 32767. This number 
                 must be provided for each run.      
                                                             
          -wn    n is an integer, indicating the width of the window for 
                 random localization. If w exceeds the length of a sequence,
                 or is negative, the entire sequence is scrambled as a single
                 window. This is also the case if w is not specified.                                      
                                                                       
          -on    n is an integer, indicating the number of nucleotides 
                 overlap between adjacent windows. It should never exceed 
                 the window size.  o defaults to 0 if not specified.
                                                             
          If w and o are specified, overlapping windows of w nucleotides 
          are shuffled, thus preserving the local characteristic base 
          composition. Windows overlap by o nucleotides.                                         
          If w and o are not specified, each sequence is shuffled globally, 
          thus preserving the overall base composition, but not the local 
          variations in comp.

          Any number of sequences may be processed from a single input 
          file.  In Pearson-format files, each new sequence begins with a
          '>' comment line, indicating the name and a short description of 
          the sequence.

          No distinction is made between protein or nucleic acid sequences.
          That is, shuffle will read any of the following characters as
          sequence:

          T,U,C,A,G,N,R,Y,M,W,S,K,D,H,V,B,L,Z,F,P,E,I,Q,X,*,-

          where '*' is the result of translating a stop codon, and '-'
          is a gap generated during sequence alignment. Lowercase is
          also accepted.

     EXAMPLE
          A sample output file is shown below. Note that the first two 
          lines of output are comment lines, listing the version of the 
          program and the parameters used in the run. 

          >SHUFFLE                   VERSION 11/ 8/93
          >RANDOM SEED:     9873          WINDOW:   12          OVERLAP:   3
          >BAZFAZ - Borborigmus azerbi F-actin-zeta gene
          ctgagtagctagtcctaaatagttagtccatagtactagtacgggtcgtt
          cacccttgggcagtg.....(etc.)
                 
     AUTHOR
       Dr. Brian Fristensky
       Dept. of Plant Science
       University of Manitoba
       Winnipeg, MB  Canada  R3T 2N2
       Phone: 204-474-6085
       FAX: 204-261-5732
       frist@cc.umanitoba.ca

     REFERENCE
       Fristensky, B. (1993) Feature expressions: creating and manipulating
       sequence datasets. Nucleic Acids Research 21:5997-6003.
2006 version init 2022-03-08 04:43:05 +08:00			`shuffle.doc update 3 Feb 94`

			`SYNOPSIS`
			`shuffle -sn [-wn -on]`

			`DESCRIPTION`
			`Shuffles sequences locally. See Lipman DJ, Wilbur WJ, Smith TF`
			`and Waterman MS (1984) On the statistical significance of nucleic`
			`acid similarities. Nucl. Acids Res. 12:215-226.`
			`-sn n is a random integer between 0 and 32767. This number`
			`must be provided for each run.`

			`-wn n is an integer, indicating the width of the window for`
			`random localization. If w exceeds the length of a sequence,`
			`or is negative, the entire sequence is scrambled as a single`
			`window. This is also the case if w is not specified.`

			`-on n is an integer, indicating the number of nucleotides`
			`overlap between adjacent windows. It should never exceed`
			`the window size. o defaults to 0 if not specified.`

			`If w and o are specified, overlapping windows of w nucleotides`
			`are shuffled, thus preserving the local characteristic base`
			`composition. Windows overlap by o nucleotides.`
			`If w and o are not specified, each sequence is shuffled globally,`
			`thus preserving the overall base composition, but not the local`
			`variations in comp.`

			`Any number of sequences may be processed from a single input`
			`file. In Pearson-format files, each new sequence begins with a`
			`'>' comment line, indicating the name and a short description of`
			`the sequence.`

			`No distinction is made between protein or nucleic acid sequences.`
			`That is, shuffle will read any of the following characters as`
			`sequence:`

			`T,U,C,A,G,N,R,Y,M,W,S,K,D,H,V,B,L,Z,F,P,E,I,Q,X,*,-`

			`where '*' is the result of translating a stop codon, and '-'`
			`is a gap generated during sequence alignment. Lowercase is`
			`also accepted.`

			`EXAMPLE`
			`A sample output file is shown below. Note that the first two`
			`lines of output are comment lines, listing the version of the`
			`program and the parameters used in the run.`

			`>SHUFFLE VERSION 11/ 8/93`
			`>RANDOM SEED: 9873 WINDOW: 12 OVERLAP: 3`
			`>BAZFAZ - Borborigmus azerbi F-actin-zeta gene`
			`ctgagtagctagtcctaaatagttagtccatagtactagtacgggtcgtt`
			`cacccttgggcagtg.....(etc.)`

			`AUTHOR`
			`Dr. Brian Fristensky`
			`Dept. of Plant Science`
			`University of Manitoba`
			`Winnipeg, MB Canada R3T 2N2`
			`Phone: 204-474-6085`
			`FAX: 204-261-5732`
			`frist@cc.umanitoba.ca`

			`REFERENCE`
			`Fristensky, B. (1993) Feature expressions: creating and manipulating`
			`sequence datasets. Nucleic Acids Research 21:5997-6003.`