gde_linux/CORE/xylem/reform.doc

 reform                                               update  3 Feb 94 
 
 NAME
   reform - reformats multiply-aligned sequences for printing.        
 
 SYNOPSIS                                               
   reform [-gpcnm] [-fx] [-sn] [-ln]  [file {ralign only}]                             
                            or                            
   ralign file parameters | reform [-gpcn] [-sn] [-ln] file
                                                          
 DESCRIPTION                                            
                                                          
       g    Gaps are to be represented by dashes (-).     
       p    Bases which agree with the consensus are      
            represented by periods (.).                   
       c    Positions at which all sequences agree are    
            capitalized in the consensus.                 
       n    Sequence data is nucleic acid. Protein default
       fx   Specify input file format, where x is
            r:RALIGN (default) p:PEARSON i:MBCRR-MASE (Intelligenetics)
       m    Input file contains multiline format sequences already aligned,
	    as opposed to ralign output. This option is obsolete, and is
            equivalent to -fp.
       ln   The output linelength is set to n.            
            Default is 70.
       sn   numbering starts with n (default=0)                                
                                                          
     file   Sequence file as described in ralign docu-    
            mentation.  reform needs to re-read the       
            sequence file read by ralign to get the       
            names of the sequences, which ralign ignores.
	    This filename is only included for ralign output.
	    If -m is set, file is ignored, and sequence names
	    must be read from the input.

     Note that positions in the consensus at which no nucleotide is in the
     majority are represented by n's (for nucleic acids) or x's (for proteins),
     rather than periods, as in ralign.
     
     Gaps in the input sequences may be represented by either blanks or dashes.
    
  INPUT FILE FORMATS

     (a) ralign (default, -fr)
     As described in ralign documentation, the input file (which is assumed to
     be ralign output) must have each sequence on a single long line.  All
     characters on a given line will be included in the alignment.  All lines
     must be exactly the same length. For example, if ralign had been read
     sequence from a file called 'allcab.seq' and written output to 'allcab.ral',
     the following command might be used:

     reform allcab.seq <allcab.ralign >allcab.ref

     (b) Pearson (-fp, -m)
     Compatible with sequence files used by Pearson's fasta programs as shown:
     >name1
     sequence1
     >name2
     sequence2
     ...
     >namen
     sequencen

     Sequences may run over many lines and line length does not have to be
     uniform. However, both dashes ('-') and blanks (' ') will be read in
     as gaps in the alignment. A right arrow (>) at the beginning of a line
     indicates the name line at the beginning of a new sequence.

     Any line beginning with a semicolon (';') will be considered a comment, 
     and will be ignored.  

     (c) MBCRR-MASE (Intelligenetics) (-fi)
     Compatible with .mase files produced by MBCRR's mase and pima programs, 
     which use the Intelligenetics format as shown:

     ;one or more comment lines
     name1
     sequence1
     ;one or more comment lines
     name2
     sequence2
     ...
     ;one or more comment lines
     namen
     sequencen

     Sequences may run over many lines and line length does not have to be
     uniform. However, both dashes ('-') and blanks (' ') will be read in
     as gaps in the alignment. Each sequence MUST begin with at least one
     comment line. When a comment line is encountered, that signals the
     beginning of a new sequence. The first line after the comment is read
     as the name, and the sequence begins on the next line after that.

  SEE ALSO  ralign, mase

     AUTHOR
       Dr. Brian Fristensky
       Dept. of Plant Science
       University of Manitoba
       Winnipeg, MB  Canada  R3T 2N2
       Phone: 204-474-6085
       FAX: 204-261-5732
       frist@cc.umanitoba.ca

     REFERENCE
       Fristensky, B. (1993) Feature expressions: creating and manipulating
       sequence datasets. Nucleic Acids Research 21:5997-6003.
2006 version init 2022-03-08 04:43:05 +08:00			`reform update 3 Feb 94`

			`NAME`
			`reform - reformats multiply-aligned sequences for printing.`

			`SYNOPSIS`
			`reform [-gpcnm] [-fx] [-sn] [-ln] [file {ralign only}]`
			`or`
			`ralign file parameters \| reform [-gpcn] [-sn] [-ln] file`

			`DESCRIPTION`

			`g Gaps are to be represented by dashes (-).`
			`p Bases which agree with the consensus are`
			`represented by periods (.).`
			`c Positions at which all sequences agree are`
			`capitalized in the consensus.`
			`n Sequence data is nucleic acid. Protein default`
			`fx Specify input file format, where x is`
			`r:RALIGN (default) p:PEARSON i:MBCRR-MASE (Intelligenetics)`
			`m Input file contains multiline format sequences already aligned,`
			`as opposed to ralign output. This option is obsolete, and is`
			`equivalent to -fp.`
			`ln The output linelength is set to n.`
			`Default is 70.`
			`sn numbering starts with n (default=0)`

			`file Sequence file as described in ralign docu-`
			`mentation. reform needs to re-read the`
			`sequence file read by ralign to get the`
			`names of the sequences, which ralign ignores.`
			`This filename is only included for ralign output.`
			`If -m is set, file is ignored, and sequence names`
			`must be read from the input.`

			`Note that positions in the consensus at which no nucleotide is in the`
			`majority are represented by n's (for nucleic acids) or x's (for proteins),`
			`rather than periods, as in ralign.`

			`Gaps in the input sequences may be represented by either blanks or dashes.`

			`INPUT FILE FORMATS`

			`(a) ralign (default, -fr)`
			`As described in ralign documentation, the input file (which is assumed to`
			`be ralign output) must have each sequence on a single long line. All`
			`characters on a given line will be included in the alignment. All lines`
			`must be exactly the same length. For example, if ralign had been read`
			`sequence from a file called 'allcab.seq' and written output to 'allcab.ral',`
			`the following command might be used:`

			`reform allcab.seq <allcab.ralign >allcab.ref`

			`(b) Pearson (-fp, -m)`
			`Compatible with sequence files used by Pearson's fasta programs as shown:`
			`>name1`
			`sequence1`
			`>name2`
			`sequence2`
			`...`
			`>namen`
			`sequencen`

			`Sequences may run over many lines and line length does not have to be`
			`uniform. However, both dashes ('-') and blanks (' ') will be read in`
			`as gaps in the alignment. A right arrow (>) at the beginning of a line`
			`indicates the name line at the beginning of a new sequence.`

			`Any line beginning with a semicolon (';') will be considered a comment,`
			`and will be ignored.`

			`(c) MBCRR-MASE (Intelligenetics) (-fi)`
			`Compatible with .mase files produced by MBCRR's mase and pima programs,`
			`which use the Intelligenetics format as shown:`

			`;one or more comment lines`
			`name1`
			`sequence1`
			`;one or more comment lines`
			`name2`
			`sequence2`
			`...`
			`;one or more comment lines`
			`namen`
			`sequencen`

			`Sequences may run over many lines and line length does not have to be`
			`uniform. However, both dashes ('-') and blanks (' ') will be read in`
			`as gaps in the alignment. Each sequence MUST begin with at least one`
			`comment line. When a comment line is encountered, that signals the`
			`beginning of a new sequence. The first line after the comment is read`
			`as the name, and the sequence begins on the next line after that.`

			`SEE ALSO ralign, mase`

			`AUTHOR`
			`Dr. Brian Fristensky`
			`Dept. of Plant Science`
			`University of Manitoba`
			`Winnipeg, MB Canada R3T 2N2`
			`Phone: 204-474-6085`
			`FAX: 204-261-5732`
			`frist@cc.umanitoba.ca`

			`REFERENCE`
			`Fristensky, B. (1993) Feature expressions: creating and manipulating`
			`sequence datasets. Nucleic Acids Research 21:5997-6003.`