Featurama      Online Help - Usage

This page describes the intendended usage of the Featurama0.7 software. It includes information on how to download and compile the latest version of Featurama, as well as what input the program needs and the output it provides.



Downloading

You can download the latest version of Featurama from SourceForge Logo



Compiling and Running

After you've downloaded the tarred and gzipped source code you will need to decompress and compile it on your own Linux or Sun machine.

  1. First you will need to decompress the source code. To do this run the following command:

    tar -xzvf [filename]

  2. Now you will need to compile the code. For this you will need to run the following two commands while in the Featurama src directory:


    make depend
    make

  3. If the gods have smiled upon you today, you should be ready to run the Featurama. You will need to run the following command while in the Featurama src directory to accomplish this:

    ./featurama [option]... [config file]

    Options:
      -? -h print this information
      -v output version information before gathering input
      -V output version information and exit
      -m output featurama run info to file [output directory]/featurama.ml

    Config File (optional):
      To make input easier featurama provides the ability to read in a configuration file containing featurama input parameters.


User Input

Once you run Featurama without specifying a config file it will ask you a series of questions, the following should serve as a guideline to how you should set each parameter:

  1. FASTA File Path:
    This parameter is the path to an input file of gene sequences in FASTA format.


    Constraints:
    You will need to provide Featurama with a valid path of not more than 255 characters to a file in FASTA format, with each line in the file being not more than 50,0000 characters long.


    Effect:
    Featurama will use the file specified to pick features for all sequences in that file.

  2. Melting Temperature:
    This parameter specifies the melting temparature of the features that it should pick.


    Constraints:
    You will need to provide Featurama with a valid number greater than 0 and no more than 100.


    Effect:
    Featurama will pick features around this melting temperature.

  3. Melting Temperature Range:
    This parameter specifies the range around the previously specified melting temparature in which to pick features.


    Constraints:
    You will need to provide Featurama with a valid number no less than 0 and no more than 50.


    Effect:
    Featurama will pick features only within the specified melting temperature range around the previously specified melting temperature.

  4. Minimum Feature Length:
    This parameter specifies the minimum sequence length of the features that it will pick


    Constraints:
    You will need to provide Featurama with a valid integer no less than 2 and no more than 100.


    Effect:
    Featurama will pick features with a sequence length no less than the length specified.

  5. Maximum Feature Length:
    This parameter specifies the maximum sequence length of the features that it will pick


    Constraints:
    You will need to provide Featurama with a valid integer no less than 2, no more than 100, and not less than the minimum feature length specified earlier.


    Effect:
    Featurama will pick features with a sequence length no greater than the length specified.

  6. Maximum Self-Complementarity Score:
    This parameter specifies the maximum score for self-complementarity of a feature.


    Constraints:
    You will need to provide Featurama with a valid integer no less than 0 and no more than the minimum feature length requested.


    Effect:
    Featurama will pick features with a self-complementarity score no greater than the one specified.

  7. Step Size:
    This parameter specifies the step size to use when windowing through the gene sequence while looking for features.


    Constraints:
    You will need to provide Featurama with a valid integer no less than 1.


    Effect:
    Featurama will window through the gene sequences using the step size specified. To save time but decrease the number of potetial features found you may want to pick a larger step. To find more features pick a smaller step.

  8. Maximum Distance to 3' End:
    This parameter describes a threshold distance from the 3' end of the gene, past which Featurama will not pick features.


    Constraints:
    You will need to provide Featurama with a valid integer no less than 2 and no more than 100,000.


    Effect:
    Featurama will not pick features past the distance specified. To pick features from the entire gene just make this number big.

  9. Initial 3' Offset:
    This parameter describes how far from the 3' end to start each search


    Constraints:
    You will need to provide Featurama with a valid integer no less than 0 and no more than 100,000.


    Effect:
    Changes the potential features considered on each new search with a different offset.

  10. Maximum Features to Pick:
    This parameter describes the maximum number of features that Featurama is allowed to pick for each gene.


    Constraints:
    You will need to provide Featurama with a valid integer no less than 2 and no more than 100,000.


    Effect:
    During the scanning phase Featurama will go on to the next gene when it has picked this number of features for the current gene, given that it hasn't passed the 3' end distance threshold. Additionally, some of the features picked during the scanning phase may get deleted during the duplicate elimination step. Thus, this number is only an upper bound.

  11. Maximum Poly A or Poly T in Each Feature Picked:
    This parameter describes the maximum number of A's or T's in a row that will be present in features that Featurama picks.


    Constraints:
    You will need to provide Featurama with a valid integer no less than minimum allowed feature length and no greater than the maximum requested feature length


    Effect:
    Features containing runs of A's or T's greater than the number specified will not be considered.

  12. Maximum Poly C or Poly G in Each Feature Picked:
    This parameter describes the maximum number of C's or G's in a row that will be present in features that Featurama picks.


    Constraints:
    You will need to provide Featurama with a valid integer no less than minimum allowed feature length and no greater than the maximum requested feature length


    Effect:
    Features containing runs of C's or G's greater than the number specified will not be considered.

  13. Window Size:
    This parameter describes the window size used to scan the gene for the next two parameter validity.


    Constraints:
    You will need to provide Featurama with a valid integer no less than minimum allowed feature length and no greater than the minimum requested feature length


    Effect:
    A window of this size will be used to check the next two parameters.

  14. Maximum A and T Content in Window:
    This parameter describes the maximum number of A's and T's that will be present in any window of size specified by the above parameter, in each feature.


    Constraints:
    You will need to provide Featurama with a valid integer no less than 1 and no greater than the requested window size


    Effect:
    Features containing of A's and T's greater than the number specified in a window of size specified above will not be considered.

  15. Maximum G and C Content in Window:
    This parameter describes the maximum number of G's and C's that will be present in any window of size specified by the above parameter, in each feature.


    Constraints:
    You will need to provide Featurama with a valid integer no less than 1 and no greater than the requested window size


    Effect:
    Features containing of G's and C's greater than the number specified in a window of size specified above will not be considered.

  16. Oligo Concentration (mMol):
    This parameter describes the oligo concentration that will be used to calculate the Tm of each feature.


    Constraints:
    You will need to provide Featurama with a valid real greater than 0 and less than 100,000.


    Effect:
    The number specified will be used to calculate Tm using Santa Lucia's method (TODO: cite).

  17. Salt Concentration (mMol):
    This parameter describes the Salt concentration that will be used to calculate the Tm of each feature.


    Constraints:
    You will need to provide Featurama with a valid real greater than 0 and less than 100,000.


    Effect:
    The number specified will be used to calculate Tm using Santa Lucia's method (TODO: cite).


Output

Currently the output of Featurama is rather simple. All features that are found are output to a file named "results.fasta", any previous file with the same name in the same directory from which Featurama was run will be overwritten. All genes for which features couldn't be found are output to a file named "featureless.genes", any previous file with the same name in the same directory from which Featurama was run will be overwritten. The statistics output by Featurama are written to the console, at the end of the run.

The results file is a file in FASTA format. The description string of each sequence contains as a substring the name of the gene for which the feature was found, along with other information. The sequence contains the actual sequence that Featurama found.

Please note that the sequences present in this file are not necessarily unique to the gene they came from. It is intended that the user runs BLASTn to weed out the closely related sequences.


Featurama Documentation Starting Page

SourceForge Logo