User Input
Once you run Featurama without specifying a config file it will ask you a
series of questions, the following should serve as a guideline to how you should set each parameter:
- FASTA File Path:
This parameter is the
path to an input file of gene sequences in FASTA format.
Constraints:
You will need to provide Featurama with a valid path of not more than
255 characters to a file in FASTA format, with each line in the file being
not more than 50,0000 characters long.
Effect:
Featurama will use the file specified to pick features for all sequences in
that file.
- Melting Temperature:
This parameter specifies the melting temparature of the features
that it should pick.
Constraints:
You will need to provide Featurama with a valid number greater than 0 and
no more than 100.
Effect:
Featurama will pick features around this melting temperature.
- Melting Temperature Range:
This parameter specifies the range around the previously specified
melting temparature in which to pick features.
Constraints:
You will need to provide Featurama with a valid number no less than 0 and
no more than 50.
Effect:
Featurama will pick features only within the specified melting temperature
range around the previously specified melting temperature.
- Minimum Feature Length:
This parameter specifies the minimum sequence length of the
features that it will pick
Constraints:
You will need to provide Featurama with a valid integer no less than 2 and
no more than 100.
Effect:
Featurama will pick features with a sequence length no less than the length
specified.
- Maximum Feature Length:
This parameter specifies the maximum sequence length of the
features that it will pick
Constraints:
You will need to provide Featurama with a valid integer no less than 2,
no more than 100, and not less than the minimum feature length specified
earlier.
Effect:
Featurama will pick features with a sequence length no greater than the length specified.
- Maximum Self-Complementarity Score:
This parameter specifies the maximum score for self-complementarity of a
feature.
Constraints:
You will need to provide Featurama with a valid integer no less than 0 and no more than the minimum feature length requested.
Effect:
Featurama will pick features with a self-complementarity score no greater than the one specified.
- Step Size:
This parameter specifies the step size to use when windowing
through the gene sequence while looking for features.
Constraints:
You will need to provide Featurama with a valid integer no less than 1.
Effect:
Featurama will window through the gene sequences using the step size specified.
To save time but decrease the number of potetial features found you may want
to pick a larger step. To find more features pick a smaller step.
- Maximum Distance to 3' End:
This parameter describes a threshold distance from the 3' end of the gene,
past which Featurama will not pick features.
Constraints:
You will need to provide Featurama with a valid integer no less than 2 and
no more than 100,000.
Effect:
Featurama will not pick features past the distance specified. To pick features from the entire gene just make this number big.
- Initial 3' Offset:
This parameter describes how far from the 3' end to start each search
Constraints:
You will need to provide Featurama with a valid integer no less than 0 and
no more than 100,000.
Effect:
Changes the potential features considered on each new search with a different offset.
- Maximum Features to Pick:
This parameter describes the maximum number of features that Featurama is allowed to pick for each gene.
Constraints:
You will need to provide Featurama with a valid integer no less than 2 and
no more than 100,000.
Effect:
During the scanning phase Featurama will go on to the next gene when it has
picked this number of features for the current gene, given that it hasn't
passed the 3' end distance threshold. Additionally, some of the
features picked during the scanning phase may get deleted during the
duplicate elimination step. Thus, this number is only an upper bound.
- Maximum Poly A or Poly T in Each Feature Picked:
This parameter describes the maximum number of A's or T's in a row that will be present in features
that Featurama picks.
Constraints:
You will need to provide Featurama with a valid integer no less than minimum allowed feature length
and no greater than the maximum requested feature length
Effect:
Features containing runs of A's or T's greater than the number specified will not be considered.
- Maximum Poly C or Poly G in Each Feature Picked:
This parameter describes the maximum number of C's or G's in a row that will be present in features
that Featurama picks.
Constraints:
You will need to provide Featurama with a valid integer no less than minimum allowed feature length
and no greater than the maximum requested feature length
Effect:
Features containing runs of C's or G's greater than the number specified will not be considered.
- Window Size:
This parameter describes the window size used to scan the gene for the next two parameter validity.
Constraints:
You will need to provide Featurama with a valid integer no less than minimum allowed feature length
and no greater than the minimum requested feature length
Effect:
A window of this size will be used to check the next two parameters.
- Maximum A and T Content in Window:
This parameter describes the maximum number of A's and T's that will be
present in any window of size specified by the above parameter, in each
feature.
Constraints:
You will need to provide Featurama with a valid integer no less than 1 and no greater than the
requested window size
Effect:
Features containing of A's and T's greater than the number specified in a window of size specified
above will not be considered.
- Maximum G and C Content in Window:
This parameter describes the maximum number of G's and C's that will be present
in any window of size specified by the above parameter, in each feature.
Constraints:
You will need to provide Featurama with a valid integer no less than 1 and no greater than the
requested window size
Effect:
Features containing of G's and C's greater than the number specified in a window of size specified
above will not be considered.
- Oligo Concentration (mMol):
This parameter describes the oligo concentration that will be used to calculate the Tm of
each feature.
Constraints:
You will need to provide Featurama with a valid real greater than 0 and less than 100,000.
Effect:
The number specified will be used to calculate Tm using Santa Lucia's method (TODO: cite).
- Salt Concentration (mMol):
This parameter describes the Salt concentration that will be used to calculate the Tm of each feature.
Constraints:
You will need to provide Featurama with a valid real greater than 0 and less than 100,000.
Effect:
The number specified will be used to calculate Tm using Santa Lucia's method (TODO: cite).
|
Output
Currently the output of Featurama is rather simple. All features that are
found are output to a file named "results.fasta", any previous file with the
same name in the same directory from which Featurama was run will be
overwritten. All genes for which features couldn't be found are output to
a file named "featureless.genes", any previous file with the
same name in the same directory from which Featurama was run will be
overwritten. The statistics output by Featurama are written to the console, at the end of the run.
The results file is a file in FASTA format. The description string of each
sequence contains as a substring the name of the gene for which the feature
was found, along with other information. The sequence contains the actual
sequence that Featurama found.
Please note that the sequences present in this file are not necessarily unique to the gene they came from. It is intended that the user runs
BLASTn to weed out the closely related sequences.
|