Usage:

momo <algorithm> [options] <PSM file>

Description

Input

Algorithm

The algorithm used to search for motifs. Available algorithms include "simple", "motifx", and "modl".

PSM file

A tab-delimited peptide-spectrum mmatch (PSM) file in which each row corresponds to a mass spectrum annotated with its corresponding peptide sequence. Currently, only PSM files generated by using the Tide search engine are supported.

Output

MoMo will create a directory, named momo_out by default. Any existing output files in the directory will be overwritten. The directory will contain:

Options

Option Parameter Description Default Behaviour
General Options
--bg-filetypefasta|prealigned This option inidicates the format of the the protein database. The protein database is specified as a fasta file.
--count-thresholdnum This option inidicates minimum number of sequences in the phosphorylation data set needed to match the residue/position pair for each recursive iteration of motif-x. The default count threshold is 20 occurrences.
--eliminate-repeatsnum This option will remove duplicate copies of modifications with identical flanking sequences. The integer parameter specifies the width of the region used to determine identify. Because the window is symmetric around the central, modified amino acid, the width parameter must be odd. To turn this option off, specify a width of 0. All modifications that are identical up to 7 amino acids long (3 on each side of the modification) are removed.
--fg-filetypefasta|prealigned|psm This option inidicates the format of the input file with the phosphorylated sequences. The input is specified as a psm file.
--filter[field],lt|le|eq|ge|gt,[threshold] Only PSMs with scores better than the specified threshold are accepted for analysis. The "[field]" component of the parameter specifies the name of the column from which the score is drawn. The next component specifies whether PSMs with scores less than, less than or equal, etc. are retained. The third component is the threshold itself. No filter.
--hash-fastanum If a protein database is provided, the process of finding the location of the peptide within the protein can be sped up using an O(1) lookup table hashing from each unique kmer to an arraylist of locations. The number specified is used as an argument to the kmer length. If the number specified is 0, then the program will proceed using linear search instead of creating a lookup table. Create an O(1) lookup table with kmer length 6.
--max-iterationsnum The maximum number of iterations for MoDL before it stops. MoDL will stop after 50 iterations.
--max-motifsnum The maximum number of motifs MoDL is allowed to find. MoDL cannot allow more than 100 motifs.
--no-stop-decrease-iterationnum MoDL will stop if there is no decrease in MDL after several iterations. MoDL stops after 10 iterations of no decrease or equal.
--min-occurencesnum A motif will only be constructed if the specified number of occurrences is reached. This threshold is applied after eliminating repeats. Only print motif if pattern occurs at least 5 times.
--protein-databaseprotein database file The protein database used to generate the PSM file. If provided, this file is used to find the amino acids flanking each modification and also to generate the background frequencies. Flanking sequences are derived from the given PSM file, substituting "don't care" symbols for missing entries. Amino acid background frequencies are derived from the non-redundant protein database.
--remove-unknownsT|F This option inidicates whether to remove all sequences that contain an 'X'. Do not remove unknown sequences.
--score-thresholdnum This option inidicates the largest bionmial probability for a residue/position pair to be counted as significant during each recursive iteration of motif-x. The binomial probability must be smaller than 0.000001 to be considered significant.
--single-motif-per-massT|F Provides the option of generating a single motif for each distinct modification mass. For example, phosphorylation is typically specified as a mass of 79.97 added to the amino acids S, T or Y. If this parameter is set to false, then three separate phosphorylation motifs are generated, each with a perfectly conserved central amino acid. If true, then all the phosphorylation events are combined into a single motif, with a mixture of S, T and Y in the central position. Default is set to false.
--version  Display the version and exit. Run as normal.
--widthnum This is an integer specifying the width of the motif. Because the motif is symmetric around the central, modified amino acid, the width parameter should be odd. Motifs of width 7 are generated.

Citing

If you use MoMo in your research, please cite the following paper: