Documentation, examples, tutorials and more

<<

doOrthologMapping.pl

Script to map proteins to eggNOG orthologous groups (OGs) using BLASTP results against eggNOG.

Synopsis

        doOrthologMapping.pl [options]

Options

--genepred

Name of geneprediction whose proteins should be mapped.

--external

External file containing BLAST output against eggNOG database.

--eggnog

Version of the eggnog database to be used.

--flavor

BLAST flavor used to generate alignments. NCBI or WU-BLAST.

--help

Prints this manual.

Description

doOrthologMapping.pl is a wrapper script that maps predicted proteins in a metagenome to eggNOG orthologous groups. If your BLASTP results were generated by WU-BLAST, it can be run as:

        doOrthologMapping.pl --flavor=WU-BLAST --genepred=MC20.MG10.AS2.GP1 --eggnog=eggnog2

It can also provide orthologous group mappings to proteins outside of Smash if you have BLAST outputs after searching against the eggNOG protein database. If your NCBI BLASTP output file is in mysample.eggnog.blastp, then you can run

        doOrthologMapping.pl --flavor=NCBI --external=mysample.eggnog.blastp --eggnog=eggnog2

This will create a file called mysample.eggnog.blast.eggnogmapping.txt that contains the eggNOG orthologous group mapping information per protein sequence in your dataset.

Required files

doOrthologMapping.pl requires the following files (assuming eggnog version 2; replace 2 with the correct version):

1. BLAST results

File containing results from BLASTing the predicted proteins against eggNOG proteins.

Note: Currently SMASH supports tabular BLAST outputs from WU-BLAST (run using "-mformat=2") and NCBI BLAST (run using "-m 8").

This file is expected to be in the gene prediction directory. Once you have the results of BLAST, please move it to that directory. To see the location of this directory for a given gene prediction (e.g. MC10.MG23.AS1.GP2), run:

        perl showLocations.pl --item=MC10.MG23.AS1.GP2
You should place MC10.MG23.AS1.GP2.eggnog2.blastp in that directory.

2. eggNOG orthologous group information and protein length files

These files are automatically downloaded from SMASH website when you install SMASH. They should be installed in your repository under the data directory. To see where they are, run:

        perl showLocations.pl --external
You should find the following files there: eggnog2_final_orthgroups.txt, eggnog2_protein_lengths.txt

<<