Documentation, examples, tutorials and more

<<

doKeggMapping.pl

Script to map proteins to KEGG orthologous groups (KOs) using BLASTP results against KEGG protein database.

Synopsis

        doKeggMapping.pl [options]

Options

--genepred

Name of geneprediction whose proteins should be mapped.

--external

External file containing BLAST output against KEGG database.

--kegg

Version of the kegg database to be used.

--flavor

BLAST flavor used to generate alignments. NCBI or WU-BLAST.

--help

Prints this manual.

Description

doKeggMapping.pl is a wrapper script that maps predicted proteins in a metagenome to KEGG orthologous groups. It is run as:

        doKeggMapping.pl --genepred=MC20.MG10.AS2.GP1 --kegg=kegg57

It can also provide orthologous group mappings to proteins outside of Smash if you have BLAST outputs after searching against the KEGG protein database. If your BLAST output file is in mysample.kegg.blastp, then you can run

        doKeggMapping.pl --external=mysample.kegg.blastp --kegg=kegg57

This will create a file called mysample.kegg.blastp.keggmapping.txt that contains the KEGG orthologous group mapping information per protein sequence in your dataset.

Required files

doKeggMapping.pl requires the following files:

1. BLAST results

File containing results from BLASTing the predicted proteins against KEGG proteins.

Note: Currently SMASH supports tabular BLAST outputs from WU-BLAST (run using "-mformat=2") and NCBI BLAST (run using "-m 8").

This file is expected to be in the gene prediction directory. Once you have the results of BLAST, please move it to that directory. To see the location of this directory for a given gene prediction (e.g. MC10.MG23.AS1.GP2), run:

        perl showLocations.pl --item=MC10.MG23.AS1.GP2
You should place MC10.MG23.AS1.GP2.kegg.blastp in that directory.

<<