Documentation, examples, tutorials and more

<<

Name

fetchMicrobialSequences.pl - Script to search the NCBI Genome database with a given set of queries and download all the records in GenBank format.

Synopsis

        fetchMicrobialSequences.pl [options]

Options

query

a single search term or name of a file containing multiple search terms (one per line)

Description

This script just downloads the results directly. If any of your results is just a record with WGS annotations without sequences, then this does not process them. These files should be processed by fetchWGSRecords.pl to download the records corresponding to those WGS contig entries.

Typically, we use the following sets of queries to make our local database.

        bacteria[organism] AND complete[PROP]
        archaea[organism] AND complete[PROP]
        bacteria[organism] AND WGS
        archaea[organism] AND WGS

If you would like to do what we do here, then you should put these four lines in a file called queries.txt and run:

        fetchMicrobialSequences.pl --query=queries.txt

You could also run, for example:

        fetchMicrobialSequences.pl --query="Bacteroides fragilis"

to download all the genome records that have Bacteroides fragilis in any of the fields.

<<