Documentation, examples, tutorials and more

<<

Name

fetchWGSRecords.pl - Script to parse genbank/embl files containing WGS records and recursively parse them to download the full records.

Synopsis

        fetchWGSRecords.pl [options]

Options

--input (required)

genbank file to be parsed, or the directory containing multiple genbank files to be parsed.

--directory

input is a directory and not a GenBank file. The script will then look for files with an extension .gbff (or given by --extension) in that directory.

--format

format of the input file (supported: genbank, embl)

--extension

file name extension to look for under directory if --directory is given. (default: gbff)

--help

Prints this manual.

Description

fetchWGSRecords.pl is useful in combination with fetchMicrobialSequences.pl prior to loading the reference genomes using addRefGenomeSequences.pl. If you have downloaded some GenBank files which do not contain sequences and instead WGS annotations that lead to wgs contig records, running addRefGenomeSequences.pl on them will be useless since there is no sequence in that file. But fetchWGSRecords.pl will parse the WGS annotation and fetch the corresponding records in GenBank format. The arguments to fetchWGSRecords.pl are virtually the same as for addRefGenomeSequences.pl, so that you can run it to download records corresponding to all the GenBank files you would load using addRefGenomeSequences.pl.

<<