21.
eggNOG v7: phylogeny-based orthology predictions and functional annotations.
The eggNOG (evolutionary genealogy of genes: Non-supervised Orthologous Groups) database is a phylogenomic resource for orthology inference, evolutionary analysis, and functional annotation across eukaryotes, bacteria, and archaea. Previous versions relied on best reciprocal hit triangulation and clustering approaches, which, although effective, faced challenges with the computational demands of large datasets, inconsistent hierarchical orthologous group (OG) reconstruction, and inaccurate classification of multidomain proteins. Here, we present eggNOG v7, the first release implementing a fully phylogenetic, domain-centric workflow. In this pipeline, sequences are first pre-clustered by Pfam domains or de novo clustering, followed by large-scale multiple sequence alignment and phylogenetic tree inference. Speciation and duplication events are then detected using a noise-tolerant algorithm to generate hierarchically consistent, evolutionarily dated OGs. Applied to 59.3 million proteins from 12 535 species, eggNOG v7 produced 3.18 million OGs, reducing singletons, fragmentation, and oversized groups compared to prior versions. Benchmarking against manually curated KEGG functional OGs demonstrated higher functional consistency. Additionally, eggNOG v7 provides updated protein functional annotations and a fully redesigned web interface with protein-centric searches, interactive phylogenies, and functional profiling tools. eggNOG v7 is available at https://eggnogdb.org.