3.
Metalog: curated and harmonised contextual data for global metagenomics samples.
Kuhn M,
Schmidt TSB,
Ferretti P,
Glazek AM,
Robbani SM,
Akanni W,
Fullam A,
Schudoma C, Cetin E, Hassan M, Noack K, Schwarz A, Thielemann R, Thomas L, von Stetten M,
Alves RJ,
Iyappan A,
Kartal E, Kel I,
Keller MI,
Maistrenko OM,
Mankowski A,
Nishijima S,
Podlesny D,
Schiller J,
Schulz S,
Van Rossum T,
Bork P 2025 Oct 31; [Epub ahead of print] PubMed:
41171125
Abstract
Metagenomic sequencing enables the in-depth study of microbes and their functions in humans, animals, and the environment. While sequencing data is deposited in public databases, the associated contextual data is often not complete and needs to be retrieved from primary publications. This lack of access to sample-level metadata like clinical data or in situ observations impedes cross-study comparisons and meta-analyses. We therefore created the Metalog database, a repository of manually curated metadata for metagenomics samples across the globe. It contains 80 423 samples from humans (including 66 527 of the gut microbiome), 10 744 animal samples, 5547 ocean water samples, and 23 455 samples from other environmental habitats such as soil, sediment, or fresh water. Samples have been consistently annotated for a set of habitat-specific core features, such as demographics, disease status, and medication for humans; host species and captivity status for animals; and filter sizes and salinity for marine samples. Additionally, all original metadata is provided in tabular form, simplifying focused studies e.g. into nutrient concentrations. Pre-computed taxonomic profiles facilitate rapid data exploration, while links to the SPIRE database enable genome-based analyses. The database is freely available for browsing and download at https://metalog.embl.de/.
2.
Fecal microbial load is a major determinant of gut microbiome variation and a confounder for disease associations.
Nishijima S, Stankevic E, Aasmets O,
Schmidt TSB, Nagata N,
Keller MI,
Ferretti P, Juel HB,
Fullam A,
Robbani SM,
Schudoma C, Hansen JK, Holm LA, Israelsen M, Schierwagen R, Torp N, Telzerow A, Hercog R,
Kandels S,
Hazenbrink DHM,
Arumugam M, Bendtsen F, Brøns C, Fonvig CE, Holm JC, Nielsen T, Pedersen JS, Thiele MS, Trebicka J, Org E, Krag A, Hansen T,
Kuhn M,
Bork P, GALAXY and MicrobLiver Consortia
2025 Jan 9; 188(1): 222-236.e15. PubMed:
39541968
Abstract + PDF
The microbiota in individual habitats differ in both relative composition and absolute abundance. While sequencing approaches determine the relative abundances of taxa and genes, they do not provide information on their absolute abundances. Here, we developed a machine-learning approach to predict fecal microbial loads (microbial cells per gram) solely from relative abundance data. Applying our prediction model to a large-scale metagenomic dataset (n = 34,539), we demonstrated that microbial load is the major determinant of gut microbiome variation and is associated with numerous host factors, including age, diet, and medication. We further found that for several diseases, changes in microbial load, rather than the disease condition itself, more strongly explained alterations in patients' gut microbiome. Adjusting for this effect substantially reduced the statistical significance of the majority of disease-associated species. Our analysis reveals that the fecal microbial load is a major confounder in microbiome studies, highlighting its importance for understanding microbiome variation in health and disease.