Curated Tara Oceans Single-Cell and Metagenome Assembled Genomes (the "SMAGs")
Curated database of SMAGs (contig-level FASTA files) (concatenated 6.9 Go, md5sum : 3e264c11dd3aa5d7dd2215ce61571cab / individual 6.9 Go, md5sum : f56892398915a8ea455e7cdab50d0b98)
10 million protein coding genes (V.1) - beta
Nucleotides (concatenated 3.0 Go, md5sum : 92d68b5a6d20d2e6a60301881d2159b7 / individual 3.0 Go, md5sum : 243b0fa51b961461547a9a8508b4f9f0)
Peptides (concatenated 2 Go, md5sum : 0f0a635c00b98dfc4ec7e849a2670712 / individual 2.0 Go, md5sum : dd0d028f368f4454aaa1958be15be338)
GFF (concatenated 546 Mo, md5sum : e1511dc15a843002e7dfd6fc5457386a / individual 547 Mo, md5sum : 3b7d474185bb0cfef83cde094ceddc8e)
EggNog functional annotations (concatenated tab file 807 Mo, md5sum : 7dbd24c9b42a21076cbf2151698c80fc / individual tab file 807 Mo, md5sum : 7562a0bcd8288b56508b7ba4b2ee4a75)
Metagenomic Co-assemblies
Contigs >1,000nt for the 11 co-assemblies (fasta 42 Go) md5sum : 12717625b60a752d7c25ad9fb7ed1076
Contigs >1,500nt for a targeted co-assembly of the Southern Ocean (One Giga-scale MAG) (fasta 507 Mo) md5sum : f108c1b96193563b4d39014c869eb5d6
Contigs >2,500nt for the 11 co-assemblies (used for binning) (fasta 16 Go) md5sum : 894b7bd6700079afccf9aa038eb6c9a9
Curated DNA-dependent RNA polymerase
RNA polymerase genes in SMAGs (fasta 948 Ko) md5sum : 8a08d6a61e75bc7fb5f56dad7f17d9a8
RNA polymerase genes in METdb (reference transcriptomes) (fasta 844 Ko) md5sum : 96d8aba7b261b56e0bc6f1191206eda1
Concatenated RNA Polymerase sequences (SMAGs+metDB) (fasta 7.1 Mo) md5sum : 65ef5b41f3b86fffade28d9753abb9f4
Other data
BUSCO genes raw peptides sequences (fasta 51 Mo) md5sum : 867c4e2f5ae8afbfeaddcbb367b02fba
Supplementary Material (pdf 4.9 Mo)
Supplementary Table (table 68 Mo)
Trees and associated metadata for the figures 1, 2 and S1 (anvio config files 688 Ko)
World-map projections (external link)
MetaGenomic Transcriptomes (MGT) v1
Transcriptome reconstruction and functional analysis of eukaryotic marine plankton communities via high-throughput metagenomics and metatranscriptomicsMetaGenomic Transcriptomes (MGT)
Alexey Vorobev et al. - Genome Research vol. 30 pp. 647-659 (2020)
Supp. Figure A1 : Taxonomic distribution of unigenes per MGT treemaps
Supp. Figure A2 : Geographical distribution of MGTs among Tara Ocean's samples (Biogeography maps)
MGT nucleic sequences (fasta file archive) [1.3 Go, md5sum : a4ce146401ef06484bec3c4e4355c076]
MGT abundance profile tsv filCAP3-based MGT post assemblies [1.2 Go, md5sum : 6f7cd1554a907b882097b101915ba2a8]
CAP3-based MGT post assemblies [1.2 Go, md5sum : 6f7cd1554a907b882097b101915ba2a8]
Canopy clustering source code (version d55ed1a8b825 from the original bitbucket repository)
MetaGenomic Transcriptomes (MGT) v1.5
Changes in gene expression in eukaryotic phytoplankton at the Atlantic-Arctic polar front
Paul Frémont et al. - BiorXv (2025)
Tara Oceans Eukaryote Gene Catalog (the "MATOU") - version 1
A global ocean atlas of eukaryotic genes
Quentin Carradec, Eric Pelletier, et al. - Nature Communications vol.9, 373 (2018)
the Marine Atlas of Tara Oceans Unigenes (MATOU) Version 1 (20171115)
Unigene sequences (116 M) (fasta 24 Go) md5sum : 7784bc7d257455c4dd779de70a52cd2e
Taxonomic affiliations (tab separated file) md5sum : 3d463085244df64367ad74a928814a0b
Proteic domains identification (tab separated file) md5sum : f22f11f2a71c824a65dd3dac8bf2d576
Metagenomic occurrences (tab separated file 14 Go) md5sum : 64650bf61d283178bd6772f86e503fc4
Metatranscriptomic occurrences (tab separated file 32 Go) md5sum : 7d2f702d1f02c7114da3ca7ad9df7574
Unigenes clusters composition (tab separated file 352 Mo) md5sum : f9c1a32a74c7d20f3516129d06cbb394
Unigenes clusters properties (tab separated file 24 Mo) md5sum : 555c1a4ddc0ac10f339eec47b89c4570
Reference protein database used for taxonomic affiliation (fasta file 8.1 Go) md5sum : 3adc1f95e9ef129621b1ddfd217386e7
Tara Oceans Eukaryote Gene Catalog (the "MATOU") - version 1.5
Patterns and drivers of diatom diversity and abundance in the global ocean
Pierella Karlusich, J.J., Cosnier, K., Zinger, L. et al. Patterns and drivers of diatom diversity and abundance in the global ocean. Nat Commun 16, 3452 (2025).
the Marine Atlas of Tara Oceans Unigenes (MATOU) Version 1.5 (20171115)
Unigene sequences (154 M) (fasta 24 Go) md5sum : 045fd2cda0e99e3b6ee52b78ea572da2
Taxonomic affiliations (tab separated file) md5sum : 222f0e6fb9715c534e5676a432b0b444
Proteic domains identification (tab separated file) md5sum : 596cb13f6d196b5742babeb689a166ce
Metagenomic occurrences (tab separated file 9 Go) md5sum : 960dd8d864be3d661c22ec360715fc2f
Metatranscriptomic occurrences (tab separated file 24 Go) md5sum : 5a592d4a76eadb114a83deae7de6f71e
Single-cell genomes
Single-cell genomics of multiple uncultured stramenopiles reveals underestimated functional diversity across oceans
Yoann Seeleuthner, Samuel Mondy, et. al - Nature Communications vol. 9, 310 (2018)
MAST-3 clade A (Stramenopiles sp. TOSAG41-2)
MAST-3 clade F (Stramenopiles sp. TOSAG23-6)
MAST-4 clade A 1 (Stramenopiles sp. TOSAG23-1)
MAST-4 clade A 2 (Stramenopiles sp. TOSAG23-2)
MAST-4 clade C (Stramenopiles sp. TOSAG41-1)
MAST-4 clade E (Stramenopiles sp. TOSAG23-3)
Chrysophyte clade H 1 (Chrysophyceae sp. TOSAG23-4)
Chrysophyte clade H 2 (Chrysophyceae sp. TOSAG23-5)