A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | AA | AB | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | ||||||||||||||||||||||||||||
2 | blank if normal; ? if confusing / unsure; ?? if very confusing / unsure / not from reliable source (e.g. random news article) | name of project. for multiple datapoints (e.g. release phase 1, 2), add a new row. | year data collected (best guess; year announced / released ok) | cumulative within same project + same type. always LOWER bound, i.e. x means >x, though some rounding allowed | type of sequencing/genotyping (WGS or SNP array) | parameter (read depth if WGS; SNP count if array) | total cost of the project, if easily findable | major funding source, if easily available. e.g. "finnish gvt" | link(s) to paper, blog post, etc., stating # genomes (and type etc.) | anything confusing or abnormal, uncertainty, etc. | ||||||||||||||||||
3 | ||||||||||||||||||||||||||||
4 | confidence | name | year | # genomes | type | parameter | $ cost | funding | evidence | notes | ||||||||||||||||||
5 | China Kadoorie Biobank | 2017 | 100k | SNP | 700-800k | https://www.ckbiobank.org/about-us/ckb-timeline, https://publications.ersnet.org/content/erj/58/4/2100199.full | very confusing. from 2021 https://publications.ersnet.org/content/erj/58/4/2100199.full : "This 700K single nucleotide polymorphism (SNP) array was used to genotype ∼32 000 CKB participants in the first phase. A revised and updated version of the original array which covers ∼803K SNPs was used to genotype ∼69 000 participants in the second and third phases." but from https://www.ckbiobank.org/about-us/ckb-timeline 2024 "400,000 samples are whole genome sequenced." maybe by WGS they mean SNP arrays? or they really did that? https://www.cell.com/cell-genomics/pdfExtended/S2666-979X(23)00144-1 | |||||||||||||||||||||
6 | China Kadoorie Biobank | 2020 | 10k | WGS | ? | https://www.ckbiobank.org/about-us/ckb-timeline | maybe more info in recent studies methods section, but didn't check | |||||||||||||||||||||
7 | China Kadoorie Biobank | 2024 | 400k | WGS | ? | https://www.ckbiobank.org/about-us/ckb-timeline | not totally sure about this; but there are multiple sources saying this was planned, and then "400,000 samples are whole genome sequenced." on the website. | |||||||||||||||||||||
8 | HapMap 3 | 2011 | 1184 | SNP | 1.6m | https://pmc.ncbi.nlm.nih.gov/articles/PMC3173859/ | ||||||||||||||||||||||
9 | Taiwan biobank | 2021 | 103K | SNP | 660K | https://www.nature.com/articles/s41525-021-00178-9 | ||||||||||||||||||||||
10 | Estonian Biobank | 2021 | 200K | SNP | 700K | ? | https://genomics.ut.ee/en/content/estonian-biobank | |||||||||||||||||||||
11 | 23andMe | 2015 | 1m | SNP | 560-960K | https://blog.23andme.com/articles/one-in-a-million | ||||||||||||||||||||||
12 | 23andMe | 2018 | 3m | SNP | 560-960K | https://www.technologyreview.com/2018/02/12/145676/2017-was-the-year-consumer-dna-testing-blew-up/ | snp count data from https://isogg.org/wiki/23andMe | |||||||||||||||||||||
13 | 23andMe | 2019 | 10m | SNP | 560-960K | https://thednageek.com/23andme-has-more-than-10-million-customers/ | ||||||||||||||||||||||
14 | 23andMe | 2021 | 11m | SNP | 560-960K | https://www.bloomberg.com/news/features/2021-11-04/23andme-to-use-dna-tests-to-make-cancer-drugs | ||||||||||||||||||||||
15 | 23andMe | 2023 | 14m | SNP | 560-960K | https://investors.23andme.com/news-releases/news-release-details/23andme-reports-fy2023-fourth-quarter-and-full-year-financial | are these really all genome-wide? plausibly yes, but not totally clear that there weren't other products with much smaller SNP arrays included | |||||||||||||||||||||
16 | 23andMe | 2024 | 15m | SNP | 560-960K | https://blog.23andme.com/articles/happy-dna-day-2024 | ||||||||||||||||||||||
17 | Million Veteran Program | 2020 | 875K | SNP | 670K | ? | https://pubmed.ncbi.nlm.nih.gov/32243820/ | |||||||||||||||||||||
18 | Million Veteran Program | 2022 | 150K | WGS | ? | ? | https://med.stanford.edu/gbsc/projects/vapahcs.html | |||||||||||||||||||||
19 | UK biobank | 2023 | 491K | WGS | $200M | https://www.ukbiobank.ac.uk/learn-more-about-uk-biobank/news/world-s-largest-genetic-project-opens-the-door-to-new-era-for-treatments-and-cures-uk-biobank-s-major-milestone | ||||||||||||||||||||||
20 | UK biobank | 2022 | 150K | WGS | 23.5 | https://www.nature.com/articles/s41586-022-04965-x | ||||||||||||||||||||||
21 | All of Us | ? 2024 | 245K | WGS | $3.6B | https://databrowser.researchallofus.org/ | https://allofus.nih.gov/about/program-overview mentions the funding, but not how much has been spent exactly. | |||||||||||||||||||||
22 | Human Genome Diversity Project (HGDP) | 2020 | 929 | WGS | 35 | https://www.science.org/doi/10.1126/science.aay5012 | Insights into human genetic variation and population history from 929 diverse genomes. We sequenced 929 genomes from 54 geographically, linguistically, and culturally diverse human populations to an average of 35× coverage and analyzed the variation among them. | |||||||||||||||||||||
23 | 1000 genomes project | 2010 | 185 | WGS | 3 | https://pmc.ncbi.nlm.nih.gov/articles/PMC3042601/ | ||||||||||||||||||||||
24 | 1000 genomes project | 2012 | 1092 | WGS | 5 | https://pmc.ncbi.nlm.nih.gov/articles/PMC3498066/ | Primary data generated for each sample consist of low-coverage (average 5x) whole-genome and high-coverage exome (average 80x across a consensus target of 24 Mb spanning over 15,000 genes) sequence data and high density SNP array information. | |||||||||||||||||||||
25 | 1000 genomes project | 2015 | 2504 | WGS | 30 | https://www.nature.com/articles/nature15393 | ||||||||||||||||||||||
26 | 1000 genomes project | 2022 | 3202 | WGS | 30 | https://www.cell.com/cell/fulltext/S0092-8674(22)00991-6?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS0092867422009916%3Fshowall%3Dtrue | ||||||||||||||||||||||
27 | NHLBI TOPMed (Trans-Omics for Precision Medicine) | 2017 | 56K | WGS | 38.2 | https://topmed.nhlbi.nih.gov/topmed-data-access-scientific-community | read depth: https://www.nature.com/articles/s41586-021-03205-y | |||||||||||||||||||||
28 | NHLBI TOPMed (Trans-Omics for Precision Medicine) | 2019 | 138K | WGS | 38.2 | https://topmed.nhlbi.nih.gov/topmed-data-access-scientific-community | ||||||||||||||||||||||
29 | FINNGEN | 2023 | 500K | SNP | 657-664K | $155M | U Helsinki, consortium | https://www.finngen.fi/en/access_results | https://www.finngen.fi/en/node/1996 mentions 664K genetic markers. https://www.finngen.fi/en/access_results also lists how much data they had collected at various previous points in time. https://www.finngen.fi/en/node/2638 describes the "consortium" | |||||||||||||||||||
30 | FINNGEN | 2020 | 220K | SNP | 657-664K | $155M | U Helsinki, consortium | https://www.finngen.fi/en/access_results | Data released to the partners: Q2 2020 | |||||||||||||||||||
31 | Tohoku University Tohoku Medical Megabank (TMM) | 2015 | 1,070 | WGS | 30 | https://www.nature.com/articles/ncomms9018 | ||||||||||||||||||||||
32 | Tohoku University Tohoku Medical Megabank (TMM) | 2024 | 100K | WGS | ? | |||||||||||||||||||||||
33 | deCODE (iceland) | 2015 | 2.5K | WGS | 20 | https://pubmed.ncbi.nlm.nih.gov/25807286/ | ||||||||||||||||||||||
34 | deCODE (iceland) | 2024? | 160K | SNP | ? | https://www.decode.com/research/ | more than half of iceland | |||||||||||||||||||||
35 | Qatar Biobank | 2022 | 20K | WGS | ? | https://thepeninsulaqatar.com/article/23/10/2022/qatar-biobank-makes-significant-progress-over-ten-years | ||||||||||||||||||||||
36 | CanPath | 2020? | 55K | SNP | ? | https://canpath.ca/about/ | ||||||||||||||||||||||
37 | BioBank Japan | 2023 | 267k | SNP | 900k | U Tokyo, national | https://biobankjp.org/en/initiatives/2108#gsc.tab=0 | snp count: https://biobankjp.org/researchers/730#gsc.tab=0 . org: https://biobankjp.org/en/about/1990#gsc.tab=0 | ||||||||||||||||||||
38 | BioBank Japan | 2023 | 11k | WGS | ? | U Tokyo, national | https://biobankjp.org/en/initiatives/2108#gsc.tab=0 | |||||||||||||||||||||
39 | Ancestry DNA | 2024 | 25M | SNP | 637K | https://thednageek.com/dna-tests/ | ||||||||||||||||||||||
40 | Autism Genetic Resource Exchange | 2024 | 14k | WGS | https://www.autismspeaks.org/science-news/worlds-largest-autism-genome-database-expands | lots of people with autism | ||||||||||||||||||||||
41 | MyHeritage | 2024 | 7M | SNP | 576K | https://thednageek.com/dna-tests/ | ||||||||||||||||||||||
42 | GEDmatch | 2023 | 1M | SNP | https://thednageek.com/dna-tests/ | |||||||||||||||||||||||
43 | Korea Biobank | 2023 | 4K | WGS | 20 (median) | ~1/3 low coverage (<10) | ||||||||||||||||||||||
44 | Genomics England, 100,000 Genomes | 2019 | 85k | wgs | ? | $380 M | UK gov | https://www.genomicsengland.co.uk/news/100000-genomes-for-approved-researchers | https://www.genomicsengland.co.uk/initiatives/100000-genomes-project not clear what is actually sequenced. sounds like a lot of it is cancer cells. but giving participant count, as that's basically a whole human genome. source for budget: https://mitochondrialdisease.nhs.uk/media/documents/jude_craft_jen_whitfield.pdf | |||||||||||||||||||
45 | Saudi Humane genome Project | 2022 | 65K | mix fo wgs and wes and snp? | https://www.nature.com/articles/s41598-022-05296-7 | One outcome of the program is that, more than 65,000 samples have been sequenced and 7,500 variants have been identified to date. | ||||||||||||||||||||||
46 | Nebula Genomics | ? | WGS | 10-30 | ||||||||||||||||||||||||
47 | HUNT Biobank | 2024 | 210k | SNP | 360k | NTNU (norwegian) | https://academic.oup.com/ije/article/53/3/dyae073/7687883 | unsure of budget or gvt funding. SNPs exome-enriched? "Illumina-Human Core Exome" https://www.ntnu.edu/about | ||||||||||||||||||||
48 | The malaysean cohort | 2023 | 12K | SNP | https://pmc.ncbi.nlm.nih.gov/articles/PMC11402075/ | paper mentions that 10% of samples were genotyped | ||||||||||||||||||||||
49 | Kaiser Permanente Research Bank | 2024 | 395k | SNP | 800k | https://researchbank.kaiserpermanente.org/for-researchers/data-resource/ | uses thermofisher array https://www.thermofisher.com/us/en/home/life-science/microarray-analysis/applications/predictive-genomics/population-genomics/arrays/axiom-pangenomix.html | |||||||||||||||||||||
50 | Swedish Twin Registry | 2015 | 22k | SNP | 560k | https://www.cambridge.org/core/journals/twin-research-and-human-genetics/article/swedish-twin-registry-content-and-management-as-a-research-infrastructure/C22D50E7162D4A9F63706B2B4B84C75B | ||||||||||||||||||||||
51 | GenomeIndia | 2023 | 10K | WGS | https://indianexpress.com/article/india/10000-human-genomes-sequenced-in-india-govt-9184930/ | |||||||||||||||||||||||
52 | Chinese Millionome Database | 2022 | 141K | WGS | 0.06-0.1 | https://academic.oup.com/nar/article/51/D1/D890/6649379 | ||||||||||||||||||||||
53 | Swedish Childhood Tumor Biobank | 2022 | 117 | WGS (tumor) | https://genomicmedicine.se/en/2023/07/10/whole-genome-sequencing-provides-important-information-for-treating-children-with-cancer/ | cancer children's tumors (unclear if also the children) | ||||||||||||||||||||||
54 | Gabriella Miller Kids First Pediatric Research Program | 2015 | 21k | WGS | NIH | https://commonfund.nih.gov/KidsFirst | ||||||||||||||||||||||
55 | Our Future Health | 2023 | 650k | SNP | 700k | https://ourfuturehealth.gitbook.io/our-future-health/data/genotype-array-data, https://research.ourfuturehealth.org.uk/data-and-cohort/ | ||||||||||||||||||||||
56 | ? | Korean Genome and Epidemiology Study | 2016 | 10k?? | SNP | https://academic.oup.com/ije/article/46/2/e20/2622834 | ||||||||||||||||||||||
57 | Genes & Health | 2021 | 45k | SNP | uk | https://www.illumina.com/content/dam/illumina-marketing/documents/products/datasheets/infinium-global-screening-array-data-sheet-370-2016-016.pdf | https://www.medrxiv.org/content/10.1101/2024.11.19.24317559v1.full.pdf | |||||||||||||||||||||
58 | H3Africa | 2024 | 50k | SNP | 1-2 MM? | https://www.fic.nih.gov/News/GlobalHealthMatters/march-april-2022/Pages/human-heredity-health-h3-africa-achievements-endure.aspx | https://catalog.h3africa.org/ | |||||||||||||||||||||
59 | H3Africa | 2020 | 426 | WGS | 10-30 | https://www.nature.com/articles/s41586-020-2859-7 | "High-depth African genomes inform human migration and health" | |||||||||||||||||||||
60 | Westlake BioBank for Chinese | 2022 | 4480 | WGS | 14 | https://www.nature.com/articles/s41467-022-30526-x?fromPaywallRec=false | ||||||||||||||||||||||
61 | Westlake BioBank for Chinese | 2022 | 6025 | SNP | 660k | https://www.nature.com/articles/s41467-022-30526-x?fromPaywallRec=false | ||||||||||||||||||||||
62 | J. Craig Venter | 2007 | 1 | WGS | 7.5 | $100,000,000 | Funding was provided from the J. Craig Venter Institute, Genome Canada/Ontario Genomics Institute, The Hospital for Sick Children Foundation, the McLaughlin Centre for Molecular Medicine, and the Canada Foundation for Innovation | https://pmc.ncbi.nlm.nih.gov/articles/PMC1964779/ | claim for cost from here: https://www.nature.com/articles/nature06884 | |||||||||||||||||||
63 | "Identification of individuals by trait prediction using whole-genome sequencing data" | 2017 | 1,061 | WGS | 30 | https://www.pnas.org/doi/pdf/10.1073/pnas.1711125114 | ||||||||||||||||||||||
64 | "Comprehensive Characterization of Human Genome Variation by High Coverage Whole-Genome Sequencing of Forty Four Caucasians" | 2013 | 44 | WGS | 65 | https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0059494&type=printable | ||||||||||||||||||||||
65 | James Watson | 2008 | 1 | WGS | 7.4 | $1,000,000 | https://www.nature.com/articles/nature06884 | |||||||||||||||||||||
66 | "The diploid genome sequence of an Asian individual" | 2008 | 1 | WGS | 36 | $500,000 | https://pmc.ncbi.nlm.nih.gov/articles/PMC2716080/ | |||||||||||||||||||||
67 | "Accurate Whole Human Genome Sequencing using Reversible Terminator Chemistry" | 2008 | 1 | WGS | 30 | https://pmc.ncbi.nlm.nih.gov/articles/PMC2581791/ | ||||||||||||||||||||||
68 | "DNA sequencing of a cytogenetically normal acute myeloid leukemia genome" | 2009 | 1 | WGS | 13.9 | Alvin J. Siteman | https://pmc.ncbi.nlm.nih.gov/articles/PMC2603574/ | |||||||||||||||||||||
69 | Alzheimer’s Disease Sequencing Project | 2014 | 583 | WGS | ? | NHGRI | https://www.nia.nih.gov/research/dn/alzheimers-disease-sequencing-project-consortia#discovery | https://grants.nih.gov/grants/guide/rfa-files/RFA-AG-16-002.html. discovery phase. National Institute on Aging Alzheimer’s Disease Family Based Study | ||||||||||||||||||||
70 | Alzheimer’s Disease Sequencing Project | 2016 | 4752 | WGS | ? | NHGRI | https://www.nia.nih.gov/research/dn/alzheimers-disease-sequencing-project-consortia#extension | date not fully clear. https://grants.nih.gov/grants/guide/pa-files/PAR-16-406.html is already in past tense though. | ||||||||||||||||||||
71 | ?? | Alzheimer’s Disease Sequencing Project | ? | x+16745 | WGS | ? | National Institute on Aging | https://www.nia.nih.gov/research/dn/alzheimers-disease-sequencing-project-consortia#follow | ||||||||||||||||||||
72 | Alzheimer’s Disease Sequencing Project | 2022 | 35,681 | WGS | ? | National Institute on Aging | https://adsp.niagads.org/adsp-and-affiliates-whole-genome-sequencing-report/ | genomes that the ADSP has sequenced as of November 2022, https://www.nia.nih.gov/research/dn/alzheimers-disease-sequencing-project-consortia#follow | ||||||||||||||||||||
73 | ?? | Alzheimer’s Disease Sequencing Project | 2023 | 94k | WGS | ? | National Institute on Aging | https://adsp.niagads.org/adsp-and-affiliates-whole-genome-sequencing-report/ | not sure i'm understanding what the numbers mean... NOT YET SEQUENCED necessarily | |||||||||||||||||||
74 | Steve Jobs | 2011 | 1 | WGS | ? | $100,000 | Steve Jobs | https://www.nytimes.com/2011/10/21/technology/book-offers-new-details-of-jobs-cancer-fight.html | ||||||||||||||||||||
75 | Complete Genomics | 2011 | 69 | WGS | ? | https://web.archive.org/web/20120610192353/http://www.completegenomics.com/news-events/press-releases/archive/Complete-Genomics-Adds-29-High-Coverage-Complete-Human-Genome-Sequencing-Datasets-to-its-Public-Genomic-Repository--119298369.html | Original page with the data isn't up anymore: https://web.archive.org/web/20120416073043/http://www.completegenomics.com/public-data/69-genomes/ | |||||||||||||||||||||
76 | Complete Genomics | 2009 | 3 | WGS | 45 | https://www.science.org/doi/10.1126/science.1181498 | ||||||||||||||||||||||
77 | "A Complete Public Domain Family Genomics Dataset" | 2013 | 4 | Whole exome sequencing | 30 | https://www.biorxiv.org/content/10.1101/000216v1.full | ||||||||||||||||||||||
78 | ? | ClinSeq | ||||||||||||||||||||||||||
79 | "A continuum of admixture in the Western Hemisphere revealed by the African Diaspora genome" | 2016 | 642 | WGS | ? | https://www.nature.com/articles/ncomms12522.pdf | ||||||||||||||||||||||
80 | "The first Korean genome sequence and analysis: full genome sequencing for a socio-ethnic group" | 2009 | 1 | WGS | 28.9 | This work was supported by a grant from the KRIBB Research Ini- | https://pubmed.ncbi.nlm.nih.gov/19470904/ | |||||||||||||||||||||
81 | "A highly annotated whole-genome sequence of a Korean individual" | 2009 | 1 | WGS | 27.8 | Intiative Program of Korea, by the Korea Science and Engineering. Foundation (KOSEF) grant funded by the Korean government (MOST), by a KOSEF grant (no. R11-2008-044-03004-0, S.M.A.), by a grant from the Innovative Research Institute for Cell Therapy (A062260, J.Y.C.), by a grant from the Ministry of Knowledge Economy (Standard Reference Data Program), and by generous funding from Gachon University of Medicine and Science and Gachon University Gil Hospital. | https://www.nature.com/articles/nature08211 | |||||||||||||||||||||
82 | "Single-molecule sequencing of an individual human genome" | 2009 | 1 | WGS | 28 | https://www.nature.com/articles/nbt.1561 | ||||||||||||||||||||||
83 | "In depth comparison of an individual’s DNA and its lymphoblastoid cell line using whole genome sequencing" | 2012 | 1 | WGS | ? | https://link.springer.com/content/pdf/10.1186/1471-2164-13-477.pdf | ||||||||||||||||||||||
84 | "Complete Khoisan and Bantu genomes from southern Africa" | 2010 | 2 | WGS | 23, 7 | https://www.nature.com/articles/nature08795.pdf | ||||||||||||||||||||||
85 | ? | https://www.personalgenomes.org/ | ||||||||||||||||||||||||||
86 | "Deep sequencing of 10,000 human genomes" | 2016 | 10545 | WGS | 30 | https://www.pnas.org/doi/full/10.1073/pnas.1613365113#sec-3 | ||||||||||||||||||||||
87 | "Whole-genome sequencing in a patient with Charcot-Marie-Tooth neuropathy" | 2010 | 1 | WGS | ? | https://pubmed.ncbi.nlm.nih.gov/20220177/ | ||||||||||||||||||||||
88 | "Genome-wide detection of single-nucleotide and copy-number variations of a single human cell" | 2012 | 1 | WGS | 25 | https://pubmed.ncbi.nlm.nih.gov/23258894/ | ||||||||||||||||||||||
89 | "A comprehensively molecular haplotype-resolved genome of a European individual" | 2011 | 1 | WGS | ? | https://pubmed.ncbi.nlm.nih.gov/21813624/ | ||||||||||||||||||||||
90 | Korean Genome Project | 2020 | 1094 | WGS | 31 | https://pubmed.ncbi.nlm.nih.gov/32766443/ | ||||||||||||||||||||||
91 | Korean Genome Project | 2021 | 10k | WGS | ? | https://www.news-medical.net/news/20210510/Koreas-first-large-scale-project-reaches-milestone-of-sequencing-10000-whole-genomes.aspx | ||||||||||||||||||||||
92 | Korea4K | 2024 | 4157 | WGS | 20 | https://academic.oup.com/gigascience/article/doi/10.1093/gigascience/giae014/7646332 | ||||||||||||||||||||||
93 | Korean Personal Genomes Project | 2014 | 35 | WGS | 5 | https://pmc.ncbi.nlm.nih.gov/articles/PMC4251052/ | ||||||||||||||||||||||
94 | Personal Genome Project | 2016 | 184 | WGS | 100 | https://pmc.ncbi.nlm.nih.gov/articles/PMC5057367/ | ||||||||||||||||||||||
95 | Human Genome Project | 2000 | 0.5 | WGS | 5 | http://news.bbc.co.uk/1/hi/sci/tech/2940601.stm | ||||||||||||||||||||||
96 | Human Genome Project | 2003 | 1 | WGS | 5 | http://news.bbc.co.uk/1/hi/sci/tech/2940601.stm | 92% of the genome | |||||||||||||||||||||
97 | ||||||||||||||||||||||||||||
98 | ||||||||||||||||||||||||||||
99 | ||||||||||||||||||||||||||||
100 |