ABCDEFGHIJKLMNOPQRSTUVWXYZAAAB
1
2
blank if normal; ? if confusing / unsure; ?? if very confusing / unsure / not from reliable source (e.g. random news article)name of project. for multiple datapoints (e.g. release phase 1, 2), add a new row. year data collected (best guess; year announced / released ok)cumulative within same project + same type. always LOWER bound, i.e. x means >x, though some rounding allowedtype of sequencing/genotyping (WGS or SNP array) parameter (read depth if WGS; SNP count if array)total cost of the project, if easily findable major funding source, if easily available. e.g. "finnish gvt"link(s) to paper, blog post, etc., stating # genomes (and type etc.)anything confusing or abnormal, uncertainty, etc.
3
4
confidencenameyear# genomestypeparameter $ costfundingevidencenotes
5
China Kadoorie Biobank2017100kSNP700-800khttps://www.ckbiobank.org/about-us/ckb-timeline, https://publications.ersnet.org/content/erj/58/4/2100199.fullvery confusing. from 2021 https://publications.ersnet.org/content/erj/58/4/2100199.full : "This 700K single nucleotide polymorphism (SNP) array was used to genotype ∼32 000 CKB participants in the first phase. A revised and updated version of the original array which covers ∼803K SNPs was used to genotype ∼69 000 participants in the second and third phases." but from https://www.ckbiobank.org/about-us/ckb-timeline 2024 "400,000 samples are whole genome sequenced." maybe by WGS they mean SNP arrays? or they really did that? https://www.cell.com/cell-genomics/pdfExtended/S2666-979X(23)00144-1
6
China Kadoorie Biobank202010kWGS?https://www.ckbiobank.org/about-us/ckb-timelinemaybe more info in recent studies methods section, but didn't check
7
China Kadoorie Biobank2024400kWGS?https://www.ckbiobank.org/about-us/ckb-timelinenot totally sure about this; but there are multiple sources saying this was planned, and then "400,000 samples are whole genome sequenced." on the website.
8
HapMap 320111184SNP1.6mhttps://pmc.ncbi.nlm.nih.gov/articles/PMC3173859/
9
Taiwan biobank2021103KSNP660Khttps://www.nature.com/articles/s41525-021-00178-9
10
Estonian Biobank2021200KSNP700K?https://genomics.ut.ee/en/content/estonian-biobank
11
23andMe20151mSNP560-960Khttps://blog.23andme.com/articles/one-in-a-million
12
23andMe20183mSNP560-960K
https://www.technologyreview.com/2018/02/12/145676/2017-was-the-year-consumer-dna-testing-blew-up/
snp count data from https://isogg.org/wiki/23andMe
13
23andMe201910mSNP560-960K
https://thednageek.com/23andme-has-more-than-10-million-customers/
14
23andMe202111mSNP560-960Khttps://www.bloomberg.com/news/features/2021-11-04/23andme-to-use-dna-tests-to-make-cancer-drugs
15
23andMe202314mSNP560-960Khttps://investors.23andme.com/news-releases/news-release-details/23andme-reports-fy2023-fourth-quarter-and-full-year-financialare these really all genome-wide? plausibly yes, but not totally clear that there weren't other products with much smaller SNP arrays included
16
23andMe202415mSNP560-960Khttps://blog.23andme.com/articles/happy-dna-day-2024
17
Million Veteran Program2020875KSNP670K?https://pubmed.ncbi.nlm.nih.gov/32243820/
18
Million Veteran Program2022150KWGS??https://med.stanford.edu/gbsc/projects/vapahcs.html
19
UK biobank2023491KWGS$200M
https://www.ukbiobank.ac.uk/learn-more-about-uk-biobank/news/world-s-largest-genetic-project-opens-the-door-to-new-era-for-treatments-and-cures-uk-biobank-s-major-milestone
20
UK biobank2022150KWGS23.5https://www.nature.com/articles/s41586-022-04965-x
21
All of Us? 2024245KWGS$3.6Bhttps://databrowser.researchallofus.org/https://allofus.nih.gov/about/program-overview mentions the funding, but not how much has been spent exactly.
22
Human Genome Diversity Project (HGDP)
2020929WGS35https://www.science.org/doi/10.1126/science.aay5012
Insights into human genetic variation and population history from 929 diverse genomes. We sequenced 929 genomes from 54 geographically, linguistically, and culturally diverse human populations to an average of 35× coverage and analyzed the variation among them.
23
1000 genomes project
2010185WGS3https://pmc.ncbi.nlm.nih.gov/articles/PMC3042601/
24
1000 genomes project
20121092WGS5https://pmc.ncbi.nlm.nih.gov/articles/PMC3498066/
Primary data generated for each sample consist of low-coverage (average 5x) whole-genome and high-coverage exome (average 80x across a consensus target of 24 Mb spanning over 15,000 genes) sequence data and high density SNP array information.
25
1000 genomes project
20152504WGS30https://www.nature.com/articles/nature15393
26
1000 genomes project
20223202WGS30
https://www.cell.com/cell/fulltext/S0092-8674(22)00991-6?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS0092867422009916%3Fshowall%3Dtrue
27
NHLBI TOPMed (Trans-Omics for Precision Medicine)201756KWGS38.2https://topmed.nhlbi.nih.gov/topmed-data-access-scientific-communityread depth: https://www.nature.com/articles/s41586-021-03205-y
28
NHLBI TOPMed (Trans-Omics for Precision Medicine)2019138KWGS38.2https://topmed.nhlbi.nih.gov/topmed-data-access-scientific-community
29
FINNGEN2023500KSNP657-664K$155MU Helsinki, consortiumhttps://www.finngen.fi/en/access_resultshttps://www.finngen.fi/en/node/1996 mentions 664K genetic markers. https://www.finngen.fi/en/access_results also lists how much data they had collected at various previous points in time. https://www.finngen.fi/en/node/2638 describes the "consortium"
30
FINNGEN2020220KSNP657-664K$155MU Helsinki, consortiumhttps://www.finngen.fi/en/access_resultsData released to the partners: Q2 2020
31
Tohoku University Tohoku Medical Megabank (TMM) 20151,070WGS30https://www.nature.com/articles/ncomms9018
32
Tohoku University Tohoku Medical Megabank (TMM) 2024100KWGS?
33
deCODE (iceland)20152.5KWGS20https://pubmed.ncbi.nlm.nih.gov/25807286/
34
deCODE (iceland)2024?160KSNP?https://www.decode.com/research/more than half of iceland
35
Qatar Biobank202220KWGS?https://thepeninsulaqatar.com/article/23/10/2022/qatar-biobank-makes-significant-progress-over-ten-years
36
CanPath2020?55KSNP?https://canpath.ca/about/
37
BioBank Japan2023267kSNP900kU Tokyo, nationalhttps://biobankjp.org/en/initiatives/2108#gsc.tab=0snp count: https://biobankjp.org/researchers/730#gsc.tab=0 . org: https://biobankjp.org/en/about/1990#gsc.tab=0
38
BioBank Japan202311kWGS?U Tokyo, nationalhttps://biobankjp.org/en/initiatives/2108#gsc.tab=0
39
Ancestry DNA202425MSNP637Khttps://thednageek.com/dna-tests/
40
Autism Genetic Resource Exchange202414kWGShttps://www.autismspeaks.org/science-news/worlds-largest-autism-genome-database-expandslots of people with autism
41
MyHeritage20247MSNP576Khttps://thednageek.com/dna-tests/
42
GEDmatch20231MSNPhttps://thednageek.com/dna-tests/
43
Korea Biobank20234KWGS20 (median)~1/3 low coverage (<10)
44
Genomics England, 100,000 Genomes
201985kwgs?$380 MUK govhttps://www.genomicsengland.co.uk/news/100000-genomes-for-approved-researchers https://www.genomicsengland.co.uk/initiatives/100000-genomes-project not clear what is actually sequenced. sounds like a lot of it is cancer cells. but giving participant count, as that's basically a whole human genome. source for budget: https://mitochondrialdisease.nhs.uk/media/documents/jude_craft_jen_whitfield.pdf
45
Saudi Humane genome Project202265Kmix fo wgs and wes and snp?https://www.nature.com/articles/s41598-022-05296-7One outcome of the program is that, more than 65,000 samples have been sequenced and 7,500 variants have been identified to date.
46
Nebula Genomics?WGS10-30
47
HUNT Biobank2024210kSNP360kNTNU (norwegian)https://academic.oup.com/ije/article/53/3/dyae073/7687883unsure of budget or gvt funding. SNPs exome-enriched? "Illumina-Human Core Exome" https://www.ntnu.edu/about
48
The malaysean cohort202312KSNPhttps://pmc.ncbi.nlm.nih.gov/articles/PMC11402075/paper mentions that 10% of samples were genotyped
49
Kaiser Permanente Research Bank2024395kSNP800khttps://researchbank.kaiserpermanente.org/for-researchers/data-resource/uses thermofisher array https://www.thermofisher.com/us/en/home/life-science/microarray-analysis/applications/predictive-genomics/population-genomics/arrays/axiom-pangenomix.html
50
Swedish Twin Registry201522kSNP560khttps://www.cambridge.org/core/journals/twin-research-and-human-genetics/article/swedish-twin-registry-content-and-management-as-a-research-infrastructure/C22D50E7162D4A9F63706B2B4B84C75B
51
GenomeIndia202310KWGShttps://indianexpress.com/article/india/10000-human-genomes-sequenced-in-india-govt-9184930/
52
Chinese Millionome Database
2022141KWGS0.06-0.1https://academic.oup.com/nar/article/51/D1/D890/6649379
53
Swedish Childhood Tumor Biobank2022117WGS (tumor)https://genomicmedicine.se/en/2023/07/10/whole-genome-sequencing-provides-important-information-for-treating-children-with-cancer/cancer children's tumors (unclear if also the children)
54
Gabriella Miller Kids First Pediatric Research Program
201521kWGSNIHhttps://commonfund.nih.gov/KidsFirst
55
Our Future Health2023650kSNP700khttps://ourfuturehealth.gitbook.io/our-future-health/data/genotype-array-data, https://research.ourfuturehealth.org.uk/data-and-cohort/
56
?Korean Genome and Epidemiology Study201610k??SNP https://academic.oup.com/ije/article/46/2/e20/2622834
57
Genes & Health202145kSNPukhttps://www.illumina.com/content/dam/illumina-marketing/documents/products/datasheets/infinium-global-screening-array-data-sheet-370-2016-016.pdfhttps://www.medrxiv.org/content/10.1101/2024.11.19.24317559v1.full.pdf
58
H3Africa202450kSNP1-2 MM?https://www.fic.nih.gov/News/GlobalHealthMatters/march-april-2022/Pages/human-heredity-health-h3-africa-achievements-endure.aspxhttps://catalog.h3africa.org/
59
H3Africa2020426WGS10-30https://www.nature.com/articles/s41586-020-2859-7"High-depth African genomes inform human migration and health"
60
Westlake BioBank for Chinese
20224480WGS14https://www.nature.com/articles/s41467-022-30526-x?fromPaywallRec=false
61
Westlake BioBank for Chinese
20226025SNP660khttps://www.nature.com/articles/s41467-022-30526-x?fromPaywallRec=false
62
J. Craig Venter20071WGS7.5$100,000,000
Funding was provided from the J. Craig Venter Institute, Genome Canada/Ontario Genomics Institute, The Hospital for Sick Children Foundation, the McLaughlin Centre for Molecular Medicine, and the Canada Foundation for Innovation
https://pmc.ncbi.nlm.nih.gov/articles/PMC1964779/claim for cost from here: https://www.nature.com/articles/nature06884
63
"Identification of individuals by trait prediction using
whole-genome sequencing data"
20171,061WGS30https://www.pnas.org/doi/pdf/10.1073/pnas.1711125114
64
"Comprehensive Characterization of Human Genome
Variation by High Coverage Whole-Genome Sequencing
of Forty Four Caucasians"
201344WGS65https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0059494&type=printable
65
James Watson20081WGS7.4$1,000,000https://www.nature.com/articles/nature06884
66
"The diploid genome sequence of an Asian individual"20081WGS36$500,000https://pmc.ncbi.nlm.nih.gov/articles/PMC2716080/
67
"Accurate Whole Human Genome Sequencing using Reversible Terminator Chemistry"20081WGS30https://pmc.ncbi.nlm.nih.gov/articles/PMC2581791/
68
"DNA sequencing of a cytogenetically normal acute myeloid leukemia genome"20091WGS13.9Alvin J. Sitemanhttps://pmc.ncbi.nlm.nih.gov/articles/PMC2603574/
69
Alzheimer’s Disease Sequencing Project2014583WGS?NHGRIhttps://www.nia.nih.gov/research/dn/alzheimers-disease-sequencing-project-consortia#discoveryhttps://grants.nih.gov/grants/guide/rfa-files/RFA-AG-16-002.html. discovery phase. National Institute on Aging Alzheimer’s Disease Family Based Study
70
Alzheimer’s Disease Sequencing Project20164752WGS?NHGRIhttps://www.nia.nih.gov/research/dn/alzheimers-disease-sequencing-project-consortia#extensiondate not fully clear. https://grants.nih.gov/grants/guide/pa-files/PAR-16-406.html is already in past tense though.
71
??Alzheimer’s Disease Sequencing Project?x+16745WGS?National Institute on Aginghttps://www.nia.nih.gov/research/dn/alzheimers-disease-sequencing-project-consortia#follow
72
Alzheimer’s Disease Sequencing Project202235,681WGS?
National Institute on Aging
https://adsp.niagads.org/adsp-and-affiliates-whole-genome-sequencing-report/
genomes that the ADSP has sequenced as of November 2022, https://www.nia.nih.gov/research/dn/alzheimers-disease-sequencing-project-consortia#follow
73
??Alzheimer’s Disease Sequencing Project202394kWGS?
National Institute on Aging
https://adsp.niagads.org/adsp-and-affiliates-whole-genome-sequencing-report/not sure i'm understanding what the numbers mean... NOT YET SEQUENCED necessarily
74
Steve Jobs20111WGS?$100,000Steve Jobshttps://www.nytimes.com/2011/10/21/technology/book-offers-new-details-of-jobs-cancer-fight.html
75
Complete Genomics201169WGS?https://web.archive.org/web/20120610192353/http://www.completegenomics.com/news-events/press-releases/archive/Complete-Genomics-Adds-29-High-Coverage-Complete-Human-Genome-Sequencing-Datasets-to-its-Public-Genomic-Repository--119298369.htmlOriginal page with the data isn't up anymore: https://web.archive.org/web/20120416073043/http://www.completegenomics.com/public-data/69-genomes/
76
Complete Genomics20093WGS45https://www.science.org/doi/10.1126/science.1181498
77
"A Complete Public Domain Family Genomics Dataset"20134Whole exome sequencing30https://www.biorxiv.org/content/10.1101/000216v1.full
78
?ClinSeq
79
"A continuum of admixture in the Western
Hemisphere revealed by the African Diaspora genome"
2016642WGS?https://www.nature.com/articles/ncomms12522.pdf
80
"The first Korean genome sequence and analysis: full genome sequencing for a socio-ethnic group"20091WGS28.9This work was supported by a grant from the KRIBB Research Ini-https://pubmed.ncbi.nlm.nih.gov/19470904/
81
"A highly annotated whole-genome sequence of a Korean individual"20091WGS27.8
Intiative Program of Korea, by the Korea Science and Engineering. Foundation (KOSEF) grant funded by the Korean government
(MOST), by a KOSEF grant (no. R11-2008-044-03004-0, S.M.A.), by
a grant from the Innovative Research Institute for Cell Therapy
(A062260, J.Y.C.), by a grant from the Ministry of Knowledge
Economy (Standard Reference Data Program), and by generous
funding from Gachon University of Medicine and Science and
Gachon University Gil Hospital.
https://www.nature.com/articles/nature08211
82
"Single-molecule sequencing of an individual human genome"20091WGS28https://www.nature.com/articles/nbt.1561
83
"In depth comparison of an individual’s DNA and
its lymphoblastoid cell line using whole genome
sequencing"
20121WGS?https://link.springer.com/content/pdf/10.1186/1471-2164-13-477.pdf
84
"Complete Khoisan and Bantu genomes from southern Africa"
20102WGS23, 7https://www.nature.com/articles/nature08795.pdf
85
?https://www.personalgenomes.org/
86
"Deep sequencing of 10,000 human genomes"201610545WGS30https://www.pnas.org/doi/full/10.1073/pnas.1613365113#sec-3
87
"Whole-genome sequencing in a patient with Charcot-Marie-Tooth neuropathy"20101WGS?https://pubmed.ncbi.nlm.nih.gov/20220177/
88
"Genome-wide detection of single-nucleotide and copy-number variations of a single human cell"20121WGS25https://pubmed.ncbi.nlm.nih.gov/23258894/
89
"A comprehensively molecular haplotype-resolved genome of a European individual"20111WGS?https://pubmed.ncbi.nlm.nih.gov/21813624/
90
Korean Genome Project
20201094WGS31https://pubmed.ncbi.nlm.nih.gov/32766443/
91
Korean Genome Project
202110kWGS?https://www.news-medical.net/news/20210510/Koreas-first-large-scale-project-reaches-milestone-of-sequencing-10000-whole-genomes.aspx
92
Korea4K20244157WGS20https://academic.oup.com/gigascience/article/doi/10.1093/gigascience/giae014/7646332
93
Korean Personal Genomes Project
201435WGS5https://pmc.ncbi.nlm.nih.gov/articles/PMC4251052/
94
Personal Genome Project
2016184WGS100https://pmc.ncbi.nlm.nih.gov/articles/PMC5057367/
95
Human Genome Project20000.5WGS5http://news.bbc.co.uk/1/hi/sci/tech/2940601.stm
96
Human Genome Project20031WGS5http://news.bbc.co.uk/1/hi/sci/tech/2940601.stm92% of the genome
97
98
99
100