We have sequenced the CHM13hTERT human cell line on the Oxford Nanopore platform to approximately 120x coverage. We have also sequenced approximately 50x coverage using 10X Genomics as well as BioNano DLS and Arima Genomics HiC. PacBio (both CLR and HiFi) data for this cell line has been previously generated by the Washington University School of Medicine and the University of Washington, and is available from NCBI SRA.
Human genomic DNA was extracted from the cultured cell line. As the DNA is native, modified bases will be preserved. We followed Josh Quick's ultra-long read (UL) protocol for library preparation and sequencing.
All data is released to the public domain (CC0) and we encourage its reuse. While not required, we would appreciate if you would acknowledge the "telomere-to-telomere" (T2T) consortium for the creation of this data and encourage you to join us if you would like to help finish the human reference genome. More information about our consortium can be found on the T2T homepage.
Miga KH, Koren S, et al. Telomere-to-telomere assembly of a complete human X chromosome. bioRxiv, 2019.
The current assembly draft (v0.7) is generated with Canu v1.7.1 including rel1 data up to 2018/11/15 and incorporating the previously released PacBio data. Two gaps on the X plus the centromere were manually resolved. Contigs with low coverage support were split and the assembly was scaffolded with BioNano. The assembly was polished with two rounds of nanopolish and two rounds of arrow. The X polishing was done using unique markers matched between the assembly and the raw read data, the rest of the genome used traditional polishing. Finally, the assembly was polished with 10X Genomics data. We validated the assembly using independent BACs. The overall QV is Q37 (Q42 in unique regions) and the assembly resolves over 80% of the bacs (280/341).
The assembly is 2.94 Gbp in size with 359 scaffolds (448 contigs) and an NG50 of 83 Mbp (70 Mbp)
Outside of the X, this should be considered a draft and likely has mis-assemblies. We will continue to update releases as we validate/fix the assembly. Unpolished assemblies are available below for each data release and may be a more suitable basis for the structural analysis of other chromosomes, but will have a lower consensus accuracy.
- Chromosome X v0.7 (md5: 89b3dd61db66177dd830527b920956fa)
- Chromosome X v0.7 unique k-mer anchored mappings (md5: ada12a00d4781f6b0101a09be19abe93)
- Chromosome 8 v3 (md5: 7194793c7fc0296749f226d2cd6a9c76)
- Chromosome 8 v3 unique k-mer anchored mappings (md5: 6b3be07cbef7a9b04bde83b91cfe764d)
- Assembly draft v0.7 (md5: b9777540aaa0251c7dbb4974fb0a69d6)
- Assembly draft v0.6 (md5: c3e3318e82ba5dc64b74f458f4989b85)
- Assembly draft v0.4 (md5: 7e3c2fff9479ba45f7916fa1eee1310b)
We sequenced a total of 367 Gbp of data (118x coverage). The read N50 is 53 kbp and there are 193 Gbp bases in reads >50 kbp (62x). The longest full-length mapping read is 1.3 Mbp.
Sequencing data was generated from three lines of CHM13 (NHGRI, UW, UCD), which all originate from the original line established by Urvashi Surti. Only the NHGRI line was karyotyped and confirmed to be stable prior to sequencing. For the NHGRI line, NHGRI (PI: Phillippy) and University of Nottingham (PI: Loose) contributed approximately 140 flowcells of UL data using Quick's ultra-long protocol; 199 Gbp (64x, 1.4 Gbp/flowcell). The read N50 is 71 kbp and there are 128 Gbp of data in reads >50 kbp (41x). For the UW line, University of Washington (PI: Eichler) contibuted 80 flowcells of UL data using a new UL protocol developed by Glennis Logsdon; 38 Gbp (12x, 0.5 Gbp/flowcell). The read N50 is 130 kbp and there are 30 Gbp of data in reads >50 kbp (10x). For the UCD line, UCDavis (PI: Dennis) contributed two PromethION cells using a ligation prep; 114 Gbp (37x, 57 Gbp/flowcell). The read N50 is 36 kbp and there are 25 Gbp of data in reads >50 kbp (8x).
rel4 is the full dataset as of 2019/09/01, all data was re-called using Guppy 3.4.5 with the HAC model.
- Guppy flip-flop 3.4.5 (md5: dad0b6caa4a2b03f57387c1bd8107b2f)
rel3 is the full dataset as of 2019/09/01, all data was re-called using Guppy 3.1.5 with the HAC model. We have provided mappings both to our current draft assembly and to the GRCh38 with decoys in cram format, using minimap2. Read ids broken out by sequencing location are available for NHGRI, U of Nottingham, UW, and UCD.
- Guppy flip-flop 3.1.5 (md5: 92026d97a898c2f5b65074048a1caabf)
- Canu v1.9 rel3 assembly (no curation or polishing, resolves 314 BACs at Q24) (md5: a05a864eb90578f0fe36e0d774395075)
- Flye v2.5 rel3 assembly (no curation or polishing, resolves 253 BACs at Q22) (md5: 80428824ecc3ec41cde9301aa3a986d0)
- Shasta rel3 assembly (no curation or polishing, resolves 176 BACs at Q28) (md5: 4da86a6b4af5fa5c35407d7cf39c1bac)
- Guppy flip-flop mapped to asm v0.7 with minimap2 (md5: 02b8966c447f2cc9dc1ae211930fd4e3)
- Guppy flip-flop mapped to GRCh38 with decoys with minimap2 (md5: a18c3c9e9f3fa638ff348ebba0f883da)
rel2 is the same data as rel1 but recalled with the latest generation callers (Guppy flip-flop 2.3.1). We have provided mappings both to our current draft assembly and to the GRCh38 with decoys in cram format, using minimap2.
- Guppy flip-flop 2.3.1 (md5: 7e3f4ded02d500a3db0c76c84cdc42b9)
- Canu v1.8 rel2 assembly (no curation or polishing, resolves 287 BACs at Q20) (md5: 778ec406528e153e9b0cb74b4a4caade)
- Guppy flip-flop mapped to asm v0.6 with minimap2 (md5: 20afc508915207c5082e6f3c427739d2)
- Guppy flip-flop mapped to GRCh38 with decoys with minimap2 (md5: 1a4888cafbc935a21c17f449b4802438)
The full dataset as of 2019/01/09. These basecalls were generated on-instrument and use older versions of Guppy (depending on when the flowcell ran on the instrument).
- Guppy on-instrument (md5: c2cb74601eb657df21b7d25980908288)
The raw fast5 data, without basecalls, is available for completeness. The data is grouped into 226 sets.
- Partitions 1-94 were sequenced at NHGRI
- Partitions 95-98 were sequenced at University of Nottingham
- Partitions 99-144 were sequenced at NHGRI
- Partitions 145-224 were sequenced at University of Washington
- Partitions 225-226 were sequenced at UC Davis
- Partition 001 (md5: c837460c50a4446fc8320c95dc88f204)
- Partition 002 (md5: 05ceccf4256d248aaec2a4c61e58c26c)
- Partition 003 (md5: 879e3a6391e5da5f943fa46b92decd47)
- Partition 004 (md5: 600bfa46c741eeff0064b1d8040b9349)
- Partition 005 (md5: 1a72beff4b2e4556c5033176ed1cd109)
- Partition 006 (md5: fcd6f8ceeac2034eddaa33cedf6d0010)
- Partition 007 (md5: 0d44cb41a4888b55bce2cba7e70107ba)
- Partition 008 (md5: 52242770505ac9aca1070e0b926c4769)
- Partition 009 (md5: 4e85e63a4ebf8efb2f97fdcee46e5737)
- Partition 010 (md5: e495530dd8a68b7bc9864ab89a4ef52f)
- Partition 011 (md5: 3b57e6256d0162d83a281e74157134e0)
- Partition 012 (md5: 735a0a03c6bec1e0ed417baa0c2d7db2)
- Partition 013 (md5: 90c51a9ab06266b2a980bcc16d3d3960)
- Partition 014 (md5: 645ea0b4edc2bfc71c708a53d5b0d92b)
- Partition 015 (md5: 24f456adb4c1c6579fe34f07c82179e7)
- Partition 016 (md5: 6b72ddda5a7a1c10b50f3026914519ec)
- Partition 017 (md5: 14e7b918b28ecc784b68569454fa27d9)
- Partition 018 (md5: d5f7c9b1d88cf48298f6cbbb2a2a45a9)
- Partition 019 (md5: cefa121a627dfcf9a1dfb117065a7264)
- Partition 020 (md5: ca0729b28cd4cccc81eba670c6e86689)
- Partition 021 (md5: 51a873a2019f2b091ab035cc3f074bb8)
- Partition 022 (md5: e9235f052d651b4ba1fdaaa06ad134d0)
- Partition 023 (md5: 75ebfdb40745d667962a19a0aa838837)
- Partition 024 (md5: e1e05425f9823e50650bd2cf1efa41c6)
- Partition 025 (md5: f8efb23a5e77b12f46bce73b2ddba36a)
- Partition 026 (md5: 829f32786514b092da9e4fb8701da037)
- Partition 027 (md5: 15ebb086d975583386c1d0e49fbca932)
- Partition 028 (md5: 3dd39dee6efea9b1b50d282d1d2aae19)
- Partition 029 (md5: 3c5b3522dd741214554f84d8645cdf20)
- Partition 030 (md5: 1ef7fe24c315085d8dcfe4e6ba9b4de2)
- Partition 031 (md5: e9501d4d0fd38d64c2ad1c81f8d1a0e3)
- Partition 032 (md5: 1f3ff51da0e87c2009bef8256b930f0b)
- Partition 033 (md5: 76a518084b021db82fd5dab7540e88bb)
- Partition 034 (md5: fd9f4dcfaeb89134a4f700a5346c16fa)
- Partition 035 (md5: dbdd53ba61d67a7f61405ae39d2b931b)
- Partition 036 (md5: c243b8f64bde0051fe104e8baaecf09b)
- Partition 037 (md5: aafa1d558881b2b4856fde3af0cbb9b2)
- Partition 038 (md5: d2e39e42eaf6a0a63d0542435590dd88)
- Partition 039 (md5: ef48d5c46f19de02fb6f6646726c95de)
- Partition 040 (md5: 17d7d34b45e14b2a79fc30e5c5084315)
- Partition 041 (md5: eb6a16d0b37d538bdbf90c3bfcc0f098)
- Partition 042 (md5: 7dbf87d75c901463b2e4e4afdc4adb52)
- Partition 043 (md5: 97c071a1d0a170e9f4809f6cdc459a6b)
- Partition 044 (md5: 27dc707435a2c98fc7201ccefec68c9d)
- Partition 045 (md5: 54ce28e1e1b54ab9fd8dd072711acd30)
- Partition 046 (md5: b174c7826fc399312fad331660745e55)
- Partition 047 (md5: 2b6ce400051fce5d2de09fd8fd461fc8)
- Partition 048 (md5: 81415b29f2b6a605473af6d3529758b1)
- Partition 049 (md5: ffc9182d8a9ad9752b6571d3d2f2b69d)
- Partition 050 (md5: 790281fcf0512a798b6f0e75b14620be)
- Partition 051 (md5: 4fc5dc17819a3727e5cedaa89550ef9f)
- Partition 052 (md5: d33a70e926dee0e67cf1a75d50ee1249)
- Partition 053 (md5: 9d66e1372866dd454173f486d57ae322)
- Partition 054 (md5: 958b62e07349258d93ee3e089c6f91ff)
- Partition 055 (md5: 0e605a04d9bbeb0573aefddbfae12bd6)
- Partition 056 (md5: 29b205c649f66e3d44ea9f598b492bc2)
- Partition 057 (md5: 7336b91e333ae912b4cfc6e366570c54)
- Partition 058 (md5: 2d992482005a2523f710487f2c0a0a31)
- Partition 059 (md5: 3b45c205982796a90aa0f40955c4937b)
- Partition 060 (md5: f085ae6a4818c44d03a6f5adfc445699)
- Partition 061 (md5: 1c5a3a0ed8b53a930535b9d34e6a0667)
- Partition 062 (md5: fbfd4ffb7cf8fca4d613d0ec67d3104c)
- Partition 063 (md5: 9ddf7a9fe7e9cf8ceb02b8debed41fcc)
- Partition 064 (md5: ee3ac8080a19d4a6ab3af84074d03d7a)
- Partition 065 (md5: d94a12692d399c44612cab8b2aea8164)
- Partition 066 (md5: a9f3bfa69bbc248b33f99f42827331eb)
- Partition 067 (md5: 6c9d4b38edc6f78521f3cfdd8edc571c)
- Partition 068 (md5: 76a29683bfad7c4a0b8a0bdbbbd6fd49)
- Partition 069 (md5: f924667636c528d56e46aa92db0a182d)
- Partition 070 (md5: f813b0a4b2a4a2353c7deb539f16f286)
- Partition 071 (md5: fa56e2524ea2cc57e79f692466375b83)
- Partition 072 (md5: 23b1df220d55ab9df2735c74849a53c9)
- Partition 073 (md5: 70839cbc61d3d8af7fafcb7ba8f96461)
- Partition 074 (md5: 109b91ceda32ab0f8b9edb24cb35fb23)
- Partition 075 (md5: 53c466af09a3a119df3255189091bcda)
- Partition 076 (md5: 22ad2327db64767e34378508afe60706)
- Partition 077 (md5: 64c7c1702e3476137c54ebc0c07d970e)
- Partition 078 (md5: 6e2048a8a2ceb36bb679455e0af81230)
- Partition 079 (md5: 45717c24fe844f2605be81bd8e15d856)
- Partition 080 (md5: 1ac20637828f0f3115f1c0f289e006aa)
- Partition 081 (md5: e7b5e584de5f2cbda1d53ec2f6e2668e)
- Partition 082 (md5: aad214d168ad3a59488dfac71fcedc22)
- Partition 083 (md5: d557dee3b08c61d540fd6a00689341fa)
- Partition 084 (md5: cc2b4676515b988dd4f64724e49c3304)
- Partition 085 (md5: 34e6154991e5d5c641e22a529c5f06e1)
- Partition 086 (md5: 2f9ff4371f32c3a33ea081ad8825437e)
- Partition 087 (md5: 945504e89ba54cdab032eac63985d216)
- Partition 088 (md5: 46a8ba05cb12b268c7f7ce04575d24da)
- Partition 089 (md5: 5fd0219c9c99aa08ce07bb35e647144c)
- Partition 090 (md5: da0e3f19f81c99a89bcff7e8f74dc6cb)
- Partition 091 (md5: c11b11f3386d47dd33acc3cba7f44fb2)
- Partition 092 (md5: 87dfa60ae9308214b43aa7075ddd9f44)
- Partition 093 (md5: 6eced035881d3e804bea7103d26c042e)
- Partition 094 (md5: 59ebbc64994779244e5f7431c54b819e)
- Partition 095 (md5: 4de3c1f5163357a256847c1082379df3)
- Partition 096 (md5: cf16e88c803b82b052651171490d6d5a)
- Partition 097 (md5: bcf0e6944fb937bdda07a68530e63f01)
- Partition 098 (md5: 3d98181fa2e5526d30bab2a6dffb1f9d)
- Partition 099 (md5: 9fa25adca355abe3161060393b40de45)
- Partition 100 (md5: 12c7eabb92e0c9fe7ac4fcfa6f4a2795)
- Partition 101 (md5: 020bae3d98d9c5b2df8faca3f8e46ead)
- Partition 102 (md5: dd4d7a7c6d682271bb9d76cc8cd2f284)
- Partition 103 (md5: 8ea12825fced78d35d6e427c02db33db)
- Partition 104 (md5: 2137c06f010f11aa150a3e431fb502b3)
- Partition 105 (md5: 1c0e69e080eb86fc4f46bf91780f7dbe)
- Partition 106 (md5: c5acaa0cf6786fa2420fe938c564f743)
- Partition 107 (md5: dadc81bebb317516b57329cf8a79dd8d)
- Partition 108 (md5: 020d60709c8892d8abc24f2cb3abadd1)
- Partition 109 (md5: 1b3463481d8203bf617705d1becb86d5)
- Partition 110 (md5: e04fdfdd42d9e6b3f1cc10d54a0ea738)
- Partition 111 (md5: ee6b441916a8170fc3c59958180c9af0)
- Partition 112 (md5: 5eeef61c820be9c7826226d0b5eaadfb)
- Partition 113 (md5: 3b8d107886b0b0f2c7a046e96bcf6693)
- Partition 114 (md5: cbfbe53039a2196d8c043de6af850e2a)
- Partition 115 (md5: 69448ff84cac02071991de26dd60e9e6)
- Partition 116 (md5: 128875eca40e2ac2ef52653724afd579)
- Partition 117 (md5: 8c9b722f6f5cf25b26573a1f1d8807f1)
- Partition 118 (md5: d0ecc4997ed5b2e9db2d7418b55bf017)
- Partition 119 (md5: 899dec0634e97a2a1bdc73ee375b7c84)
- Partition 120 (md5: b17d8467e54d28f3a0748f8e0d86305b)
- Partition 121 (md5: 3d25e5440f17ad324d3b9176c31443dd)
- Partition 122 (md5: 3d89e66d0558babc5b42488e9d7e9b09)
- Partition 123 (md5: 5a48a1424b00933956a21582a66b4ef9)
- Partition 124 (md5: 02ca9ed9b6570a8e5fd8f862adc1ae9d)
- Partition 125 (md5: 7ed0458200f8499ee0529ae691460de0)
- Partition 126 (md5: 9f641a474a8eed64b658e48d0004cde0)
- Partition 127 (md5: a28c15458c2df4455d35c3d1b6f9d0f8)
- Partition 128 (md5: cd1f20b2f3dc7a6d293e2dcc30f3d70d)
- Partition 129 (md5: b269d0ad7ee3879ce92d10aa0b817f6a)
- Partition 130 (md5: 8179021a84f457f545265d80b061640d)
- Partition 131 (md5: 1104dd5a0c900d9862017b1196d9109f)
- Partition 132 (md5: 936403b29d7022f19e988caa5e4885c9)
- Partition 133 (md5: 97b539127ed11106ef75df3759391e92)
- Partition 134 (md5: a7409210b3f3ac08b14b5833b8dc97f4)
- Partition 135 (md5: c266e484108dd9beb2ae23bc6f17cedb)
- Partition 136 (md5: 3047a843da59b3020669116228961b3d)
- Partition 137 (md5: 83a41de61e307a4d184ec94a6d3dc5c7)
- Partition 138 (md5: 8c0ae1c83c4a968218a7193836613b48)
- Partition 139 (md5: 5a69a9b10a0b509ad18a4531901cb128)
- Partition 140 (md5: a6eda1f81e1d528b9bffd07c9e701ff6)
- Partition 141 (md5: 871020ffbc86e0abacc92174d00f97d6)
- Partition 142 (md5: 0fa6572159ec40ebda3400316bceb036)
- Partition 143 (md5: 1275a1587f52129a2eb31b1c4c0ca10c)
- Partition 144 (md5: cd983796c2c94d4dd0b19af8b134ef1f)
- Partition 145 (md5: d07b77482f5ae167fa8806f51ac0db3c)
- Partition 146 (md5: 4c39784af62b54a9596c8f9f2b71a2ba)
- Partition 147 (md5: bf2762af2fd81bacf30aed0b4c3141d5)
- Partition 148 (md5: c035ffce30821e292089cb0e0effc7f8)
- Partition 149 (md5: 0d045ac58956492f98b1937faae88d06)
- Partition 150 (md5: 4f9226fb735fc74916acab02c8fcb40a)
- Partition 151 (md5: 54fdf8a38733d0fe01add1c4695a7d89)
- Partition 152 (md5: 882f3c664b0ca06424f8007f80d64d58)
- Partition 153 (md5: 4b8b18a9a1f12047e0309365aecc4832)
- Partition 154 (md5: db8d6643f695cebca5076b51f5fa0c71)
- Partition 155 (md5: dec98ff1d34ab863b5bf2a0356001089)
- Partition 156 (md5: df65d570a34b03fe8b8944fcdba238b7)
- Partition 157 (md5: 0e7a3df6e9d466699fc96990502f96c4)
- Partition 158 (md5: 0d9df61f2eaa0b723fca237034ce74b6)
- Partition 159 (md5: 5acd09c60776e515831d1c2f547da1e1)
- Partition 160 (md5: 46142bd5341d2c5fae08a479dda540a8)
- Partition 161 (md5: 15fe73bc61b67762a6d8d99ac695bae7)
- Partition 162 (md5: b20122665bcf330c9305b000a759c0e3)
- Partition 163 (md5: 98f1933ee44988d0d60edd065aea745f)
- Partition 164 (md5: 27330b662904fe3921ec1fbe9a5e0a39)
- Partition 165 (md5: 64f500808fac28e412e53fae2560f287)
- Partition 166 (md5: e262612e2104dce1310b0aa85f0886b9)
- Partition 167 (md5: 5517cdbaf3d851fa5787711eaca8192b)
- Partition 168 (md5: 3f44a98c5d234498284e02702c8f3cd9)
- Partition 169 (md5: 14524b5fa02d41bd2c983322a4321099)
- Partition 170 (md5: d47d9bd68384dd0d4377cabac565f7e4)
- Partition 171 (md5: 63282d621013b2b41571e7d0bf517262)
- Partition 172 (md5: 2a2c642990854cc005ebbde51dbec56a)
- Partition 173 (md5: 32adcc1d1e9d122d2cffab3648cbd1b7)
- Partition 174 (md5: 52905ca54875717a3e3d4cdb5955df46)
- Partition 175 (md5: 11c847f4e695cab036315ae2428cf80b)
- Partition 176 (md5: dcb226eef40a0bef2fd2d5d26f13b88c)
- Partition 177 (md5: 0192f5f9618f119f7cba8f58f2f1fe68)
- Partition 178 (md5: bf9fc0582a6f1f5e4b419f6dd3ecc949)
- Partition 179 (md5: dd6a42f110d1be4041d1fa23403070c2)
- Partition 180 (md5: 817ea0768e2275a6240aefba5e9402b8)
- Partition 181 (md5: fdb66d1d5ec39338673e857c2aa69a87)
- Partition 182 (md5: e5bf39ba1d99e5337b294ae428a2c72d)
- Partition 183 (md5: d8de690d6db614c887c1799c9abfba89)
- Partition 184 (md5: adf710282e35b7cd1f0ea77fcdc32c5a)
- Partition 185 (md5: 4ee185a2889430f63f1960219a68ed78)
- Partition 186 (md5: e51dd1c3db3348ce6fa1d089bfbaaa26)
- Partition 187 (md5: c3b6b5476e3982bcc19b3026aa9786b4)
- Partition 188 (md5: 2eae48d373c6cb85e6b652a7db224f7f)
- Partition 189 (md5: 80719a5ce718ac1a7935ce5ca90e1ac7)
- Partition 190 (md5: f8ae9a7954e6ade74bfbbc10772b5b77)
- Partition 191 (md5: cf16a4d22b1f1e7c86b68bb51789e473)
- Partition 192 (md5: a797cbb62e49135fbc152ec8497d5370)
- Partition 193 (md5: 33e754109a7882fd7c068416859a0695)
- Partition 194 (md5: 7ff95e524daa1d937c3e65213ba901b7)
- Partition 195 (md5: 00c8789114fbaed7fb5f884fdca96346)
- Partition 196 (md5: e307840ffdbddaf7237a65efa8c85188)
- Partition 197 (md5: 9dcee9a1d3d576599d31dda1c8b38ff8)
- Partition 198 (md5: b57b2927e330af13858cfb6ff8ed13bd)
- Partition 199 (md5: 435452a0b0c8dd82aaa1ad9001e6e86e)
- Partition 200 (md5: e4aa7c85f2c1513a0669bf85bab92832)
- Partition 201 (md5: b3a40dbdd504d8e20e9b00924848975c)
- Partition 202 (md5: c48d830ee1710f23d125301172954e35)
- Partition 203 (md5: d60a220607d465dc3f6cf2779efb1262)
- Partition 204 (md5: 6fc8204a147898ba1d93945a78b33be8)
- Partition 205 (md5: 95b0fa324b0341668ba55675bed664d1)
- Partition 206 (md5: d75561496fd34938c4d6ec4c74f0543d)
- Partition 207 (md5: 37d370034bcd5503baf1fda12b184def)
- Partition 208 (md5: 1922135013379a366119367e780915ff)
- Partition 209 (md5: b60425e6503e5a3618d0019189aa6a17)
- Partition 210 (md5: 4b41c25ee6c34006d25e04ca66ee3bd4)
- Partition 211 (md5: 316efb5752cbedca3593a704f83178ce)
- Partition 212 (md5: a6279b0d13a200df4ec5f3aec5c54785)
- Partition 213 (md5: 102ce625ade4851c6ef77350c1d66bba)
- Partition 214 (md5: b56af7f1bb450d861d9eecfc681e3f09)
- Partition 215 (md5: 6524dbc2d7f6b8bc65c30ff68d400e00)
- Partition 216 (md5: c24fdefe4aa9a563f06d33367264e57d)
- Partition 217 (md5: 14e79f6d50ce4d2b91364ac176bb9170)
- Partition 218 (md5: 5b71e4c287589290c699d78597eb0fe0)
- Partition 219 (md5: 06220779ab619d8f6ec927b5a53f5bce)
- Partition 220 (md5: c7a09dc722e123506b801db090923317)
- Partition 221 (md5: d07c41c9dbf5f6fe7c0745c4323d4a36)
- Partition 222 (md5: 5fb46b69c2192b8c77a6505ccf0a3499)
- Partition 223 (md5: 036b920192151a51a561792cb3257ecf)
- Partition 224 (md5: a0a16ac031a6bafdba6c299282f5275a)
- Partition 225 (md5: 378aebc21351b13ba643bb83645ae860)
- Partition 226 (md5: d99c0ef473cec223269f1c91a6d99bc7)
Approximately 50x of data was generated on a NovaSeq instrument. Based on the summary output of Supernova, there are 1.2 billion reads with 41x effective coverage. The mean molecule length is 130 kbp and an N50 of 864 reads per barcode.
- CHM13_prep5_S13_L002_I1_001 (md5: 84af4586ca9f78060d5802b36cdd9e8a)
- CHM13_prep5_S13_L002_R1_001 (md5: 231633e0cf2fbdeba732dc7ad6233fa0)
- CHM13_prep5_S13_L002_R2_001 (md5: 386febfc3fc760e11e315e69310ed3d8)
- CHM13_prep5_S14_L002_I1_001 (md5: f0b7628e90dfaf2f702ec613c7b61ca7)
- CHM13_prep5_S14_L002_R1_001 (md5: 86afbc7a41ea1c81657bf1ca64d1178c)
- CHM13_prep5_S14_L002_R2_001 (md5: 3dfbe58b5ae715213e20614837dcf3b7)
- CHM13_prep5_S15_L002_I1_001 (md5: ee34f03c765787ea069050d8eaac1de4)
- CHM13_prep5_S15_L002_R1_001 (md5: 73edcb56dd18d7b7b2705b4db7b4efc5)
- CHM13_prep5_S15_L002_R2_001 (md5: a0de8e5bc127203129e4e1437b3e6aaa)
- CHM13_prep5_S16_L002_I1_001 (md5: 42db246f7e5725a7b6ff3f5f5aedfd6e)
- CHM13_prep5_S16_L002_R1_001 (md5: 3d3db7eccaf388fbcd901cbc6ad47630)
- CHM13_prep5_S16_L002_R2_001 (md5: 9dfcc17398a7acd906212a09ab4c8903)
Approximately 430x of data was generated using the Saphyr instrument and the DLE-1 enzyme. There are 15.2 M molecules with an N50 molecule length of 115.9 kbp and a max of 2.3 Mbp (2 M molecules > 150 kbp, N50 218 kbp). The assembly of the molecules is 2.97 Gbp in size with 255 contigs and an NG50 of 59.6 Mbp.
A library was generated using an Arima genomics kit and sequenced to approximately 40x on an Illumina HiSeq X.
- CHM13.rep1_lane1_R1.fastq.gz (md5: 41d2f26eb1f958723e28e32ca471b680)
- CHM13.rep1_lane1_R2.fastq.gz (md5: 2747aaf1d128182bcaa151098e0abe74)
- CHM13.rep2_lane1_R1.fastq.gz (md5: 26ce58141bb25b4931512ec4cf176f64)
- CHM13.rep2_lane1_R2.fastq.gz (md5: 77b71bd1067c6e4e908a9aaa05f4bd73)
The PacBio data (both CLR and HiFi) was previously generated and is available from the SRA. The list of P6-C4 cells used for arrow polishing are listed here.
Files are generously hosted by Amazon Web Services. Although available as straight-forward HTTP links, download performance is improved by using the Amazon Web Services command-line interface. References should be amended to use the s3://
addressing scheme, i.e. replace https://s3.amazon.com/nanopore-human-wgs/
with s3://nanopore-human-wgs
to download. For example, to download CHM13_prep5_S13_L002_I1_001.fastq.gz
to the current working directory use the following command.
aws s3 --no-sign-request cp s3://nanopore-human-wgs/chm13/10x/CHM13_prep5_S13_L002_I1_001.fastq.gz .
or to download the full dataset use the following command.
aws s3 --no-sign-request sync s3://nanopore-human-wgs/chm13/ .
The s3 command can also be used to get information on the dataset, for example reporting the size of every file in human-readable format.
aws s3 --no-sign-request ls --recursive --human-readable --summarize s3://nanopore-human-wgs/chm13/
or to obtain technology-specific sizes.
aws s3 --no-sign-request ls --recursive --human-readable --summarize s3://nanopore-human-wgs/chm13/nanopore/fast5
aws s3 --no-sign-request ls --recursive --human-readable --summarize s3://nanopore-human-wgs/chm13/nanopore/rel2
aws s3 --no-sign-request ls --recursive --human-readable --summarize s3://nanopore-human-wgs/chm13/assemblies
Amending the max_concurrent_requests
etc. settings as per this guide will improve download performance further.
Please raise issues on this Github repository concerning this dataset.
* rel1 and 2: 2nd March 2019. Initial release.
* asm v0.6 and canu rel2 assembly: 28th May 2019. Assembly update.
* Hi-C data added: 25th July 2019. Data update.
* asm v0.6 alignments of rel2 added: 30th Aug 2019. Data Update
* rel3: 16th Sept 2019. Data update.
* chrX v0.7, canu 1.9 and flye 2.5 rel3 assembly: 24th Oct 2019. Assembly update.
* shasta rel3 assembly: 20th Dec 2019. Assembly update.
* chr8 v3, rel4 data: 21 Feb 2020. Data and assembly update.