Authors
William H Press, John A Hawkins, Stephen K Jones, Jeffrey M Schaub, Ilya J Finkelstein
Publication date
2020/8/4
Journal
Proceedings of the National Academy of Sciences
Volume
117
Issue
31
Pages
18489-18496
Publisher
National Academy of Sciences
Description
Synthetic DNA is rapidly emerging as a durable, high-density information storage platform. A major challenge for DNA-based information encoding strategies is the high rate of errors that arise during DNA synthesis and sequencing. Here, we describe the HEDGES (Hash Encoded, Decoded by Greedy Exhaustive Search) error-correcting code that repairs all three basic types of DNA errors: insertions, deletions, and substitutions. HEDGES also converts unresolved or compound errors into substitutions, restoring synchronization for correction via a standard Reed–Solomon outer code that is interleaved across strands. Moreover, HEDGES can incorporate a broad class of user-defined sequence constraints, such as avoiding excess repeats, or too high or too low windowed guanine–cytosine (GC) content. We test our code both via in silico simulations and with synthesized DNA. From its measured performance, we …
Total citations
2019202020212022202320241617243633
Scholar articles
WH Press, JA Hawkins, SK Jones Jr, JM Schaub… - Proceedings of the National Academy of Sciences, 2020