This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/115298
Description
Title
Next generation DNA-based data recorders
Author(s)
Tabatabaei, Seyed Kasra
Issue Date
2022-04-18
Director of Research (if dissertation) or Advisor (if thesis)
Milenkovic, Olgica
Doctoral Committee Chair(s)
Schroeder, Charles
Committee Member(s)
Aksimentiev, Oleksii
Lu, Yi
Department of Study
School of Molecular & Cell Bio
Discipline
Biophysics & Quant Biology
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
Ph.D.
Degree Level
Dissertation
Keyword(s)
DNA
Data Storage
Nicking
DNA Sequencing
Metadata
Modified Nucleotides
Neural Networks
Nanopores
Abstract
DNA-based data storage systems have received significant attention in the synthetic biology, computer science and information theory communities due to their promise of ultrahigh storage density, recording durability, energy efficiency, environment friendliness and potential capability of integration with in-memory computing platforms. In such systems, user content is stored in synthetic DNA oligos comprised of natural DNA nucleotides (A, T, C, and G) and retrieved via next generation (e.g., Illumina) or third generation (e.g., Oxford Nanopores) sequencing technologies. Despite recent advances in DNA synthesis and sequencing methods, all known DNA-based data storage platforms suffer from high cost, read-write latency and significant error rates that render them noncompetitive with modern electronic storage devices.
Here, we introduce new approaches for encoding and reading information in DNA molecules. We first demonstrate that one can use readily available native DNA extracted from living cells (rather than using synthetic DNA molecules) to store information, and we further show that information can be stored in the topology of the sugar-phosphate backbone in the form of single-bond breaks known as ‘nicks’ (rather than storing information only in the sequence content). We show that information written in nicks can also be retrieved via a commonly used sequencing platform such as Illumina MiSeq, which is similar to synthetic DNA-based data storage systems. We further demonstrate that nick-based and sequence content-based recording approaches can be combined to generate a two-dimensional data storage system, where the sequence is reserved for archival data and metadata is written in the backbone of the molecule. In a third project, we introduce a fundamentally new concept for a prototype of a DNA-based recorder that uses an extended DNA alphabet comprised of the four canonical DNA nucleotides in addition to seven chemically modified nucleotides. The DNA data storage platform with an extended alphabet holds the potential for a ~2-fold increase in the information storage density. We demonstrate that combinatorial patterns, generated from these additional nucleobases as well as the natural nucleotides, can be accurately discriminated using MspA and Oxford nanopores, making them suitable candidates for carrying digital information. Overall, the work presented in this thesis fundamentally advances the field of macromolecular data storage.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.