Withdraw
Loading…
Scaling short read de novo DNA sequence assembly to gigabase genomes
Cook, Jeffrey J.
Loading…
Permalink
https://hdl.handle.net/2142/24291
Description
- Title
- Scaling short read de novo DNA sequence assembly to gigabase genomes
- Author(s)
- Cook, Jeffrey J.
- Issue Date
- 2011-05-25T15:03:13Z
- Director of Research (if dissertation) or Advisor (if thesis)
- Zilles, Craig
- Doctoral Committee Chair(s)
- Zilles, Craig
- Committee Member(s)
- Hudson, Matthew E.
- Lumetta, Steven S.
- Patel, Sanjay J.
- Wong, Martin D.F.
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Date of Ingest
- 2011-05-25T15:03:13Z
- Keyword(s)
- de novo sequence assembly
- de Bruijn graph
- Eulerian assembly
- gigabase genome assembly
- Deoxyribonucleic Acid (DNA)
- short reads
- massively parallel sequencing
- Abstract
- The recent advent of massively parallel sequencing technologies has drastically reduced the cost of sequencing, sparking a revolution in whole genome de novo sequencing. However, these new technologies sample much shorter segments of DNA, called short reads, than conventional but more costly long read sequencing technologies, and suffer from higher and more varied error rates. Modern genome assembly tools compensate for these shortcomings by using de Bruijn graph based assembly techniques; however, for large genomes, the physical memory required to efficiently build and manipulate the de Bruijn graph generally far exceeds that which is available on modern commodity workstations. This dissertation develops novel out-of-core algorithms that permit conservative assembly of the de Bruijn graph using one to three orders of magnitude less memory than is required by the naïve approach. These algorithms are implemented in an open source genome assembly tool that replaces the front-end assembly process, which can connect to existing back-end tools in a manner that attempts to decouple the phases that have performance concerns but simple heuristics, from those that have complex heuristics but relatively straightforward implementations, in a way that allows each to be developed by domain experts.
- Graduation Semester
- 2011-05
- Permalink
- http://hdl.handle.net/2142/24291
- Copyright and License Information
- Copyright 2011 Jeffrey J. Cook
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Electrical and Computer Engineering
Dissertations and Theses in Electrical and Computer EngineeringManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…