Files in this item



application/pdfLi_Yang.pdf (11MB)
(no description provided)PDF


Title:Novel methods in transcriptome analysis using RNA-seq
Author(s):Li, Yang
Advisor(s):Ma, Jian
Department / Program:Bioengineering
Degree Granting Institution:University of Illinois at Urbana-Champaign
splice junction
alternative splicing
gene fusion
Abstract:RNA-seq has proven to be a powerful technique for transcriptome profiling based on next-generation sequencing (NGS) technologies. Using RNA-seq, we want to solve two critical challenges: identifying Splice Junctions (SJs) and annotating gene fusion transcripts. Due to the limited read length of NGS data, it is extremely challenging to accurately map RNA-seq reads to SJs, which is important for the analysis of alternative splicing and isoform construction. In this thesis, we describe a novel method, called TrueSight, that combines information from (i) RNA-seq read mapping quality and (ii) coding potential from the reference genome sequences into a unified model that utilizes semi-supervised self-training to precisely identify SJs. Both simulation and real data evaluations showed that TrueSight achieved higher sensitivity and specificity than existing tools. We also applied TrueSight to discover novel splice forms in honey bee transcriptomes that cannot be detected by other methods and found that 94.6% of honey bee multi-exon genes are alternatively spliced. Utilizing high coverage transcriptome profiling data and a gene model enhanced by TrueSight, our quantitative analysis revealed that the expression ratio of the splice variants from a single gene is significantly correlated with the gene's exon-intron structure, splice site strength, and methylation pattern. We believe this new tool will be highly useful to comprehensively study splice variants based on RNA-seq. Fusion transcripts can be created as a result of genome rearrangement in cancer. Some of them play important roles in carcinogenesis, and can serve as diagnostic and therapeutic targets. With more and more cancer genomes being sequenced by next-generation sequencing technologies, we believe an efficient tool for reliably identifying fusion transcripts will be desirable for many groups. With the alignment tool we developed, we designed and implemented an open-source software tool, called FusionHunter, which reliably identifies fusion transcripts from transcriptional analysis of paired-end RNA-seq. We show that FusionHunter can accurately detect fusions that were previously confirmed by RT-PCR in a publicly available dataset. The purpose of FusionHunter is to identify potential fusions with high sensitivity and specificity and to guide further functional validation in the laboratory.
Issue Date:2012-05-22
Rights Information:Copyright 2012 Yang Li
Date Available in IDEALS:2012-05-22
Date Deposited:2012-05

This item appears in the following Collection(s)

Item Statistics