Files in this item

FilesDescriptionFormat

application/pdf

application/pdfHua_Zhou.pdf (4MB)
(no description provided)PDF

Description

Title:Study of correlation analysis for protein-protein interactions
Author(s):Zhou, Hua
Director of Research:Jakobsson, Eric
Doctoral Committee Chair(s):Jakobsson, Eric
Doctoral Committee Member(s):Wraight, Colin A.; Chen, Lin-Feng; Ha, Taekjip; Gennis, Robert B.
Department / Program:Biochemistry
Discipline:Biochemistry
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:Ph.D.
Genre:Dissertation
Subject(s):correlation analysis
coevolution
mirrortree
protein interaction
pearson's correlation
Abstract:Protein-protein interactions take place when two or more proteins bind together, usually to carry out some biological functions. Many biological processes are formed by interaction networks that are consisted of more than one protein-protein interaction. To understand the proteins’ function mechanism, we need to understand their interaction networks firstly. The availability of complete genome/protein sequences brings in the need of computation methods for protein-protein interaction prediction. Although there have been many computation methods developed to study protein-protein interaction, most methods are based on the correlation of either protein pairs’ sequences information or expression data. Here in our study, we firstly studied the entire sequence approach of correlation analysis, which is also called mirrortree method. We applied it to the interacting and non-interacting protein pairs’ datasets among human and yeast, which gave us a clear separation between the interacting and non-interacting protein-pairs with an AUC score of over 0.70. Application to other datasets such as the interacting and non-interacting protein pairs among human or yeast and human and mouse didn’t give a clear separation though (AUC score below or close to 0.60). We reached conclusion that the mirrortree method should be applied with datasets that have a relatively large evolutionary coverage and this evolutionary coverage should be normalized. We then studied some other possible factors that might affect the approach and found out the evolutionary span could significantly change the correlation scores from this approach while the number of common species and the method used for distance matrices calculations don’t change the correlation scores a lot. We then studied the motif/residue based correlation analysis; firstly, we verified the motif/residue based correlation analysis on known interacting protein pair, Kv1.2 and β2 subunit. The motif/residue based correlation analysis was successfully verified here. Then we applied this method onto epithelial sodium and chloride channels, which included CFTR and ENaC, Clc2 and ENaC, CACC and CFTR, trying to identify the possible networks of sodium and chloride channels in the airway epithelia. This provided us the possible interactions of Clc2 and ENaC γ subunit, CFTR and CACC3. We then applied the same motif/residue based correlation analysis onto Ach receptor complex, we studied the muscle type, neuronal type and mixed type Ach receptor complex subunits interactions, we found significant interactions among muscle type Ach receptor subunits and significant non-interactions between muscle type and neuronal type Ach receptor subunits, but we didn’t find significant interactions among neuronal type Ach receptor subunits, which is not as expected. Possible reason here is the variety of neuronal type Ach receptor subunits, so they might be co-evolving with many possible other subunits, which lead to a lower level correlation for any single pair of them. We then studied the site-specific (amino acid level) correlation analysis. Here we applied the approach to study the specificity of toxins, i.e. conotoxins and scorpion toxins on Kv channels, we successfully predicted the possible hot-spots of their interaction interfaces and pinpointed them back to 3D structures to get a general view of the interaction mechanism; We determined the connections between variability of turret region and the specificity of different toxins to their different target channels. We also found the negative linear relationship of residue pairs’ (from interface of the interacting protein pairs) correlation scores with their distances in 3D complex structures, which further verified and strengthened the approach of site specific approach of correlation analysis. The hotspots information obtained from the approach was also successfully used to direct the docking of one toxin onto a voltage gated potassium channel.
Issue Date:2014-05-30
URI:http://hdl.handle.net/2142/49533
Rights Information:Copyright 2014 Hua Zhou
Date Available in IDEALS:2014-05-30
Date Deposited:2014-05


This item appears in the following Collection(s)

Item Statistics