Files in this item



application/pdfMalik Nadeem_Akhtar.pdf (2MB)Restricted Access
(no description provided)PDF


Title:Evaluation and assignment of significance levels to peptide identifications from the database search programs using resampling approach
Author(s):Akhtar, Malik Nadeem
Director of Research:Rodriguez-Zas, Sandra L.
Doctoral Committee Chair(s):Rodriguez-Zas, Sandra L.
Doctoral Committee Member(s):Sweedler, Jonathan V.; Villamil, Maria B.; Caetano-Anollés, Gustavo
Department / Program:Animal Sciences
Discipline:Animal Sciences
Degree Granting Institution:University of Illinois at Urbana-Champaign
Database search programs
Tandem mass spectrometry (MS)
Significance levels
k-permuted Decoy database
Abstract:A novel application of Monte Carlo permutation testing that improves the calculation of the peptide match significance levels and detection rate in database search programs is demonstrated. Novel k-permuted decoy databases (where k denotes the type and number of permutations) were evaluated for accurate computation of match significance levels. K-permuted decoy databases were generated by: (a) complete permutations of peptide sequences (Whole), (b) permutation of terminal positions of peptide sequences (End), and (c) permuted peptides that fall within a certain mass tolerance of the tandem mass spectra (Mass-based). The ‘Whole’ and ‘End’ based permutation tests were performed using various indicators of peptide match quality in OMSSA, Crux, and X! Tandem on manually annotated neuropeptide tandem mass spectrometry spectra. Permutation p-values were calculated as the fraction of the permutations in the k-permuted databases with match indicator score as extreme as the original spectra match in the target database. The ‘Whole’ k-permuted decoy databases identified most (up to 100%) neuropeptides, while the ‘End’ k-permuted decoy databases provided better discrimination of the performance between the match indicators. The permutation test based p-values using the hyperscore (X! Tandem), E-value (OMSSA) and Sp score (Crux) match indicators outperformed the other match indicators in the database search programs. The simple indicator of match “the number of matched ions” provided performance comparable to the best match indicators in the OMSSA, X! Tandem, and Crux. Databases of least 10^5 k-permuted decoy peptides per spectra provided accurate p-values. Overall, the ‘Whole’ and ‘End’ k-permuted decoy databases improved the consensus among the database search programs. The ability of the k-permuted decoy databases to improve the classifications among correct and incorrect peptide matches was evaluated with ‘Mass-based’ k-permuted decoy database using best match indicator in the OMSSA (i.e., E-value). The evaluation was performed by searching 5806 tryptic tandem mass spectra (671 with annotated peptide entries) against the standard target and combined target-decoy databases. False discovery rate estimates based on the target-decoy approach and known identities of the annotated spectra were used to filter the peptide-spectrum matches. The k-permuted decoy database approach enabled the detection of up to 89% and 87% annotated peptides relative to the OMSSA’s E-value with 82% and 84% identifications in the target database and target-decoy database, respectively. Improvements in performance was due to better performance of the k-permutation decoy database on small and large peptides with less than 13 matched fragment ions and large (insignificant) OMSSA E-values.
Issue Date:2015-01-21
Rights Information:Copyright 2014 Malik Nadeem Akhtar
Date Available in IDEALS:2015-01-21
Date Deposited:2014-12

This item appears in the following Collection(s)

Item Statistics