Files in this item

FilesDescriptionFormat

application/pdf

application/pdfpasternack_jeffrey.pdf (848Kb)
(no description provided)PDF

Description

Title:Knowing Who to Trust and What to Believe in the Presence of Conflicting Information
Author(s):Pasternack, Jeffrey
Advisor(s):Roth, Dan
Contributor(s):Gil, Yolanda; Han, Jiawei; Zhai, ChengXiang
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:Ph.D.
Genre:Doctoral
Subject(s):trust
trustworthiness
information trustworthiness
information trust
comprehensive trust metrics
subjective truth
fact-finders
fact-finding
factfinders
factfinding
generalized fact-finding
generalized fact-finders
constrained fact-finding
constrained fact-finders
generalized constrained models
GCMs
latent trust analysis
latent trustworthiness analysis
Latent Trust Analysis (LTA)
belief
information filtering
structured learning
constrained structured learning
constrained learning
Abstract:The Information Age has created an increasing abundance of data and has, thanks to the rise of the Internet, made that knowledge instantly available to humans and computers alike. This is not without caveats, however, as though we may read a document, ask an expert, or locate a fact nearly effortlessly, we lack a ready means to determine whether we should actually believe them. We seek to address this problem with a computational trust system capable of substituting for the user's informed, subjective judgment, with the understanding that truth is not objective and instead depends upon one's prior knowledge and beliefs, a philosophical point with deep practical implications. First, however, we must consider the even more basic question of how the trustworthiness of an information source can be expressed: measuring the trustworthiness of a person, document, or publisher as the mere percentage of true claims it makes can be extraordinarily misleading at worst, and uninformative at best. Instead of providing simple accuracy, we instead provide a comprehensive set of trust metrics, calculating the source's truthfulness, completeness, and bias, providing the user with our trust judgment in a way that is both understandable and actionable. We then consider the trust algorithm itself, starting with the baseline of determining the truth by taking a simple vote that assumes all information sources are equally trustworthy, and quickly move on to fact-finders, iterative algorithms capable of estimating the trustworthiness of the source in addition to the believability of the claims, and proceed to incorporate increasing amounts of information and declarative prior knowledge into the fact-finder's trust decision via the Generalized and Constrained Fact-Finding frameworks while still maintaining the relative simplicity and tractability of standard fact-finders. Ultimately, we introduce Latent Trust Analysis, a new type of probabilistic trust model that provides the first strongly principled view of information trust and a wide array of advantages over preceding methods, with a semantically crisp generative story that explains how sources "generate" their assertions in claims. Such explanations can be used to justify trust decisions to the user, and, moreover, the transparent mechanics make the models highly flexible, e.g. by applying regularization via Bayesian prior probabilities. Furthermore, as probabilistic models they naturally support semi-supervised and supervised learning when the truth of some claims or the trustworthiness of sources is already known, unlike fact-finders which are perform only unsupervised learning. Finally, with Generalized Constrained Models, a new structured learning technique, we can apply declarative prior knowledge to Latent Trust Analysis models just as we can with Constrained Fact-Finding. Together, these trust algorithms create a spectrum of approaches that trade increasing complexity for greater information utilization, performance, and flexibility, although even the most sophisticated Latent Trust Analysis model remains tractable on a web-scale dataset. As our trust algorithms improve our ability to separate the wheat from the chaff, the curse of modern "information overload" may become a blessing after all.
Issue Date:2012-02-01
Genre:thesis
URI:http://hdl.handle.net/2142/29516
Rights Information:Copyright 2011 Jeffrey Pasternack
Date Available in IDEALS:2012-02-01
2014-02-01
Date Deposited:2011-12


This item appears in the following Collection(s)

Item Statistics

  • Total Downloads: 97
  • Downloads this Month: 7
  • Downloads Today: 0