Files in this item

FilesDescriptionFormat

application/pdf

application/pdfMAO-THESIS-2021.pdf (1MB)
(no description provided)PDF

Description

Title:Spatial vs. graphical representation of distributional semantic knowledge
Author(s):Mao, Shufan
Advisor(s):Willits, Jon Anthony
Department / Program:Psychology
Discipline:Psychology
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:M.S.
Genre:Thesis
Subject(s):Semantic memory
distributional models
spreading activation
Abstract:Distributional semantic models represent words in a vector space and are competent in various semantic tasks. They are also limited in certain aspects such as representing different types of relations (e.g. syntagmatic vs. paradigmatic) in the same space and form indirect semantic relations, resulting in difficulty of advanced tasks such as inference meaning of unseen phrases and analogy. In this article, we propose a hybrid semantic model encoding distributional data (word co-occurrence) with graphical structure and measure lexical semantic relatedness by a spreading activation algorithm, which addresses the issue of spatial models. We systematically investigated the modeling parameters contributing to the representational capability, by manipulating and controlling the hyperparameters, and testing the models on a selectional preference task. The models are trained on an artificial corpus generated to describe ordered events happened in a toy world simulation, which embeds verb-noun selectional preference, and the task require the models to recover the verb-noun co-occurrence information and making inference on the selectional preference of verb-noun pairs absent in the corpus. We showed that both the graphical data structure with the spreading activation measure and the co-occurrence (information) encoding type attributing to the better performance and the capability to encode both syntagmatic and paradigmatic relations to form indirect semantic relationship and infer on unseen word pairs. As the hybrid graphical model is trained on corpus data, it is a semantic network from linguistic distributional statistics, and a new way of learning and representing semantic knowledge.
Issue Date:2021-04-27
Type:Thesis
URI:http://hdl.handle.net/2142/110573
Rights Information:Copyright 2021 Shufan Mao
Date Available in IDEALS:2021-09-17
Date Deposited:2021-05


This item appears in the following Collection(s)

Item Statistics