Improving reasoning capabilities of large language models

Dixit, Tanay

Improving reasoning capabilities of large language models

Dixit, Tanay

This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.

Permalink

https://hdl.handle.net/2142/129521

Description

Title

Improving reasoning capabilities of large language models

Author(s)

Dixit, Tanay

Issue Date

2025-04-11

Director of Research (if dissertation) or Advisor (if thesis)

Han, Jiawei

Department of Study

Siebel School Comp & Data Sci

Discipline

Computer Science

Degree Granting Institution

University of Illinois Urbana-Champaign

Degree Name

M.S.

Degree Level

Thesis

Keyword(s)

Natural Language Processing
Large Language Models

Language

eng

Abstract

Large Language Models (LLMs) have succeeded at solving several wide range of tasks like mathematical problem solving, code generation, common-sense reasoning, etc. The recent success of these models is largely attributed to scaling in both model size and training data, which often includes massive amounts of web-scale and synthetic data — raising important questions about their true generalization capabilities. Several studies have highlighted critical failure cases in the reasoning abilities of LLMs, such as token biases in logical problem-solving and sensitivity to the order of premises, indicating a reliance on surface-level cues rather than true logical understanding. Additionally, these reasoning abilities ares hown to emerge only when models are trained on extremely large datasets. This technique of learning to reason deviates from how humans learn to reason and think. Humans learn to solve problems by first understanding and acquiring the fundamental principles involved in reasoning, and then learn to apply these principles to new tasks, rather than directly learning to solve hundreds of complex problems. Inspired by this, we aim to train LLMs to learn to reason with the help of axioms - fundamental principles of reasoning, in particular causal axioms. Causal axioms lay the crucks of causal inference which humans use in making decisions or inferences in several scenarios. The influence of causal axioms on the reasoning abilities of LLMs remains underexplored; in this work, we demonstrate that causal axiomatic training can enhance LLM performance across a broad range of reasoning tasks, even those not directly related to causality. Our extensive evaluation results across 16 benchmarks, shows that LLMs fine-tuned using our axiomatic data show stronger gains compared to baseline approaches on most tasks.

Graduation Semester

2025-05

Type of Resource

Thesis

Handle URL

https://hdl.handle.net/2142/129521

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Dissertations and Theses - Computer Science

Dissertations and Theses from the Siebel School of Computer Science

Improving reasoning capabilities of large language models

Dixit, Tanay

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Computer Science

Log In