Can LLMs implicitly learn numeric parameter constraints in data science APIs?
Deng, Yinlin
This item's files can only be accessed by the System Administrators group.
Permalink
https://hdl.handle.net/2142/127478
Description
Title
Can LLMs implicitly learn numeric parameter constraints in data science APIs?
Author(s)
Deng, Yinlin
Issue Date
2024-12-05
Director of Research (if dissertation) or Advisor (if thesis)
Zhang, Lingming
Department of Study
Siebel School Comp & Data Sci
Discipline
Computer Science
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
Program Synthesis
Evaluation
Dataset
Data Science
LLM4Code
Abstract
Data science (DS) programs, typically built on popular DS libraries (such as PyTorch and NumPy) with thousands of APIs, serve as the cornerstone for various mission-critical domains such as financial systems, autonomous driving software, and coding assistants. Recently, large language models (LLMs) have been widely applied to generate DS programs across diverse scenarios, such as assisting users for DS programming or detecting critical vulnerabilities in DS frameworks. Such applications have all operated under the assumption, that LLMs can implicitly model the numerical parameter constraints in DS library APIs and produce valid code. However, this assumption has not been rigorously studied in the literature. In this paper, we empirically investigate the proficiency of LLMs to handle these implicit numerical constraints when generating DS programs. We studied 28 widely used APIs from PyTorch and NumPy, and scrutinized the LLMs’ generation performance in different levels of granularity: full programs and all parameters completion. The results show that LLMs are great at generating simple DS programs, particularly those that follow common patterns seen in training data. However, as we increase the difficulty by providing more complex/unusual inputs, the performance of LLMs drops significantly.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.