Withdraw
Loading…
Crafting unusual programs for fuzzing deep learning libraries
Yang, Shujing
Loading…
Permalink
https://hdl.handle.net/2142/120389
Description
- Title
- Crafting unusual programs for fuzzing deep learning libraries
- Author(s)
- Yang, Shujing
- Issue Date
- 2023-04-20
- Director of Research (if dissertation) or Advisor (if thesis)
- Zhang, Lingming
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- Software Engineering
- Language Models
- Language
- eng
- Abstract
- Deep Learning (DL) applications play a vital role in modern society. Bugs in DL libraries can significantly impact a wide range of downstream DL applications, making it crucial to develop effective testing techniques for these libraries. Generating valid input programs for fuzzing DL libraries is challenging, as they must adhere to both the syntax/semantics of the supported languages (e.g., Python) and the tensor/operator constraints required for constructing valid computational graphs. The recent TitanFuzz work has shown, for the first time, that modern Large Language Models (LLMs) can be directly employed to implic- itly learn all language and DL computation constraints to create valid programs for fuzzing DL libraries. However, LLMs tend to generate ordinary programs that follow patterns/to- kens similar to typical programs found in their vast training corpora (e.g., GitHub), whereas fuzzing favors unusual inputs that cover edge cases or are less likely to be manually produced. To address this challenge, we propose AtlasFuzz, the first technique to prime LLMs for synthesizing unusual programs to enhance fuzzing effectiveness. AtlasFuzz is based on the well-established hypothesis that historically bug-triggering programs may contain rare and valuable code elements crucial for bug discovery. While traditional techniques leveraging such historical information demand extensive human effort to design dedicated generators and ensure the syntactic/semantic validity of the generated programs, AtlasFuzz demon- strates that this process can be fully automated through the intrinsic capabilities of LLMs (including fine-tuning and in-context learning) and is generalizable and applicable to chal- lenging domains. Furthermore, AtlasFuzz also highlights the potential of directly utilizing the instruction-following capability of the recent ChatGPT for effective fuzzing. Our experimental study on two popular DL libraries (PyTorch and TensorFlow) reveals that AtlasFuzz is an effective fuzzer for DL libraries, detecting 18 bugs, including 10 already confirmed as previously unknown bugs.
- Graduation Semester
- 2023-05
- Type of Resource
- Thesis
- Handle URL
- https://hdl.handle.net/2142/120389
- Copyright and License Information
- Copyright 2023 Shujing Yang
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Computer Science
Dissertations and Theses from the Siebel School of Computer ScienceManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…