Files in this item



application/pdfOSEI-OWUSU-THESIS-2020.pdf (399kB)
(no description provided)PDF


Title:CodeSimilarity: an approach for clustering introductory programming assignments
Author(s):Osei-Owusu, Jonathan
Advisor(s):Xie, Tao
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
programming education
Abstract:Enrollment in introductory programming (CS1) courses continues to surge and hundreds of CS1 students can produce thousands of submissions for a single problem, all requiring timely feedback and accurate grading. While not exclusive to CS1 courses, instructors of such courses are challenged to provide feedback at scale (e.g., to hundreds of students). Because these students have a diverse range of skills and backgrounds, it is essential to differentiate common strategies and shortcomings of student submissions to a given problem. There is a strong need for clustering submissions by the similarity of their strategies for enabling instructors to provide customized feedback to students. To fill this need, in this thesis, we present the CodeSimilarity approach, which first automatically generates test data for correct student submissions and then uses semantic program features (i.e., path conditions) to cluster correct student submissions by their strategies. We define the strategy employed by a student submission as the way that the problem space is partitioned into sub-spaces and how the problem is uniquely addressed within each sub-space. In particular, CodeSimilarity leverages automated test generation based on symbolic execution to determine the path conditions for a given submission; comparing each submission’s path conditions allows to establish behavioral equivalence relationships with respect to the strategies employed by these submissions. We evaluate CodeSimilarity on four datasets to assess the effectiveness of our approach. The evaluation results show that by using semantic program features (i.e., path conditions), CodeSimilarity can effectively cluster submissions that employ the same strategy.
Issue Date:2020-07-24
Rights Information:Copyright 2020 Jonathan Osei-Owusu
Date Available in IDEALS:2020-10-07
Date Deposited:2020-08

This item appears in the following Collection(s)

Item Statistics