Files in this item
Files | Description | Format |
---|---|---|
application/pdf ![]() | (no description provided) |
Description
Title: | Remix and reuse of source code in software production |
Author(s): | Jones, M. Cameron |
Director of Research: | Twidale, Michael B. |
Doctoral Committee Chair(s): | Downie, J. Stephen |
Doctoral Committee Member(s): | Twidale, Michael B.; Smith, Linda C.; Karahalios, Karrie G. |
Department / Program: | Library & Information Science |
Discipline: | Library & Information Science |
Degree Granting Institution: | University of Illinois at Urbana-Champaign |
Degree: | Ph.D. |
Genre: | Dissertation |
Subject(s): | Software informatics
Remix programming Clone analysis Informetrics Source code Programming by Google |
Abstract: | The means of producing information and the infrastructure for disseminating it are constantly changing. The web mobilizes information in electronic formats, making it easier to copy, modify, remix, and redistribute. This has changed how information is produced, distributed, and used. People are not just consuming information; they are actively producing, remixing, and sharing information, using the web as a platform for creativity and production. This is true of software development as well. It is frequently commented by programmers and researchers who study software development, that programmers frequently copy and paste code. Although this practice is widely acknowledged, it is rarely studied directly, or explicitly accounted for in models of software development. However, this attitude is changing as software becomes more ubiquitous, and software development practice shifts away from the formal models of software engineering, towards a post-modernist perspective. This study explores how source code snippets in programming books and on the web are changing software development practice. By examining program source code using clone detection algorithms, this study provides a comprehensive view of code copying across 6,190 PHP-language applications. These data are used to explore the concept of a "remix" method of software production, where software and systems are built out of copied and pasted snippets of code. These findings are contrasted against both traditional models of information production coming from informetrics (e.g., authorship, citation analysis), and models from software engineering (e.g., the Lego Hypothesis). Explanations for observed phenomena are discussed borrowing metaphors from linguistics, which provide a richer explanation of copy-paste programming than offered by the Lego Hypothesis. The focus and findings of this study ultimately point to a pressing demand for further research centered on the notion of software as information. Software and software repositories hold a large amount of information about how it was produced, and how it is used, adapted, and maintained. Software informatics is proposed as an organizing label to study the science of information, practice, and communication around software. It studies the individual, collaborative, and social aspects of software production and use, spanning multiple representations of software from design, to source code, to application. |
Issue Date: | 2011-01-14 |
URI: | http://hdl.handle.net/2142/18255 |
Rights Information: | Copyright 2010 M. Cameron Jones Some Rights Reserved This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA. |
Date Available in IDEALS: | 2011-01-14 |
Date Deposited: | 2010-12 |
This item appears in the following Collection(s)
-
Graduate Dissertations and Theses at Illinois
Graduate Theses and Dissertations at Illinois -
Dissertations and Theses - Information Sciences
Dissertations and theses from the School of Information Sciences