Files in this item



application/pdfWC1_iconf08.pdf (26kB)
(no description provided)PDF


Title:Is There a Cloud in your Future? Applications of "Cloud Computing" to Web-scale Problems
Author(s):Lin, Jimmy
Subject(s):cloud computing
Abstract:Is there a cloud in your future? Applications of “cloud computing” to Web-scale problems Proposal for a “wildcard” session, iConference 2008 Organizer: Jimmy Lin ( Assistant Professor College of Information Studies University of Maryland, College Park 1. Background IBM and Google recently committed a total of $30 million over two years to an initiative on “cloud computing”, in collaboration with six universities across the country (see references). They are: Berkeley, Carnegie Mellon, MIT, Stanford, the University of Maryland, and the University of Washington. I am the leader of this initiative at the University of Maryland, and to my knowledge the only participant from an iSchool (the rest are lead by faculty in computer science departments). “Cloud computing” refers to technology for exploiting large computer clusters to tackle “Web-scale” information processing problems, where immense quantities of data make traditional sequential processing impractical. Specifically, this initiative focuses on Google’s MapReduce programming paradigm, which was specifically designed for processing extremely large data sets (and indeed used by Google itself for much of its production operations). Programs written in the MapReduce functional style are automatically parallelized and executed on a large cluster of commodity machines. The run-time system takes care of the details of partitioning the input data, scheduling the program’s execution across a set of machines, handling machine failures, and managing the required inter-machine communication. Hadoop is an open-source implementation of the MapReduce framework. As a part of this initiative, IBM and Google are making Hadoop clusters available to the university collaborators, with the simultaneous goal of advancing research and education. For the past two months, the Computational Linguistics and Information Processing (CLIP) Lab at the University of Maryland has been actively exploiting this resource for research in natural language processing and information retrieval. The exponential explosion of information on the Web and in easily accessible digital formats forces us to think “outside the box” when tackling data-intensive “Web-scale” problems. Researchers must think and analyze data at a massively parallel scale or face the prospect of being relegated to work on “toy” problems. “Cloud computing” could potentially provide the infrastructure that allows researchers to tackle “Web-scale” challenges at a reasonable cost. From an educational point of view, the ability to think about problems in terms of parallel processing algorithms will become a critical skill in tomorrow’s work force. “Cloud computing” is an emerging technology that iSchools cannot afford to ignore. 2. Goals • To introduce the iSchool community to “cloud computing” and the MapReduce framework • To provide the iSchool community an overview of research and education efforts currently underway • To begin a discussion on the implications of “cloud computing” to research and education in iSchools
Issue Date:2008-02-28
Genre:Conference Paper / Presentation
Date Available in IDEALS:2010-03-11

This item appears in the following Collection(s)

Item Statistics