Files in this item



application/pdfKHANEJA-THESIS-2015.pdf (828kB)
(no description provided)PDF


Title:An experimental study of monolithic scheduler architecture in cloud computing systems
Author(s):Khaneja, Gourav
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Cluster Scheduling
Cloud Computing
Performance Evaluation
Abstract:Scheduling in large scale computing clusters is critical to job performance and resource utilization. As the cluster size grows to thousands of machines and scheduling needs become complex and varied, scheduling in cloud-scale clusters presents unique challenges. To encourage the development of innovative schedulers, there is a need for an experimental framework to analyze scheduling performance over large clusters, using relatively modest resources. In this thesis, we present an experimental scheduler testbed to study job scheduling in emulated cloud-scale clusters. We show that the performance of the scheduler in an emulated cluster models closely the same in a real cluster of the same size. We use the testbed to evaluate the monolithic scheduler architecture, a popular scheduling architecture, in a 6000 node emulated cluster over realistic workload. We conclude that scheduling algorithms should embrace randomness in order to beat resource contention. We infer that scheduling in the monolithic architecture is a network I/O intensive process. We calculate the optimal value of design parameters for the monolithic architecture for Google workload. Hadoop YARN is a popular open-source cluster management framework which can be seen as an implementation of the monolithic scheduler architecture. We evaluate the three default scheduling policies in Hadoop YARN: Capacity, Fair and Fifo, over realistic workload. Based on our experiments, we observe that Fifo scheduling results in unbalanced load across cluster machines and is not suitable for enterprise clusters. We study the trade-offs exploited by Capacity and Fair scheduler: while the Fair scheduler offers less scheduling delay by avoiding head-of-the-line blocking problem, it may drop applications in case the load increases. On the other hand, the Capacity scheduler does not drop any application but errs on the side of higher scheduling delay.
Issue Date:2015-04-24
Rights Information:Copyright 2015 Gourav Khaneja
Date Available in IDEALS:2015-07-22
Date Deposited:May 2015

This item appears in the following Collection(s)

Item Statistics