Files in this item
|(no description provided)|
|Title:||Measurement-based performance analysis and modeling of parallel systems|
|Doctoral Committee Chair(s):||Iyer, Ravishankar K.|
|Department / Program:||Electrical Engineering|
|Degree Granting Institution:||University of Illinois at Urbana-Champaign|
|Subject(s):||Engineering, Electronics and Electrical
|Abstract:||The CPUs, memory, interconnection network, operating system, runtime system, I/O subsystem, and application characteristics all play an important role in determining the overall performance obtained from a parallel system. However, previous studies have mostly looked at the memory or the OS or the network performance in isolation. A global view of the overheads from different system perspectives has been lacking. In this dissertation, we characterize the overheads for large application benchmarks executing on the Cedar shared-memory parallel system from operating system, runtime system parallelization, and global memory and interconnection network contention perspectives.
Parallel systems are often used in multiprogrammed environments. However, the issue of scalability in multiprogrammed shared-memory parallel systems has not been studied before. We investigate the scalability of the Cedar system in multiprogrammed environments and show that there is no performance improvement with scaling for fine-grained loop parallel applications executing in multiprogrammed workloads. We also demonstrate that there is an exponential drop in the overhead due to multiprogramming as the loop granularity is increased. We then propose and implement a self-preemption technique to improve the performance of fine-grained applications in multiprogrammed environments.
To balance the processor performance of parallel systems with sufficient I/O performance, several parallel I/O systems have been developed in recent years. However, very little is understood about their performance. We characterize the performance of the PIOUS parallel I/O system on the DEC Alphacluster, via real system measurements, and show that the message passing processing overheads at the compute and I/O nodes limit the throughput that they can sustain. We also use these measurements to provide realistic input parameters to PioSim, a parallel I/O simulation environment we have developed.
PioSim offers a number of unique features: (1) two architecture models--remote and local disk architecture models, (2) two usage models--simple and intelligent parallel I/O models, and (3) an application-oriented synthetic parallel I/O workload generator, PioSyn, capable of modeling a wide variety of temporal and spatial application file access patterns. We illustrate the potential of PioSim and PioSyn through experiments on the Alphacluster model, for scientific, database, and videoserver workloads.
|Rights Information:||Copyright 1996 Natarajan, Chitra|
|Date Available in IDEALS:||2011-05-07|
|Identifier in Online Catalog:||AAI9712385|
This item appears in the following Collection(s)
Dissertations and Theses - Electrical and Computer Engineering
Dissertations and Theses in Electrical and Computer Engineering
Graduate Dissertations and Theses at Illinois
Graduate Theses and Dissertations at Illinois