# Browse College of Engineering by Contributor "Padua, David A."

• (2001)
The last analysis targets container objects that are provided by standard libraries. Object-oriented design plays an increasing role in performance-critical codes. When dealing with general-purpose programs, arrays share ...

application/pdf

PDF (7MB)
• (2013-08-22)
Parallelization is one of the major challenges for programmers. But parallelizing existing code is a hard task that can lead to less than optimal solutions since sequential programs can su er from impediments to ...

application/pdf

PDF (548kB)
• (1995)
Memory-related anti- and output dependences are false dependences because they do not represent the flow of data but rather only the collisions caused by memory location reuse. Privatization is a technique to eliminate ...

application/pdf

PDF (5MB)
• (2001)
After presenting the translation and optimization techniques utilized by the SPL compiler, empirical data is presented showing the efficiency of the resulting C/FORTRAN code. Timings are compared, using fast Fourier transform ...

application/pdf

PDF (5MB)
• (1992)
Prolog has a number of advantages for use in rapid prototyping. The explotation of parallelism holds the promise of making these prototypes directly executable. This dissertation addresses the parallel execution of Prolog ...

application/pdf

PDF (10MB)
• (2013-08-22)
As the demand increases for high performance and power efficiency in modern computer runtime systems and architectures, programmers are left with the daunting challenge of fully exploiting these systems for efficiency, ...

application/pdf

PDF (11MB)
• (1999)
We introduce two intermediate representations: the concurrent control flow graph, and the concurrent static single assignment form. Based on these representations, we develop an analysis technique, called concurrent global ...

application/pdf

PDF (5MB)
• (2000)
We use stack distances to quantify locality and we show that the average locality computed using stack distances is a very reliable metric. A new algorithm for stack processing, that is 30% faster than the best know algorithm ...

application/pdf

PDF (6MB)
• (2000)
We have studied five different possible parallelization methods for irregular reduction loops, all of which can be applied automatically by a compiler. We compared their ease of use, applicability, supporting compiler ...

application/pdf

PDF (8MB)
• (2015-04-20)
This thesis studies the compilation and runtime techniques to improve the performance of dynamic scripting languages using R programming language as a test case. The R programming language is a convenient system for ...

application/pdf

PDF (10MB)
• (2012-09-18)
Historically, the creators of parallel programming models have employed two different approaches to make their models available to developers: either by providing a library with hooks for common programming languages, by ...

application/pdf

PDF (2MB)
• (1996)
This thesis addresses the issues of translating an interactive array language, such as MATLAB$\sp1$, into a traditional compiled language, such as Fortran, in order to achieve better performance. It describes the main ...

application/pdf

PDF (5MB)
• (1997)
This dissertation explores the applicability of fully automatic parallelizing techniques for distributed memory multiprocessors. In the research, an ordinary Fortran 77 program is assumed as input, and no information is ...

application/pdf

PDF (5MB)
• (2011-05-25)
Exploiting parallelism in modern machines increases the di culty of developing applications. Thus, new abstractions are needed that facilitate parallel programming and at the same time allow the programmer to control ...

application/pdf

PDF (9MB)
• (1996)
Despite rapid increases in CPU performance, the primary obstacles to achieving higher performance in contemporary processor organizations remain control and data hazards. Primary data cache misses are responsible for the ...

application/pdf

PDF (7MB)
• (1992)
The optimization of programs with explicit--i.e. user specified--parallelism requires the computation of the data dependence relation if optimizations performed by the compiler are to preserve sequential consistency. Shasha ...

application/pdf

PDF (6MB)
• (2011-01-14)
This thesis presents a new, Java-based object-oriented parallel language called Deterministic Parallel Java (DPJ). DPJ uses a novel effect system to guarantee determinism by default. That means that parallel programs ...

application/pdf

PDF (1MB)
• (2013-02-03)
Accelerator devices like the General Purpose Graphics Computing Units (GPGPUs) play an important role in enhancing the performance of many contemporary scientific applications. However, programming GPUs using languages ...

application/pdf

PDF (873kB)
• (2011-08-26)
With the emergence of highly multithreaded architectures, an effective performance monitoring system must reflect the interaction between a large number of concurrent events, and associate the overall effect of individual ...

application/pdf

PDF (1MB)
• (1996)
Reducing memory latency is critical to the performance of large-scale parallel systems. Due to the temporal and spatial locality of memory reference patterns, private caches can eliminate redundant memory accesses and ...

application/pdf

PDF (5MB)