Files in this item

FilesDescriptionFormat

application/pdf

application/pdfSteven_Dalton.pdf (6MB)
(no description provided)PDF

Description

Title:Data parallel algebraic multigrid
Author(s):Dalton, Steven
Director of Research:Olson, Luke
Doctoral Committee Chair(s):Olson, Luke
Doctoral Committee Member(s):Gropp, William; Heath, Michael; Brown, Jed
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:Ph.D.
Genre:Dissertation
Subject(s):graphics processing unit (GPU)
algebraic multigrid (AMG)
Sparse matrix
Abstract:Algebraic multigrid methods for large, sparse linear systems are central to many computational simulations. Parallel algorithms for such solvers are generally decomposed into coarse-grain tasks suitable for distributed computers with traditional processing cores. Accelerating multigrid methods on massively parallel throughput-oriented processors, on the other hand, demands algorithms with abundant fine-grained parallelism. In this dissertation we analyze and decompose the smoothed aggregation algebraic multigrid method into parallel primitives effectively mapped to emerging architectures. Our formulation is performed in two phases, the first building on high-level generic abstractions to construct our solver in a architecture agnostic manner. Though effective we show that performance is severely limited by irregular sparse matrix operations, most notably sparse matrix-matrix multiplication. In the second phase, we address this performance bottleneck using novel techniques to optimize irregular sparse matrix operations. This dissertation is also concerned with irregular graph operations necessary to partition sparse matrices into disjoint sets for parallel processing. We apply our solver to accelerate eigenvector computations necessary during spectral partitioning methods and find that performance is limited by multigrid construction. We propose and analyze a novel strategy to accelerate graph Laplacian eigenvector computations by combining iterative methods, namely blocked Lanczos, with breadth first search operations on graphs. By basing our partitioner on primitives with substantial parallelism we demonstrate notable performance improvement compared with generic eigensolvers.
Issue Date:2015-01-21
URI:http://hdl.handle.net/2142/72802
Rights Information:Copyright 2014 Steven Dalton
Date Available in IDEALS:2015-01-21
Date Deposited:2014-12


This item appears in the following Collection(s)

Item Statistics