Files in this item

FilesDescriptionFormat

application/pdf

application/pdfGregory_Ross.pdf (4MB)
(no description provided)PDF

Description

Title:High performance histogramming on massively parallel processors
Author(s):Ross, Gregory
Advisor(s):Hwu, Wen-Mei W.
Department / Program:Electrical & Computer Eng
Discipline:Electrical & Computer Engr
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:M.S.
Genre:Thesis
Subject(s):General Purpose Computing on Graphics Processing Units (GPGPU)
Compute Unified Device Architecture (CUDA)
Histogram
Image Processing
Parallelism
Abstract:Histogramming is a technique by which input datasets are mined to extract features and patterns. Histograms have wide range of uses in computer vision, machine learning, database processing, quality control for manufacturing, and many applications benefit from advance knowledge about the distribution of data. Computing a histogram is, essentially, the antithesis of parallel processing. Without the use of slow atomic operations or serial execution when contributing data to a histogram bin in an input-driven method, there would likely be inaccuracies in the resulting output. An output-driven method would eliminate the need for atomic operations but would amplify read bandwidth requirements, reduce overall throughput, and result in a zero or negative gain in performance. We introduce a method to pack multiple bins into a memory word with the goal of better utilizing GPU resources. This method improves GPU occupancy relative to earlier histogram kernel implementations, increases the number of working threads to better hide the latency of atomic operations and collisions while maintaining reasonable throughput. This technique will be demonstrated to improve performance of histogram functions of various sizes with a variety of inputs, including a study on a particular application. While the results are heavily driven by dependancies on input data patterns, the conclusions gathered in this thesis will outline that the packed atomics histogramming kernel can and usually does outperform other implementations in all but a select number of exceptions.
Issue Date:2014-09-16
URI:http://hdl.handle.net/2142/50625
Rights Information:Copyright 2014 Gregory Ross
Date Available in IDEALS:2014-09-16
Date Deposited:2014-08


This item appears in the following Collection(s)

Item Statistics