Optimal round and sample-size complexity for partitioning in parallel sorting
Yang, Wentao
This item's files can only be accessed by the System Administrators group.
Permalink
https://hdl.handle.net/2142/115736
Description
Title
Optimal round and sample-size complexity for partitioning in parallel sorting
Author(s)
Yang, Wentao
Issue Date
2022-04-28
Director of Research (if dissertation) or Advisor (if thesis)
Solomonik, Edgar
Department of Study
Computer Science
Discipline
Computer Science
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
parallel sorting
data partitioning
Abstract
State-of-the-art parallel sorting algorithms for distributed-memory architectures are based on computing a balanced partitioning via sampling and histogramming. By finding samples that partition the sorted keys into evenly-sized chunks, these algorithms minimize the number of communication rounds required. Histogramming (computing positions of samples) guides sampling, enabling a decrease in the overall number of samples collected. We derive lower and upper bounds on the number of sampling/histogramming rounds required to compute a balanced partitioning. We improve on prior results to demonstrate that when using p processors/parts, O(log∗ p) rounds with O(p/ log∗ p) samples per round suffice. We match that with a lower bound that shows any algorithm requires at least Ω(log∗ p) rounds with O(p) samples per round. Additionally, we prove the Ω(p log p) samples lower bound for one round, showing the optimality of sample sort in this case. To derive the lower bound, we propose a hard randomized input distribution and apply classical results from the distribution theory of runs.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.