Files in this item



application/pdfEvaluating_subl ... ors_for_big_data__UAP_.pdf (216kB)
(no description provided)PDF


Title:Evaluating sublinear estimators for big data
Author(s):Brando Miranda
Subject(s):big data
sublinear algorithms
data bases
Abstract:Increasingly, databases are storing more and more data, making it costly to go through all the data one may have in a database. However, users are still interested in being able to query a database holding their data to get some understanding of the data that they have. In this paper we propose three different sampling-based methods to estimate the total mean value of one particular attribute in a particular group of records in a data set. First we approximate the number of elements pertaining to one group and then, estimate their mean value. With these two approximated quantities, we can easily estimate the total amount one group contributes by multiplying both averages. We will also argue the correctness of the algorithms that we propose. We evaluate each algorithm in practice by comparing them on real data.
Issue Date:2014-05-16
Genre:Technical Report
Date Available in IDEALS:2020-12-23

This item appears in the following Collection(s)

Item Statistics