Files in this item
|(no description provided)||ZIP|
|Title:||Benford’s Law in arbitrary datasets|
Benford’s law is about a frequency distribution of numbers that span multiple orders of magnitude. Specifically, it concerns the most significant digit in such a distribution. For example, the leading digit of 392 is 3, that of 1042 is 1, and so on. in the following datasets, I’m going to plot the leading digits of certain values to demonstrate how this works. I will use the following formula to do this: If x is the datapoint, we take FLOOR(x/POWER(10, FLOOR(LOG(x,10)))) to get this.
Though you might expect a bunch of random numbers to have an equal chance of getting any given leading digit, about 11.1%, Benford’s law states that this is far from the case. There’s around a 30% chance of getting 1 and 5% chance of getting 9, and the graph, in general, follows a logarithmic scale (the logarithmic scale is also the reason for this law).
However, you shouldn’t take my word for it that this is the case. I’ve chosen 3 datasets that are completely unrelated to each other. I’m going to look for values in these datasets that span multiple orders of magnitude, and plot their leading digits to find out how credible Benford’s Law really is.
(Acknowledgement) Professor John Hart for his excellent course in Data Visualization (CS 498) over the summer of 2018.
|Rights Information:||Copyright 2018 Karan Abrol|
|Date Available in IDEALS:||2019-02-07|