View Complete Reference

Kössler, W, Lenz, H-J and Wang, XD (2021)

Is the Benford Law Useful for Data Quality Assessment?

In: Knoth, S., Schmid, W. (eds) Frontiers in Statistical Quality Control 13. ISQC 2019, Springer, Cham, pp. 391-406.

ISSN/ISBN: 978-3-030-67856-2 DOI: 10.1007/978-3-030-67856-2_22

Abstract: Data quality and data fraud are ofincreasing concern in the digitalworld. Benford’s Law is used worldwide for detecting non-conformance or data fraud of numerical data. It says that the first non-zero digit D1, of a data item from a universe, is not uniformly distributed. The shape is roughly logarithmically decaying starting with P(D1 = 1) ∼= 0.3. It is self-evident that Benford’s Law should not be applied for detecting manipulated or faked data before having examined the goodness of fit of the probability model while the business process is free of manipulations, i.e. ‘under control’. In this paper, we are concerned with the goodness-of-fit phase, not with fraud detection itself. We selected five empirical numerical data sets of various sample sizes being publicly accessible as a kind of benchmark, and evaluated the performance of three statistical tests. The tests include the chi-square goodness-of-fit test, which is used in businesses as a standard test, the Kolmogorov–Smirnov test, and the MAD test as originated by Nigrini (1992). We are analyzing further whether the invariance properties of Benford’s Law might improve the tests or not.

@InProceedings{, author="K{\"o}ssler, Wolfgang and Lenz, Hans-J. and Wang, Xing D.", editor="Knoth, Sven and Schmid, Wolfgang", title="Is the Benford Law Useful for Data Quality Assessment?", booktitle="Frontiers in Statistical Quality Control 13", year="2021", publisher="Springer International Publishing", address="Cham", pages="391--406", isbn="978-3-030-67856-2" }

Reference Type: Conference Paper

Subject Area(s): Statistics