View Complete Reference

Schultz, DR, Dinçkol, Ö, Valentini, S, De Felip, E, Calamandrei, G, Karakitsios, S, Sarigiannis, DA, Pino, A and Fuentes, B (2025)

Application of the Newcomb-Benford Law in biological exposure studies: Data manipulation versus physiological pattern identification [version 2; peer review: 1 approved]

Open Research Europe 5(166).

ISSN/ISBN: Not available at this time. DOI: 10.12688/openreseurope.19212.2



Abstract: Background The Newcomb-Benford Law (NBL) describes the distribution frequency of the first significant digit in large data sets and suggests that the probability for the significant digits is not uniform. To date, it is used as an evaluation tool to identify fraudulent or manipulated data across many fields, although this practice may be flawed. Uninformed applications may lead to inappropriate labelling of data as falsified when, in actuality, underlying mechanisms may drive accordance to the distribution. Objectives To investigate the properties of the NBL in the natural sciences, we applied the concept to five toxicological datasets to investigate how it can be applied in biological systems and if intrinsic characteristics of the data may drive accordance or violation. We hypothesize that physiologically advantageous (essential and beneficial) elements will result in a violation of the NBL due to active and preferential uptake. Methods Observational and experimental studies, different age groups, and biosample types (blood, urine, hair, bone) were assessed, with each measuring a range of essential, beneficial, and non-essential elements. Eight statistical tests were trialled, including the often-used Pearson’s Chi-squared. The Judge-Schecter Mean Deviation test was determined to be the most appropriate statistical test and was applied to each elemental grouping (Essential, Beneficial, Non-Essential) and a combination of all. Results Results indicate that physiological mechanisms, like preferential elemental uptake, lead to violations with distinctive patterns that also depend on the anatomic function of the biosample. Discussion Visual analysis based on the NB distribution could be useful in distinguishing the difference between violations resulting from data tampering versus those resulting from innate physiological mechanisms. Overall, the use of the NBL for detecting data manipulation in exposure studies necessitates reevaluation, urging a deeper understanding of the intrinsic nature of the data.


Bibtex:
@article{, author = {Dayna R. Schultz and Öykü Dinçkol and Silvia Valentini and Elena De Felip and Gemma Calamandrei and Spyros Karakitsios and Dimosthenis A. Sarigiannis and Anna Pino and Byron Fuentes}, title = {Application of the Newcomb-Benford Law in biological exposure studies: Data manipulation versus physiological pattern identification {[version 2; peer review: 1 approved]}}, year = {2025}. journal = {Open Research Europe}, volume = {5}, number {166}, url = {https://open-research-europe.ec.europa.eu/articles/5-166}, doi = {10.12688/openreseurope.19212.2}, }


Reference Type: Journal Article

Subject Area(s): Biology