Preprint, Authorea.
ISSN/ISBN: Not available at this time. DOI: 10.22541/au.173901291.14107238/v1
Abstract: Context: Benford’s Law describes the distribution of numerical patterns, specifically focusing on the frequency of the leading digit in a set of natural numbers. It divides these numbers into nine groups based on their first digit, with the largest category comprising numbers beginning with 1, followed by those starting with 2, and so on. Objective: Each neuron within a Neural Network (NN) is associated with a numerical value called a weight, which is updated according to specific functions. This research examines the Degree of Benford’s Law Existence (DBLE) across two language model methodologies: (1) Recurrent Neural Networks (RNNs), and (2) Long Short-Term Memory (LSTM). Additionally, this study investigates whether models with higher performance exhibit a stronger presence of DBLE. Methods: Two neural network language models, namely: (1) Simple RNN and (2) LSTM, were selected as the subject models for the experiment. Each model is tested with five different optimizers and four different datasets (textual corpora selected from Wikipedia). This results in a total of 20 different configurations for each model. The neuron weights for each configuration were extracted at each epoch, and the following metrics were measured at each epoch: (1) DBLE, (2) training set accuracy, (3) training set error, (4) test set accuracy, and (5) test set error. Results: The results show that the weights in both models, across all optimizers, follow Benford’s Law. Additionally, the findings indicate a strong correlation between DBLE and the performance on the training set in both language models. This means that models with higher performance on training set, exhibit a stronger of DBLE. Benford’s Law, Recurrent Neural Network, Machine Learning, Language Model, Text Generation, Natural Language Processing, LSTM.
Bibtex:
@misc{,
author = {Farshad Ghassemi Toosi},
title = {Benford’s Law in Basic RNN and Long Short-Term Memory (LSTM) and their Associations},
year = {2025},
doi = {10.22541/au.173901291.14107238/v1},
url = {https://d197for5662m48.cloudfront.net/documents/publicationstatus/244597/preprint_pdf/999ef62cf2d65485e31c05658fcc324d.pdf},
}
Reference Type: Preprint
Subject Area(s): Computer Science