BXC - Benford Analysis Tool
A forensic digital analysis tool for detecting data manipulation using Benford's Law.
Overview
BXC (Benford X-C) is a command-line tool that analyzes numerical data to determine if it follows Benford's Law, a mathematical principle that describes the expected frequency distribution of leading digits in naturally occurring datasets. Deviations from this distribution can indicate data manipulation, fraud, or synthetic data generation.
Features
- First digit or all digits analysis - Choose between analyzing only the first digit or all digits in your dataset
- Real-time animated visualization - Watch digit frequencies update as data is processed with cumulative display
- Animated GIF export - Generate animated GIFs from analysis results for presentations and reports
- Chi-squared statistical testing - Automatic calculation with 95% confidence level interpretation
- Multiple data sources - Analyze local files or download from URLs (HTTP/FTP)
- CSV column extraction - Analyze specific columns from multi-column datasets
- Comprehensive reporting - Generates value logs, percentage logs, and ASCII charts
- Interactive and batch modes - Use command-line flags or interactive prompts
- Custom metadata - Add titles, descriptions, and source information to analyses
Installation
From DEB Package
bash
sudo dpkg -i bxc2.0.0amd64.deb
From Source
Requires the FreeBASIC Compiler (fbc) to compile from source:
bash
fbc bxc.bas
Usage
Basic Syntax
bash
bxc -f [file] -d [1|all] -l [length] -c [column] [options]
Required Flags
-f [file] - Data file to analyze (local file or URL)
-d [1|all] - Analyze first digit (1) or all digits (all)
-l [number] - Sample pool length (typically 10000 for statistical significance)
-c [number] - Column number (0 for single column data)
Optional Flags
-a [interval] - Enable animated graph display (updates every N records, default: 100)
-g - Generate animated GIF from animation (requires -a flag)
-t [text] - Title for the analysis
-s [text] - Data source description
-i [text] - Additional information/description
-h, --help - Display help message
Examples
Basic first digit analysis:
bash
bxc -f financial_data.dat -d 1 -l 10000 -c 0
Multi-column CSV with animation:
bash
bxc -f transactions.csv -d 1 -l 10000 -c 2 -a 50
All digits analysis:
bash
bxc -f dataset.dat -d all -l 5000 -c 0
Analyze data from URL:
bash
bxc -f http://example.com/data.csv -d 1 -l 10000 -c 1
Interactive mode:
bash
bxc
Program will prompt for all parameters
Generate animated GIF with custom metadata:
bash
bxc -f sales_data.csv -d 1 -l 10000 -c 2 -a 50 -g -t "Q4 Sales Analysis" -s "Company XYZ"
Understanding the Output
Animated Display
When using the
-a flag, you'll see a live updating display showing cumulative analysis:
========================================================================
Benford X-C Live Analysis - Animated View (Cumulative)
========================================================================
Records processed: 15342 | Total digits analyzed: 8450
------------------------------------------------------------------------
Digit Actual Expected Deviation Chart
----- ------- -------- --------- ---------------------------------
1 30.12% 30.10% +0.02% ███████████████
2 17.58% 17.60% -0.02% ████████
3 12.51% 12.50% +0.01% ██████
...
The cumulative display shows how data progressively converges (or diverges) from Benford's Law, making it easier to detect fraud patterns.
Final Report
At the end of analysis, you'll receive:
Final Statistics - Average percentages across all samples
Chi-Squared Test Result - Statistical significance test
Interpretation - Whether data fits Benford's Law
Chi-Squared Statistic: 8.2347
Result: Data FITS Benford's Law (95% confidence)
No significant deviation detected.
Output Files
Values Log [filename]_[mode]-[sample]-.log - Raw digit counts
Percentage Log [filename][mode]-[sample].log - Percentage distributions
ASCII Chart chart[filename][mode]-[sample]_.log.txt - Visual representation
Animated GIF benfordanimation[timestamp].gif - Animated visualization (when using -g flag)
Benford's Law Reference
Expected first digit frequencies:
| Digit | Expected % |
|-------|------------|
| 1 | 30.1% |
| 2 | 17.6% |
| 3 | 12.5% |
| 4 | 9.7% |
| 5 | 7.9% |
| 6 | 6.7% |
| 7 | 5.8% |
| 8 | 5.1% |
| 9 | 4.6% |
Chi-Squared Interpretation
χ² < 15.51 - Data fits Benford's Law (no manipulation detected)
15.51 ≤ χ² < 20 - Moderate concern, investigate further
χ² ≥ 20 - High concern, likely fraud or synthetic data
Use Cases
Financial Fraud Detection - Analyze accounting records, invoices, expenses
Election Data Verification - Detect potential vote manipulation
Scientific Data Validation - Verify experimental or survey data authenticity
Tax Compliance - Audit financial statements and tax returns
Insurance Claims - Identify potentially fraudulent claim patterns
Audit and Compliance - General purpose data integrity verification
Technical Details
Requirements
Linux/Unix operating system (x86-64)
Standard GNU utilities (cut, wget, head, tail)
Terminal with ANSI escape code support (for animation)
ImageMagick (for GIF generation, optional)
Sample Size Recommendations
Benford's Law analysis requires adequate sample sizes:
Minimum: 1,000 records
Recommended: 10,000+ records
Optimal: 50,000+ records
Performance
Processes ~125,000 records per second
Minimal overhead with animation enabled (~5%)
Efficient memory usage for large datasets
About Benford Bench Project
BXC is part of the Benford Bench project (benfordbench.org), operational since 2016. The project was created to crowdsource fraud identification and reporting in big data through Benford's Law analysis.
Project Contributors
Jason Page (Original Author)
Morris Chukhman
Padraig O'Hara
Kevin Perez
Michael Fiedler
License
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see <https://www.gnu.org/licenses/>.
Support
For issues, questions, or contributions:
Visit: https://benfordbench.org
Report issues on the project repository
Version
Current version: 2.0.0
See
CHANGELOG.md for version history and updates.
What's New in 2.0.0
Animated GIF Export - Generate shareable animated GIFs of your analysis
Cumulative Animation - Watch data converge to Benford's Law in real-time
Enhanced Display - 2 decimal precision and fixed-width columns for cleaner output
Custom Metadata - Add titles, descriptions, and source information
Improved Labels - Clearer terminology throughout the interface
For complete details, see
CHANGELOG.md.