arcetri / sts Goto Github PK
View Code? Open in Web Editor NEWImproved version of the NIST Statistical Test Suite (STS)
Improved version of the NIST Statistical Test Suite (STS)
Does this project support poker test?
While running the suite, I decided to test a run of 2 ^ 20 0 bits, to confirm the library would fail all tests. To my surprise, it still passed the approximate entropy test.
After some digging in the code, I found that the compute_phi
function in approximateEntropy.c
uses the log
function on all members of the array state->apen_C[thread_state->thread_id]
. If any of these members are zero, it returns as nan
, which messes up future calculations until the test gives a false pass.
Bug:
/*
* Step 3 and 4a: compute the the terms of the phi formula
*/
sum = 0.0;
for (i = 0; i < powLen; i++) {
sum += (double) state->apen_C[thread_state->thread_id][i] *
log(state->apen_C[thread_state->thread_id][i] / (double) n);
}
My fix:
/*
* Step 3 and 4a: compute the the terms of the phi formula
*/
sum = 0.0;
for (i = 0; i < powLen; i++) {
if (state->apen_C[thread_state->thread_id][i]) {
sum += (double) state->apen_C[thread_state->thread_id][i] *
log(state->apen_C[thread_state->thread_id][i] / (double) n);
}
}
The bit runs count test could be optimized (runs.c). Uploading bits to a byte array (BitSequence **epsilon), IMHO, is also not a good idea...
I was cross/double checking some of the statistics of my randomness tests but could not get the p-values reported in finalAnaysisReport to agree with my calculation. The difference seems to be that sts rounds the sample size divided by 10 which then obviously gives different results than without rounding when sample size is not divisible by 10.
For example, one result from
% ./tools/generators 9 64 > sha1.bin
% ./sts -O -F r -i 64 sha1.bin
% head -8 finalAnalysisReport.txt | tail -3
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 P-VALUE PROPORTION STATISTICAL TEST
------------------------------------------------------------------------------
6 8 6 6 16 4 3 5 5 5 0.017912 61/64 Frequency
% python3
>>> from scipy.special import gammaincc
>>> import numpy as np
>>> obs = np.array([6, 8, 6, 6, 16, 4, 3, 5, 5, 5])
>>> chi_float = sum(((obs - 6.4)**2) / 6.4)
>>> gammaincc(9/2, chi_float/2)
0.029796344939787778
>>> chi_int = sum(((obs - 6)**2) / 6)
>>> gammaincc(9/2, chi_int/2)
0.01791240452984323
shows that sts reports chi_int based p-value due to integer division at [1].
Statistics is not my strong point ;-) but I assume it should be something like
expCount = (double)sampleCount / state->tp.uniformity_bins;
to get the correct p-value of 0.029796.
[1] https://github.com/arcetri/sts/blob/master/src/tests/frequency.c#L673
Testing official reference data.pi, 3.2.5 and 3.2.6 results the same, but does not match nist sts 2.1.2.
% gcc --version
gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
. . .
% uname -a
Linux <hostname> 4.15.0-38-generic #41-Ubuntu SMP Wed Oct 10 10:59:38 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
test | 2.1.2 | 3.2.5 |
---|---|---|
BlockFrequency | 0.380615 | 0.584952 |
LongestRun | 0.024390 | 0.027295 |
Universal | 0.669012 | 0.687852 |
This is confusing us.
Command line and user input for 2.1.2
% cat test.pi.txt
0
data/data.pi
1
0
1
0
% ./assess 1000000 < test.pi.txt
Command line for 3.2.x
% ./sts -i 1 -w ../326pi -F a -S 1000000 -s ../sts-2.1.2/data/data.pi
I can figure out diff in "Block Frequency test" is due to diff block length
and # of substrings
< Block Frequency test
---
> BLOCK FREQUENCY TEST
3,6c3
< (a) Chi^2 = 58.010254
< (b) # of substrings = 61
< (c) block length = 16384
< (d) bits discarded = 576
---
> COMPUTATIONAL INFORMATION:
8c5,10
< SUCCESS p_value = 0.584952
---
> (a) Chi^2 = 7849.375000
> (b) # of substrings = 7812
> (c) block length = 128
> (d) Note: 64 bits were discarded.
> ---------------------------------------------
> SUCCESS p_value = 0.380615
So let's talk about the other 2.
Hi,
I'm having an issue with current master. when Passing a -P option it doesn't recognice integers as integers.
./sts -P 9=8192 ../../linux/1uMB
FATAL: parse_args: -P num=value[,num=value].. failed to parse a num=value, expecting integer=integer: 9=8192
For command line usage help, try: ./sts -h
Version: 3.2.3
Thanks!
So I have a file of 108196287 lines of 8-byte strings converted to ASCII 0/1.
How do I run the tests on this file?
I was never able to get sts-2.1.2 to run without alloc errors.
"Added some comments and fixes to the NIST's paper: The improved SP300-22 Rev 1a is available."
SP300-22 -> SP800-22
Not super relevant but might aswell let you know.
Also, if I were to reference this github in an article to whom should I atribue the authorship? The three main contributors mentioned in the ReadMe was my first thought but if another way is intended I will do as expected .
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.