Coder Social home page Coder Social logo

pbstark / shangrla Goto Github PK

View Code? Open in Web Editor NEW
9.0 9.0 13.0 9.42 MB

Sets of Half-Average Nulls Generate Risk-Limiting Audits: tools for assertion-based risk-limiting election audits

License: GNU Affero General Public License v3.0

Python 100.00%

shangrla's People

Contributors

bsheehan-sf-rla avatar dvukcevic avatar pbstark avatar spertus avatar wannabesmith avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

shangrla's Issues

Where is ballot_comparison module?

The tests and notebooks related to SUITE and hybrid audits fail for lack of a ballot_comparison module. I presume that a dependency is missing.

# python Code/suite_tools.py
Traceback (most recent call last):
  File "Code/suite_tools.py", line 11, in <module>
    from ballot_comparison import ballot_comparison_pvalue
ModuleNotFoundError: No module named 'ballot_comparison'

Note also that there are some extraneous files like
Code/.ipynb_checkpoints/hybrid-audit-example-3-checkpoint.ipynb
that should be removed and avoided via an addition to .gitignore.

kaplan_martingale returns nan

There are some cases where kaplan_martingale returns nan.

Steps to reproduce

s = [1, 0, 1, 1, 0, 0, 1] 
TestNonnegMean.kaplan_martingale(s, N=7, t=3/7, random_order=True)

returns

array([1.66666667, 0.72222222, 1.13888889, 2.64722222,        nan,
              nan,        nan]))

I think that this was due to c values of infinity being sent to integral_from_roots.

In addition, changing the last observed data point results in the entire martingale changing.

Steps to reproduce

s = [1, 0, 1, 1, 0, 0, 0]                                                   
TestNonnegMean.kaplan_martingale(s, N=7, t=3/7, random_order=True)

returns

(1.0, array([1., 1., 1., 1., 1., 1., 1.]))

ONEAudit implementation questions

  • For Dominion as used by San Francisco, treat tabulator scan batches as the ONEAudit batches for pooling assorters.
    • Treat any ballot in counting group 1 (cards cast in-person) as a pooled CVR, pooled over its (tabulator, batch) combination.
  • For others jurisdictions, need flexibility to use ONEAudit reporting batches that do not correspond to scan batches, for instance, the votes in a contest that were cast in person.
    • add function in class CVR that assigns tabulation groups based on the contests the CVR contains?
  • Implementation uses Contest-level flag for whether to use ONEAudit (using the Audit.AUDIT_TYPE.ONEAUDIT class constant), but it might make sense to specify the audit strategy at the Audit level or the Stratum level.
  • Need a way to estimate sample sizes. Can use the actual CVRs as MVRs together with the ONEAudit CVRs in a simulation, perhaps with additional hypothetical errors in the actual CVRs.
  • Need to work through the sampling when use_style == True.
    • Sample from CVRs that contain the contest. If a selected CVR is in a pool batch, select a random card from the pool batch that contains it
    • Every card in a batch for which at least one CVR contains the contest should be treated as if it contains the contest: if it does not, the MVR should record a non-vote (rather than treating it as an error if the card does not contain the contest, as it would for use_style without ONEAudit. Could do that by adding the contest (with empty vote dict) to every CVR in the pool batch to which the CVR belongs and that did not already contain the contest.** Existing functions automatically take care of the rest.

**A "scoped" defaultdict could do that, if such a construct exists, but it's also straightforward to code.

kaplan_markov function is broken

The kaplan_markov function currently doesn't work, as it tries to do
any(x < 0) while x is a list in find_p_values.

Wrapping the definition of d with np.array(...) in find_p_values on line 1442 of assertion_audit_utils fixes this issue.

Welford's Algorithm in NonnegMean

I compared the implementation of Welford's algorithm for calculating the running mean implemented in NonnegMean.shrink_trunc to an existing implementation online, and I found some discrepancy in the results.

The specific lines I'm referring to can be found here:

# Welford's algorithm for running mean and running sd
mj = [x[0]]
sdj = [0]
for i, xj in enumerate(x[1:]):
mj.append(mj[-1] + (xj - mj[-1]) / (i + 1))
sdj.append(sdj[-1] + (xj - mj[-2]) * (xj - mj[-1]))
sdj = np.sqrt(sdj / j)
# end of Welford's algorithm.

Extracting this code into a short function, we have:

import numpy as np

def shrink_trunc_welford(x: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
        j = np.arange(1, len(x) + 1)
        mj = [x[0]]
        sdj = [0]
        for i, xj in enumerate(x[1:]):
            mj.append(mj[-1] + (xj - mj[-1]) / (i + 1))
            sdj.append(sdj[-1] + (xj - mj[-2]) * (xj - mj[-1]))
        sdj = np.sqrt(sdj / j)
        return mj, sdj

Then, a quick test:

x = np.array([1.5,2.5,1.7,1.0,1.2,2.1,1.4,1.8])
m, sd = shrink_trunc_welford(x)
print(m)
# [1.5, 2.5, 2.1, 1.7333333333333334, 1.6, 1.7000000000000002, 1.6500000000000001, 1.6714285714285715]
print(np.mean(x))
# 1.65

It seems the currently implemented running mean overshoots the true mean of the sample.

I'm not sure whether this is expected or not, but I compared this to an existing implementation in the welford package on pypi:

from welford import Welford

welford = Welford()
m = []
for xi in x:
    welford.add(xi)
    m.append(float(welford.mean))
print(m)
# [1.5, 2.0, 1.9, 1.6749999999999998, 1.5799999999999998, 1.6666666666666665, 1.6285714285714283, 1.65]

This implementation does not overshoot the sample mean. If this is not intended, let me know and I should be able to tweak the implementation accordingly quite easily.

Also, it appears the algorithm is implemented again in agrapa, although the code looks to be slightly different (perhaps calculating variance rather than sd):

# Welford's algorithm for running mean and running sd
mj = [x[0]]
sdj2 = [0]
for i, xj in enumerate(x[1:]):
mj.append(mj[-1] + (xj - mj[-1]) / (i + 1))
sdj2.append(sdj2[-1] + (xj - mj[-2]) * (xj - mj[-1]))
sdj2 = sdj2 / j
# end of Welford's algorithm.

Impact of RuntimeWarning: overflow encountered in Notebook and tests?

In the published notebook assertion-RLA.ipynb there are some warnings displayed:

Find initial sample size

/opt/tljh/user/lib/python3.6/site-packages/numpy/core/fromnumeric.py:90: RuntimeWarning: overflow encountered in reduce
  return ufunc.reduce(obj, axis, dtype, out, **passkwargs)

Find measured risks for all assertions

/opt/shared/Code/assertion_audit_utils.py:984: RuntimeWarning: overflow encountered in double_scalars
  a[k+1,j] += 0 if j==0 else (1-c[k])*(j/(k+1))*a[k,j-1]
/opt/shared/Code/assertion_audit_utils.py:983: RuntimeWarning: overflow encountered in double_scalars
  a[k+1,j] = -c[k]*((k+1-j)/(k+1))*a[k,j]
/opt/shared/Code/assertion_audit_utils.py:1048: RuntimeWarning: invalid value encountered in multiply
  mart_vec = np.multiply(Y_norm,integrals)

I also see these among the warnings displayed when running pytest assertion_audit_utils.py.

Should we worry about these? Fix them? Suppress them somehow?

Break loop in new_sample_size

The while loop in new_sample_size needs to break at some point. Obvious choice is to break when the population size is reached; this is certainly true when sampling w/o replacement. If population size is reached without confirming the outcome, raise a message? When sampling w/ replacement could add a parameter to break once a certain fraction (potentially bigger than 1) of the population has been sampled; for now, take it to be 1.

initial sample size returns 1

Somehow the calculation of the initial sample size broke when ballot polling was introduced. Presumably, calculating the incremental sample size is also wrong. And the sample size calculations should terminate if the size reaches the appropriate population size (max_cards or contest['cards']).

Looking for oc_cvrs.zip

Hi All:

Im working with Vanessa Teague et al, to understand SHANGRLA and evaluate porting it.

Im trying to run examples/OC_example.ipynb in SHANGRLA python repo. It references RLA files not in the repo, apparently from the Orange County 2022 election:

'cvr_file': '/Users/amanda/Downloads/oc_cvrs.zip',
#'cvr_file': '/Users/Jake/Desktop/oc_cvrs.zip',

Are these available somewhere that I can download?

If not, is there another example data set I can use to run a complete SHANGRLA workflow?

Thanks,
John (https://github.com/JohnLCaron)

incremental sample size: condition on the observed sample

Condition on the observed sample.
Make assumption about the error rate for future sample (default: observed rate in the sample so far).
Find quantile or expected incremental sample size for each assertion; sum maximum sampling probabilities across as-yet-unsampled cards.

Bug in find_p_values()?

Line 1609 of assertion_audit_utils.py calls overstatement_assorter() with a "cards" argument (which was inserted in commit #37). However, overstatement_assorter() does not have a cards parameter, so an error is thrown.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.