Coder Social home page Coder Social logo

accuracy-first-differential-privacy's Introduction

Code for paper "Accuracy First: Selecting a Differentially Private
                Level for Accuracy-Constrained ERM"
by Ligett, Neel, Roth, Waggoner, Wu

https://arxiv.org/abs/1705.10829


---------------------------------------------------------------
-- Disclaimer

This code is used for simulations of the performance of
differentially private algorithms, but should not be used in practice
to protect actual sensitive data! The theorems are proved for true
randomness and real numbers, while the code uses python's internal
random generators and floating-point numbers.


---------------------------------------------------------------
-- Requirements
1.  python3 with the numpy, matplotlib, and scikit-learn libraries.

2.  (optional) Linux, bash, and the GNU parallel utility
    available from most repositories.
    This is not mandatory, it's just that a bash script is
    used for running a whole bunch of experiments at once
    and in parallel.


---------------------------------------------------------------
-- Usage

Navigate into the code/ directory to run the code.

You will need a dataset file in plain text.
Each row of the file is a data point.
It should contain d+1 space-separated numbers
(for some d) where the first d are "x" and the last is "y".
It is assumed that the L1-norm of each x is at most 1,
and each |y| <= 1.
For logistic regression, each y should be plus or minus 1.
See data/ directories for downloading the datasets used in the
paper and processing them into this format.

You can run a single experiment at a time and print the output,
or run a set of experiments and save the outputs into folders.


---------------------------------------------------------------
-- To run a single experiment for a given data set and parameters:
     $ python3 run_ridge.py [args]
   OR
     $ python3 run_logist.py [args]
Run them with no arguments for help on the args.


---------------------------------------------------------------
-- To run a set of experiments:

1. You should have a dataset file and also a file with a list of
   the alpha parameters to try, called alphalist.txt.
   E.g. you can edit 'gen_alphalist.py' to your liking and then run


     $ gen_alphalist.py > alphalist.txt

2. Edit the file 'run_sims.sh' to set all the parameters to your
   liking. Also edit the top of the file 'prep_simulations.sh'
   to rename the variable 'run_file_name'.
   It should be "run_many_logist.py" if you want logistic regression
   or "run_many_ridge.py" if you want ridge regression.

3. Execute the following (full explanation of what it does below):

     $ ./run_sims.sh

4. Execute the following to read the results, print some output about
   them, and produce some plots.

     $ python3 collect_results_ridge.py sims-results/


----------------
About run_sims.sh:
This will create a folder sims-results/ and run a bunch of simulations
writing the results into that folder, along with an 'about.py'
file that specifies what all the parameters were.

It does the following:

   a. Runs python3 prep_simulations.py [args]
      which creates the folder sim-results/
      and writes about.py into it.
      Also writes a list of commands to a temporary file.

   b. Invokes GNU parallel to run the commands in parallel.
      WARNING: for large datasets, may use up all your RAM and
      crash your computer! Use --max-procs to limit the number
      of commands to run simultaneously.
      Each command that is run is of form in step c.
      
   c. python3 run_many_ridge.py [args]
      Runs num_trials experiments for the given parameters,
      writing the outputs into sim-results/param-i/ for the
      given i.



accuracy-first-differential-privacy's People

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.