itu-square / privugger Goto Github PK
View Code? Open in Web Editor NEWPrivugger (/prɪvʌɡə(r)/) is a privacy risk analysis library for python programs. Docs and tutorials: https://itu-square.github.io/privugger/
License: Apache License 2.0
Privugger (/prɪvʌɡə(r)/) is a privacy risk analysis library for python programs. Docs and tutorials: https://itu-square.github.io/privugger/
License: Apache License 2.0
When running infer
multiple times with the pymc3 backend we get an error. It seems to be related to re-defining the pymc3 model more than once.
So far we perform inference using pymc3. However, in some situations (e.g., in the lack of conditions) it might be more efficient to directly sample using libraries like scipy.stats
. Given that we have specification datatypes for defining distribution (#5), supporting different backends can be added. The infer
should use the appropriate backend depending on the analysis to perform.
Since the distribution resulting of pv.concatenate has no name the method add_observation crashes
We will add a module with functions to compute several privacy measures.
This survey contains a wide variety of metrics we can consider supporting.
Since we are developing a library, it seems more appropriate to call it privug. What do you think?
Perhaps separating the function for building the model and running the inference.
Simply running pip install -r requirements.txt
in the develop branch does not install properly all required dependences. Here are a few details we should add (possibly others as well):
This commit 126c7a0 adds support for 1d (integer) vector observations. This is useful for analyzing programs whose output is a 1d vector.
We must extend the enhancement to vectors of other types, e.g., float.
Additionally, we could consider extending observations to vectors of higher dimensions.
We should add support for conditions on random variables (modulo the selected backend).
Alter the /docs/index.rst such that the front page and nav bar looks more like num.pyro.ai
We should add more examples of using privugger that serve as a guide for users of the library.
We can easily allow the user to rename the output distribution by adding a name
parameter to the Program
constructor.
It should be possible to concatenate and stack distributions over different axes.
This is child issue of #4
We should have an pip
based installation guide in the documentation. Then, we should remove the line sys.path.append(os.path.join("../../.."))
from the notebooks.
We should support the definition of the input parameters of the program to analyze using a specification format. Dictionary formats like JSON are a good candidate.
## Spec (step 1)
names= {
'dist': Uniform,
'range': (0,50)
# ...
}
ages={
'dist': Normal,
'mu': 0,
'sigma': 1,
# ...
}
# ...
These specifications will be used to create input datatypes as follows:
import privug as pv
pv_ds=pv.Dataset(input_spec=[names,ages], var_names=['names','ages',...])
# or
p1=pv.float(ages)
# or
p2=pv.int(names)
These will be finally used by the inference method together with the target program, e.g.:
pv.infer(target_program,pv_ds)
Currently, we need way too many imports to access the different packages in the library, e.g., data structures, distributions, etc. It would be better to have just one import with everything.
The opendp
library has updated its API. We should update our notebook using opendp
.
We should add example cases of using the privugger is useful and links to the documentation page
This was triggered due to pymc3 not supporting the new ARM64 architecture in (some) modern laptops
We should have a module for attacker synthesis based on the work by @Pluttodk.
We should create and make publicly available a documentation page with the document ion of the API as well as tutorials and examples. Ideally, we should use Read the docs.
When we run the analysis more than once in two different programs, the analysis doesn't seem to work well.
Related to the program transformation work that @RasmusCarl did, we should have an infer
method that takes as input a python program, and a specification (as defined in #5) and returns the inferred trace. Something like,
import privug as pv
pv.infer(target_program, input_spec)
Hi, I was running the examples from the tutorial.
I got to this point:
trace = pv.infer(program,
cores=4,
draws=10_000,
method='pymc3')
This runs for a while but then fails with:
/python3.8/site-packages/privugger/data_structures/program.py", line 54, in add_observation
vals = re.search(cons, constraints)
NameError: name 're' is not defined
I guessed that it was probably regex missing, so I tried just adding it as a reference to that.
Then I ran it again, to which I got:
python3.8/site-packages/privugger/data_structures/program.py", line 121, in inner
pm.Normal(f"cons_{i}", distribution, precision, observed=value)
NameError: name 'pm' is not defined
Could there be missing some imports? or am I maybe doing something wrong?
@CorentinPhilippe-Taylor noticed that when concatenating a pv.Constant
random variable which is supposed to be continuous, pymc
tries to use NUTS for that variable. Then it crashes as the gradient is 0.
We should add support for analyzing a function that is not provided via a .py
file.
It might be useful to be able to sample from the prior to compare leakage between prior/posterior
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.