Coder Social home page Coder Social logo

tensorflow-adversarial's Introduction

Craft Image Adversarial Samples with Tensorflow

THE CODE IS PROVIDED AS IT-IS, MAY NOT UPDATE IT ANYMORE. HOPEFULLY IT IS STILL HELPFUL.

DOI

Table of Contents

  1. API
  2. Dependencies
  3. The model
  4. How to Use
  5. Results
  6. More Attacks (outdated)
  7. Related Work
  8. Citation

This repo contains adversarial image crafting algorithms implemented in pure Tensorflow. The algorithms can be found in attacks folder. The implementation adheres to the principle tensor-in, tensor-out. They all return a Tensorflow operation which could be run through sess.run(...).

API

  • Fast Gradient Method (FGM) basic/iterative

    fgm(model, x, eps=0.01, epochs=1, sign=True, clip_min=0.0, clip_max=1.0)

    If sign=True, use gradient sign as noise, otherwise use gradient values directly. Empirically gradient sign works better.

  • Fast Gradient Method with Target (FGMT)

    fgmt(model, x, y=None, eps=0.01, epochs=1, sign=True, clip_min=0.0, clip_max=1.0):

    The only difference from FGM is that this is a targeted attack, i.e., a desired target can be provided. If y=None, this implements the least-likely class method.

  • Jacobian-based Saliency Map Approach (JSMA)

    jsma(model, x, y, epochs=1, eps=1, clip_min=0, clip_max=1, score_fn=lambda t, o: t * tf.abs(o))

    y is the target label, could be an integer or a list. when epochs is a floating number in the range [0, 1], it denotes the maximum percentage distortion allowed and epochs is automatically deduced. k denotes the number of pixels to change at a time, should only be 1 or 2. score_fn is the function used to calculate the saliency score, default to be dt/dx * (-do/dx), could also be dt/dx - do/dx.

  • DeepFool

    deepfool(model, x, noise=False, eta=0.01, epochs=3, clip_min=0.0, clip_max=1.0, min_prob=0.0)

    If noise is True, the return value is noise, otherwise only xadv is returned. Note that in my implementation, the noise if calculated as f/||w|| * w instead of f/||w|| * w/||w||, where ||w|| is the L2 norm. It seems that ||w|| is so small such that noise will explode when adding it. In the original author's implementation, they add a small value 1e-4 for numeric stability, I guess we might have similar issue here. Anyway, this factor does not change the direction of the noise, and in practice, the adversarial noise is still subtle and hard to notice.

  • CW

    cw(model, x, y=None, eps=1.0, ord_=2, T=2,
       optimizer=tf.train.AdamOptimizer(learning_rate=0.1), alpha=0.9,
       min_prob=0, clip=(0.0, 1.0)):

    Note that CW is a bit different from the above gradient-based methods in that it is an optimization-based attack. Thus, it returns a tuple, (train_op, xadv, noise). After running train_op for desired epochs, run xadv to get the adversarial images. Please note that it is OPTIMIZATION-BASED method, which means it is tricky. You probably need to search for the best parameter configuration per image. Otherwise, you will NOT get the amazingly good result reported in the paper. It took me a couple of days to realize that the reason for my crappy adversarial images was not that my implementation was wrong, but rather, my learning rate was too small!!

Dependencies

  1. Python3, samples codes uses many of the Python3 features.
  2. Numpy, only needed in sample codes.
  3. Tensorflow, tested with Tensorflow 1.4.

The model

Notice that we have model as the first parameter for every method. The model is a wrapper function to create the target model computation graph. The first parameter has to be the input x, other parameters may be added when necessary, but they need to have default values.

def model(x, logits=False):
  # x is the input to the network, usually a tensorflow placeholder
  ybar = ...                    # get the prediction
  logits_ = ...                 # get the logits before softmax
  if logits:
    return y, logits
  return y

How to Use

Implementation of each attacking method is self-contained, and depends only on TensorFlow. Copy the attacking method file to the same folder as your source code and import it.

The implementation should work on any framework that is compatible with Tensorflow. Examples are provided in examples folder, each example is self-contained.

Results

  • Comparison of all implemented algorithms.

    img

  • Fast gradient sign method adversarial on MNIST.

    img

  • Fast gradient value method adversarial on MNIST.

    img

  • DeepFool generate adversarial images.

    img

  • CW L2 generates targeted attack on a random select image, with binary search for the best eps value.

    img

  • JSMA generates cross label adversarial on MNIST. Labels on the left are the true labels, labels on the bottom are predicted labels by the model.

    img

  • JSMA generates cross label adversarial on MNIST, with difference as saliency function, i.e., dt/dx - do/dx.

    img

  • JSMA generates adversarial images from blank images.

    img

More Attacks

The list is outdated.

Related Work

Citation

You are encouraged to cite this code if you use it for your work. See the above Zenodo DOI link.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.