gongzhitaao / tensorflow-adversarial Goto Github PK

View Code? Open in Web Editor NEW

224.0 13.0 72.0 1.4 MB

Crafting adversarial images

License: MIT License

Python 100.00%

tensorflow deep-learning adversarial adversarial-images adversarial-texts

tensorflow-adversarial's Introduction

Craft Image Adversarial Samples with Tensorflow

THE CODE IS PROVIDED AS IT-IS, MAY NOT UPDATE IT ANYMORE. HOPEFULLY IT IS STILL HELPFUL.

API
Dependencies
The model
How to Use
Results
More Attacks (outdated)
Related Work
Citation

This repo contains adversarial image crafting algorithms implemented in pure Tensorflow. The algorithms can be found in attacks folder. The implementation adheres to the principle tensor-in, tensor-out. They all return a Tensorflow operation which could be run through sess.run(...).

API

Fast Gradient Method (FGM) basic/iterative
```
fgm(model, x, eps=0.01, epochs=1, sign=True, clip_min=0.0, clip_max=1.0)
```
If sign=True, use gradient sign as noise, otherwise use gradient values directly. Empirically gradient sign works better.
Fast Gradient Method with Target (FGMT)
```
fgmt(model, x, y=None, eps=0.01, epochs=1, sign=True, clip_min=0.0, clip_max=1.0):
```
The only difference from FGM is that this is a targeted attack, i.e., a desired target can be provided. If y=None, this implements the least-likely class method.
Jacobian-based Saliency Map Approach (JSMA)
```
jsma(model, x, y, epochs=1, eps=1, clip_min=0, clip_max=1, score_fn=lambda t, o: t * tf.abs(o))
```
y is the target label, could be an integer or a list. when epochs is a floating number in the range [0, 1], it denotes the maximum percentage distortion allowed and epochs is automatically deduced. k denotes the number of pixels to change at a time, should only be 1 or 2. score_fn is the function used to calculate the saliency score, default to be dt/dx * (-do/dx), could also be dt/dx - do/dx.
DeepFool
```
deepfool(model, x, noise=False, eta=0.01, epochs=3, clip_min=0.0, clip_max=1.0, min_prob=0.0)
```
If noise is True, the return value is noise, otherwise only xadv is returned. Note that in my implementation, the noise if calculated as f/||w|| * w instead of f/||w|| * w/||w||, where ||w|| is the L2 norm. It seems that ||w|| is so small such that noise will explode when adding it. In the original author's implementation, they add a small value 1e-4 for numeric stability, I guess we might have similar issue here. Anyway, this factor does not change the direction of the noise, and in practice, the adversarial noise is still subtle and hard to notice.
CW
```
cw(model, x, y=None, eps=1.0, ord_=2, T=2,
   optimizer=tf.train.AdamOptimizer(learning_rate=0.1), alpha=0.9,
   min_prob=0, clip=(0.0, 1.0)):
```
Note that CW is a bit different from the above gradient-based methods in that it is an optimization-based attack. Thus, it returns a tuple, (train_op, xadv, noise). After running train_op for desired epochs, run xadv to get the adversarial images. Please note that it is OPTIMIZATION-BASED method, which means it is tricky. You probably need to search for the best parameter configuration per image. Otherwise, you will NOT get the amazingly good result reported in the paper. It took me a couple of days to realize that the reason for my crappy adversarial images was not that my implementation was wrong, but rather, my learning rate was too small!!

Dependencies

Python3, samples codes uses many of the Python3 features.
Numpy, only needed in sample codes.
Tensorflow, tested with Tensorflow 1.4.

The `model`

Notice that we have model as the first parameter for every method. The model is a wrapper function to create the target model computation graph. The first parameter has to be the input x, other parameters may be added when necessary, but they need to have default values.

def model(x, logits=False):
  # x is the input to the network, usually a tensorflow placeholder
  ybar = ...                    # get the prediction
  logits_ = ...                 # get the logits before softmax
  if logits:
    return y, logits
  return y

How to Use

Implementation of each attacking method is self-contained, and depends only on TensorFlow. Copy the attacking method file to the same folder as your source code and import it.

The implementation should work on any framework that is compatible with Tensorflow. Examples are provided in examples folder, each example is self-contained.

Results

Comparison of all implemented algorithms.
Fast gradient sign method adversarial on MNIST.
Fast gradient value method adversarial on MNIST.
DeepFool generate adversarial images.
CW L2 generates targeted attack on a random select image, with binary search for the best eps value.
JSMA generates cross label adversarial on MNIST. Labels on the left are the true labels, labels on the bottom are predicted labels by the model.
JSMA generates cross label adversarial on MNIST, with difference as saliency function, i.e., dt/dx - do/dx.
JSMA generates adversarial images from blank images.

More Attacks

The list is outdated.

Moment iterative attack https://arxiv.org/abs/1710.06081
Virtual adversarial https://arxiv.org/abs/1507.00677
CarliniWagner (CW) https://arxiv.org/abs/1608.04644
Elastic net https://arxiv.org/abs/1709.04114
MadryEtAl https://arxiv.org/abs/1706.06083
Fast feature https://arxiv.org/abs/1511.05122
Houdini https://arxiv.org/abs/1707.05373

Related Work

tensorflow/cleverhans Well maintained adversarial implementaion in TensorFlow.
LTS4/DeepFool Author's code for deepfool in PyTorch and Matlab.

Citation

You are encouraged to cite this code if you use it for your work. See the above Zenodo DOI link.

tensorflow-adversarial's People

Stargazers

Watchers

Forkers

vv123 nap0017 jianweilin jdc08161063 scitao wanjinchang benjamesbabala hw915 andrewschreiber prabhant beckgom coderx7 mystorytime fqaiser94 tangzixia aporia3517 kabkabm qjin2016 xieyi318 zhengyizy xunge hcyray angeberry akolada qing0991 afcarl ceolium pinjiahe hanqi kakacharles10 karimkalimu machanic yj131gg0d albertzhanghit guidao17 cyphina apakat rahimentezari iamsile mahbubkhoda shangyuancun hanyc0914 fendaq yanbofan kyungpilgwon josie0921 shayan-taheri sundycoders huyoboy chenxiao2402 161250097 rapirent bxz9200 rpplayground jpgard popocheng zh448781120 andymic sk-subroto candy666-cyber thanhdtran vivianzhao24 hardhik-99 liang813 adeeps1 isxrh iq-scm yuxing-gao diamond-fur

tensorflow-adversarial's Issues

Any adversarial attack that sustains after resize attack

Hi,

This is Bala. I have a query regarding adversarial attack.

Is there any adversarial attack that sustains/consists of added noise, after resize attack ? (adversarial image -> converting into High / low resolution image -> resize to original adverarial image size)

Thanks,
Bala

I

hello，I want to test the jsma function,but I can't find ex_01.py in the repertory，so can you give me your ex_01.py?Thank you very much!!!

what is adversarial data?

recently I am working on model compression and if it influenced model robustness while facing adversarial samples, I tried ex06.py on cifar10, and after generating adversarial data, it would test against adversarial test data:
print('Testing against adversarial test data')
score = model.evaluate(X_adv, y_test)
print('\nloss: {0:.4f} acc: {1:.4f}'.format(score[0], score[1]))
I am a little confused, higher accuracy means better or worse robustness? or it's not related?
so could you please explain a little about adversarial data, thx!

ex_00.py for different model

Hi,

I tried to change the model to one similar to(https://github.com/radioML/examples/blob/master/modulation_recognition/RML2016.10a_VTCNN2_example.ipynb) this model. Where total params is 2,830,427 (way more than your simple model). While generating adversarial crafting graph, I receive the following error
Traceback (most recent call last):
File "RML_TOS.py", line 121, in
x_adv = fgsm(_model_fn, x, epochs=9, eps=0.0001)
File "/home/pduraisamy/Adversarial/adversarial/fgsm.py", line 23, in fgsm
ybar = model(x_adv)
File "RML_TOS.py", line 117, in model_fn
logits, = ybar.op.inputs
ValueError: too many values to unpack
Exception AttributeError: "'NoneType' object has no attribute 'path'" in <function remove at 0x7fb07c21f140> ignored

Does it mean it couldn't handle deep network? Appreciate your help.

Some thoughts ... most adversarial examples that looks ok to human is because...

For a typical example

Human may read it as "4" only because we know it's handwriting. And handwriting is done with a pen, and written by strokes.

If I tell you this is not written by hand, but printed by a printer.
You probably tell me it's definitely a "9" not a "4".
(And you might use your common sense, that a printer might lack ink.)

If I just tell myself, they are not handwritings, they are prints, ink sprayed on water or paper made of rubber, many examples doesn't look strange anymore.

So the difference is probably in the training data.

Error in exp02.py

TypeError: call() got an unexpected keyword argument 'logits'

Dependencies missing

I looked for dependencies part in the readme but didn't see it. I wondered which version of tensorflow this code is working on?

problem of fgmt: type of label y

I found that you write the "fgmt" algrithm in "attackes/fast_gradient.py" but not achieve it. So I changed the file "example/fgsm_mnist.py" to achieved it but meet the problem as below:
ValueError: Cannot convert a partially known TensorShape to a Tensor: (?,)

the changed code is as below:

with tf.variable_scope('model', reuse=True):
    env.target = tf.placeholder(tf.int32, (), name='target')
    env.x_fgsm = fgmt(model, env.x, env.target, epochs=env.fgsm_epochs, eps=env.fgsm_eps)
...

def make_fgsm(sess, env, X_data, epochs=1, eps=0.01, batch_size=128):
...
    for batch in range(n_batch):
        start = batch * batch_size
        end = min(n_sample, start + batch_size)
        adv = sess.run(env.x_fgsm, feed_dict={
            env.x: X_data[start:end],
            env.target: np.random.choice(n_classes),
            env.fgsm_eps: eps,
            env.fgsm_epochs: epochs})
        X_adv[start:end] = adv
    return X_adv

I guess the type of target is wrong... Hoping for your reply~

One question about deepfool when attack a keras model

I try to use the code to attack models implemented in Keras. There is no problem when using fgsm. However, when I use deepfool, I get an error:
Traceback (most recent call last):
File "attack_mnist_keras.py", line 136, in
main()
File "attack_mnist_keras.py", line 127, in main
x_deepfool = deepfool(classifier_attack, x, epochs=3)
File "/home/wangxiaosen/attack_gan/attack_keras/deepfool.py", line 50, in deepfool
name='deepfool')
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/functional_ops.py", line 423, in map_fn
swap_memory=swap_memory)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3224, in while_loop
result = loop_context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2956, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2893, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/functional_ops.py", line 413, in compute
packed_fn_values = fn(packed_values)
File "/home/wangxiaosen/attack_gan/attack_keras/deepfool.py", line 46, in _f
clip_max=clip_max, min_prob=min_prob)
File "/home/wangxiaosen/attack_gan/attack_keras/deepfool.py", line 165, in _deepfoolx
name='_deepfoolx', back_prop=False)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3224, in while_loop
result = loop_context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2956, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2893, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/home/wangxiaosen/attack_gan/attack_keras/deepfool.py", line 144, in _body
for i in range(ydim)]
TypeError: 'NoneType' object cannot be interpreted as an integer

I found that in the deepfool.py, line 120 get y0 with shape (?, 10) rather than (1, 10). I have tried to ignore this and set ydim = 1, but I get another error:
Traceback (most recent call last):
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call
return fn(*args)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: slice index 1 of dimension 0 out of bounds.
[[Node: deepfool/while/_deepfoolx/strided_slice_4 = StridedSlice[Index=DT_INT64, T=DT_FLOAT, begin_mask=0, ellipsis_mask=0, end_mask=0, new_axis_mask=0, shrink_axis_mask=1, _device="/job:localhost/replica:0/task:0/device:GPU:0"](deepfool/while/_deepfoolx/stack, deepfool/while/_deepfoolx/strided_slice_4/stack/_147, deepfool/while/_deepfoolx/strided_slice_4/stack_1/_149, deepfool/while/_deepfoolx/strided_slice_4/Cast/_151)]]
[[Node: clip_by_value/_165 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_781_clip_by_value", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "attack_mnist_keras.py", line 136, in
main()
File "attack_mnist_keras.py", line 129, in main
x_adv = make_deepfool(sess, x_deepfool, x, epochs, X_test, 3)
File "attack_mnist_keras.py", line 61, in make_deepfool
K.learning_phase(): 0})
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
run_metadata)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: slice index 1 of dimension 0 out of bounds.
[[Node: deepfool/while/_deepfoolx/strided_slice_4 = StridedSlice[Index=DT_INT64, T=DT_FLOAT, begin_mask=0, ellipsis_mask=0, end_mask=0, new_axis_mask=0, shrink_axis_mask=1, _device="/job:localhost/replica:0/task:0/device:GPU:0"](deepfool/while/_deepfoolx/stack, deepfool/while/_deepfoolx/strided_slice_4/stack/_147, deepfool/while/_deepfoolx/strided_slice_4/stack_1/_149, deepfool/while/_deepfoolx/strided_slice_4/Cast/_151)]]
[[Node: clip_by_value/_165 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_781_clip_by_value", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Caused by op 'deepfool/while/_deepfoolx/strided_slice_4', defined at:
File "attack_mnist_keras.py", line 136, in
main()
File "attack_mnist_keras.py", line 127, in main
x_deepfool = deepfool(classifier_attack, x, epochs=3)
File "/home/wangxiaosen/attack_gan/attack_keras/deepfool.py", line 50, in deepfool
name='deepfool')
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/functional_ops.py", line 423, in map_fn
swap_memory=swap_memory)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3224, in while_loop
result = loop_context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2956, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2893, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/functional_ops.py", line 413, in compute
packed_fn_values = fn(packed_values)
File "/home/wangxiaosen/attack_gan/attack_keras/deepfool.py", line 46, in _f
clip_max=clip_max, min_prob=min_prob)
File "/home/wangxiaosen/attack_gan/attack_keras/deepfool.py", line 166, in _deepfoolx
name='_deepfoolx', back_prop=False)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3224, in while_loop
result = loop_context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2956, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2893, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/home/wangxiaosen/attack_gan/attack_keras/deepfool.py", line 149, in _body
gk, go = g[k0], tf.concat((g[:k0], g[(k0+1):]), axis=0)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 597, in _slice_helper
name=name)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 763, in strided_slice
shrink_axis_mask=shrink_axis_mask)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 8148, in strided_slice
name=name)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
op_def=op_def)
File "/opt/anaconda/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1718, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): slice index 1 of dimension 0 out of bounds.
[[Node: deepfool/while/_deepfoolx/strided_slice_4 = StridedSlice[Index=DT_INT64, T=DT_FLOAT, begin_mask=0, ellipsis_mask=0, end_mask=0, new_axis_mask=0, shrink_axis_mask=1, _device="/job:localhost/replica:0/task:0/device:GPU:0"](deepfool/while/_deepfoolx/stack, deepfool/while/_deepfoolx/strided_slice_4/stack/_147, deepfool/while/_deepfoolx/strided_slice_4/stack_1/_149, deepfool/while/_deepfoolx/strided_slice_4/Cast/_151)]]
[[Node: clip_by_value/_165 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_781_clip_by_value", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

I wonder if you could help me solve this problem. Thanks very much.

Does the fgmt attack work correctly? Do you have a few sample outputs for fgmt?

Is there loop in your fast gradient method?

as in title.
I noticed that you have "tf.while_loop" in your code.
as far as I know, fast gradient is called fast because it doesn't have loop.
see section 2.1 in "ADVERSARIAL EXAMPLES IN THE PHYSICAL WORLD", of which the link appeared in the comment in your code:
In this paper we refer to this method as “fast” because it does not require an iterative procedure to
compute adversarial examples, and thus is much faster than other considered methods.
Your kindness would be appreciated if you may explain where did I make mistakes or misunderstand.

example can't run

when i run the example fgsm_mnist, error happened: ImportError: No module named 'attacks'