Coder Social home page Coder Social logo

nimarb / pytorch_influence_functions Goto Github PK

View Code? Open in Web Editor NEW
312.0 312.0 70.0 459 KB

This is a PyTorch reimplementation of Influence Functions from the ICML2017 best paper: Understanding Black-box Predictions via Influence Functions by Pang Wei Koh and Percy Liang.

License: Other

Python 100.00%
deep-learning influence-functions pytorch pytorch-implementation

pytorch_influence_functions's People

Contributors

expectopatronum avatar ml-illustrated avatar nimarb avatar zaydh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytorch_influence_functions's Issues

run time

how long does it take to run for cifar with nearly 100 testpoints, full 50k training points, recursion depth = 6000 & r = 10? also when i run this on a single gpu, i always get oom error. any work arounds for this?

influence are all 0?

Hi,

I run the train cifar10 in /examples, the model is a simple one in the train_influence_functions.py. Then after I ran test_influence_functions.py, all the influence values are -0.0 for all 50000 images, the -0.0 is actually 0.

Did you meet this problem?

Improve Numerical Stability of Function calc_loss

Summary: We can improve the numerical stability/accuracy of the calc_loss method.

The current implementation uses the following:

def calc_loss(y, t):
    y = torch.nn.functional.log_softmax(y)
    loss = torch.nn.functional.nll_loss(
        y, t, weight=None, reduction='mean')
    return loss

PyTorch includes a single functional that is numerically more stable cross_entropy. It would also simplify the above code to:

def calc_loss(y, t):
    loss = torch.nn.functional.cross_entropy(y, t, weight=None, reduction="mean")
    return loss

some question about s_test

Uploading image.png…
In s_test, params, names = make_functional(model)will delete the paramters of the model,but I find in function f load_weights(model, names, new_params) can not load the params. and load_weights(model, names, params, as_params=True) can load the model weights.But when i use this ,the hvp will report the error: RuntimeError: The output of the user-provided function is independent of input 0. This is not allowed in strict mode.
Although use load_weights(model, names, new_params) can run, the paramters are not be loaded,is this reasonable?

Issue with continuous-value-predicting LSTM results

As a first step in using these tools, I am trying to get training set influence for a small LSTM (~1000 weights) and toy-sized data set (train_n and test_n = 100).

After making very few adjustments (i.e., changing the nll_loss to mse_loss), I can get results running the following

ptif.calc_influence_function.calc_influence_single(
    model,
    train_loader,
    test_loader,
    np.argmax(y_test).item(), # trying to get influence regarding the most extreme set instance
    gpu = -1,
    recursion_depth = 1,  # setting this up to 5 also works but takes longer, anything above 5 seems memory-prohibitive
    r = 1 # setting this up to 5 also works but takes longer, anything above 5 seems memory-prohibitive
)

The results are consistent between multiple calls to the function, but they correlate very poorly with leave-one-out training results.

ptif_influence_results

Do you know if I am using this incorrectly or if there is any fixable reason why the implementation may perform poorly for an LSTM predicting continuous values (most examples that I see are using CNN architectures for image classification)

If useful, my fork containing those minimal changes can be found here - https://github.com/jdiaz4302/pytorch_influence_functions

Usage of loss which is not CrossEntropy

Hi,

Thank you very much for this implementation!

I noticed that the only allowed loss is CrossEntropyLoss. As I'd need your module to compute Influence Functions on other kinds of losses, how about adding the possibility to use a different loss, perhaps passed as param in the config dict?

I was thinking about the following changes (to avoid disruptive changes to the API):

def s_test(z_test, t_test, model, z_loader, gpu=-1, damp=0.01, scale=25.0,
recursion_depth=5000):

to

def s_test(z_test, t_test, model, z_loader, gpu=-1, damp=0.01, scale=25.0,
           recursion_depth=5000, loss_fn=compute_loss):

And then change

to

loss = loss_fn(y, t)

With something analogous for:

It would be obviously necessary to propagate the loss_fn params to the functions that call grad_z and s_test.

I'd be glad to create a new branch and open a pull request with this change!

Thanks :)

why large influence is harmful

does there have any evaluation for influence function values? I found the most helpful train image label in CIFAR10 is 2 while the test image label 4.

Overwriting h_estimate rather than updating it

Referring to this line in this repo:

_v + (1 - damp) * h_estimate - _hv / scale

The implementation of this line from the TF repo: https://github.com/kohpangwei/influence-release/blob/578bc458b4d7cc39ed7343b9b271a04b60c782b1/influence/genericNeuralNet.py#L500
which is cur_estimate = [a + (1-damping) * b - c/scale for (a,b,c) in zip(v, cur_estimate, hessian_vector_val)] seems to be incorrect.

Rather than using the current estimate in conjunction with the new vector product, the implementation in this repo is using the original estimation over and over.

I believe the fix would be to simply set h_estimate = v.copy() higher up in the function, remove h_init_estimates entirely, and use h_estimate in the for loop, like this:
h_estimate = [ _v + (1 - damp) * h_e - _hv / scale for _v, h_e, _hv in zip(v, h_estimate, hv)]

Question on influence computation

Hi, thanks for your amazing contribution of this Pytorch implementation!

Question on influence computation: as mentioned in function calc_s_test_single: s_test = invHessian * nabla(Loss(test_img, model params). However, as in Eq. (2) in the paper Understanding Black-box Predictions via Influence Functions, we should compute invHessian * nabla(Loss(train_img, model params). It seems that the positions of train and test image in Eq. (2) is switched.

Did I get anything wrong? Could you please provide more clue on that? Thanks so much for your information!

Wrong behaviour when test_start_index=0

When test_start_index=0 the first 10 test samples are used and not the first one per class. This is caused by checking if test_false_index is True. This can be fixed by using None when no index should be used. I fixed it here and can make a pull request if you agree that this is the intended behaviour.

Best regards
Verena

cuda error

/home/zliu106/.local/lib/python3.9/site-packages/pytorch_influence_functions/influence_function.py:69: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
y = torch.nn.functional.log_softmax(y)
Traceback (most recent call last):
File "/gpfs/fs1/home/zliu106/Documents/influence_function/influence.py", line 109, in
influences, harmful, helpful = ptif.calc_img_wise(config, model, trainloader, testloader)
File "/home/zliu106/.local/lib/python3.9/site-packages/pytorch_influence_functions/calc_influence_function.py", line 460, in calc_img_wise
influence, harmful, helpful, _ = calc_influence_single(
File "/home/zliu106/.local/lib/python3.9/site-packages/pytorch_influence_functions/calc_influence_function.py", line 315, in calc_influence_single
s_test_vec = calc_s_test_single(model, z_test, t_test, train_loader,
File "/home/zliu106/.local/lib/python3.9/site-packages/pytorch_influence_functions/calc_influence_function.py", line 93, in calc_s_test_single
s_test_vec_list.append(s_test(z_test, t_test, model, train_loader,
File "/home/zliu106/.local/lib/python3.9/site-packages/pytorch_influence_functions/influence_function.py", line 43, in s_test
x, t = x.cuda(), t.cuda()
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Clarification about parameters recursion_depth and r

Hi,
thanks for this great package!
I am wondering how to set some of the parameters. In the README you mention:

Combined, the original paper suggests that recursion_depth * r should equal the training dataset size, thus the above values of r = 10 and recursion_depth = 5000 are valid for CIFAR-10 with a training dataset size of 50000 items.

Can you point me to the paragraph in the paper?

I could only find

We pick t to be large enough such that H̃ t stabilizes, and to reduce variance we repeat this procedure r times and average results. Empirically, we found this significantly faster than CG.

which does not tell me that recursion_depth * r should equal the training dataset size.

Thanks and best regards,
Verena

Tensor shape

Thank you for your interest in this issue.

I tried to re-engineer pytorch_influence into TensorFlow 2.0 code.

I wonder how is the tensor shape of return from grad_z() ?

I think the return value is list and includes tensors which are (feature, batch), is it right?

This is because the return value from grad z should be inversed hessian (shape:n_feature x n_feature) when deep learning return as follows F : R ^{feature} -> R^{1}
(hessian is symmatric matrix)

Considering "upweighting Influence function (loss) = grad_z(z_test, theta) ^ {T} * hessian * grad_z(z, theta) ",
the matrix shape must be as follow: (batch. feature) * (feature, feature) * (feature, 1)

A potential typo/bug in averaging s_test?

In calc_s_test_single, it seems like the intended operation is to add all the sampled s_test_i together and divide the sum by r. However, it is actually extending the list. Because of this, it seems like when I set r>1, calc_influence_single just completely skips s_test_vec[len(grad_z_vec):]. Am I not understanding what this is doing correctly, or is this a typo? I'm quite unsure because this looks like something intentional, as it's explicitly converted to a list of tensors.

Missing elements & clarification when calculating s_test

Hello and thank you for the neat implementation!
In influence_function.py there are some things that aren't very clear to me in comparison to the paper.
First of all, here the computation for s_test is missing a multiplication imo. Shouldn't it be like this (?):

h_estimate = [ _v + ((1 - damp) * _h_e - _hv / scale)*_v
                      for _v, _h_e, _hv in zip(v, h_estimate, hv)]

according to this:
image

Furthermore, it is not obvious to me why in this func. you chose to apply grad() a second time after calculating the lement wise product. Is there a difference with running it two times beforehand?

Thank you and cheers!

How to prevent NAN influence values?

I am consistently getting NAN values in my output on CIFAR-10 data for recursion depths (r-depth) as low as 15. I noticed that the influence values for r-depth = 1 are big negative values (e.x. -2000) and they just explode to -1e36 for r-depth = 10.
I increased the scale value to 500 as suggested in #6, but to no avail. I have also used the default r-depth = 5000. I am using the function calc_influence_single. The averaging value (r) is 1 for all of these as suggested by #2.

Does anyone have advice or solution to this? Thank you.

How to configure "damp" and "scale"?

Hi, I was wondering what does the "damp" and "scale" means and how to configure them?
The readme says that their default values are 0.01 and 25, but I don't know how to configure them and what will be affected by their different values?
Thanks and best regards

The gradient of h_estimate for compute hv

Thanks for the great code.

I change the loss function into binary cross entropy but it runs really slow when the recursion_depth rises. Is it reasonable to use h_estimate.detach() before we put h_estimate into hv = hvp(loss, list(model.parameters()), h_estimate)? Since we should't incorporate the gradient of h_estimate in terms of w in function hvp.

I am not sure about it and appreciated it if you can take a look. Thank you.

Random train sample in s_test

Hi, I was wondering why you are taking a random training sample in s_test. And according to the comment TODO: do x, t really have to be chosen RANDOMLY from the train set? you are not certain about that. Is there some hint and the paper or why did you implement it like this? Did you have any new insights?
Thanks and best regards
Verena

The sign

tmp_influence = -sum(
[
###################################
# TODO: verify if computation really needs to be done
# on the CPU or if GPU would work, too
###################################
torch.sum(k * j).data.cpu().numpy()
for k, j in zip(grad_z_vecs[i], e_s_test)
###################################
# Originally with [i] because each grad_z contained
# a list of tensors as long as e_s_test list
# There is one grad_z per training data sample
###################################
]) / train_dataset_size

In the final step, the code here accually calculate the -1/n·I_up,loss(z,z_test). However, in the equation(2) of the original paperhttps://arxiv.org/abs/1703.04730, the term I_up,loss(z,z_test) do has a minus sign. So two negatives make a positive, the calculating here should be 1/n·I_up,loss(z,z_test). Or I just misunderstand some part of code or paper?

Gap between influence and real loss difference

Hi, I am trying to perform a leave-one-out retraining to compare the difference between influence computed by calc_influence_single here and actual loss difference by computing two times of loss before and after retrain. I use calc_loss here to compute loss. Also, I scale the influence by len(trainset). Surprisingly I found big difference between computed influence and real loss difference after retrain. For a random pick test_idx=10, train_idx_to_remove = 609, I'm getting the following result:

actual_loss_diffs:  tensor(0.7848, device='cuda:0', grad_fn=<SubBackward0>)
predict_loss_diffs:  -0.003923072204589844

which doesn't seem very relevant to me.

Thanks in advance for any kind suggestions!

Using another loss functions?

In the case i want to use another loss function than NLL, for example a regression problem.
i will simply change the calc_loss function implementation.
correct?

Default Configuration GPU Setting in the README

In the README, it states:

gpu: Default = -1, -1 for calculation on the CPU otherwise GPU id

If I am understanding the meaning of "default" here, you are referring to the function get_default_config. Looking at that function's definition, it looks like the default value appears to be 0.

Assuming my understanding is correct and there is an issue, I am happy to submit a pull request to either update get_default_config to match the README or vice version -- whichever is correct. If I misunderstood the meaning of "default" please feel free to close.

Pytorch TypeError when running examples

Hi,
I am trying to run your example files.
I first ran train_influence_functions.py, which ran fine, and then test_influence_functions.py, where I get the following output and error:

Files already downloaded and verified
Files already downloaded and verified
2022-06-24 10:58:49,474: Running on: 1 images per class.
2022-06-24 10:58:49,475: Starting at img number: 0 per class.

/usr/local/lib/python3.7/dist-packages/pytorch_influence_functions/influence_function.py:70: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
  y = torch.nn.functional.log_softmax(y)

Calc. s_test recursions: [=================================================] 1 / 1
Averaging r-times: [=======================================================] 1 / 1
Calc. influence function: [========================================] 49999 / 50000
Calc. influence function: [========================================] 50000 / 50000

---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

[<ipython-input-17-f738cfd2c21e>](https://localhost:8080/#) in <module>()
      3 trainloader, testloader = load_data()
      4 ptif.init_logging()
----> 5 inf = ptif.calc_img_wise(config, model, trainloader, testloader)

5 frames

[/usr/local/lib/python3.7/dist-packages/pytorch_influence_functions/calc_influence_function.py](https://localhost:8080/#) in calc_img_wise(config, model, train_loader, test_loader)
    469         influence, harmful, helpful, _ = calc_influence_single(
    470             model, train_loader, test_loader, test_id_num=i, gpu=config['gpu'],
--> 471             recursion_depth=config['recursion_depth'], r=config['r_averaging'])
    472         end_time = time.time()
    473 

[/usr/local/lib/python3.7/dist-packages/pytorch_influence_functions/calc_influence_function.py](https://localhost:8080/#) in calc_influence_single(model, train_loader, test_loader, test_id_num, gpu, recursion_depth, r, s_test_vec, time_logging)
    344         display_progress("Calc. influence function: ", i, train_dataset_size)
    345 
--> 346     harmful = np.argsort(influences)
    347     helpful = harmful[::-1]
    348 

<__array_function__ internals> in argsort(*args, **kwargs)

[/usr/local/lib/python3.7/dist-packages/numpy/core/fromnumeric.py](https://localhost:8080/#) in argsort(a, axis, kind, order)
   1112 
   1113     """
-> 1114     return _wrapfunc(a, 'argsort', axis=axis, kind=kind, order=order)
   1115 
   1116 

[/usr/local/lib/python3.7/dist-packages/numpy/core/fromnumeric.py](https://localhost:8080/#) in _wrapfunc(obj, method, *args, **kwds)
     52     bound = getattr(obj, method, None)
     53     if bound is None:
---> 54         return _wrapit(obj, method, *args, **kwds)
     55 
     56     try:

[/usr/local/lib/python3.7/dist-packages/numpy/core/fromnumeric.py](https://localhost:8080/#) in _wrapit(obj, method, *args, **kwds)
     41     except AttributeError:
     42         wrap = None
---> 43     result = getattr(asarray(obj), method)(*args, **kwds)
     44     if wrap:
     45         if not isinstance(result, mu.ndarray):

[/usr/local/lib/python3.7/dist-packages/torch/_tensor.py](https://localhost:8080/#) in __array__(self, dtype)
    731         if dtype is None:
    732             return self.numpy()
--> 733         else:
    734             return self.numpy().astype(dtype, copy=False)
    735 

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

I can't figure out why it is not working, as I assume it worked for you and I haven't changed anything.
Does anyone else have this issue or can tell me how to solve it?

Don't understand Readme

If the influence function is calculated for multiple test images, the helpfulness is ordered by average helpfulness to the prediction outcome of the processed test samples.

I don't understand this. My pytorch test dataloader I use has many examples, but the helpfulness list is always in order from the most positive for that specific test image. There doesn't appear to be any "averaging" happening?

Am I missing something?

GPU Requirments

Hi, thanks for this work! I'm running the cifar-10 examples. I wonder what's the GPU memory requirement to replicate this experiments. I have used a 32GB Tesla V100 and it failed due to CUDA_out_of_memory, which seems weird to me. Thanks in advance!

GPU Memory Usage Issues

Computations inside s_test seem to run out of memory when tensors are on a GPU device (within 3-4 iterations). Does it have to do with some memory not being freed properly in one of these functions? I've experimented with this with ridiculously small models and batch sizes

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.