Coder Social home page Coder Social logo

antixk / pytorch-model-compare Goto Github PK

View Code? Open in Web Editor NEW
317.0 4.0 37.0 2.72 MB

Compare neural networks by their feature similarity

License: MIT License

Python 100.00%
pytorch deep-learning neural-networks cka transformers imagenet feature-extraction pip torch-cka

pytorch-model-compare's Issues

Works fine with the whole model but raise "NANs" on selected layers.

When I was trying to compare the same model trained on different datasets, I encountered a weird problem:

It works fine when I compare all layers:
cka = CKA(model1, model2, device='cuda', model1_name='model1', model2_name='model2')

But, when I try to compare a selected subset of layers:
cka = CKA(model1, model2, device='cuda', model1_name='model1', model2_name='model2', model1_layers=list(model1.state_dict().keys())[:5], model2_layers=list(model2.state_dict().keys())[:5])
It raises:

HSIC computation resulted in NANs

Do you have any idea how to fix this? Thank you very much.

Better way to implement

I am trying to implement this code to check CKA similarity of ResNet50, MobileNetV3, EfficientNet, and many others. How can we implement it without running through the error?

ValueError: Input image size (32*32) doesn't match model (224*224).

I tried to compare Resnet50 vs ViT but got this error ValueError: Input image size (3232) doesn't match model (224224).
The code I used is

model1 = resnet50(pretrained=False)
layers_Res1 = list(model1.children())
model1 = nn.Sequential(*layers_Res1[:-1])

#model2 = resnet34(pretrained=True)
config = ViTConfig()
model2 = ViTModel(config)

Create a new model with the last two layers removed

layers_trans = list(model2.children())
model2 = nn.Sequential(*layers_trans[:-2])

cka = CKA(model1, model2,
model1_name="ResNet50", model2_name="ViT-B")
#,device='cuda')

cka.compare(dataloader)

cka.plot_results(save_path="/content/drive/MyDrive/resnet50vit-b.png")

getting spurious "HSIC computation resulted in NANs"

Thanks for this great little module! I was able to adapt the code to deal with models suitable for speech recognition (mostly transformers and conformers) and I'm learning a lot from the CKA outputs.

One problem I face is that for some models, some layers hit this assert after a certain number of batches. Basically, if I try to pass 300 batches of 32 through the model, I end up with NaN exception around 150 or so. It doesn't seem related to the data because I shuffle the data and get the same exception after the same number of batches.

I guess this is a numerical stability problem perhaps. Is there some assumptions about the range of the layer features and outputs?

The X means what in the formulation of HSIC

I notice that in the formulation of HSIC , the K are calculated with X . And X denote a matrix of activateions of p1 neurons for n examples . But in the code , it seems like the feature map after this layer . I don't know if i am wrong , could you help me ?
Take an example , use CIFAR-10 as dataset , which images are 3232 . And the first layer is conv2(in_channels=3,out_channels=6,kernel_size=5) , the feature map after this layer could be 62828 , and the neurons could be 65*5 . Which should i take for calculate CKA?
I read the paper and i think it should be the representation , do you think what i said is right?
But it some other project of CKA on githubs , they use the weight of neurons which confuses me . If i want to use traditional CKA to calculate the CKA similarities , where can i find the needed feature map?

Comparision between ResNet50 and ViT gives error

!pip install transformers
import torch
from torchvision import transforms
from torch.utils.data import Dataset, DataLoader
import torch.nn as nn
import torch.nn.functional as F
from transformers import ViTFeatureExtractor, ViTModel, ViTConfig, AutoConfig

Modify the model - ResNet

model_Res = torch.hub.load('pytorch/vision:v0.10.0', 'resnet50', pretrained=True)

Remove the last layer of the model Res

layers_Res = list(model_Res.children())
model_Res = nn.Sequential(*layers_Res[:-1])

Set the top layers to be not trainable

count = 0
for child in model_Res.children():
count += 1
if count < 8:
for param in child.parameters():
param.requires_grad = False

Modify the model - ViT model

model_trans = ViTModel.from_pretrained('google/vit-base-patch16-224-in21k')
count = 0
for child in model_trans.children():
count += 1
if count >= 4:
for param in child.parameters():
param.requires_grad = False

layers_trans = list(model_trans.children())
model_trans_top = nn.Sequential(*layers_trans[:-2])

model1 = model_Res
model2 = model_trans_top

cka = CKA(model1, model2,
model1_name="ResNet50", model2_name="ViT",
device='cuda')

cka.compare(dataloader)

cka.plot_results(save_path="/content/drive/MyDrive/resnet-ViTcompare.png")

i got this error ValueError: Input image size (3232) doesn't match model (224224).

AssertionError: HSIC computation resulted in NANs

I tried comparing many EfficientNet to other models (and its variants), but all I got is this error: AssertionError: HSIC computation resulted in NANs.
One example:

python3 eff_b0b2_compare.py

eff_b0b2_compare.py:

import torch
from torchvision.models import efficientnet_b0, efficientnet_b2 # edit
from torchvision.datasets import CIFAR10
from torch.utils.data import DataLoader
import torchvision.transforms as transforms
import numpy as np
import random
from torch_cka import CKA

def seed_worker(worker_id):
    worker_seed = torch.initial_seed() % 2**32
    np.random.seed(worker_seed)
    random.seed(worker_seed)

g = torch.Generator()
g.manual_seed(0)
np.random.seed(0)
random.seed(0)

model1_name, model2_name = 'efficientnet_b0', 'efficientnet_b2' # edit
model1 = efficientnet_b0(pretrained=True) # edit
model2 = efficientnet_b2(pretrained=True) # edit

transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))])

batch_size = 16 # 256

dataset = CIFAR10(root='../data/',
                  train=False,
                  download=True,
                  transform=transform)

dataloader = DataLoader(dataset,
                        batch_size=batch_size,
                        shuffle=False,
                        worker_init_fn=seed_worker,
                        generator=g,)

cka = CKA(model1, model2,
        model1_name=model1_name, model2_name=model2_name,
        device='cuda')

cka.compare(dataloader)

cka.plot_results(save_path="../exps/{}.jpg".format(model1_name, model2_name))
/home/brcao/.local/lib/python3.8/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
  warnings.warn(
/home/brcao/.local/lib/python3.8/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=EfficientNet_B0_Weights.IMAGENET1K_V1`. You can also use `weights=EfficientNet_B0_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
/home/brcao/.local/lib/python3.8/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=EfficientNet_B2_Weights.IMAGENET1K_V1`. You can also use `weights=EfficientNet_B2_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
Files already downloaded and verified
/home/brcao/.local/lib/python3.8/site-packages/torch_cka/cka.py:62: UserWarning: Model 1 seems to have a lot of layers. Consider giving a list of layers whose features you are concerned with through the 'model1_layers' parameter. Your CPU/GPU will thank you :)
  warn("Model 1 seems to have a lot of layers. " \
/home/brcao/.local/lib/python3.8/site-packages/torch_cka/cka.py:69: UserWarning: Model 2 seems to have a lot of layers. Consider giving a list of layers whose features you are concerned with through the 'model2_layers' parameter. Your CPU/GPU will thank you :)
  warn("Model 2 seems to have a lot of layers. " \
/home/brcao/.local/lib/python3.8/site-packages/torch_cka/cka.py:145: UserWarning: Dataloader for Model 2 is not given. Using the same dataloader for both models.
  warn("Dataloader for Model 2 is not given. Using the same dataloader for both models.")
| Comparing features |:  28                                                                                    | Comparing features |:  32%|▎| 13                 | Comparing features |:  35%|▎| 14                                                                                                     | Comparing features |:  38%|▍| 15                                  | Comparing features |: 100%|██| 40/40 [3:43:19<00:00, 335.00s/it]^[[B^[[A^[[B^[[A^[[B
Traceback (most recent call last):
  File "eff_b0b2_compare.py", line 45, in <module>
    cka.compare(dataloader)
  File "/home/brcao/.local/lib/python3.8/site-packages/torch_cka/cka.py", line 183, in compare
    assert not torch.isnan(self.hsic_matrix).any(), "HSIC computation resulted in NANs"
AssertionError: HSIC computation resulted in NANs

Any help would be great. Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.