juliendenize / eztorch Goto Github PK

View Code? Open in Web Editor NEW

36.0 36.0 0.0 2.08 MB

Library to perform image and video self-supervised learning.

Home Page: https://juliendenize.github.io/eztorch/

License: Other

Python 100.00%

contrastive-learning image image-processing pytorch self-supervised video video-processing

eztorch's People

Contributors

Stargazers

Watchers

eztorch's Issues

Unable to use provided COMEDIAN checkpoint for inference

Hello! Firstly thanks for providing this amazing library and research! I got an issue during the inference using the provided checkpoint.

I was unable to use the ViViT Tiny checkpoint to run the inference. There are several missing keys in the checkpoint, such as pytorch-lightning_version, global_step, epoch and state_dict. However I modified them and artificially created the missing keys, and moved the whole checkpoint dict into the state_dict. Now I was greeted with another missing keys of the model itself:

RuntimeError: Error(s) in loading state_dict for SoccerNetSpottingModel:
        Missing key(s) in state_dict: "train_transform.0._transform.0.transforms.2.brightness", "train_transform.0._transform.0.transforms.2.contrast", "train_transform.0._transform.0.transforms.2.saturation", "val_transform.1._transform.0.transforms.2.mean", "val_transform.1._transform.0.transforms.2.std", "test_transform.1._transform.0.transforms.2.mean", "test_transform.1._transform.0.transforms.2.std".
        Unexpected key(s) in state_dict: "trunk.transformer.temporal_mask_token", "val_transform.1._transform.2.mean", "val_transform.1._transform.2.std", "test_transform.1._transform.2.mean", "test_transform.1._transform.2.std".
        size mismatch for train_transform.0._transform.0.transforms.5.mean: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([3, 1, 1]).
        size mismatch for train_transform.0._transform.0.transforms.5.std: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([3, 1, 1]).

I'm using the inference step args in the doc, with this as the inference input config

config_path="../eztorch/configs/run/finetuning/vivit"
config_name="vivit_tiny_soccernet_uniform"
...

Do you have any idea what did I miss?

Thanks again!

Compute custom video with COMEDIAN

I am interested in computing my own video with COMEDIAN to detect soccer events.

Specifically I would like to use the already pretrained model for a number of different events.

I have tried to capture the model in a Python script to load the model.

from hydra import compose, initialize
import hydra
from eztorch.utils.utils import compile_model

# initialize

with initialize(
    config_path="./eztorch/configs/run/finetuning/vivit",
    version_base="1.11",
):
    config = compose(config_name="vivit_tiny_soccernet_uniform").
    model = hydra.utils.instantiate(config.model)
    model = compile_model(model, config)

Load .pth:

ckpt_path = "./comedian_vivit_tiny_seed203.pth"
state_dict = torch.load(ckpt_path)
state_dict["train_transform.0._transform.0.transforms.5.mean"] = state_dict["train_transform.0._transform.0.transforms.5.mean"].view(3, 1, 1)
state_dict["train_transform.0._transform.0.transforms.5.std"] = state_dict["train_transform.0._transform.0.transforms.5.std"].view(3, 1, 1)
model._orig_mod.load_state_dict(state_dict, strict=False)

The result:

_IncompatibleKeys(missing_keys=['train_transform.0._transform.0.transforms.2.brightness', 'train_transform.0._transform.0.transforms.2.contrast', 'train_transform.0._transform.0.transforms.2.saturation', 'val_transform.1._transform.0.transforms.2.mean', 'val_transform.1._transform.0.transforms.2.std', 'test_transform.1._transform.0.transforms.2.mean', 'test_transform.1._transform.0.transforms.2.std'], unexpected_keys=['trunk.transformer.temporal_mask_token', 'trunk.transformer.temporal_transformer.blocks.4.norm1.weight', 'trunk.transformer.temporal_transformer.blocks.4.norm1.bias', 'trunk.transformer.temporal_transformer.blocks.4.attn.qkv.weight', 'trunk.transformer.temporal_transformer.blocks.4.attn.qkv.bias', 'trunk.transformer.temporal_transformer.blocks.4.attn.proj.weight', 'trunk.transformer.temporal_transformer.blocks.4.attn.proj.bias', 'trunk.transformer.temporal_transformer.blocks.4.norm2.weight', 'trunk.transformer.temporal_transformer.blocks.4.norm2.bias', 'trunk.transformer.temporal_transformer.blocks.4.mlp.fc1.weight', 'trunk.transformer.temporal_transformer.blocks.4.mlp.fc1.bias', 'trunk.transformer.temporal_transformer.blocks.4.mlp.fc2.weight', 'trunk.transformer.temporal_transformer.blocks.4.mlp.fc2.bias', 'trunk.transformer.temporal_transformer.blocks.5.norm1.weight', 'trunk.transformer.temporal_transformer.blocks.5.norm1.bias', 'trunk.transformer.temporal_transformer.blocks.5.attn.qkv.weight', 'trunk.transformer.temporal_transformer.blocks.5.attn.qkv.bias', 'trunk.transformer.temporal_transformer.blocks.5.attn.proj.weight', 'trunk.transformer.temporal_transformer.blocks.5.attn.proj.bias', 'trunk.transformer.temporal_transformer.blocks.5.norm2.weight', 'trunk.transformer.temporal_transformer.blocks.5.norm2.bias', 'trunk.transformer.temporal_transformer.blocks.5.mlp.fc1.weight', 'trunk.transformer.temporal_transformer.blocks.5.mlp.fc1.bias', 'trunk.transformer.temporal_transformer.blocks.5.mlp.fc2.weight', 'trunk.transformer.temporal_transformer.blocks.5.mlp.fc2.bias', 'val_transform.1._transform.2.mean', 'val_transform.1._transform.2.std', 'test_transform.1._transform.2.mean', 'test_transform.1._transform.2.std'])

I think the pth has loaded well minus a few transforms. I'm not sure if I'm performing the model loading in the best way: I'm trying to use it freely from Python, because when I train it I want to use it in a process.

If I am right the model._orig_mod.trunk layer directly extracts the features every 2 frames.

x = torch.randn(1, 3, 128, 224, 224).cuda()
model.eval()
model = model.cuda()
with torch.no_grad():
    feats = model._orig_mod.trunk(x)
    y = model(x)
    assert torch.allclose(feats, y['h'], atol=1e-5)

Please confirm me these questions and if I am doing it right.

Train Vivit_tiny on soccerNetV2 from scratch

Hi!
Thanks for providing this cool code base!
I'm trying to reproduce our results from the paper in Table 5, where you got 48.1% t-AmAP without step 1 and step 2, and I am struggling with this.

As a beginning, I was able to reproduce good results based on your checkpoint so it appears not to be a data issue.

Can you share the hyperparams that led you to get these results without any pertaining?
I see that the learning rate in the paper was 5*10^(-4) but in the yaml it is 0.001, but this did not help the network to converge.
Do I miss something?
Did you train on 2-80GB A100?

Thanks in advance!
Daniel

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.