Coder Social home page Coder Social logo

dqshuai / metaformer Goto Github PK

View Code? Open in Web Editor NEW
211.0 6.0 36.0 592 KB

A PyTorch implementation of "MetaFormer: A Unified Meta Framework for Fine-Grained Recognition". A reference PyTorch implementation of “CoAtNet: Marrying Convolution and Attention for All Data Sizes”

License: MIT License

Python 100.00%
fine-grained-classification pytorch

metaformer's People

Contributors

bakingbrains avatar dqshuai avatar ifighting avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

metaformer's Issues

bert_embedding_cub

作者您好,可以提供一下bert_embedding_cub这个文件吗?我在使用meta训练的时候报错了,非常感谢!

model weights for MetaFormer-2 fine tuned on iNat 2018

are the weights for MetaFormer-2 fine tuned on iNat 2018 available? I'm doing research for this model and it would help a lot in training time and computation resources if I can get them without retraining again, thanks!

CUDA Version?

I don`t know the cuda version to run this project?
Maybe conda install pytorch==1.5.1 torchvision==0.6.1 cudatoolkit=10.2 -c pytorch?

model zoo

Hi, could you please share model zoo with baidu cloud disk ? Thanks! @dqshuai

I have a question about "linear embbeding" and "non-linear embbeding".

Thanks for all your great work!

I have two questions about the paper.

  1. Is figure 2 on page 4 of the paper and figure 1 on page 10 of the paper referring to the same architecture?

  2. The term "non-linear embbedding" and "linear embbedding" are used to describe embedding meta-information, but if the figures refer to the same architecture, what is the intention behind the different designations?
    Neural networks are iterations of processes that perform linear transformations and activation functions that perform nonlinear transformations. Is it correct to say that you used "non-linear embbedding" because you are using an activation function relu that performs a non-linear transformation?

RuntimeError in

Hi,thanks for your patience.
I'm new here,and when I try to run the train in CUB200,meet the error
Could you please help .THANGKS.

Here are my run commend
python3 -m torch.distributed.launch --nproc_per_node 6 --master_port 12345 main.py --cfg ./configs/MetaFG_meta_1_224.yaml --batch-size 12 --tag cub-200_v1 --lr 5e-5 --min-lr 5e-7 --warmup-lr 5e-8 --epochs 2500 --warmup-epochs 20 --dataset cub-200 --accumulation-steps 2 --opts DATA.IMG_SIZE 224

And the error

Traceback (most recent call last):
File "main.py", line 403, in
main(config)
File "main.py", line 163, in main
train_one_epoch_local_data(config, model, criterion, data_loader_train, optimizer, epoch, mixup_fn, lr_scheduler)
File "main.py", line 210, in train_one_epoch_local_data
outputs = model(samples,meta)
File "/home/miniconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/miniconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 705, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/miniconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/MetaFormer/models/MetaFG_meta.py", line 231, in forward
x = self.forward_features(x,meta)
File "/home/MetaFormer/models/MetaFG_meta.py", line 171, in forward_features
metas = torch.split(meta,self.meta_dims,dim=1)
File "/home/miniconda3/envs/pytorch/lib/python3.8/site-packages/torch/functional.py", line 156, in split
return tensor.split(split_size_or_sections, dim)
File "/home/miniconda3/envs/pytorch/lib/python3.8/site-packages/torch/tensor.py", line 499, in split
return super(Tensor, self).split_with_sizes(split_size, dim)
RuntimeError: split_with_sizes expects split_sizes to sum exactly to 32 (input tensor's size at dimension 1), but got split_sizes=[4, 3]

about pre-trained checkpoint

Very nice job!

Could you please provide some pre-trained checkpoints? For example, the 92.3% CUB accuracy MetaFormer-1, and the 92.9% CUB accuracy MetaFormer-2?

Appreciate for your generosity!

Checkpoint on iNaturalist 2018

Thanks for your wonderful work. If it is possible, could you please share your checkpoints on iNaturalist 2018? Thank you very much!

metadata-generation-failed

您好,我在安装环境的时候出现问题:
cwd: /home/gpu/PycharmProjects/MetaFormer-master/apex/
Preparing metadata (setup.py) ... error
error: metadata-generation-failed
× Encountered error while generating package metadata.

能麻烦您看一下是哪里出了问题吗?我每一步都是按照您的readme文件进行的,麻烦您了

bert_embedding_cub

您好,请问您可以提供一下bert_embedding_cub这个文件吗?非常感谢!

Regarding Inferencing.

I have trained the model for 28 epochs on CUB-200 dataset, also I wrote a small inferencing script which accepts a single image along with its meta info. but while predicting it is not at all giving good result. Do I need to train more?

Any suggestions here?

Thank you.

Regarding embedding files for cub.

Can you please suggest is this the right way to generate the embeddings?

from transformers import AutoTokenizer, AutoModel
import pickle

text_file = "file.txt"

with open(text_file, 'r') as rr:
    data = rr.read()

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModel.from_pretrained("bert-base-uncased")

inputs = tokenizer(data, return_tensors="pt")
outputs = model(**inputs)
print(outputs[0])

embeddings = {'embedding_words': outputs[0].detach().numpy()}

with open('filepickle', 'wb') as pkl:
    pickle.dump(embeddings, pkl)

Thank You

NAbirds dataset

Hello, can you provide the data set of NAbirds? An official website can't download it. Thank you very much

About running on one GPU

I have only one GPU. I have set local_rank=-1 ,but failed to run the code.
What do I need to revise to successfully run on one GPU?

Danish Fungi 2020 - Performance Evaluation

Dear All,

This is more a feature request than a bug; anyway, can you test/utilize your method on the DF20 dataset? We include much more metadata within the dataset, thus, the performance evaluation with regular ViT architecture might is interesting.

The link to the paper and web follows:

Best,
Lukas

Some errata found on the code

Hi,
Thanks for sharing wonderful work :)

While running the code, I found some minor errata building the data augmentation module.

from timm.data.transforms import _pil_interp

I guess this line should fixed with
from timm.data.transforms import str_to_pil_interp
since there is no _pil_interp in timm.data.transforms. https://github.com/rwightman/pytorch-image-models/blob/master/timm/data/transforms.py

If there is something I missed, pleased let me know :)

Is it fair to use larger pretrained model?

Hi,
First of all congratulations for your great work!

I always worried about the effect of pretrainning for FGVC. There is high risk of data overlapping of pretrained dataset and fine tune dataset. Take CUB dataset for example, it already find that CUB200-2011 have overlapping images in test dataset with imagenet1k train dataset see here. So it is highly possible that there will be more overlap of CUB with imagenet21k and iNaturalist. So there seems twio possible sources that can explain the obviously improvement when using pretrained model with larger dataset:

  1. the pretrained model may learned some commonly useful structures which improves performance on CUB task, this is good
  2. the pretrained model with larger dataset just has seem more test image from CUB test dataset, so it performs well, this is bad

So what is your opinion about this risk?

About how to get meta data?

hi, I have downloaded nabirds data from the link provided by you, but I don't find meta data,can you tell me how to get meta data, thanks!

Regarding object detection.

Does it work well for detection and localization task? (from the paper it should work well, to classify object with meta info)

Any suggestions here?

meta data

Hi, may I ask where did the meta data come from? Time, latitude and longitude are not provided in the dataset

About how to get meta data?

hi, I have downloaded cub-200 data from the link provided by you, but I don't find meta data,can you tell me how to get meta data, thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.