mlpc-ucsd / coat Goto Github PK
View Code? Open in Web Editor NEW(ICCV 2021 Oral) CoaT: Co-Scale Conv-Attentional Image Transformers
License: Apache License 2.0
(ICCV 2021 Oral) CoaT: Co-Scale Conv-Attentional Image Transformers
License: Apache License 2.0
Hi,
I'm very impressed by your excellent work! Thanks for sharing your code.
I have questions about the training protocol.
In your paper,
"We train all models with a global batch size of 2048 with the NVIDIA Automatic Mixed Precision(AMP) enabled."
but the training script denotes the batch size of 256, instead of 2048.
I wonder two points from here.
Can I re-produce the result accuracy in this repo by using this command (batch size=256, instead of 2048)?
Does this repo contains AMP?
Thanks in advance :)
Hi, Author.
I want to know are EV and EV(hat) equivalent or approximate in the paper?
Are EV(hat)^l in the second half of formula 7 and formula 8 equivalent or approximate?
Thank you, looking forward to your answer.
Hello, It is a nice work, I find you have released the pre-trained checkpoint for CoaT-Lite. Do you have plans to release the pre-trained checkpoint for CoaT ? Thanks ~
Hi,
For checking re-producibility, I tried to train the coat_lite_mini model(reported 79.1/94.5) and got 78.85/94.42 by using this command :
bash scripts/train.sh coat_lite_mini coat_lite_mini
with the default settings such as the batch size of 256 and using 8 GPUs (TITAN RTX).
Is such a small difference (79.1 vs. 78.9) negligible?
My environment :
sys.platform linux
Python 3.7.9 (default, Aug 31 2020, 12:42:55) [GCC 7.3.0]
numpy 1.19.2
Compiler GCC 7.5
CUDA compiler CUDA 10.1
detectron2 arch flags 7.5
DETECTRON2_ENV_MODULE
PyTorch 1.7.0
PyTorch debug build True
GPU available True
GPU 0,1,2,3,4,5,6,7 TITAN RTX (arch=7.5)
CUDA_HOME /usr/local/cuda-10.1
Pillow 8.0.1
torchvision 0.8.0
torchvision arch flags 3.5, 5.0, 6.0, 7.0, 7.5
fvcore 0.1.2.post20201218
cv2 Not found
PyTorch built with:
- GCC 7.3
- C++ Version: 201402
- Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- NNPACK is enabled
- CPU capability usage: AVX2
- CUDA Runtime 10.2
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
- CuDNN 7.6.5
- Magma 2.5.2
- Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,
Hi,
Could you share the script for finetuning 384x384 model?
Hi, Thanks for your great work. I wonder would you share the tools for calculating flops and parameters for detection?
I am very interested in your project. i want to know when will your team releases the code with pre-trained model please
Hi,
First of all, congratulations on your acceptance in ICCV 👍.
I have seen your updated paper(arxiv v2) and have some questions.
In Table 3 which shows Mask R-CNN results under MMDetection framework,
(1)
Is the result with FPN 1x is trained ms-train or single-scale train ??
(2)
Do you have a plan to release the new implementation of MMDetection?
Thanks in advance :)
Dear Weijian,
I recently read your paper on CoaT, it's really excellent work!
I wish to do some further research based on CoaT Small. However, it's not mentioned in your paper or repo, so I wonder if you had implemented CoaT Small and if yes, will the model be made available?
Thanks in advance!
@yix081 @xwjabc thanks for sharing the code base , i have few queries on the problem statement which i am working for its gender_age classification of a person ie multilabel recognition problem
@yix081 @xwjabc thanks for you work, it has helped me a lot but had few queries
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.