Coder Social home page Coder Social logo

Comments (6)

PCJohn avatar PCJohn commented on August 15, 2024 1

All other models are trained by finetuning the baseline model (starting from the bdd_peds.pth checkpoint)
The baseline model is trained "from scratch", but does use the pretrained resnet initialization. You'll have to download this model.

See the section "Download Pretrained Backbone Model" in INSTALL.md: https://github.com/AruniRC/detectron-self-train/blob/master/INSTALL.md

from detectron-self-train.

PCJohn avatar PCJohn commented on August 15, 2024

This is the path to the baseline model.
You can download it from this location: http://maxwell.cs.umass.edu/self-train/models/bdd_ped_models/bdd_baseline/bdd_peds.pth

The link is in the table in the "Models" section in the README

from detectron-self-train.

liyunsheng13 avatar liyunsheng13 commented on August 15, 2024

Do you mean using the baseline model as initialization to train other models? But I find for the baseline model, there is no initialization model in the train script. Is there an issue or you do it somewhere else in the source code. I think for the baseline model, when you train it, at least you need to use the pretrained resnet as initialzation. But I don't find you do it.
I trained the baseline model with you code for 70000 iterations and only get 10 mIoU which is worse than the reported result (~15). Do you think it is caused by random initialization?

from detectron-self-train.

AruniRC avatar AruniRC commented on August 15, 2024

@liyunsheng13 it may be a good first step to make sure you have installed everything correctly and the inference demo is working: https://github.com/AruniRC/detectron-self-train#inference-demo

If the demo is working and giving you the expected detection output, then the training scripts should work properly. If there is any further confusion please let us know.

BTW, the line that loads the Imagenet-pretrained Resnet weights for training BDD baseline is in the config YAML: https://github.com/AruniRC/detectron-self-train/blob/master/configs/baselines/bdd100k.yaml#L7

from detectron-self-train.

liyunsheng13 avatar liyunsheng13 commented on August 15, 2024

Hi, when I use the train script "bdd_source_and_HP18k.sh", I find the NUM_GPU is 1. Is there a type here? I though it would be 4 or 8. If I use 1, I will have the assertion error. Could you me know how many GPUs you use and the batch size per GPU for both "bdd_source_and_HP18k.sh" and the baseline results. It seems that you use 8 gpus with batch size = 1 for the baseline results which let me a little confused.

from detectron-self-train.

AruniRC avatar AruniRC commented on August 15, 2024

Hi @liyunsheng13 ,

the detectron train_net_step scales the learning rate and other settings based on (a) the number of GPUs available and (b) the NUM_GPU specified in the training config YAML.

When we trained, we kept the YAML unchanged, and set the number of GPUs at run time (this ensures correct learning rates scaling handled internally in the code). On a cluster this is set by the Slurm option --gres GPU:1 for specifying 1 visible GPU. Similarly, when using a local machine, we had to use CUDA_VISIBLE_DEVICES to 1. If this solves your assertion error, let us know, and we will update the README accordingly.

Also, the baseline BDD detector used a standard training pipeline, and we used 4 to 8 GPUs. For all other models (HP, HP-cons etc) we used a single GPU. Note: the config YAMLs are unchanged, only the run time settings are changed.

I'll tag @PCJohn for any additional comments, and confirming that 1 GPU was used when calling the training script on BDD.

from detectron-self-train.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.