Can we train the CNN detector using multiple GPUs? The available arguments cause the p

Multi-GPU training for detector about os2d HOT 6 CLOSED

saswat0 commented on September 28, 2024

Multi-GPU training for detector

from os2d.

Comments (6)

aosokin commented on September 28, 2024

Hi,this probably can done but will likely require adjusting the code. We've never tried training on multiple GPUs.

from os2d.

saswat0 commented on September 28, 2024

Okay. In my case, I have a 48GB GPU, but the training occupies only 8GB, thereby taking three days to train the detector. How were you able to do this in a shorter time?

from os2d.

aosokin commented on September 28, 2024

But what is the processing load of your GPU? if it is low you can try increasing batch size.
Another idea: it might be fine to train for significantly fewer iterations, you just need to monitor the behaviour of the validation loss to stop the process early.

from os2d.

saswat0 commented on September 28, 2024

GPU runs at 100% capacity, but most memory is left idle. Increasing the batch size in the config file isn't reflected in the final parameters. Did you face this issue while experimenting?
For the second approach, the current code doesn't have a tfboard support. Should I monitor the logs instead?

from os2d.

aosokin commented on September 28, 2024

GPU runs at 100% capacity, but most memory is left idle. Increasing the batch size in the config file isn't reflected in the final parameters. Did you face this issue while experimenting?

It sounds weird, changing train.batch_size and train.class_batch_size should definitely change the training process - at least the GPU memory usage should go up.
However, if the GPU is already at 100% simply changing batch size is not likely to increase training speed.

For the second approach, the current code doesn't have a tfboard support. Should I monitor the logs instead?

Yes, the code does not have tensorboard but it includes another visualization tool: os2d/utils/plot_visdom.py

from os2d.

saswat0 commented on September 28, 2024

Got it, thanks!

from os2d.

Related Issues (20)

Recommend Projects

Multi-GPU training for detector about os2d HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent