I have let's say a set of 1,000 heavily annotated panoptic domain specific images that

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Resources required to fine-tune this model with swin-l about oneformer HOT 13 CLOSED

th0mas-codes commented on June 12, 2024

Resources required to fine-tune this model with swin-l

from oneformer.

Comments (13)

alder-french-leviton commented on June 12, 2024

Hey @th0mas-codes, I'm in a similar boat with trying to figure out how to fine tune Oneformer for panoptic segmentation and curious as to why you closed this comment. Were you able to figure out how to fine tune it?

from oneformer.

th0mas-codes commented on June 12, 2024

I have managed to train on my custom data which is showing good results as the initial test. The setup is a bit extensive to get the data in the correct format for me but i managed.

I have been able to turn the image_batch size down to 4 from 16 and train using a single a100 40gb GPU. I was expecting to have to use a pretrained checkpoint on lets say the ade20k dataset to then finetune on top of that for decent results but it shows that for my data training on a heavily annotated dataset of 1.000 images with 2 classes i was able to get decent results just by training from "scratch".

I ended up not moving forward with the huggingface pretrained method and following there guide on setting up the training using detectron2 and the train_net.py file with tweaks as there are a lot of neat things from detectron2 that i can use / is already setup. Im sure if you know more about what you are doing huggingface finetune maybe could be easier but im fairly new to all this.

from oneformer.

alder-french-leviton commented on June 12, 2024

@th0mas-codes Thanks for getting back to me! I was hoping to use one of the pretrained models myself but it makes sense that doing so is more difficult. So it's good to know that if I end up training the model from scratch instead it can still work well. If I end up figuring out how to get the pretrained model fine tuned I'll let you know.

I have one question: What is your dataset format? I am a bit unclear what exactly the OneFormer model is expecting, at first I thought was COCO but now I'm thinking it's the Detectron2 format. Maybe even the same one used in this DigitalSreeni youtube video? If so that would be nice haha.

EDIT: Actually it seems like Detectron2 uses COCO format? I was able to figure out what the COCO JSON annotations look like, including their 'segmentation' polygons by creating an example in Makesense. But there's still one thing I'm unsure of, which is whether .PNG instance segmentation masks for each image are required for training OneFormer. I would think that the segmentation JSON has all the necessary data since it describes both the instances of segmentations as well as their class, yet the COCO train2017 panoptic dataset includes .PNG segmentation masks for each image.

from oneformer.

Pari-singh commented on June 12, 2024

Hi @alder-french-leviton @th0mas-codes and @praeclarumjj3. I trained the DiNAT backbone model for my custom images and got decent results. Now, I want to perform finetuning on those trained weights for some of the internal tasks, where I will have 500 new images on a regular basis. Thus, you understand that combining entire data and retraining is a kill, hence I am looking for a way to be able to finetune the weights on oncoming 500 images. However, I couldn't find a way to freeze layers for DiNAT. The config file (unlike that for resnet) does not have FREEZE option for MODEL.BACKBONE. Can any of you give more info on how to approach this problem.

Thanks

from oneformer.

alder-french-leviton commented on June 12, 2024

@Pari-singh I ended up forgetting about trying to get OneFormer working using the Detectron framework because I was running into so many errors when trying to setup its environment (it seems to require old libraries and an old CUDA Toolkit). Instead, I tried using HuggingFace's OneFormer, and after a lot of elbow grease I was able to get it working and have been training OneFormer for panoptic tasks, with batching, using the OneFormer model from HuggingFace with the SWIN Tiny backbone. If you want to do this yourself a good starting point is this notebook. I might try to make a tutorial based off my notebook so others can see how to use the OneFormer model for panoptic training with batching themselves, but I'm busy so it will be a while before I get it out.

from oneformer.

Resources required to fine-tune this model with swin-l about oneformer HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent