msminhas93 / deeplabv3finetuning Goto Github PK

View Code? Open in Web Editor NEW

157.0 157.0 99.0 4.09 MB

Tutorial on fine tuning DeepLabv3 segmentation network for your own segmentation task in PyTorch.

License: MIT License

Jupyter Notebook 97.75% Python 2.25%

deeplabv3finetuning's People

Contributors

Stargazers

Watchers

Forkers

dgks0n bemoregt timeamagyar manugoyal12345 buzzit-jimmytse sumanpal94 vicioux salildabholkar barlo92 soorajpandey maxfrei750 pahal2007 hyun-jin-park amitkayal gerardwalsh sarthak-srivastava benmaxgcu zhou-rui1 bunderhi ashishpatel26 diptendra hsulin0806 naveenkumarmulabitmovin xrosliang angrysword thp01126 aymaneleya clarkdinh dibs06 maafihanene stencilman celestialized mahdiesrafili 13301338176 lukaka4331 anacosmina pritesh-aidash sytzesimonse bartuzi-envie yaroslavmavliutov dangelfred xwasco rogo96 jummy1124 lliger9 mfkiwl fnhdx aureliedj mzhi phuccuongngo99 mauricioprod sonamv18 msoftware soham-chitnis10 lord-aresyzen jyuatsfl xun925 snooplsm duanweiwe abhisheklalwani luweishuang svkd maf88 yseen gasperpodobnik gaoy74 wisamreid astatine404 cuvankanbur yonglinz adrienchassignet hanhern sommern jyue86 marat-b dalacan eagol lixiang007666 jcjang1 ajevnisek jacobsayono niklasdhahn farahmand-m msaif314 thalesfnsc sherigithub leeshien shinnthant89 douglas-ll cnzakimuena lucainiaoge grndng finuom wilsonjvp ytkz11 vchaparro henraso

deeplabv3finetuning's Issues

Issue with inference using fine-trained model

After retraining the model using the code in this repository, I have attempted to use it in order to segment one of the images in the CrackForest db. The code is taken from the pytorch page:

import torch
model = torch.load('output/weights.pt')
model.eval()

import urllib
url, filename = ("file:///home/rhobincu/gitroot/DeepLabv3FineTuning/CrackForest/Images/092.jpg", "092.jpg")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)

# sample execution (requires torchvision)
from PIL import Image
from torchvision import transforms
input_image = Image.open(filename)
print(input_image)
preprocess = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model

# move the input and model to GPU for speed if available
if torch.cuda.is_available():
    print('Using GPU!')
    input_batch = input_batch.to('cuda')
    model.to('cuda')

import time

start = time.clock()
with torch.no_grad():
    output = model(input_batch)['out'][0]
end = time.clock()
print('Inference duration (s): ', end - start)
output_predictions = output.argmax(0)

# create a color pallette, selecting a color for each class
palette = torch.tensor([2 ** 25 - 1, 2 ** 15 - 1, 2])
colors = torch.as_tensor([i for i in range(2)])[:, None] * palette
colors = (colors % 255).numpy().astype("uint8")
print(colors)

# plot the semantic segmentation predictions of 21 classes in each color
img_size = input_image.size
data = output_predictions.byte().cpu().numpy()
print(data)
print(data.sum())
r = Image.fromarray(data).resize(img_size)
r.putpalette(colors)

#cv2.imshow('image',input_image)

import matplotlib.pyplot as plt
plt.imshow(r)
plt.show()

The problem is that the output from the nn (data) is full of 0. Any ideas?

Producing a sample

Thank you for your code and awesome project, I found it really useful and informative for a current project I'm working on. I was able to train the DeepLabv3 model on a welding joint semantic segmentation problem within 8 epochs (code is here with you the credit you!). I didn't initially apply any transforms. Rather, the masks were produced in MATLAB and those were used as the ground truth. The white on the masks registered as 255 instead of 1 so I had to make sure to divide by 1. I made one modification while trying to produce a sample as well, but otherwise, I used your approach. I had images that were 480 x 640 also. You could probably use cv2 here as well but I wanted to make it the same as how I was setting up my input data.

image2 = np.reshape(np.swapaxes(np.transpose(Image.open(IMAGES + 'tjoint_123105000329.png')), 1, 2), (1, 3, 480, 640))
mask2 = cv2.imread(MASKS + 'tjoint_123105000329_output.png')

with torch.no_grad():
  b = baseline(torch.from_numpy(image2).type(torch.cuda.FloatTensor))

MSE vs CrossEntropy as the loss function.

Thanks for the code sharing, first of all.

The segmentation output is the image of class ids. So BCELoss (for binary segmentation) or CrossEntropy should be used, I guess, even though MSE also may be used too.

E.g., this uses crossentropy: https://github.com/wkentaro/pytorch-fcn/blob/master/torchfcn/trainer.py

Any special reason for MSE? I am just curious.

Thanks.

Image Data Normalization

As presented in https://pytorch.org/hub/pytorch_vision_deeplabv3_resnet101/, any input image data must be normalized because the backbone (resnet) was trained with it, I guess.

All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (N, 3, H, W), where N is the number of images, H and W are expected to be at least 224 pixels. The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225].

Regarding output values

Hi, I've trained the model successfully. On performing inference as per code in "analysis.ipynb", I see that the histogram of the flattened output array has a few negative values as well. How is this possible? Isn't the final output of the network obtained after a sigmoid layer so as to represent probabilities?

image segmentation with multiple classes

Thanks for the amazing project! My dataset is multiple classes segmentation. Mask image is (H, W), which H is height, W is width, each pixel is an integer representing the class. For example, tree: 0, ... car: 8, sky: 9. The mask looks like [[0,3,9],[3,4,5]].
The number of my classes is 10.

I'm wondering how to train on this dataset. Should it be like

def get_model(num_classes=10):
    model = models.segmentation.deeplabv3_resnet101(pretrained=True, progress=True)
    model.classifier = DeepLabHead(2048, num_classes=10)
    model.train()
    return model

However, the prediction size seems to be wrong.
The y_pred is torch.Size([8, 38, 256, 456]) but y_truth is torch.Size([8, 256, 456]), 8 is the batch size, 256 is H, 456 is W.
y_pred = model(inputs)['out']

The size doesn't match to feed into loss function. Moreover, the y_pred has float number for every element but I expect it should be the number representing class like 0,1,2,3.

May I ask how to deal with it? Thanks a lot for helping!

ValueError: num_samples should be a positive integer value, but got num_samples=0

I try to apply it using my own dataset(4 images), but why cannot the model work，what's wrong?

Analysis.ipynb cv2.imread BGR2RGB conversion

Analysis.ipynb, in [6]
img = cv2.imread(f'./CrackForest/Images/{ino:03d}.jpg').transpose(2,0,1).reshape(1,3,320,480)
will convert RGB image incorrectly.
(cv2 and PIL Image read RGB image format differently)
(need to use cv2.cvtColor for fixing cv2's BGR into RGB)
img = cv2.imread(f'./CrackForest/Images/{ino:03d}.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB).transpose(2,0,1).reshape(1,3,320,480)

Where is the sklearn.metrics module?

from sklearn.metrics import roc_auc_score, f1_score

ImportError: No module named sklearn.metrics

Frozen layers

Did you freeze any layers in your implementation or does your model get retrained and uses the starting weights from the DeepLabv3?

Expected more than 1 value per channel when training

Hi, @msminhas93

I met this error when run with my data.

ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])

What's wrong to me?

Thanks
Best,
@bemoregt.

Batch summary metrics are averaged with the initial zero

I tried to open a PR but got an error.

Anyways in trainer.py it should be like that. Otherwise the metrics are averaged with zero and therefore inaccurate

-        batchsummary = {a: [0] for a in fieldnames}
+        batchsummary = {a: [] for a in fieldnames}

RuntimeError: CUDA out of memory. Tried to allocate 42.00 MiB (GPU 0; 3.81 GiB total capacity; 2.79 GiB already allocated; 25.44 MiB free; 2.92 GiB reserved in total by PyTorch)

RuntimeError Traceback (most recent call last)
Cell In[27], line 44
42 # Specify the optimizer with a lower learning rate
43 optimizer = torch.optim.Adam(model.parameters(), lr=1e-4)
---> 44 _ = train_model(model,
45 criterion,
46 dataloaders,
47 optimizer,
48 bpath=exp_directory,
49 metrics=metrics,
50 num_epochs=epochs)
52 # Save the trained model
53 torch.save(model, exp_directory / 'weights.pt')

File ~/Documents/5G/DeepLabv3FineTuning/trainer.py:49, in train_model(model, criterion, dataloaders, optimizer, metrics, bpath, num_epochs)
47 # track history if only in train
48 with torch.set_grad_enabled(phase == 'Train'):
---> 49 outputs = model(inputs)
50 loss = criterion(outputs['out'], masks)
51 y_pred = outputs['out'].data.cpu().numpy().ravel()

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:722, in Module._call_impl(self, *input, **kwargs)
720 result = self._slow_forward(*input, **kwargs)
721 else:
--> 722 result = self.forward(*input, **kwargs)
723 for hook in itertools.chain(
724 _global_forward_hooks.values(),
725 self._forward_hooks.values()):
726 hook_result = hook(self, input, result)

File ~/.local/lib/python3.8/site-packages/torchvision/models/segmentation/_utils.py:20, in _SimpleSegmentationModel.forward(self, x)
18 input_shape = x.shape[-2:]
19 # contract: features is a dict of tensors
---> 20 features = self.backbone(x)
22 result = OrderedDict()
23 x = features["out"]

File ~/.local/lib/python3.8/site-packages/torchvision/models/_utils.py:63, in IntermediateLayerGetter.forward(self, x)
61 out = OrderedDict()
62 for name, module in self.items():
---> 63 x = module(x)
64 if name in self.return_layers:
65 out_name = self.return_layers[name]

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/container.py:117, in Sequential.forward(self, input)
115 def forward(self, input):
116 for module in self:
--> 117 input = module(input)
118 return input

File ~/.local/lib/python3.8/site-packages/torchvision/models/resnet.py:112, in Bottleneck.forward(self, x)
109 out = self.bn2(out)
110 out = self.relu(out)
--> 112 out = self.conv3(out)
113 out = self.bn3(out)
115 if self.downsample is not None:

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py:419, in Conv2d.forward(self, input)
418 def forward(self, input: Tensor) -> Tensor:
--> 419 return self._conv_forward(input, self.weight)

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py:415, in Conv2d._conv_forward(self, input, weight)
411 if self.padding_mode != 'zeros':
412 return F.conv2d(F.pad(input, self._reversed_padding_repeated_twice, mode=self.padding_mode),
413 weight, self.bias, self.stride,
414 _pair(0), self.dilation, self.groups)
--> 415 return F.conv2d(input, weight, self.bias, self.stride,
416 self.padding, self.dilation, self.groups)

RuntimeError: CUDA out of memory. Tried to allocate 42.00 MiB (GPU 0; 3.81 GiB total capacity; 2.79 GiB already allocated; 25.44 MiB free; 2.92 GiB reserved in total by PyTorch)

CUDA memory issue

Hello,

First off: Great work with this project, it is exceptionally cleanly written and structured imo!
It is nice to see that there is a tutorial, the full code on github with clean and working instructions. (It is working using the CPU for me) :)

Is there a way to make pytorch reserve less RAM?
In another post someone talked about a active variable in the session which seems to use too much memory:
https://discuss.pytorch.org/t/how-does-reserved-in-total-by-pytorch-work/70172/2

Could this be the case? (I am not an Python expert)

Here's the issue:

\DeepLabv3FineTuning-master> python main.py --data-directory CrackForest --exp_directory CFExp --epochs 1
Epoch 1/1

  0%|                                                                                           | 0/24 [00:05<?, ?it/s]
Traceback (most recent call last):
  File "main.py", line 66, in <module>
    main()
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\click\core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\click\core.py", line 782, in main
    rv = self.invoke(ctx)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\click\core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\click\core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "main.py", line 53, in main
    _ = train_model(model,
  File "C:\Users\gdurm\Downloads\DeepLabv3FineTuning-master\DeepLabv3FineTuning-master\trainer.py", line 49, in train_model
    outputs = model(inputs)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torchvision\models\segmentation\_utils.py", line 19, in forward
    features = self.backbone(x)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torchvision\models\_utils.py", line 63, in forward
    x = module(x)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\container.py", line 119, in forward
    input = module(input)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torchvision\models\resnet.py", line 133, in forward
    out = self.bn3(out)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\batchnorm.py", line 135, in forward
    return F.batch_norm(
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\functional.py", line 2146, in batch_norm
    return torch.batch_norm(
RuntimeError: CUDA out of memory. Tried to allocate 38.00 MiB (GPU 0; 2.00 GiB total capacity; 1.05 GiB already allocated; 11.44 MiB free; 1.10 GiB reserved in total by PyTorch)```

Zeros in the prediction

In the variable data, I get all zeros for some reason. Can you please provide assistance regarding that?

import torch
model = torch.load('output/weights.pt')
model.eval()

import urllib
url, filename = ("file:///home/rhobincu/gitroot/DeepLabv3FineTuning/CrackForest/Images/092.jpg", "092.jpg")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)

# sample execution (requires torchvision)
from PIL import Image
from torchvision import transforms
input_image = Image.open(filename)
print(input_image)
preprocess = transforms.Compose([
    [transforms.Resize((512,512)), transforms.ToTensor()]
)

input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model

# move the input and model to GPU for speed if available
if torch.cuda.is_available():
    print('Using GPU!')
    input_batch = input_batch.to('cuda')
    model.to('cuda')

import time

start = time.clock()
with torch.no_grad():
    output = model(input_batch)['out'][0]
end = time.clock()
print('Inference duration (s): ', end - start)
data = output.argmax(0).detach().byte().cpu().numpy()

# create a color pallette, selecting a color for each class
palette = torch.tensor([2 ** 25 - 1, 2 ** 15 - 1, 2])
colors = torch.as_tensor([i for i in range(2)])[:, None] * palette
colors = (colors % 255).numpy().astype("uint8")
print(colors)

# plot the semantic segmentation predictions
img_size = input_image.size
r = Image.fromarray(data).resize(img_size)
r.putpalette(colors)

import matplotlib.pyplot as plt
plt.imshow(r)
plt.show()

How to convert path file to TensorRT?

Hi, I want to deploy the model in TensorRT format, do you have experience about convert model?

Thanks!