Coder Social home page Coder Social logo

deeplabv3finetuning's People

Contributors

anacosmina avatar msminhas93 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

deeplabv3finetuning's Issues

Issue with inference using fine-trained model

After retraining the model using the code in this repository, I have attempted to use it in order to segment one of the images in the CrackForest db. The code is taken from the pytorch page:

import torch
model = torch.load('output/weights.pt')
model.eval()

import urllib
url, filename = ("file:///home/rhobincu/gitroot/DeepLabv3FineTuning/CrackForest/Images/092.jpg", "092.jpg")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)

# sample execution (requires torchvision)
from PIL import Image
from torchvision import transforms
input_image = Image.open(filename)
print(input_image)
preprocess = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model

# move the input and model to GPU for speed if available
if torch.cuda.is_available():
    print('Using GPU!')
    input_batch = input_batch.to('cuda')
    model.to('cuda')

import time

start = time.clock()
with torch.no_grad():
    output = model(input_batch)['out'][0]
end = time.clock()
print('Inference duration (s): ', end - start)
output_predictions = output.argmax(0)

# create a color pallette, selecting a color for each class
palette = torch.tensor([2 ** 25 - 1, 2 ** 15 - 1, 2])
colors = torch.as_tensor([i for i in range(2)])[:, None] * palette
colors = (colors % 255).numpy().astype("uint8")
print(colors)

# plot the semantic segmentation predictions of 21 classes in each color
img_size = input_image.size
data = output_predictions.byte().cpu().numpy()
print(data)
print(data.sum())
r = Image.fromarray(data).resize(img_size)
r.putpalette(colors)

#cv2.imshow('image',input_image)

import matplotlib.pyplot as plt
plt.imshow(r)
plt.show()

The problem is that the output from the nn (data) is full of 0. Any ideas?

Producing a sample

Thank you for your code and awesome project, I found it really useful and informative for a current project I'm working on. I was able to train the DeepLabv3 model on a welding joint semantic segmentation problem within 8 epochs (code is here with you the credit you!). I didn't initially apply any transforms. Rather, the masks were produced in MATLAB and those were used as the ground truth. The white on the masks registered as 255 instead of 1 so I had to make sure to divide by 1. I made one modification while trying to produce a sample as well, but otherwise, I used your approach. I had images that were 480 x 640 also. You could probably use cv2 here as well but I wanted to make it the same as how I was setting up my input data.

image2 = np.reshape(np.swapaxes(np.transpose(Image.open(IMAGES + 'tjoint_123105000329.png')), 1, 2), (1, 3, 480, 640))
mask2 = cv2.imread(MASKS + 'tjoint_123105000329_output.png')

with torch.no_grad():
  b = baseline(torch.from_numpy(image2).type(torch.cuda.FloatTensor))

Image Data Normalization

As presented in https://pytorch.org/hub/pytorch_vision_deeplabv3_resnet101/, any input image data must be normalized because the backbone (resnet) was trained with it, I guess.

All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (N, 3, H, W), where N is the number of images, H and W are expected to be at least 224 pixels. The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225].

Regarding output values

Hi, I've trained the model successfully. On performing inference as per code in "analysis.ipynb", I see that the histogram of the flattened output array has a few negative values as well. How is this possible? Isn't the final output of the network obtained after a sigmoid layer so as to represent probabilities?

image segmentation with multiple classes

Thanks for the amazing project! My dataset is multiple classes segmentation. Mask image is (H, W), which H is height, W is width, each pixel is an integer representing the class. For example, tree: 0, ... car: 8, sky: 9. The mask looks like [[0,3,9],[3,4,5]].
The number of my classes is 10.

I'm wondering how to train on this dataset. Should it be like

def get_model(num_classes=10):
    model = models.segmentation.deeplabv3_resnet101(pretrained=True, progress=True)
    model.classifier = DeepLabHead(2048, num_classes=10)
    model.train()
    return model

However, the prediction size seems to be wrong.
The y_pred is torch.Size([8, 38, 256, 456]) but y_truth is torch.Size([8, 256, 456]), 8 is the batch size, 256 is H, 456 is W.
y_pred = model(inputs)['out']

The size doesn't match to feed into loss function. Moreover, the y_pred has float number for every element but I expect it should be the number representing class like 0,1,2,3.

May I ask how to deal with it? Thanks a lot for helping!

Analysis.ipynb cv2.imread BGR2RGB conversion

Analysis.ipynb, in [6]
img = cv2.imread(f'./CrackForest/Images/{ino:03d}.jpg').transpose(2,0,1).reshape(1,3,320,480)
will convert RGB image incorrectly.
(cv2 and PIL Image read RGB image format differently)
(need to use cv2.cvtColor for fixing cv2's BGR into RGB)
img = cv2.imread(f'./CrackForest/Images/{ino:03d}.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB).transpose(2,0,1).reshape(1,3,320,480)

Frozen layers

Did you freeze any layers in your implementation or does your model get retrained and uses the starting weights from the DeepLabv3?

Batch summary metrics are averaged with the initial zero

I tried to open a PR but got an error.

Anyways in trainer.py it should be like that. Otherwise the metrics are averaged with zero and therefore inaccurate

-        batchsummary = {a: [0] for a in fieldnames}
+        batchsummary = {a: [] for a in fieldnames}

RuntimeError: CUDA out of memory. Tried to allocate 42.00 MiB (GPU 0; 3.81 GiB total capacity; 2.79 GiB already allocated; 25.44 MiB free; 2.92 GiB reserved in total by PyTorch)

RuntimeError Traceback (most recent call last)
Cell In[27], line 44
42 # Specify the optimizer with a lower learning rate
43 optimizer = torch.optim.Adam(model.parameters(), lr=1e-4)
---> 44 _ = train_model(model,
45 criterion,
46 dataloaders,
47 optimizer,
48 bpath=exp_directory,
49 metrics=metrics,
50 num_epochs=epochs)
52 # Save the trained model
53 torch.save(model, exp_directory / 'weights.pt')

File ~/Documents/5G/DeepLabv3FineTuning/trainer.py:49, in train_model(model, criterion, dataloaders, optimizer, metrics, bpath, num_epochs)
47 # track history if only in train
48 with torch.set_grad_enabled(phase == 'Train'):
---> 49 outputs = model(inputs)
50 loss = criterion(outputs['out'], masks)
51 y_pred = outputs['out'].data.cpu().numpy().ravel()

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:722, in Module._call_impl(self, *input, **kwargs)
720 result = self._slow_forward(*input, **kwargs)
721 else:
--> 722 result = self.forward(*input, **kwargs)
723 for hook in itertools.chain(
724 _global_forward_hooks.values(),
725 self._forward_hooks.values()):
726 hook_result = hook(self, input, result)

File ~/.local/lib/python3.8/site-packages/torchvision/models/segmentation/_utils.py:20, in _SimpleSegmentationModel.forward(self, x)
18 input_shape = x.shape[-2:]
19 # contract: features is a dict of tensors
---> 20 features = self.backbone(x)
22 result = OrderedDict()
23 x = features["out"]

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:722, in Module._call_impl(self, *input, **kwargs)
720 result = self._slow_forward(*input, **kwargs)
721 else:
--> 722 result = self.forward(*input, **kwargs)
723 for hook in itertools.chain(
724 _global_forward_hooks.values(),
725 self._forward_hooks.values()):
726 hook_result = hook(self, input, result)

File ~/.local/lib/python3.8/site-packages/torchvision/models/_utils.py:63, in IntermediateLayerGetter.forward(self, x)
61 out = OrderedDict()
62 for name, module in self.items():
---> 63 x = module(x)
64 if name in self.return_layers:
65 out_name = self.return_layers[name]

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:722, in Module._call_impl(self, *input, **kwargs)
720 result = self._slow_forward(*input, **kwargs)
721 else:
--> 722 result = self.forward(*input, **kwargs)
723 for hook in itertools.chain(
724 _global_forward_hooks.values(),
725 self._forward_hooks.values()):
726 hook_result = hook(self, input, result)

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/container.py:117, in Sequential.forward(self, input)
115 def forward(self, input):
116 for module in self:
--> 117 input = module(input)
118 return input

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:722, in Module._call_impl(self, *input, **kwargs)
720 result = self._slow_forward(*input, **kwargs)
721 else:
--> 722 result = self.forward(*input, **kwargs)
723 for hook in itertools.chain(
724 _global_forward_hooks.values(),
725 self._forward_hooks.values()):
726 hook_result = hook(self, input, result)

File ~/.local/lib/python3.8/site-packages/torchvision/models/resnet.py:112, in Bottleneck.forward(self, x)
109 out = self.bn2(out)
110 out = self.relu(out)
--> 112 out = self.conv3(out)
113 out = self.bn3(out)
115 if self.downsample is not None:

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:722, in Module._call_impl(self, *input, **kwargs)
720 result = self._slow_forward(*input, **kwargs)
721 else:
--> 722 result = self.forward(*input, **kwargs)
723 for hook in itertools.chain(
724 _global_forward_hooks.values(),
725 self._forward_hooks.values()):
726 hook_result = hook(self, input, result)

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py:419, in Conv2d.forward(self, input)
418 def forward(self, input: Tensor) -> Tensor:
--> 419 return self._conv_forward(input, self.weight)

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py:415, in Conv2d._conv_forward(self, input, weight)
411 if self.padding_mode != 'zeros':
412 return F.conv2d(F.pad(input, self._reversed_padding_repeated_twice, mode=self.padding_mode),
413 weight, self.bias, self.stride,
414 _pair(0), self.dilation, self.groups)
--> 415 return F.conv2d(input, weight, self.bias, self.stride,
416 self.padding, self.dilation, self.groups)

RuntimeError: CUDA out of memory. Tried to allocate 42.00 MiB (GPU 0; 3.81 GiB total capacity; 2.79 GiB already allocated; 25.44 MiB free; 2.92 GiB reserved in total by PyTorch)

CUDA memory issue

Hello,

First off: Great work with this project, it is exceptionally cleanly written and structured imo!
It is nice to see that there is a tutorial, the full code on github with clean and working instructions. (It is working using the CPU for me) :)

Is there a way to make pytorch reserve less RAM?
In another post someone talked about a active variable in the session which seems to use too much memory:
https://discuss.pytorch.org/t/how-does-reserved-in-total-by-pytorch-work/70172/2

Could this be the case? (I am not an Python expert)

Here's the issue:

\DeepLabv3FineTuning-master> python main.py --data-directory CrackForest --exp_directory CFExp --epochs 1
Epoch 1/1

  0%|                                                                                           | 0/24 [00:05<?, ?it/s]
Traceback (most recent call last):
  File "main.py", line 66, in <module>
    main()
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\click\core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\click\core.py", line 782, in main
    rv = self.invoke(ctx)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\click\core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\click\core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "main.py", line 53, in main
    _ = train_model(model,
  File "C:\Users\gdurm\Downloads\DeepLabv3FineTuning-master\DeepLabv3FineTuning-master\trainer.py", line 49, in train_model
    outputs = model(inputs)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torchvision\models\segmentation\_utils.py", line 19, in forward
    features = self.backbone(x)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torchvision\models\_utils.py", line 63, in forward
    x = module(x)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\container.py", line 119, in forward
    input = module(input)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torchvision\models\resnet.py", line 133, in forward
    out = self.bn3(out)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\batchnorm.py", line 135, in forward
    return F.batch_norm(
  File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\functional.py", line 2146, in batch_norm
    return torch.batch_norm(
RuntimeError: CUDA out of memory. Tried to allocate 38.00 MiB (GPU 0; 2.00 GiB total capacity; 1.05 GiB already allocated; 11.44 MiB free; 1.10 GiB reserved in total by PyTorch)```

Zeros in the prediction

In the variable data, I get all zeros for some reason. Can you please provide assistance regarding that?

import torch
model = torch.load('output/weights.pt')
model.eval()

import urllib
url, filename = ("file:///home/rhobincu/gitroot/DeepLabv3FineTuning/CrackForest/Images/092.jpg", "092.jpg")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)

# sample execution (requires torchvision)
from PIL import Image
from torchvision import transforms
input_image = Image.open(filename)
print(input_image)
preprocess = transforms.Compose([
    [transforms.Resize((512,512)), transforms.ToTensor()]
)

input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model

# move the input and model to GPU for speed if available
if torch.cuda.is_available():
    print('Using GPU!')
    input_batch = input_batch.to('cuda')
    model.to('cuda')

import time

start = time.clock()
with torch.no_grad():
    output = model(input_batch)['out'][0]
end = time.clock()
print('Inference duration (s): ', end - start)
data = output.argmax(0).detach().byte().cpu().numpy()

# create a color pallette, selecting a color for each class
palette = torch.tensor([2 ** 25 - 1, 2 ** 15 - 1, 2])
colors = torch.as_tensor([i for i in range(2)])[:, None] * palette
colors = (colors % 255).numpy().astype("uint8")
print(colors)

# plot the semantic segmentation predictions
img_size = input_image.size
r = Image.fromarray(data).resize(img_size)
r.putpalette(colors)

import matplotlib.pyplot as plt
plt.imshow(r)
plt.show()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.