msminhas93 / deeplabv3finetuning Goto Github PK
View Code? Open in Web Editor NEWTutorial on fine tuning DeepLabv3 segmentation network for your own segmentation task in PyTorch.
License: MIT License
Tutorial on fine tuning DeepLabv3 segmentation network for your own segmentation task in PyTorch.
License: MIT License
After retraining the model using the code in this repository, I have attempted to use it in order to segment one of the images in the CrackForest db. The code is taken from the pytorch page:
import torch
model = torch.load('output/weights.pt')
model.eval()
import urllib
url, filename = ("file:///home/rhobincu/gitroot/DeepLabv3FineTuning/CrackForest/Images/092.jpg", "092.jpg")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)
# sample execution (requires torchvision)
from PIL import Image
from torchvision import transforms
input_image = Image.open(filename)
print(input_image)
preprocess = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model
# move the input and model to GPU for speed if available
if torch.cuda.is_available():
print('Using GPU!')
input_batch = input_batch.to('cuda')
model.to('cuda')
import time
start = time.clock()
with torch.no_grad():
output = model(input_batch)['out'][0]
end = time.clock()
print('Inference duration (s): ', end - start)
output_predictions = output.argmax(0)
# create a color pallette, selecting a color for each class
palette = torch.tensor([2 ** 25 - 1, 2 ** 15 - 1, 2])
colors = torch.as_tensor([i for i in range(2)])[:, None] * palette
colors = (colors % 255).numpy().astype("uint8")
print(colors)
# plot the semantic segmentation predictions of 21 classes in each color
img_size = input_image.size
data = output_predictions.byte().cpu().numpy()
print(data)
print(data.sum())
r = Image.fromarray(data).resize(img_size)
r.putpalette(colors)
#cv2.imshow('image',input_image)
import matplotlib.pyplot as plt
plt.imshow(r)
plt.show()
The problem is that the output from the nn (data) is full of 0. Any ideas?
Thank you for your code and awesome project, I found it really useful and informative for a current project I'm working on. I was able to train the DeepLabv3 model on a welding joint semantic segmentation problem within 8 epochs (code is here with you the credit you!). I didn't initially apply any transforms. Rather, the masks were produced in MATLAB and those were used as the ground truth. The white on the masks registered as 255 instead of 1 so I had to make sure to divide by 1. I made one modification while trying to produce a sample as well, but otherwise, I used your approach. I had images that were 480 x 640 also. You could probably use cv2
here as well but I wanted to make it the same as how I was setting up my input data.
image2 = np.reshape(np.swapaxes(np.transpose(Image.open(IMAGES + 'tjoint_123105000329.png')), 1, 2), (1, 3, 480, 640))
mask2 = cv2.imread(MASKS + 'tjoint_123105000329_output.png')
with torch.no_grad():
b = baseline(torch.from_numpy(image2).type(torch.cuda.FloatTensor))
Thanks for the code sharing, first of all.
The segmentation output is the image of class ids. So BCELoss (for binary segmentation) or CrossEntropy should be used, I guess, even though MSE also may be used too.
E.g., this uses crossentropy: https://github.com/wkentaro/pytorch-fcn/blob/master/torchfcn/trainer.py
Any special reason for MSE? I am just curious.
Thanks.
As presented in https://pytorch.org/hub/pytorch_vision_deeplabv3_resnet101/, any input image data must be normalized because the backbone (resnet) was trained with it, I guess.
All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (N, 3, H, W), where N is the number of images, H and W are expected to be at least 224 pixels. The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225].
Hi, I've trained the model successfully. On performing inference as per code in "analysis.ipynb", I see that the histogram of the flattened output array has a few negative values as well. How is this possible? Isn't the final output of the network obtained after a sigmoid layer so as to represent probabilities?
Thanks for the amazing project! My dataset is multiple classes segmentation. Mask image is (H, W), which H is height, W is width, each pixel is an integer representing the class. For example, tree: 0, ... car: 8, sky: 9. The mask looks like [[0,3,9],[3,4,5]].
The number of my classes is 10.
I'm wondering how to train on this dataset. Should it be like
def get_model(num_classes=10):
model = models.segmentation.deeplabv3_resnet101(pretrained=True, progress=True)
model.classifier = DeepLabHead(2048, num_classes=10)
model.train()
return model
However, the prediction size seems to be wrong.
The y_pred
is torch.Size([8, 38, 256, 456])
but y_truth
is torch.Size([8, 256, 456])
, 8 is the batch size, 256 is H, 456 is W.
y_pred = model(inputs)['out']
The size doesn't match to feed into loss function. Moreover, the y_pred has float number for every element but I expect it should be the number representing class like 0,1,2,3.
May I ask how to deal with it? Thanks a lot for helping!
I try to apply it using my own dataset(4 images), but why cannot the model work,what's wrong?
Analysis.ipynb, in [6]
img = cv2.imread(f'./CrackForest/Images/{ino:03d}.jpg').transpose(2,0,1).reshape(1,3,320,480)
will convert RGB image incorrectly.
(cv2 and PIL Image read RGB image format differently)
(need to use cv2.cvtColor for fixing cv2's BGR into RGB)
img = cv2.imread(f'./CrackForest/Images/{ino:03d}.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB).transpose(2,0,1).reshape(1,3,320,480)
from sklearn.metrics import roc_auc_score, f1_score
ImportError: No module named sklearn.metrics
Did you freeze any layers in your implementation or does your model get retrained and uses the starting weights from the DeepLabv3?
Hi, @msminhas93
I met this error when run with my data.
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])
What's wrong to me?
Thanks
Best,
@bemoregt.
I tried to open a PR but got an error.
Anyways in trainer.py it should be like that. Otherwise the metrics are averaged with zero and therefore inaccurate
- batchsummary = {a: [0] for a in fieldnames}
+ batchsummary = {a: [] for a in fieldnames}
RuntimeError Traceback (most recent call last)
Cell In[27], line 44
42 # Specify the optimizer with a lower learning rate
43 optimizer = torch.optim.Adam(model.parameters(), lr=1e-4)
---> 44 _ = train_model(model,
45 criterion,
46 dataloaders,
47 optimizer,
48 bpath=exp_directory,
49 metrics=metrics,
50 num_epochs=epochs)
52 # Save the trained model
53 torch.save(model, exp_directory / 'weights.pt')
File ~/Documents/5G/DeepLabv3FineTuning/trainer.py:49, in train_model(model, criterion, dataloaders, optimizer, metrics, bpath, num_epochs)
47 # track history if only in train
48 with torch.set_grad_enabled(phase == 'Train'):
---> 49 outputs = model(inputs)
50 loss = criterion(outputs['out'], masks)
51 y_pred = outputs['out'].data.cpu().numpy().ravel()
File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:722, in Module._call_impl(self, *input, **kwargs)
720 result = self._slow_forward(*input, **kwargs)
721 else:
--> 722 result = self.forward(*input, **kwargs)
723 for hook in itertools.chain(
724 _global_forward_hooks.values(),
725 self._forward_hooks.values()):
726 hook_result = hook(self, input, result)
File ~/.local/lib/python3.8/site-packages/torchvision/models/segmentation/_utils.py:20, in _SimpleSegmentationModel.forward(self, x)
18 input_shape = x.shape[-2:]
19 # contract: features is a dict of tensors
---> 20 features = self.backbone(x)
22 result = OrderedDict()
23 x = features["out"]
File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:722, in Module._call_impl(self, *input, **kwargs)
720 result = self._slow_forward(*input, **kwargs)
721 else:
--> 722 result = self.forward(*input, **kwargs)
723 for hook in itertools.chain(
724 _global_forward_hooks.values(),
725 self._forward_hooks.values()):
726 hook_result = hook(self, input, result)
File ~/.local/lib/python3.8/site-packages/torchvision/models/_utils.py:63, in IntermediateLayerGetter.forward(self, x)
61 out = OrderedDict()
62 for name, module in self.items():
---> 63 x = module(x)
64 if name in self.return_layers:
65 out_name = self.return_layers[name]
File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:722, in Module._call_impl(self, *input, **kwargs)
720 result = self._slow_forward(*input, **kwargs)
721 else:
--> 722 result = self.forward(*input, **kwargs)
723 for hook in itertools.chain(
724 _global_forward_hooks.values(),
725 self._forward_hooks.values()):
726 hook_result = hook(self, input, result)
File ~/.local/lib/python3.8/site-packages/torch/nn/modules/container.py:117, in Sequential.forward(self, input)
115 def forward(self, input):
116 for module in self:
--> 117 input = module(input)
118 return input
File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:722, in Module._call_impl(self, *input, **kwargs)
720 result = self._slow_forward(*input, **kwargs)
721 else:
--> 722 result = self.forward(*input, **kwargs)
723 for hook in itertools.chain(
724 _global_forward_hooks.values(),
725 self._forward_hooks.values()):
726 hook_result = hook(self, input, result)
File ~/.local/lib/python3.8/site-packages/torchvision/models/resnet.py:112, in Bottleneck.forward(self, x)
109 out = self.bn2(out)
110 out = self.relu(out)
--> 112 out = self.conv3(out)
113 out = self.bn3(out)
115 if self.downsample is not None:
File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:722, in Module._call_impl(self, *input, **kwargs)
720 result = self._slow_forward(*input, **kwargs)
721 else:
--> 722 result = self.forward(*input, **kwargs)
723 for hook in itertools.chain(
724 _global_forward_hooks.values(),
725 self._forward_hooks.values()):
726 hook_result = hook(self, input, result)
File ~/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py:419, in Conv2d.forward(self, input)
418 def forward(self, input: Tensor) -> Tensor:
--> 419 return self._conv_forward(input, self.weight)
File ~/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py:415, in Conv2d._conv_forward(self, input, weight)
411 if self.padding_mode != 'zeros':
412 return F.conv2d(F.pad(input, self._reversed_padding_repeated_twice, mode=self.padding_mode),
413 weight, self.bias, self.stride,
414 _pair(0), self.dilation, self.groups)
--> 415 return F.conv2d(input, weight, self.bias, self.stride,
416 self.padding, self.dilation, self.groups)
RuntimeError: CUDA out of memory. Tried to allocate 42.00 MiB (GPU 0; 3.81 GiB total capacity; 2.79 GiB already allocated; 25.44 MiB free; 2.92 GiB reserved in total by PyTorch)
Hello,
First off: Great work with this project, it is exceptionally cleanly written and structured imo!
It is nice to see that there is a tutorial, the full code on github with clean and working instructions. (It is working using the CPU for me) :)
Is there a way to make pytorch reserve less RAM?
In another post someone talked about a active variable in the session which seems to use too much memory:
https://discuss.pytorch.org/t/how-does-reserved-in-total-by-pytorch-work/70172/2
Could this be the case? (I am not an Python expert)
Here's the issue:
\DeepLabv3FineTuning-master> python main.py --data-directory CrackForest --exp_directory CFExp --epochs 1
Epoch 1/1
0%| | 0/24 [00:05<?, ?it/s]
Traceback (most recent call last):
File "main.py", line 66, in <module>
main()
File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\click\core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\click\core.py", line 782, in main
rv = self.invoke(ctx)
File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\click\core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\click\core.py", line 610, in invoke
return callback(*args, **kwargs)
File "main.py", line 53, in main
_ = train_model(model,
File "C:\Users\gdurm\Downloads\DeepLabv3FineTuning-master\DeepLabv3FineTuning-master\trainer.py", line 49, in train_model
outputs = model(inputs)
File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torchvision\models\segmentation\_utils.py", line 19, in forward
features = self.backbone(x)
File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torchvision\models\_utils.py", line 63, in forward
x = module(x)
File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\container.py", line 119, in forward
input = module(input)
File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torchvision\models\resnet.py", line 133, in forward
out = self.bn3(out)
File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\modules\batchnorm.py", line 135, in forward
return F.batch_norm(
File "C:\tools\Anaconda3\envs\deeplabfinetuning\lib\site-packages\torch\nn\functional.py", line 2146, in batch_norm
return torch.batch_norm(
RuntimeError: CUDA out of memory. Tried to allocate 38.00 MiB (GPU 0; 2.00 GiB total capacity; 1.05 GiB already allocated; 11.44 MiB free; 1.10 GiB reserved in total by PyTorch)```
In the variable data
, I get all zeros for some reason. Can you please provide assistance regarding that?
import torch
model = torch.load('output/weights.pt')
model.eval()
import urllib
url, filename = ("file:///home/rhobincu/gitroot/DeepLabv3FineTuning/CrackForest/Images/092.jpg", "092.jpg")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)
# sample execution (requires torchvision)
from PIL import Image
from torchvision import transforms
input_image = Image.open(filename)
print(input_image)
preprocess = transforms.Compose([
[transforms.Resize((512,512)), transforms.ToTensor()]
)
input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model
# move the input and model to GPU for speed if available
if torch.cuda.is_available():
print('Using GPU!')
input_batch = input_batch.to('cuda')
model.to('cuda')
import time
start = time.clock()
with torch.no_grad():
output = model(input_batch)['out'][0]
end = time.clock()
print('Inference duration (s): ', end - start)
data = output.argmax(0).detach().byte().cpu().numpy()
# create a color pallette, selecting a color for each class
palette = torch.tensor([2 ** 25 - 1, 2 ** 15 - 1, 2])
colors = torch.as_tensor([i for i in range(2)])[:, None] * palette
colors = (colors % 255).numpy().astype("uint8")
print(colors)
# plot the semantic segmentation predictions
img_size = input_image.size
r = Image.fromarray(data).resize(img_size)
r.putpalette(colors)
import matplotlib.pyplot as plt
plt.imshow(r)
plt.show()
Hi, I want to deploy the model in TensorRT format, do you have experience about convert model?
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.