Coder Social home page Coder Social logo

reachsumit / deep-unet-for-satellite-image-segmentation Goto Github PK

View Code? Open in Web Editor NEW
302.0 10.0 137.0 144.87 MB

Satellite Imagery Feature Detection with SpaceNet dataset using deep UNet

Python 97.69% Shell 2.31%
deep-learning deep-neural-networks image-segmentation unet unet-image-segmentation keras keras-tensorflow python3

deep-unet-for-satellite-image-segmentation's Introduction

Deep UNet for satellite image segmentation

banner!

About this project

This is a Keras based implementation of a deep UNet that performs satellite image segmentation.

Dataset

  • The dataset consists of 8-band commercial grade satellite imagery taken from SpaceNet dataset.
  • Train collection contains few tiff files for each of the 24 locations.
  • Every location has an 8-channel image containing spectral information of several wavelength channels (red, red edge, coastal, blue, green, yellow, near-IR1 and near-IR2). These files are located in data/mband/ directory.
  • Also available are correctly segmented images of each training location, called mask. These files contain information about 5 different classes: buildings, roads, trees, crops and water (note that original Kaggle contest had 10 classes).
  • Resolution for satellite images s 16-bit. However, mask-files are 8-bit.

Implementation

  • Deep Unet architecture is employed to perform segmentation.
  • Image augmentation is used for input images to significantly increases train data.
  • Image augmentation is also done while testing, mean results are exported to result.tif image. examples

Note: Training for this model was done on a Tesla P100-PCIE-16GB GPU.

Prediction Example

prediction example

Network architecture

Deep Unet Architecture

deep-unet-for-satellite-image-segmentation's People

Contributors

reachsumit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deep-unet-for-satellite-image-segmentation's Issues

How to convert the project with "PyTorch"?

Hi,
I have seen the deep learning is implemented with "Keras" in your project. However, if I would like to rewrite the model and training using "PyTorch", then how to do converting?
100
101

So, could you pls give me some suggestions? Thanks!

how to get confusion matrix

I am a little bit confused here. what are the y_test and y_pred here? which is required for building a confusion matrix

Train Run Error

Hello all , thanks to help me as soon as you can ,
when i run the prot of train i meet this problem i don't know how can'i solve it

start train net
Generated 400 patches
Generated 100 patches


InvalidArgumentError Traceback (most recent call last)
~\anaconda3\envs\tf3\lib\site-packages\tensorflow_core\python\framework\ops.py in _create_c_op(graph, node_def, inputs, control_inputs)
1618 try:
-> 1619 c_op = c_api.TF_FinishOperation(op_desc)
1620 except errors.InvalidArgumentError as e:

InvalidArgumentError: Negative dimension size caused by subtracting 2 from 1 for 'max_pooling2d_60/MaxPool' (op: 'MaxPool') with input shapes: [?,1,1,512].

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)
in
18 return model
19
---> 20 train_net()

in train_net()
3 x_train, y_train = get_patches(X_DICT_TRAIN, Y_DICT_TRAIN, n_patches=TRAIN_SZ, sz=PATCH_SZ)
4 x_val, y_val = get_patches(X_DICT_VALIDATION, Y_DICT_VALIDATION, n_patches=VAL_SZ, sz=PATCH_SZ)
----> 5 model = get_model()
6 if os.path.isfile(weights_path):
7 model.load_weights(weights_path)

in get_model()
17
18 def get_model():
---> 19 return unet_model(N_CLASSES, PATCH_SZ, n_channels=N_BANDS, upconv=UPCONV, class_weights=CLASS_WEIGHTS)
20
21

~\Desktop\Unet\deep-unet-for-satellite-image-segmentation-master\unet_model.py in unet_model(n_classes, im_sz, n_channels, n_filters_start, growth_factor, upconv, class_weights)
44 conv4_1 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(pool4_1)
45 conv4_1 = Conv2D(n_filters, (3, 3), activation='relu', padding='same')(conv4_1)
---> 46 pool4_2 = MaxPooling2D(pool_size=(2, 2))(conv4_1)
47 pool4_2 = Dropout(droprate)(pool4_2)
48

~\anaconda3\envs\tf3\lib\site-packages\keras\backend\tensorflow_backend.py in symbolic_fn_wrapper(*args, **kwargs)
73 if _SYMBOLIC_SCOPE.value:
74 with get_graph().as_default():
---> 75 return func(*args, **kwargs)
76 else:
77 return func(*args, **kwargs)

~\anaconda3\envs\tf3\lib\site-packages\keras\engine\base_layer.py in call(self, inputs, **kwargs)
487 # Actually call the layer,
488 # collecting output(s), mask(s), and shape(s).
--> 489 output = self.call(inputs, **kwargs)
490 output_mask = self.compute_mask(inputs, previous_mask)
491

~\anaconda3\envs\tf3\lib\site-packages\keras\layers\pooling.py in call(self, inputs)
203 strides=self.strides,
204 padding=self.padding,
--> 205 data_format=self.data_format)
206 return output
207

~\anaconda3\envs\tf3\lib\site-packages\keras\layers\pooling.py in _pooling_function(self, inputs, pool_size, strides, padding, data_format)
266 output = K.pool2d(inputs, pool_size, strides,
267 padding, data_format,
--> 268 pool_mode='max')
269 return output
270

~\anaconda3\envs\tf3\lib\site-packages\keras\backend\tensorflow_backend.py in pool2d(x, pool_size, strides, padding, data_format, pool_mode)
4070 x = tf.nn.max_pool(x, pool_size, strides,
4071 padding=padding,
-> 4072 data_format=tf_data_format)
4073 elif pool_mode == 'avg':
4074 x = tf.nn.avg_pool(x, pool_size, strides,

~\anaconda3\envs\tf3\lib\site-packages\tensorflow_core\python\ops\nn_ops.py in max_pool_v2(input, ksize, strides, padding, data_format, name)
3824 padding=padding,
3825 data_format=data_format,
-> 3826 name=name)
3827 # pylint: enable=redefined-builtin
3828

~\anaconda3\envs\tf3\lib\site-packages\tensorflow_core\python\ops\gen_nn_ops.py in max_pool(input, ksize, strides, padding, data_format, name)
5198 _, _, _op, _outputs = _op_def_library._apply_op_helper(
5199 "MaxPool", input=input, ksize=ksize, strides=strides, padding=padding,
-> 5200 data_format=data_format, name=name)
5201 _result = _outputs[:]
5202 if _execute.must_record_gradient():

~\anaconda3\envs\tf3\lib\site-packages\tensorflow_core\python\framework\op_def_library.py in _apply_op_helper(op_type_name, name, **keywords)
740 op = g._create_op_internal(op_type_name, inputs, dtypes=None,
741 name=scope, input_types=input_types,
--> 742 attrs=attr_protos, op_def=op_def)
743
744 # outputs is returned as a separate return value so that the output

~\anaconda3\envs\tf3\lib\site-packages\tensorflow_core\python\framework\func_graph.py in _create_op_internal(self, op_type, inputs, dtypes, input_types, name, attrs, op_def, compute_device)
593 return super(FuncGraph, self)._create_op_internal( # pylint: disable=protected-access
594 op_type, inputs, dtypes, input_types, name, attrs, op_def,
--> 595 compute_device)
596
597 def capture(self, tensor, name=None, shape=None):

~\anaconda3\envs\tf3\lib\site-packages\tensorflow_core\python\framework\ops.py in _create_op_internal(self, op_type, inputs, dtypes, input_types, name, attrs, op_def, compute_device)
3320 input_types=input_types,
3321 original_op=self._default_original_op,
-> 3322 op_def=op_def)
3323 self._create_op_helper(ret, compute_device=compute_device)
3324 return ret

~\anaconda3\envs\tf3\lib\site-packages\tensorflow_core\python\framework\ops.py in init(self, node_def, g, inputs, output_types, control_inputs, input_types, original_op, op_def)
1784 op_def, inputs, node_def.attr)
1785 self._c_op = _create_c_op(self._graph, node_def, grouped_inputs,
-> 1786 control_input_ops)
1787 name = compat.as_str(node_def.name)
1788 # pylint: enable=protected-access

~\anaconda3\envs\tf3\lib\site-packages\tensorflow_core\python\framework\ops.py in _create_c_op(graph, node_def, inputs, control_inputs)
1620 except errors.InvalidArgumentError as e:
1621 # Convert to ValueError for backwards compatibility.
-> 1622 raise ValueError(str(e))
1623
1624 return c_op

ValueError: Negative dimension size caused by subtracting 2 from 1 for 'max_pooling2d_60/MaxPool' (op: 'MaxPool') with input shapes: [?,1,1,512].

Why is it possible to do multiple classifications

I find the gt_mband datasets only have one label, Why is it possible to do multiple classifications?
in addition, How do you do multiple classification using the binary crossentropy loss function

Getting Core dumped error

Epoch 1/150
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)

where the dataset is taken from?

satellite imagery is taken from SpaceNet dataset- you have written this in your code but I guess dataset is taken from DSTL kaggle competition. Please clear this confusion

Assertion error while running train_unet.py

assert len(img.shape) == 3 and img.shape[0] > sz and img.shape[1] > sz and img.shape[0:2] == mask.shape[0:2]
this is getting Assertion error while running train_unet.py

how to resolve this?

datasets

I see 2 datasets mband and gt_mband. Kindly let us know difference and usage. Thanks

Using pre-trained weights on own data

Hi @reachsumit

I would like to use the pre-trained weights to compute image segmentation on my own dataset, could you provide some information on how to go about this?

I have downloaded the pre-trained weights and cloned the repository, but what files do I need to run and how do I implement the pre-trained weights?

Many thanks,
Chloe

dataset

could you tell me which satellite used or could you give me the dataset link ?

Training with the given U_net model gives very small probability values

Hello there. I took your U_net model and applied similar idea to train my own model which takes 4 channel input with channels being RGB(NIR) and output being of 8 classes.
When I trained my own model, it gave me very inaccurate values which predicted classes only at the boundaries of the images and that too with very small values the maximum probability being in order of 1e-15.

Can you give me some of your insights or suggestions regarding this issue? Is the problem with the model or maybe its regarding an inaccurate way of pre processing applied?

Questions

Hello,
I tested your training and predict codes on my pc to study image segmentation with deep learning. But I have some questions (both about "predict.py") that I would appreciate if you could answer me.
1- why it was necessary to "fill extended image with mirrors"? - line 19
2- why did you have to make some image reverses (after "for i in range(7)...") to get the image segmentation? - line 76

thanks for your attention.

Runtime error

default
Hello, this error occurs when I run the train_unet.py file. But I don't know how to modify it, can you help me? thank you very much.

training data

I want to know how training data with label is given to neural network. I guess here do not have any label,
How is X and Y data given? I see some 'mask' here. Please explain.

Thanks for article.

How are CLASS_WEIGHTS assigned?

There is no description of how CLASS_WEIGHTS are created. I am attempting to use this on my own data, for segmentation, but I believe I am running into loss function issues because of incorrect class weights.

Mathematical correctness of prediction and the mean

Firstly there are only 7 cases in your predict.py where I believe you are trying to augment the input image (4 rotations states and two flip states). Shouldn't it be 8 cases?

Next, when taking the average, you have recursively taken average of the current result with the result of previous average? In doing so, doesn't it give more weight to the latest prediction?

For example say we have to take the average of three results which are a, b and c (In your case, you should have 8 results, though you have 7 only).

The average result should be :
(a+b+c)/3

whereas you are doing:
[ (a+b)/2 + c ] / 2,
i.e. (a+b+2c)/4.

Which is not the same as (a+b+c)/3.

What am I missing here?

predict part

great work
I would like to be enlightned more about this part if you allow, because I couldnt get to understand its role in prediction part.
Thanks in advance:
for i in range(7): if i == 0: # reverse first dimension mymat = predict(img[::-1,:,:], model, patch_sz=PATCH_SZ, n_classes=N_CLASSES).transpose([2,0,1]) #print(mymat[0][0][0], mymat[3][12][13]) print("Case 1",img.shape, mymat.shape) elif i == 1: # reverse second dimension temp = predict(img[:,::-1,:], model, patch_sz=PATCH_SZ, n_classes=N_CLASSES).transpose([2,0,1]) #print(temp[0][0][0], temp[3][12][13]) print("Case 2", temp.shape, mymat.shape) mymat = np.mean( np.array([ temp[:,::-1,:], mymat ]), axis=0 ) elif i == 2: # transpose(interchange) first and second dimensions temp = predict(img.transpose([1,0,2]), model, patch_sz=PATCH_SZ, n_classes=N_CLASSES).transpose([2,0,1]) #print(temp[0][0][0], temp[3][12][13]) print("Case 3", temp.shape, mymat.shape) mymat = np.mean( np.array([ temp.transpose(0,2,1), mymat ]), axis=0 ) elif i == 3: temp = predict(np.rot90(img, 1), model, patch_sz=PATCH_SZ, n_classes=N_CLASSES) #print(temp.transpose([2,0,1])[0][0][0], temp.transpose([2,0,1])[3][12][13]) print("Case 4", temp.shape, mymat.shape) mymat = np.mean( np.array([ np.rot90(temp, -1).transpose([2,0,1]), mymat ]), axis=0 ) elif i == 4: temp = predict(np.rot90(img,2), model, patch_sz=PATCH_SZ, n_classes=N_CLASSES) #print(temp.transpose([2,0,1])[0][0][0], temp.transpose([2,0,1])[3][12][13]) print("Case 5", temp.shape, mymat.shape) mymat = np.mean( np.array([ np.rot90(temp,-2).transpose([2,0,1]), mymat ]), axis=0 ) elif i == 5: temp = predict(np.rot90(img,3), model, patch_sz=PATCH_SZ, n_classes=N_CLASSES) #print(temp.transpose([2,0,1])[0][0][0], temp.transpose([2,0,1])[3][12][13]) print("Case 6", temp.shape, mymat.shape) mymat = np.mean( np.array([ np.rot90(temp, -3).transpose(2,0,1), mymat ]), axis=0 ) else: temp = predict(img, model, patch_sz=PATCH_SZ, n_classes=N_CLASSES).transpose([2,0,1]) #print(temp[0][0][0], temp[3][12][13]) print("Case 7", temp.shape, mymat.shape) mymat = np.mean( np.array([ temp, mymat ]), axis=0 )

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.