Coder Social home page Coder Social logo

georgeseif / semantic-segmentation-suite Goto Github PK

View Code? Open in Web Editor NEW
2.5K 112.0 880.0 874.39 MB

Semantic Segmentation Suite in TensorFlow. Implement, train, and test new Semantic Segmentation models easily!

Python 100.00%
deep-learning tensorflow segmentation computer-vision python semantic-segmentation densenet refinenet encoder-decoder semantic-segmentation-models

semantic-segmentation-suite's Introduction

Semantic Segmentation Suite in TensorFlow

alt-text-10

News

What's New

  • This repo has been depricated and will no longer be handling issues. Feel free to use as is :)

Description

This repository serves as a Semantic Segmentation Suite. The goal is to easily be able to implement, train, and test new Semantic Segmentation models! Complete with the following:

  • Training and testing modes
  • Data augmentation
  • Several state-of-the-art models. Easily plug and play with different models
  • Able to use any dataset
  • Evaluation including precision, recall, f1 score, average accuracy, per-class accuracy, and mean IoU
  • Plotting of loss function and accuracy over epochs

Any suggestions to improve this repository, including any new segmentation models you would like to see are welcome!

You can also check out my Transfer Learning Suite.

Citing

If you find this repository useful, please consider citing it using a link to the repo :)

Frontends

The following feature extraction models are currently made available:

Models

The following segmentation models are currently made available:

Files and Directories

  • train.py: Training on the dataset of your choice. Default is CamVid

  • test.py: Testing on the dataset of your choice. Default is CamVid

  • predict.py: Use your newly trained model to run a prediction on a single image

  • helper.py: Quick helper functions for data preparation and visualization

  • utils.py: Utilities for printing, debugging, testing, and evaluation

  • models: Folder containing all model files. Use this to build your models, or use a pre-built one

  • CamVid: The CamVid datatset for Semantic Segmentation as a test bed. This is the 32 class version

  • checkpoints: Checkpoint files for each epoch during training

  • Test: Test results including images, per-class accuracies, precision, recall, and f1 score

Installation

This project has the following dependencies:

  • Numpy sudo pip install numpy

  • OpenCV Python sudo apt-get install python-opencv

  • TensorFlow sudo pip install --upgrade tensorflow-gpu

Usage

The only thing you have to do to get started is set up the folders in the following structure:

├── "dataset_name"                   
|   ├── train
|   ├── train_labels
|   ├── val
|   ├── val_labels
|   ├── test
|   ├── test_labels

Put a text file under the dataset directory called "class_dict.csv" which contains the list of classes along with the R, G, B colour labels to visualize the segmentation results. This kind of dictionairy is usually supplied with the dataset. Here is an example for the CamVid dataset:

name,r,g,b
Animal,64,128,64
Archway,192,0,128
Bicyclist,0,128, 192
Bridge,0, 128, 64
Building,128, 0, 0
Car,64, 0, 128
CartLuggagePram,64, 0, 192
Child,192, 128, 64
Column_Pole,192, 192, 128
Fence,64, 64, 128
LaneMkgsDriv,128, 0, 192
LaneMkgsNonDriv,192, 0, 64
Misc_Text,128, 128, 64
MotorcycleScooter,192, 0, 192
OtherMoving,128, 64, 64
ParkingBlock,64, 192, 128
Pedestrian,64, 64, 0
Road,128, 64, 128
RoadShoulder,128, 128, 192
Sidewalk,0, 0, 192
SignSymbol,192, 128, 128
Sky,128, 128, 128
SUVPickupTruck,64, 128,192
TrafficCone,0, 0, 64
TrafficLight,0, 64, 64
Train,192, 64, 128
Tree,128, 128, 0
Truck_Bus,192, 128, 192
Tunnel,64, 0, 64
VegetationMisc,192, 192, 0
Void,0, 0, 0
Wall,64, 192, 0

Note: If you are using any of the networks that rely on a pre-trained ResNet, then you will need to download the pre-trained weights using the provided script. These are currently: PSPNet, RefineNet, DeepLabV3, DeepLabV3+, GCN.

Then you can simply run train.py! Check out the optional command line arguments:

usage: train.py [-h] [--num_epochs NUM_EPOCHS]
                [--checkpoint_step CHECKPOINT_STEP]
                [--validation_step VALIDATION_STEP] [--image IMAGE]
                [--continue_training CONTINUE_TRAINING] [--dataset DATASET]
                [--crop_height CROP_HEIGHT] [--crop_width CROP_WIDTH]
                [--batch_size BATCH_SIZE] [--num_val_images NUM_VAL_IMAGES]
                [--h_flip H_FLIP] [--v_flip V_FLIP] [--brightness BRIGHTNESS]
                [--rotation ROTATION] [--model MODEL] [--frontend FRONTEND]

optional arguments:
  -h, --help            show this help message and exit
  --num_epochs NUM_EPOCHS
                        Number of epochs to train for
  --checkpoint_step CHECKPOINT_STEP
                        How often to save checkpoints (epochs)
  --validation_step VALIDATION_STEP
                        How often to perform validation (epochs)
  --image IMAGE         The image you want to predict on. Only valid in
                        "predict" mode.
  --continue_training CONTINUE_TRAINING
                        Whether to continue training from a checkpoint
  --dataset DATASET     Dataset you are using.
  --crop_height CROP_HEIGHT
                        Height of cropped input image to network
  --crop_width CROP_WIDTH
                        Width of cropped input image to network
  --batch_size BATCH_SIZE
                        Number of images in each batch
  --num_val_images NUM_VAL_IMAGES
                        The number of images to used for validations
  --h_flip H_FLIP       Whether to randomly flip the image horizontally for
                        data augmentation
  --v_flip V_FLIP       Whether to randomly flip the image vertically for data
                        augmentation
  --brightness BRIGHTNESS
                        Whether to randomly change the image brightness for
                        data augmentation. Specifies the max bightness change
                        as a factor between 0.0 and 1.0. For example, 0.1
                        represents a max brightness change of 10% (+-).
  --rotation ROTATION   Whether to randomly rotate the image for data
                        augmentation. Specifies the max rotation angle in
                        degrees.
  --model MODEL         The model you are using. See model_builder.py for
                        supported models
  --frontend FRONTEND   The frontend you are using. See frontend_builder.py
                        for supported models

Results

These are some sample results for the CamVid dataset with 11 classes (previous research version).

In training, I used a batch size of 1 and image size of 352x480. The following results are for the FC-DenseNet103 model trained for 300 epochs. I used RMSProp with learning rate 0.001 and decay 0.995. I did not use any data augmentation like in the paper. I also didn't use any class balancing. These are just some quick and dirty example results.

Note that the checkpoint files are not uploaded to this repository since they are too big for GitHub (greater than 100 MB)

Class Original Accuracy My Accuracy
Sky 93.0 94.1
Building 83.0 81.2
Pole 37.8 38.3
Road 94.5 97.5
Pavement 82.2 87.9
Tree 77.3 75.5
SignSymbol 43.9 49.7
Fence 37.1 69.0
Car 77.3 87.0
Pedestrian 59.6 60.3
Bicyclist 50.5 75.3
Unlabelled N/A 40.9
Global 91.5 89.6
Loss vs Epochs Val. Acc. vs Epochs
alt text-1 alt text-2
Original GT Result
alt-text-3 alt-text-4 alt-text-5

semantic-segmentation-suite's People

Contributors

1453042287 avatar arasharchor avatar georgeseif avatar mrshu avatar simondeussen avatar spritea avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

semantic-segmentation-suite's Issues

RESNET checkpoints - no mention in doc

Hello,

maybe not a big thing as one can debug the problem rather easily. But can you update the documention that mentions that RESNET needs checkpoints and better run the script to get it. Or even better to incorporate the script into the main.py file

Thanks
Peter

Suggestion: ICNet and AdapNet

Hi.

These two segmentation networks have recetly been very interesting.

The first already has a implementation https://github.com/hellochick/ICNet-tensorflow, but maybe you also want to add it here and the second one http://deepscene.cs.uni-freiburg.de/ has really cool fusion techniques although the model inside it is just resnet50.

So if you find these interesting and want to experiment with these then I would be very happy :)

I might even try adding icnet myself.

Model Checkpoints

Since the model checkpoints are too big for Github would it be possible to upload them to google drive and provide a link? Or if its easier we can do it through email. Thank you!

python main.py --mode predict

Traceback (most recent call last):
File "main.py", line 522, in
sys.stdout.write("Testing image " + args.image)
TypeError: cannot concatenate 'str' and 'NoneType' objects

Still, can not predict?

Reproducibility of CamVid results

George,

I'm having problems reproducing the results for training CamVid. I am trying the follow with no luck. I attempt to predict after training and confirmed the prediction is incorrect also.

TRAINING RESULTS:
Validation precision = 0.49989
Validation recall = 0.512134
Validation F1 = 0.50587
Validation IoU = 0.01776

TRAIN:
python main.py --mode train --dataset CamVid --model PSPNet-Res50 --batch_size 100000 --num_epoch 300

PREDICT:
python main.py --mode predict --dataset CamVid --model PSPNet-Res50 --image trash/in.png

RefineNet Implementation

Hi GeorgeSeif,

Thanks for your implementation of so many SOTA segmentation networks.

I am interested in the RefineNet which works quite well on many datasets.

(1)I notice that the original paper of RefineNet states that we should first apply the softmax function to the 1/4 scale output of RefineNet blocks to get a dense score map, and then upscale the score map to the original size.

In your implementation, you first upscale and then calculate softmax cross entropy loss.
Have you tested the effect of these two implementations?

(2)Similarly, the author applied another RCU model to the final output of RefineNet blocks before upscaling and softmax. However, your implementation seems to miss these two RCU units.

(3)Did you try to obtain some performance of RefineNet on popular datasets such as NYUv2, CamVid or CityScape, compared to the original paper?

If so, would you mind telling me the performance of your implementation? How about the efficiency (inference time on a modern GPU using ResNet50/101?)

Many thanks.

Shuaifeng

MobileUnet pretrained model

Could you please share the pretrained model of MobileUnet? I didn't find the link of it in this project. Thanks a lot!

Code consulting

Please, I have 11 errors in running main.py. For example, 167 lines of build_custom error, 415 lines of test_input_names, 420 lines of test_output_names, 434 lines of val_input_names, etc.

Training loss begins at 0 and stuck on it

I'm trying to do segmentation only on human and background class. I have modified the main.py such that the ground truth image dimension should be 256X256X1. I removed the one hot and reverse one hot encoding function calls because I have only one class and because of that the output image is always one hot encoded. But I see the from the beginning that the loss stuck at 0 and never increases.

I didn't choose the class balancing loss function option. I tried changing the loss function to IOU. In this case the loss is not 0 anymore but the loss (negative iou) value becomes less than -1 which means something still wrong. I tried increasing the learning rate with no affect. I have checked input and ground truth images by visualizing but didn't see any problem on that.

I removed all the augmentation. Only resized the images to 256x256 dimension before providing to the model.

The prediction output is always an image having all the pixel set to 0. I have printed the predicted image values and the values are huge negative numbers. Like,

-8136019.0, -8398163.0, -5907791.0, -10888527.0,

I have edited the get_label_info() function so that the label info is not loaded from the csv. Instead I did the following,

class_names = []
label_values = []
class_names.append('People')
label_values.append([1,1,1])

return class_names, label_values

The console log with batch size 6, model DeepLabV3-Res50 and no augmentation

This model has 27649921 trainable parameters
Loading the data ...

***** Begin training *****
Dataset --> /media/zayd/FUN/DL/Datasets/processed_Segmentation_dataset
Model --> DeepLabV3-Res50
Crop Height --> 256
Crop Width --> 256
Num Epochs --> 300
Batch Size --> 6
Num Classes --> 1
Data Augmentation:
	Vertical Flip --> False
	Horizontal Flip --> False
	Brightness Alteration --> None
	Rotation --> None

2018-04-23 20:33:31.648008: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.22GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2018-04-23 20:33:32.104951: W tensorflow/core/common_runtime/bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.21GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
[2018-04-23 20:33:33] Epoch = 0 Count = 60 Current_Loss = 0.0000 Time = 3.83
[2018-04-23 20:33:34] Epoch = 0 Count = 120 Current_Loss = 0.0000 Time = 1.42
[2018-04-23 20:33:36] Epoch = 0 Count = 180 Current_Loss = 0.0000 Time = 1.38
[2018-04-23 20:33:37] Epoch = 0 Count = 240 Current_Loss = 0.0000 Time = 1.41
[2018-04-23 20:33:39] Epoch = 0 Count = 300 Current_Loss = 0.0000 Time = 1.39
[2018-04-23 20:33:40] Epoch = 0 Count = 360 Current_Loss = 0.0000 Time = 1.39
[2018-04-23 20:33:41] Epoch = 0 Count = 420 Current_Loss = 0.0000 Time = 1.37
[2018-04-23 20:33:43] Epoch = 0 Count = 480 Current_Loss = 0.0000 Time = 1.37
[2018-04-23 20:33:44] Epoch = 0 Count = 540 Current_Loss = 0.0000 Time = 1.37
[2018-04-23 20:33:45] Epoch = 0 Count = 600 Current_Loss = 0.0000 Time = 1.36
[2018-04-23 20:33:47] Epoch = 0 Count = 660 Current_Loss = 0.0000 Time = 1.37
[2018-04-23 20:33:48] Epoch = 0 Count = 720 Current_Loss = 0.0000 Time = 1.37
[2018-04-23 20:33:50] Epoch = 0 Count = 780 Current_Loss = 0.0000 Time = 1.37
[2018-04-23 20:33:51] Epoch = 0 Count = 840 Current_Loss = 0.0000 Time = 1.46
[2018-04-23 20:33:52] Epoch = 0 Count = 900 Current_Loss = 0.0000 Time = 1.40
[2018-04-23 20:33:54] Epoch = 0 Count = 960 Current_Loss = 0.0000 Time = 1.39
[2018-04-23 20:33:55] Epoch = 0 Count = 1020 Current_Loss = 0.0000 Time = 1.37
[2018-04-23 20:33:57] Epoch = 0 Count = 1080 Current_Loss = 0.0000 Time = 1.38
[2018-04-23 20:33:58] Epoch = 0 Count = 1140 Current_Loss = 0.0000 Time = 1.41
[2018-04-23 20:33:59] Epoch = 0 Count = 1200 Current_Loss = 0.0000 Time = 1.38
[2018-04-23 20:34:01] Epoch = 0 Count = 1260 Current_Loss = 0.0000 Time = 1.42
[2018-04-23 20:34:02] Epoch = 0 Count = 1320 Current_Loss = 0.0000 Time = 1.39
[2018-04-23 20:34:04] Epoch = 0 Count = 1380 Current_Loss = 0.0000 Time = 1.40
[2018-04-23 20:34:05] Epoch = 0 Count = 1440 Current_Loss = 0.0000 Time = 1.41
[2018-04-23 20:34:06] Epoch = 0 Count = 1500 Current_Loss = 0.0000 Time = 1.44
[2018-04-23 20:34:08] Epoch = 0 Count = 1560 Current_Loss = 0.0000 Time = 1.44
[2018-04-23 20:34:09] Epoch = 0 Count = 1620 Current_Loss = 0.0000 Time = 1.38
[2018-04-23 20:34:11] Epoch = 0 Count = 1680 Current_Loss = 0.0000 Time = 1.38
[2018-04-23 20:34:12] Epoch = 0 Count = 1740 Current_Loss = 0.0000 Time = 1.38
[2018-04-23 20:34:13] Epoch = 0 Count = 1800 Current_Loss = 0.0000 Time = 1.37
[2018-04-23 20:34:15] Epoch = 0 Count = 1860 Current_Loss = 0.0000 Time = 1.38
/usr/local/lib/python3.5/dist-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)
/usr/local/lib/python3.5/dist-packages/sklearn/metrics/classification.py:1137: UndefinedMetricWarning: Recall is ill-defined and being set to 0.0 in labels with no true samples.
  'recall', 'true', average, warn_for)
/usr/local/lib/python3.5/dist-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)
/usr/local/lib/python3.5/dist-packages/sklearn/metrics/classification.py:1137: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no true samples.
  'recall', 'true', average, warn_for)

Average validation accuracy for epoch # 0000 = 0.000000
Average per class validation accuracies for epoch # 0000:
People = 0.000000
Validation precision =  0.0
Validation recall =  0.0
Validation F1 score =  0.0
Validation IoU score =  0.0

Link to my main.py file, helpers.py and utils.py files. I changed utils.py to add debug logs only. Stuck on it for few days now.

Pre-trained Tiramisu model

Hi,

Thanks for open-sourcing your work. I was wondering if you have a pre-trained weights for the Tiramisu network?

Thanks

Refinenet model's problem

Hi George,
In the MultiResolutionFusion function of Refinenet,should we upsample the rcu_low rather than the rcu_high according to the paper?

Fine-tuning

Hello,

How can fine-tune a model on a custom dataset, with possibly a different number of classes using a pre-trained weight?

Second, which annotation style we should follow? The annotation for each datase is different, for example VOC uses white borders of dont-care =255 which others dont.

TypeError: Expected binary or unicode string, got <built-in function input>

Hi, I'm trying to run main.py in an environment with the installed requirements, but I think there is need for the versions. I believe this is caused by me using a different Tensorflow version.

tensorflow==1.7.0
tensorflow-gpu==1.2.1

`Traceback (most recent call last):
File "/home/alansalinas/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py", line 518, in make_tensor_proto
str_values = [compat.as_bytes(x) for x in proto_values]
File "/home/alansalinas/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py", line 518, in
str_values = [compat.as_bytes(x) for x in proto_values]
File "/home/alansalinas/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/util/compat.py", line 68, in as_bytes
(bytes_or_text,))
TypeError: Expected binary or unicode string, got

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "main.py", line 152, in
network = build_fc_densenet(input, preset_model = args.model, num_classes=num_classes)
File "models/FC_DenseNet_Tiramisu.py", line 112, in build_fc_densenet
stack = slim.conv2d(inputs, n_filters_first_conv, [3, 3], scope='first_conv', activation_fn=None)
File "/home/alansalinas/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 183, in func_with_args
return func(*args, **current_args)
File "/home/alansalinas/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1015, in convolution
inputs = ops.convert_to_tensor(inputs)
File "/home/alansalinas/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 950, in convert_to_tensor
as_ref=False)
File "/home/alansalinas/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1040, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/home/alansalinas/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 235, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/home/alansalinas/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 214, in constant
value, dtype=dtype, shape=shape, verify_shape=verify_shape))
File "/home/alansalinas/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py", line 522, in make_tensor_proto
"supported type." % (type(values), values))
TypeError: Failed to convert object of type <class 'builtin_function_or_method'> to Tensor. Contents: . Consider casting elements to a supported type.`

training blocked by reading images

Hello, GeorgeSeif, i'm training large amount of data using GPUs, and found the feeding process it too slow, it cost me about 0.5s per image, nearly comparable to the network learning time, i'd recommend you to use a FIFO Queue to avoid the block of feeding data into TF, this would make this repo more excellent.

Checkpoints for the models we run

Hi George,

Thanks a lot for the great repo you created. I find it very useful.

One suggestion is to add control for the number of checkpoints that are actually saved when we run our own models.

Namely, I run 300 iterations just to test and ended up running out of hard disk, since there are more than 80GB of checkpoints created. Probably we would be fine with one after every N=10 or 100 saved.
What do you think?

CamVid example fails with _csv.Error error while openning class_dict.csv file

The full error I have got is below:

"Semantic-Segmentation-Suite/helpers.py", line 29, in get_label_info
header = next(file_reader)
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)

So I have changed the corresponding line in the helpers.py from:

with open(csv_path, 'rb') as csvfile:

to:

with open(csv_path, 'r') as csvfile:

to make it work.

Not sure if others would experience the same problem, but maybe helpful if they do.

GPU required

Hello, I executed your interesting repository in my computer, but after some minutes I get error:
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1,168960,2160] and ...

Do you think it depends on my computer that has only 4GB and hasn't a GPU?

Can I change default settings in main.py to execute your code in my computer, or I need to run in a instance in the cloud (Aws) with GPU?

Thanks a lot

Small bug report: iterator should return strings, not bytes

@GeorgeSeif
I enjoy your code. python main.py threw a following error:
File "./helpers.py", line 26, in get_class_dict
header = next(file_reader)
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)

It seems like a small typo.
https://stackoverflow.com/questions/8515053/csv-error-iterator-should-return-strings-not-bytes
I changed the line 24 of helpers.py and it worked, so just FYI.
FROM with open(csv_path, 'rb') as csvfile:
TO: with open(csv_path, 'r') as csvfile:

Suggestion: data augmentation

Hi!

Its a great suite of capabilities for training models for segmentation! My suggestion would be to add more data augmentation, because that always makes models better and more generic. For example all the things that Keras has (https://keras.io/preprocessing/image/): rotation, zooming to certain parts of the image etc.

pre-training

Is it possible to learn the data of output 3 as pre-training after learning with output data of 20? (I want to learn my own data after learning VOC)
I do not know how to pre-training.

Thank you for quite useful code.

PSPNet's model's Probelm

Hi, George! I'm coming again!
In the PSPNet, the paper uses a pretrained ResNet model with the dilated network strategy, because using the dilated convolution, the final feature map size is 1/8 of the input image,but in the code of your PSPNet ,you directly use "end_points['pool3']" as the input of PyramidPoolingModule, although the feature map's size is 1/8, the receptive field decreases,so the rusult is not much better .Using the dilated convolution to gain 1/8 feature map is better!

Bug in DeeplabV3+ implementation

Hi,

I think there is a bug in the DeeplabV3+ implementation. On lines 110-111 the conv2d functions take encoder_features as argument, but they should take net instead. Currently this has the effect of ignoring the AtrousSpatialPyramidPoolingModule in the output.

What do you think?

rotation & scale not randomized

It seems to me both rotation and zoom are meant to control the range of the corresponding augmentation parameters.

107     if args.rotation:
108         angle = args.rotation
109     else:
110         angle = 0.0
111     if args.zoom:
112         scale = args.zoom
113     else:
114         scale = 1.0
115     if args.rotation or args.zoom:
116         M = cv2.getRotationMatrix2D((input_image.shape[1]//2, input_image.shape[0]//2), angle, scale)
117         input_image = cv2.warpAffine(input_image, M, (input_image.shape[1], input_image.shape[0]))
118         output_image = cv2.warpAffine(output_image, M, (output_image.shape[1], output_image.shape[0])    )

training time

How long does it take to train this model?
When i run this code, i got the following messages and the training process is extremely slow, why?

The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-12-30 02:23:19.780078: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
This model has 9270812 trainable parameters
***** Begin training *****
[2017-12-30 02:31:59] Epoch = 0 Count = 20 Current = 2.52 Time = 25.51
[2017-12-30 02:40:30] Epoch = 0 Count = 40 Current = 2.36 Time = 25.65
[2017-12-30 02:49:02] Epoch = 0 Count = 60 Current = 2.20 Time = 25.60
[2017-12-30 02:57:29] Epoch = 0 Count = 80 Current = 2.02 Time = 25.34
[2017-12-30 03:05:56] Epoch = 0 Count = 100 Current = 1.81 Time = 25.45
[2017-12-30 03:14:23] Epoch = 0 Count = 120 Current = 1.42 Time = 25.31
[2017-12-30 03:22:50] Epoch = 0 Count = 140 Current = 1.61 Time = 25.34
[2017-12-30 03:31:18] Epoch = 0 Count = 160 Current = 1.49 Time = 25.29
[2017-12-30 03:39:44] Epoch = 0 Count = 180 Current = 1.65 Time = 25.30
[2017-12-30 03:48:13] Epoch = 0 Count = 200 Current = 1.42 Time = 25.48
[2017-12-30 03:56:43] Epoch = 0 Count = 220 Current = 1.50 Time = 25.55
[2017-12-30 04:05:14] Epoch = 0 Count = 240 Current = 1.60 Time = 25.81
[2017-12-30 04:13:48] Epoch = 0 Count = 260 Current = 1.46 Time = 25.56
[2017-12-30 04:22:18] Epoch = 0 Count = 280 Current = 1.89 Time = 25.88
[2017-12-30 04:30:49] Epoch = 0 Count = 300 Current = 1.46 Time = 25.55
[2017-12-30 04:39:22] Epoch = 0 Count = 320 Current = 1.32 Time = 25.33

train Error FRRN-A

when I try to train the FRRN-A model with the command " python main.py --mode train --dataset CamVid --crop_height 720 --crop_width 960 --batch_size 5 --num_val_images 10 --model FRRN-A" I obtain this error I get this error
/home/cedriq/anaconda3/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
Preparing the model ...
WARNING:tensorflow:From main.py:167: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version.
Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See tf.nn.softmax_cross_entropy_with_logits_v2.

This model has 17741264 trainable parameters
Loading the data ...

***** Begin training *****
Dataset --> CamVid
Model --> FRRN-A
Crop Height --> 720
Crop Width --> 960
Num Epochs --> 300
Batch Size --> 5
Num Classes --> 32
Data Augmentation:
Vertical Flip --> False
Horizontal Flip --> False
Brightness Alteration --> None
Rotation --> None

Traceback (most recent call last):
File "main.py", line 254, in
output_image = np.float32(helpers.one_hot_it(label=output_image, class_dict=class_dict))
File "/home/cedriq/Bureau/prediction/Semantic-Segmentation-Suite/helpers.py", line 66, in one_hot_it
for colour in unique_labels:
TypeError: iteration over a 0-d array

how can I fix it?

How do I use input data with shape (width, height, 1)?

Thanks for your code! Your code is working well on the RGB dataset.

I'm trying to use my own gray image dataset.
I changed code of input placeholoder and imread as below, however I had a error.
[change code of main.py]
input = tf.placeholder(tf.float32, shape=[None, None, None, 1])
input_image = cv2.imread(train_input_names[id], cv2.IMREAd_GRAYSCALE)
[change code of utils.py]
line 60 return image[y:y + crop_height, x:x + crop_width], label[y:y + crop_height, x:x + crop_width]
[error msg.]
File "main.py", line 277, in <module>
_, current = sess.run([opt, loss], feed_dict={input: input_image_batch, output: output_image_batch})
ValueError : Cannot feed value of shape (1, 640, 512) for Tensor 'Placeholder:0', which has shap '(?, ?, ?, 1)'

How do I use input data with shape (width, height, 1) like gray image?

Poor performance of DeepLab V3+

Hi, George
I just tried DeepLab V3+ model you added recently. However, it didn't perform well as I expected. Specifically, I tried DeepLab V3 and V3+ on my own dataset of 5 classes with 100 epochs. DeepLab V3 perform much better than V3+ regarding validation IoU, here 53.42% VS 33.87%. Weired, any idea?
deeplabv3
deeplabv3plus

Class balancing

Hi.

It seems that the networks basically ignore minority classes (eg road signs, lamp posts etc). Have you considered adding some balancing schemes also or you haven't found such an issue?

slow data loading

Data loading is not properly optimized and GPU is idle most of the time waiting for data. I found that bottleneck is one_hot_it. It is hopelessly slow because of the pixel-level operation in python, each involving a table lookup.

On-the-fly color-to-class mapping is a very inefficient design. Such mapping should be done offline.
At least an option should be provided to allow directly loading label images.

In my case it's a binary classification problem, and after I fixed the one-hot-encoding with a hack I got a 6x speedup and my GPU is now above 50% busy when training.

how to make label for myself dataset

Hello!I have a question for you. The question is how to make a colored label, in other words, how to make the original black label into a colored label in the data set.Thank you !!

Should we including the unlabeled class?

In your release code, you considered the unlabeled class and set the number of output to 12, rather than 11 (used by the paper author). And you got a global accuracy less than paper author, but the class average accuracy is higher than it. This confused me. Should I consider the unlabeled class when conducting my experiments? Specifically, i'm using the FC-DenseNet to conduct building extraction. Which means i only want to classify out the building class. Which way should i choose?Thanks for you patience.

Command line arguments for crop_height and crop_width creates exception

If I run python main.py with default parameters. everything works fine.

However if I do this
python main.py --num_epochs 3 --crop_height 150 --crop_width 150
or any other value I get an exception. The exception is copied below. The problem occurs here ConcatOp : Dimensions of inputs should match: shape[0] = [1,8,8,48] vs. shape[1] = [1,9,9,288],

The values in the exception for the shape mismatch is different depending on the input.

for e.g.: python main.py --num_epochs 3 --crop_height 100 --crop_width 120
ConcatOp : Dimensions of inputs should match: shape[0] = [1,6,6,48] vs. shape[1] = [1,6,7,288]

I was hoping to use it for images of different dimensions than the ones in CamVid dataset.

The exception looks like this

Traceback (most recent call last):
  File "main.py", line 278, in <module>
    _,current=sess.run([opt,loss],feed_dict={input:input_image_batch,output:output_image_batch})
  File "/home/ubuntu/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
    run_metadata_ptr)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1128, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1344, in _do_run
    options, run_metadata)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1363, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [1,8,8,48] vs. shape[1] = [1,9,9,288]
	 [[Node: FC-DenseNet56/transitionup6/concat = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](FC-DenseNet56/transitionup6/Conv2d_transpose/BiasAdd, FC-DenseNet56/denseblock5/concat_3, gradients/softmax_cross_entropy_with_logits_sg_grad/ExpandDims/dim)]]
	 [[Node: Mean/_15 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_7316_Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op u'FC-DenseNet56/transitionup6/concat', defined at:
  File "main.py", line 147, in <module>
    network = build_fc_densenet(input, preset_model = args.model, num_classes=num_classes)
  File "models/FC_DenseNet_Tiramisu.py", line 150, in build_fc_densenet
    stack = TransitionUp(block_to_upsample, skip_connection_list[i], n_filters_keep, scope='transitionup%d' % (n_pool + i + 1))
  File "models/FC_DenseNet_Tiramisu.py", line 63, in TransitionUp
    l = tf.concat([l, skip_connection], axis=-1)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 1130, in concat
    return gen_array_ops._concat_v2(values=values, axis=axis, name=name)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 700, in _concat_v2
    "ConcatV2", values=values, axis=axis, name=name)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
    op_def=op_def)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1625, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): ConcatOp : Dimensions of inputs should match: shape[0] = [1,8,8,48] vs. shape[1] = [1,9,9,288]
	 [[Node: FC-DenseNet56/transitionup6/concat = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](FC-DenseNet56/transitionup6/Conv2d_transpose/BiasAdd, FC-DenseNet56/denseblock5/concat_3, gradients/softmax_cross_entropy_with_logits_sg_grad/ExpandDims/dim)]]
	 [[Node: Mean/_15 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_7316_Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]] 

IndexError: too many indices for array

@GeorgeSeif Thanks for your code! I use my own dataset and don't change other hyper parameters. While training, I meet the following error.

[2018-01-08 14:49:35] Epoch = 0 Count = 640 Current = 0.44 Time = 1.12
Traceback (most recent call last):
File "main.py", line 155, in
input_image, output_image = utils.random_crop(input_image, output_image, args.crop_height, args.crop_width)
File "/home/linxp/zhangch/fc-DenseNet/utils.py", line 61, in random_crop
return image[y:y+crop_height, x:x+crop_width, :], label[y:y+crop_height, x:x+crop_width]
IndexError: too many indices for array

I can't figure out the reason.There is no problem for CamVid dataset because the training set has only 360 samples. However, my training set has about 1500 samples for 2 classses. It fails at about 600 steps for the above error. Could you give some suggestion?

error: get_pretrained_checkpoints.py

@GeorgeSeif
get_pretrained_checkpoints.py returned the following error. Worked fine for vgg and inception_v2-v4.
(python 3.5.4).

Any suggestion? Thank you!!


mv: cannot stat 'inception_resnet_v2.ckpt'. No such file or directory.
Traceback (most recent call last):
File "get_pretrained_checkpoints.py", line 41, in
subprocess.check_output(['mv', 'inception_resnet_v2.ckpt', 'weights'])
File "/home/me/anaconda3/envs/tensorflow/lib/python3.5/subprocess.py", line 316, in check_output
**kwargs).stdout
File "/home/me/anaconda3/envs/tensorflow/lib/python3.5/subprocess.py", line 398, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['mv', 'inception_resnet_v2.ckpt', 'weights']' returned non-zero exit status 1

Training FC-DenseNet103 with my data

Hi every one. I have labeled some jpg images using paint and i would like to train the FC-DenseNet103 model with but i get the following error.

Traceback (most recent call last):
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1361, in _do_call
return fn(*args)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1340, in _run_fn
target_list, status, run_metadata)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: All dimensions except 3 must match. Input 1 has shape [1 13 13 656] and doesn't match input 0 with shape [1 12 12 240].
[[Node: gradients/FC-DenseNet103/transitionup6/concat_grad/ConcatOffset = ConcatOffset[N=2, _device="/job:localhost/replica:0/task:0/device:CPU:0"](gradients/FC-DenseNet103/denseblock1/concat_grad/mod, gradients/FC-DenseNet103/transitionup6/concat_grad/ShapeN, gradients/FC-DenseNet103/transitionup6/concat_grad/ShapeN:1)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "main.py", line 276, in
_,current=sess.run([opt,loss],feed_dict={input:input_image_batch,output:output_image_batch})
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 905, in run
run_metadata_ptr)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1137, in _run
feed_dict_tensor, options, run_metadata)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1355, in _do_run
options, run_metadata)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1374, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: All dimensions except 3 must match. Input 1 has shape [1 13 13 656] and doesn't match input 0 with shape [1 12 12 240].
[[Node: gradients/FC-DenseNet103/transitionup6/concat_grad/ConcatOffset = ConcatOffset[N=2, _device="/job:localhost/replica:0/task:0/device:CPU:0"](gradients/FC-DenseNet103/denseblock1/concat_grad/mod, gradients/FC-DenseNet103/transitionup6/concat_grad/ShapeN, gradients/FC-DenseNet103/transitionup6/concat_grad/ShapeN:1)]]

Caused by op 'gradients/FC-DenseNet103/transitionup6/concat_grad/ConcatOffset', defined at:
File "main.py", line 169, in
opt = tf.train.AdamOptimizer(0.0001).minimize(loss, var_list=[var for var in tf.trainable_variables()])
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 359, in minimize
grad_loss=grad_loss)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 460, in compute_gradients
colocate_gradients_with_ops=colocate_gradients_with_ops)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py", line 611, in gradients
lambda: grad_fn(op, *out_grads))
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py", line 377, in _MaybeCompile
return grad_fn() # Exit early
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py", line 611, in
lambda: grad_fn(op, *out_grads))
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/array_grad.py", line 212, in _ConcatGradV2
op, grad, start_value_index=0, end_value_index=-1, dim_index=-1)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/array_grad.py", line 137, in _ConcatGradHelper
offset = gen_array_ops._concat_offset(non_neg_concat_dim, sizes)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 586, in _concat_offset
"ConcatOffset", concat_dim=concat_dim, shape=shape, name=name)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3271, in create_op
op_def=op_def)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1650, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

...which was originally created as op 'FC-DenseNet103/transitionup6/concat', defined at:
File "main.py", line 139, in
network = build_fc_densenet(input, preset_model = args.model, num_classes=num_classes)
File "models/FC_DenseNet_Tiramisu.py", line 150, in build_fc_densenet
stack = TransitionUp(block_to_upsample, skip_connection_list[i], n_filters_keep, scope='transitionup%d' % (n_pool + i + 1))
File "models/FC_DenseNet_Tiramisu.py", line 63, in TransitionUp
l = tf.concat([l, skip_connection], axis=-1)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 1175, in concat
return gen_array_ops._concat_v2(values=values, axis=axis, name=name)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 625, in _concat_v2
"ConcatV2", values=values, axis=axis, name=name)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3271, in create_op
op_def=op_def)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1650, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): All dimensions except 3 must match. Input 1 has shape [1 13 13 656] and doesn't match input 0 with shape [1 12 12 240].
[[Node: gradients/FC-DenseNet103/transitionup6/concat_grad/ConcatOffset = ConcatOffset[N=2, _device="/job:localhost/replica:0/task:0/device:CPU:0"](gradients/FC-DenseNet103/denseblock1/concat_grad/mod, gradients/FC-DenseNet103/transitionup6/concat_grad/ShapeN, gradients/FC-DenseNet103/transitionup6/concat_grad/ShapeN:1)]]

as attached file an example of my labeled images

img_48_sleep_space_good
img_48_sleep_space_good_l

CamVid database labels

Thanks for your code!CamVid database's label is color image(RGB),but your train_labels、val_labels and test_labels are gray image。Can you tell me why you want to convert these annotated color images into grayscale images and how to convert them?
This is your train_labels:
0001tp_006690
This is CamVid' label:
0001tp_006690_l

train problem

Hi I would like to train RefineNet-Res101 using default CamVid dataset containing in the project but after running this command "python main.py --mode train --dataset CamVid --crop_height 720 --crop_wi
dth 960 --batch_size 5 --num_val_images 10 --model RefineNet-Res101" I obtain this error

/home/cedriq/anaconda3/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
Preparing the model ...
WARNING:tensorflow:From main.py:167: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version.
Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See tf.nn.softmax_cross_entropy_with_logits_v2.

This model has 83330464 trainable parameters
2018-03-28 11:45:00.846569: W tensorflow/core/framework/op_kernel.cc:1202] OP_REQUIRES failed at save_restore_tensor.cc:170 : Not found: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for models/resnet_v2_101.ckpt
Traceback (most recent call last):
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1361, in _do_call
return fn(*args)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1340, in _run_fn
target_list, status, run_metadata)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for models/resnet_v2_101.ckpt
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "main.py", line 179, in
init_fn(sess)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 690, in callback
saver.restore(session, model_path)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1755, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 905, in run
run_metadata_ptr)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1137, in _run
feed_dict_tensor, options, run_metadata)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1355, in _do_run
options, run_metadata)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1374, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for models/resnet_v2_101.ckpt
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

Caused by op 'save/RestoreV2', defined at:
File "main.py", line 142, in
network, init_fn = build_refinenet(input, preset_model = args.model, num_classes=num_classes)
File "models/RefineNet.py", line 167, in build_refinenet
init_fn = slim.assign_from_checkpoint_fn(os.path.join(pretrained_dir, 'resnet_v2_101.ckpt'), slim.get_model_variables('resnet_v2_101'))
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 688, in assign_from_checkpoint_fn
saver = tf_saver.Saver(var_list, reshape=reshape_variables)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1293, in init
self.build()
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1302, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1339, in _build
build_save=build_save, build_restore=build_restore)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 796, in _build_internal
restore_sequentially, reshape)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 449, in _AddRestoreOps
restore_sequentially)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 847, in bulk_restore
return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1030, in restore_v2
shape_and_slices=shape_and_slices, dtypes=dtypes, name=name)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3271, in create_op
op_def=op_def)
File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1650, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

NotFoundError (see above for traceback): Unsuccessful TensorSliceReader constructor: Failed to find any matching files for models/resnet_v2_101.ckpt
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

I m new in python and deeplearning Please what should be the problem? excuse me for my bad english I m french speaking

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.