pudae / tensorflow-densenet Goto Github PK

View Code? Open in Web Editor NEW

168.0 168.0 59.0 46 KB

Tensorflow-DenseNet with ImageNet Pretrained Models

License: Apache License 2.0

Python 100.00%

tensorflow-densenet's People

Contributors

Stargazers

Watchers

Forkers

zumbalamambo sweaterr mlsdd dyz-zju martin-prillard ustcpcs aaasss0636 berlingrad cg1507 alexliyang armstrongyang guojiapeng00 fendaq insikk summer-bunny manutdzou youngseok0001 jpchen2012 w1368027790 lz1313 chasonlee mj-moose zhcm hurricanedjp dmonn xiaofeiwang2018 jdegange kaaokou huadongtang davis-love-ai ddzhen metwalli 2017e8020161116 liuqiaoping7 bodhiswatta atztao jinhwanhan jaewshin chenfengldw jayceyxc illool pro-flynn kartikmehta09 yongqis amirunpri2018 dongxusong ronden berooo gbc8181 chenping5121 bhupathyap xuetuo yichaojin suyouooooo qyxqyx 3229018240 hwnkxiaolei yuedy liujuanlt

tensorflow-densenet's Issues

What is DenseNet 161?

Thanks for making this available!

I couldn't figure out the naming scheme of DenseNet in the paper, and would like to ask a quick question here. In table 1 of the paper, there are "DenseNet-121", "DenseNet-169", "DenseNet-201" and "DenseNet-264", but didn't mention "DenseNet-161", which you provided in the pre-trained model. The paper did mention "our largest model (DenseNet-161) ...", but I couldn't find more details about it. Could you provide some pointer or more details about that please? Thanks!

Cann't Fine-tune using the image_net model checkpoint.

Hi, pudae:

When I try to fine-tune in my own dataset use tf_densenet121 imagenet model, it gives the following error:

2017-11-09 11:42:15.694613: W tensorflow/core/framework/op_kernel.cc:1158] Not found: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ./model/tf-densenet121/tf_densenet121.ckpt

is there problem with the pre-trained model?

Cant run/train on mnist data

I understand that MNIST is a grayscale image so how to make this code work with GrayScale images, it works fine with RGB images

$python train_image_classifier.py --train_dir=${TRAIN_DIR} --dataset_name=mnist --dataset_split_name=train --dataset_dir=${DATASET_DIR} --model_name=densenet121

normal output:
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [1,1,1024,10] rhs shape= [1,1,1024,5]
[[Node: save/Assign_1299 = Assign[T=DT_FLOAT, _class=["loc:@densenet121/logits/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](densenet121/logits/weights/RMSProp, save/RestoreV2/_7)]]

after editing preprocessing output:
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [7,7,1,64] rhs shape= [7,7,3,64]
[[Node: save/Assign_8 = Assign[T=DT_FLOAT, _class=["loc:@densenet121/conv1/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](densenet121/conv1/weights, save/RestoreV2/_2667)]]

INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, assertion failed: [Unable to decode bytes as JPEG, PNG, GIF, or BMP] [[Node: case/If_0/decode_image/cond_jpeg/cond_png/cond_gif

Hi all,
I use Python 2.7.13 and Tensorflow 1.3.0 on CPU.

I want to use DensNet( https://github.com/pudae/tensorflow-densenet ) for regression problem. My data contains 60000 jpeg images with 37 float labels for each image.
I saved my data into tfrecords files by:

`
def Read_Labels(label_path):
labels_csv = pd.read_csv(label_path)
labels = np.array(labels_csv)
return labels[:,1:]

def load_image(addr):
# read an image and resize to (224, 224)
img = cv2.imread(addr)
img = cv2.resize(img, (224, 224), interpolation=cv2.INTER_CUBIC)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = img.astype(np.float32)
return img

def Shuffle_images_with_labels(shuffle_data, photo_filenames, labels):
if shuffle_data:
c = list(zip(photo_filenames, labels))
shuffle(c)
addrs, labels = zip(*c)
return addrs, labels

def image_to_tfexample_mine(image_data, image_format, height, width, label):
return tf.train.Example(features=tf.train.Features(feature={
'image/encoded': bytes_feature(image_data),
'image/format': bytes_feature(image_format),
'image/class/label': _float_feature(label),
'image/height': int64_feature(height),
'image/width': int64_feature(width),
}))

def _convert_dataset(split_name, filenames, labels, dataset_dir):
assert split_name in ['train', 'validation']

num_per_shard = int(math.ceil(len(filenames) / float(_NUM_SHARDS)))

with tf.Graph().as_default():

    for shard_id in range(_NUM_SHARDS):
      output_filename = _get_dataset_filename(dataset_path, split_name, shard_id)
     
      with tf.python_io.TFRecordWriter(output_filename) as tfrecord_writer:
          start_ndx = shard_id * num_per_shard
          end_ndx = min((shard_id+1) * num_per_shard, len(filenames))
          for i in range(start_ndx, end_ndx):
              sys.stdout.write('\r>> Converting image %d/%d shard %d' % (
                      i+1, len(filenames), shard_id))
              sys.stdout.flush()

              img = load_image(filenames[i])
              image_data = tf.compat.as_bytes(img.tostring())
                
              label = labels[i]
                
              example = image_to_tfexample_mine(image_data, image_format, height, width, label)
                
              # Serialize to string and write on the file
              tfrecord_writer.write(example.SerializeToString())

sys.stdout.write('\n')
sys.stdout.flush()

def run(dataset_dir):

labels = Read_Labels(dataset_dir + '/training_labels.csv')

photo_filenames = _get_filenames_and_classes(dataset_dir + '/images_training')

shuffle_data = True 

photo_filenames, labels = Shuffle_images_with_labels(
        shuffle_data,photo_filenames, labels)

training_filenames = photo_filenames[_NUM_VALIDATION:]
training_labels = labels[_NUM_VALIDATION:]

validation_filenames = photo_filenames[:_NUM_VALIDATION]
validation_labels = labels[:_NUM_VALIDATION]

_convert_dataset('train',
                 training_filenames, training_labels, dataset_path)
_convert_dataset('validation',
                 validation_filenames, validation_labels, dataset_path)

print('\nFinished converting the Flowers dataset!')`

And I decode it by:

`
with tf.Session() as sess:

feature = {
  'image/encoded': tf.FixedLenFeature((), tf.string, default_value=''),
  'image/format': tf.FixedLenFeature((), tf.string, default_value='jpeg'),
  'image/class/label': tf.FixedLenFeature(
      [37,], tf.float32, default_value=tf.zeros([37,], dtype=tf.float32)),
   }

filename_queue = tf.train.string_input_producer([data_path], num_epochs=1)

reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)

features = tf.parse_single_example(serialized_example, features=feature)

image = tf.decode_raw(features['image/encoded'], tf.float32)
print(image.get_shape())

label = tf.cast(features['image/class/label'], tf.float32)

image = tf.reshape(image, [224, 224, 3])

images, labels = tf.train.shuffle_batch([image, label], batch_size=10, capacity=30, num_threads=1, min_after_dequeue=10)

init_op = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer())
sess.run(init_op)

coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)

for batch_index in range(6):
    img, lbl = sess.run([images, labels])
    img = img.astype(np.uint8)
    print(img.shape)
    for j in range(6):
        plt.subplot(2, 3, j+1)
        plt.imshow(img[j, ...])
    plt.show()

coord.request_stop()

coord.join(threads)`

It's all fine up to this point. But when I use the bellow commands for decoding TFRecord files:

`
reader = tf.TFRecordReader

keys_to_features = {
'image/encoded': tf.FixedLenFeature((), tf.string, default_value=''),
'image/format': tf.FixedLenFeature((), tf.string, default_value='raw'),
'image/class/label': tf.FixedLenFeature(
[37,], tf.float32, default_value=tf.zeros([37,], dtype=tf.float32)),
}

items_to_handlers = {
'image': slim.tfexample_decoder.Image('image/encoded'),
'label': slim.tfexample_decoder.Tensor('image/class/label'),
}

decoder = slim.tfexample_decoder.TFExampleDecoder(
keys_to_features, items_to_handlers)`

I get the following error.

INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, assertion failed: [Unable to decode bytes as JPEG, PNG, GIF, or BMP]
[[Node: case/If_0/decode_image/cond_jpeg/cond_png/cond_gif/Assert_1/Assert = Assert[T=[DT_STRING], summarize=3, _device="/job:localhost/replica:0/task:0/cpu:0"](case/If_0/decode_image/cond_jpeg/cond_png/cond_gif/is_bmp, case/If_0/decode_image/cond_jpeg/cond_png/cond_gif/Assert_1/Assert/data_0)]]
INFO:tensorflow:Caught OutOfRangeError. Stopping Training.
INFO:sensorflow:Finished training! Saving model to disk.

To use Densenet for my problem, I should fix this error first.
Could anybody please help me out of this problem. This code works perfectly for the datasets like flowers, MNIST and CIFAR10 available at https://github.com/pudae/tensorflow-densenet/tree/master/datasets but does not work for my data.

Dropout rate

In densenet.py Line: 48

shouldn't it be ??
if dropout_rate:
>>>>>net = tf.nn.dropout(net,keep_prob=dropout_rate)

Missing convolution biases in pre-trained checkpoints

Using the inspect_checkpoint tool on the pretrained model, I can see that the convolutional layers miss their biases.

densenet169/dense_block4/conv_block4/x1/BatchNorm/beta (DT_FLOAT) [736]
densenet169/dense_block4/conv_block4/x1/BatchNorm/gamma (DT_FLOAT) [736]
densenet169/dense_block4/conv_block4/x1/BatchNorm/moving_mean (DT_FLOAT) [736]
densenet169/dense_block4/conv_block4/x1/BatchNorm/moving_variance (DT_FLOAT) [736]
densenet169/dense_block4/conv_block4/x1/Conv/weights (DT_FLOAT) [1,1,736,128]
densenet169/dense_block4/conv_block4/x2/BatchNorm/beta (DT_FLOAT) [128]
densenet169/dense_block4/conv_block4/x2/BatchNorm/gamma (DT_FLOAT) [128]
densenet169/dense_block4/conv_block4/x2/BatchNorm/moving_mean (DT_FLOAT) [128]
densenet169/dense_block4/conv_block4/x2/BatchNorm/moving_variance (DT_FLOAT) [128]
densenet169/dense_block4/conv_block4/x2/Conv/weights (DT_FLOAT) [3,3,128,32]

How were the Keras checkpoints converted? Did something go wrong?

Preprocess Image

Hello I'm trying to train densenet as part of Faster-RCNN architecture using pre-trained checkpoint, I would like to know if there is required preprocessing image step required to successfully run the model. Thanks!

densenet-121 pre-trained model feature map size does not match

Hi, thanks for your work. I found the DenseNet-121 on ImageNet's feature map size in
each block is [56,28,14,7], however in the pre-trained model is [55,27,13,6]

densenet121/conv1/convolution                                          [-1, 112, 112, 64]
densenet121/dense_block1/conv_block1/x1/Conv/convolution               [-1, 55, 55, 128]
densenet121/dense_block1/conv_block1/x2/Conv/convolution               [-1, 55, 55, 32]
densenet121/dense_block1/conv_block2/x1/Conv/convolution               [-1, 55, 55, 128]
densenet121/dense_block1/conv_block2/x2/Conv/convolution               [-1, 55, 55, 32]
densenet121/dense_block1/conv_block3/x1/Conv/convolution               [-1, 55, 55, 128]
densenet121/dense_block1/conv_block3/x2/Conv/convolution               [-1, 55, 55, 32]
densenet121/dense_block1/conv_block4/x1/Conv/convolution               [-1, 55, 55, 128]
densenet121/dense_block1/conv_block4/x2/Conv/convolution               [-1, 55, 55, 32]
densenet121/dense_block1/conv_block5/x1/Conv/convolution               [-1, 55, 55, 128]
densenet121/dense_block1/conv_block5/x2/Conv/convolution               [-1, 55, 55, 32]
densenet121/dense_block1/conv_block6/x1/Conv/convolution               [-1, 55, 55, 128]
densenet121/dense_block1/conv_block6/x2/Conv/convolution               [-1, 55, 55, 32]
densenet121/transition_block1/blk/Conv/convolution                     [-1, 55, 55, 128]
densenet121/transition_block1/AvgPool2D/AvgPool                        [-1, 27, 27, 128]
densenet121/dense_block2/conv_block1/x1/Conv/convolution               [-1, 27, 27, 128]
densenet121/dense_block2/conv_block1/x2/Conv/convolution               [-1, 27, 27, 32]
densenet121/dense_block2/conv_block2/x1/Conv/convolution               [-1, 27, 27, 128]
densenet121/dense_block2/conv_block2/x2/Conv/convolution               [-1, 27, 27, 32]
densenet121/dense_block2/conv_block3/x1/Conv/convolution               [-1, 27, 27, 128]
densenet121/dense_block2/conv_block3/x2/Conv/convolution               [-1, 27, 27, 32]
densenet121/dense_block2/conv_block4/x1/Conv/convolution               [-1, 27, 27, 128]
densenet121/dense_block2/conv_block4/x2/Conv/convolution               [-1, 27, 27, 32]
densenet121/dense_block2/conv_block5/x1/Conv/convolution               [-1, 27, 27, 128]
densenet121/dense_block2/conv_block5/x2/Conv/convolution               [-1, 27, 27, 32]
densenet121/dense_block2/conv_block6/x1/Conv/convolution               [-1, 27, 27, 128]
densenet121/dense_block2/conv_block6/x2/Conv/convolution               [-1, 27, 27, 32]
densenet121/dense_block2/conv_block7/x1/Conv/convolution               [-1, 27, 27, 128]
densenet121/dense_block2/conv_block7/x2/Conv/convolution               [-1, 27, 27, 32]
densenet121/dense_block2/conv_block8/x1/Conv/convolution               [-1, 27, 27, 128]
densenet121/dense_block2/conv_block8/x2/Conv/convolution               [-1, 27, 27, 32]
densenet121/dense_block2/conv_block9/x1/Conv/convolution               [-1, 27, 27, 128]
densenet121/dense_block2/conv_block9/x2/Conv/convolution               [-1, 27, 27, 32]
densenet121/dense_block2/conv_block10/x1/Conv/convolution              [-1, 27, 27, 128]
densenet121/dense_block2/conv_block10/x2/Conv/convolution              [-1, 27, 27, 32]
densenet121/dense_block2/conv_block11/x1/Conv/convolution              [-1, 27, 27, 128]
densenet121/dense_block2/conv_block11/x2/Conv/convolution              [-1, 27, 27, 32]
densenet121/dense_block2/conv_block12/x1/Conv/convolution              [-1, 27, 27, 128]
densenet121/dense_block2/conv_block12/x2/Conv/convolution              [-1, 27, 27, 32]
densenet121/transition_block2/blk/Conv/convolution                     [-1, 27, 27, 256]
densenet121/transition_block2/AvgPool2D/AvgPool                        [-1, 13, 13, 256]
densenet121/dense_block3/conv_block1/x1/Conv/convolution               [-1, 13, 13, 128]
densenet121/dense_block3/conv_block1/x2/Conv/convolution               [-1, 13, 13, 32]
densenet121/dense_block3/conv_block2/x1/Conv/convolution               [-1, 13, 13, 128]
densenet121/dense_block3/conv_block2/x2/Conv/convolution               [-1, 13, 13, 32]
densenet121/dense_block3/conv_block3/x1/Conv/convolution               [-1, 13, 13, 128]
densenet121/dense_block3/conv_block3/x2/Conv/convolution               [-1, 13, 13, 32]
densenet121/dense_block3/conv_block4/x1/Conv/convolution               [-1, 13, 13, 128]
densenet121/dense_block3/conv_block4/x2/Conv/convolution               [-1, 13, 13, 32]
densenet121/dense_block3/conv_block5/x1/Conv/convolution               [-1, 13, 13, 128]
densenet121/dense_block3/conv_block5/x2/Conv/convolution               [-1, 13, 13, 32]
densenet121/dense_block3/conv_block6/x1/Conv/convolution               [-1, 13, 13, 128]
densenet121/dense_block3/conv_block6/x2/Conv/convolution               [-1, 13, 13, 32]
densenet121/dense_block3/conv_block7/x1/Conv/convolution               [-1, 13, 13, 128]
densenet121/dense_block3/conv_block7/x2/Conv/convolution               [-1, 13, 13, 32]
densenet121/dense_block3/conv_block8/x1/Conv/convolution               [-1, 13, 13, 128]
densenet121/dense_block3/conv_block8/x2/Conv/convolution               [-1, 13, 13, 32]
densenet121/dense_block3/conv_block9/x1/Conv/convolution               [-1, 13, 13, 128]
densenet121/dense_block3/conv_block9/x2/Conv/convolution               [-1, 13, 13, 32]
densenet121/dense_block3/conv_block10/x1/Conv/convolution              [-1, 13, 13, 128]
densenet121/dense_block3/conv_block10/x2/Conv/convolution              [-1, 13, 13, 32]
densenet121/dense_block3/conv_block11/x1/Conv/convolution              [-1, 13, 13, 128]
densenet121/dense_block3/conv_block11/x2/Conv/convolution              [-1, 13, 13, 32]
densenet121/dense_block3/conv_block12/x1/Conv/convolution              [-1, 13, 13, 128]
densenet121/dense_block3/conv_block12/x2/Conv/convolution              [-1, 13, 13, 32]
densenet121/dense_block3/conv_block13/x1/Conv/convolution              [-1, 13, 13, 128]
densenet121/dense_block3/conv_block13/x2/Conv/convolution              [-1, 13, 13, 32]
densenet121/dense_block3/conv_block14/x1/Conv/convolution              [-1, 13, 13, 128]
densenet121/dense_block3/conv_block14/x2/Conv/convolution              [-1, 13, 13, 32]
densenet121/dense_block3/conv_block15/x1/Conv/convolution              [-1, 13, 13, 128]
densenet121/dense_block3/conv_block15/x2/Conv/convolution              [-1, 13, 13, 32]
densenet121/dense_block3/conv_block16/x1/Conv/convolution              [-1, 13, 13, 128]
densenet121/dense_block3/conv_block16/x2/Conv/convolution              [-1, 13, 13, 32]
densenet121/dense_block3/conv_block17/x1/Conv/convolution              [-1, 13, 13, 128]
densenet121/dense_block3/conv_block17/x2/Conv/convolution              [-1, 13, 13, 32]
densenet121/dense_block3/conv_block18/x1/Conv/convolution              [-1, 13, 13, 128]
densenet121/dense_block3/conv_block18/x2/Conv/convolution              [-1, 13, 13, 32]
densenet121/dense_block3/conv_block19/x1/Conv/convolution              [-1, 13, 13, 128]
densenet121/dense_block3/conv_block19/x2/Conv/convolution              [-1, 13, 13, 32]
densenet121/dense_block3/conv_block20/x1/Conv/convolution              [-1, 13, 13, 128]
densenet121/dense_block3/conv_block20/x2/Conv/convolution              [-1, 13, 13, 32]
densenet121/dense_block3/conv_block21/x1/Conv/convolution              [-1, 13, 13, 128]
densenet121/dense_block3/conv_block21/x2/Conv/convolution              [-1, 13, 13, 32]
densenet121/dense_block3/conv_block22/x1/Conv/convolution              [-1, 13, 13, 128]
densenet121/dense_block3/conv_block22/x2/Conv/convolution              [-1, 13, 13, 32]
densenet121/dense_block3/conv_block23/x1/Conv/convolution              [-1, 13, 13, 128]
densenet121/dense_block3/conv_block23/x2/Conv/convolution              [-1, 13, 13, 32]
densenet121/dense_block3/conv_block24/x1/Conv/convolution              [-1, 13, 13, 128]
densenet121/dense_block3/conv_block24/x2/Conv/convolution              [-1, 13, 13, 32]
densenet121/transition_block3/blk/Conv/convolution                     [-1, 13, 13, 512]
densenet121/transition_block3/AvgPool2D/AvgPool                        [-1, 6, 6, 512]
densenet121/dense_block4/conv_block1/x1/Conv/convolution               [-1, 6, 6, 128]
densenet121/dense_block4/conv_block1/x2/Conv/convolution               [-1, 6, 6, 32]
densenet121/dense_block4/conv_block2/x1/Conv/convolution               [-1, 6, 6, 128]
densenet121/dense_block4/conv_block2/x2/Conv/convolution               [-1, 6, 6, 32]
densenet121/dense_block4/conv_block3/x1/Conv/convolution               [-1, 6, 6, 128]
densenet121/dense_block4/conv_block3/x2/Conv/convolution               [-1, 6, 6, 32]
densenet121/dense_block4/conv_block4/x1/Conv/convolution               [-1, 6, 6, 128]
densenet121/dense_block4/conv_block4/x2/Conv/convolution               [-1, 6, 6, 32]
densenet121/dense_block4/conv_block5/x1/Conv/convolution               [-1, 6, 6, 128]
densenet121/dense_block4/conv_block5/x2/Conv/convolution               [-1, 6, 6, 32]
densenet121/dense_block4/conv_block6/x1/Conv/convolution               [-1, 6, 6, 128]
densenet121/dense_block4/conv_block6/x2/Conv/convolution               [-1, 6, 6, 32]
densenet121/dense_block4/conv_block7/x1/Conv/convolution               [-1, 6, 6, 128]
densenet121/dense_block4/conv_block7/x2/Conv/convolution               [-1, 6, 6, 32]
densenet121/dense_block4/conv_block8/x1/Conv/convolution               [-1, 6, 6, 128]
densenet121/dense_block4/conv_block8/x2/Conv/convolution               [-1, 6, 6, 32]
densenet121/dense_block4/conv_block9/x1/Conv/convolution               [-1, 6, 6, 128]
densenet121/dense_block4/conv_block9/x2/Conv/convolution               [-1, 6, 6, 32]
densenet121/dense_block4/conv_block10/x1/Conv/convolution              [-1, 6, 6, 128]
densenet121/dense_block4/conv_block10/x2/Conv/convolution              [-1, 6, 6, 32]
densenet121/dense_block4/conv_block11/x1/Conv/convolution              [-1, 6, 6, 128]
densenet121/dense_block4/conv_block11/x2/Conv/convolution              [-1, 6, 6, 32]
densenet121/dense_block4/conv_block12/x1/Conv/convolution              [-1, 6, 6, 128]
densenet121/dense_block4/conv_block12/x2/Conv/convolution              [-1, 6, 6, 32]
densenet121/dense_block4/conv_block13/x1/Conv/convolution              [-1, 6, 6, 128]
densenet121/dense_block4/conv_block13/x2/Conv/convolution              [-1, 6, 6, 32]
densenet121/dense_block4/conv_block14/x1/Conv/convolution              [-1, 6, 6, 128]
densenet121/dense_block4/conv_block14/x2/Conv/convolution              [-1, 6, 6, 32]
densenet121/dense_block4/conv_block15/x1/Conv/convolution              [-1, 6, 6, 128]
densenet121/dense_block4/conv_block15/x2/Conv/convolution              [-1, 6, 6, 32]
densenet121/dense_block4/conv_block16/x1/Conv/convolution              [-1, 6, 6, 128]
densenet121/dense_block4/conv_block16/x2/Conv/convolution              [-1, 6, 6, 32]

do we need to change the this line to the following in densenet.py

net = slim.conv2d(net, num_filters, 7, stride=2, scope='conv1')

net = slim.conv2d(net, num_filters, 7, stride=2, scope='conv1',padding="VALID")

Looking forward to your reply

does pre-trained densenet121 lack some parameters?

Hello, I met a problem when using densenet121 pre-trained model in tensorflow using saver.restore(sess, FLAGS.checkpoint_path). The error said that can not find some parameters in checkpoints. part log is below:

2019-02-25 20:38:07.183012: W tensorflow/core/framework/op_kernel.cc:1158] Not found: Key densenet121/dense_block1/conv_block2/x2/Conv/biases not found in che ckpoint
2019-02-25 20:38:07.185169: W tensorflow/core/framework/op_kernel.cc:1158] Not found: Key densenet121/conv1/biases not found in checkpoint
2019-02-25 20:38:07.185359: W tensorflow/core/framework/op_kernel.cc:1158] Not found: Key densenet121/dense_block1/conv_block1/x1/Conv/biases not found in che ckpoint
2019-02-25 20:38:07.185769: W tensorflow/core/framework/op_kernel.cc:1158] Not found: Key densenet121/dense_block1/conv_block2/x2/Conv/biases not found in che ckpoint
[[Node: save_1/RestoreV2_23 = RestoreV2[dtypes=[DT_FLOAT], _device="/j ob:localhost/replica:0/task:0/cpu:0"](_arg_save_1/Const_0_0, save_1/RestoreV2_2 3/tensor_names, save_1/RestoreV2_23/shape_and_slices)]]
2019-02-25 20:38:07.186000: W tensorflow/core/framework/op_kernel.cc:1158] Not found: Key densenet121/dense_block1/conv_block2/x2/Conv/biases not found in che ckpoint
[[Node: save_1/RestoreV2_23 = RestoreV2[dtypes=[DT_FLOAT], _device="/j ob:localhost/replica:0/task:0/cpu:0"](_arg_save_1/Const_0_0, save_1/RestoreV2_2 3/tensor_names, save_1/RestoreV2_23/shape_and_slices)]]
2019-02-25 20:38:07.186009: W tensorflow/core/framework/op_kernel.cc:1158] Not found: Key densenet121/dense_block1/conv_block2/x2/Conv/biases not found in che ckpoint
[[Node: save_1/RestoreV2_23 = RestoreV2[dtypes=[DT_FLOAT], _device="/j ob:localhost/replica:0/task:0/cpu:0"](_arg_save_1/Const_0_0, save_1/RestoreV2_2 3/tensor_names, save_1/RestoreV2_23/shape_and_slices)]]
2019-02-25 20:38:07.186139: W tensorflow/core/framework/op_kernel.cc:1158] Not found: Key densenet121/dense_block1/conv_block2/x2/Conv/biases not found in che ckpoint

and I print pre-trained model parameters names, and actually I also can not find some parameters such as densenet121/dense_block1/conv_block2/x2/Conv/biases

How to use eval_image_classifier.py to predict my images?

I'm not very familiar with this code enough，can you help me？

BN+ReLU+Conv Work for you?

Hello pudae:

 I try densenet121 base on your code for FR project, but it seems that the Conv structure ( BN+ReLU+Conv) is not work for me.
 I modify the Conv structure to Conv+BN+ReLU, the training is ok but the accuracy is lower. 
 So I try to modify the Conv structure to BN+ReLU+Conv+BN+ReLU, it seems that the training is ok and the accuracy is better than above two structure.
 I am confused for that. Do you have any suggenstion for that? My part of densenet code is as below:

reduction=0.5
growth_rate=32
num_filters=64
compression = 1.0 - reduction
num_layers=[6,12,24,16]
num_dense_blocks = len(num_layers)

with slim.arg_scope([slim.conv2d, slim.fully_connected],
                    weights_initializer=slim.xavier_initializer_conv2d(uniform=True),
                    weights_regularizer=slim.l2_regularizer(weight_decay),
                    activation_fn=None,
                    biases_initializer=None):

    with tf.variable_scope('densenet121', [images], reuse=reuse):
        with slim.arg_scope([slim.batch_norm], 
                            scale=True,
                            decay=0.99,
                            epsilon=1.1e-5), \
             slim.arg_scope([slim.batch_norm, slim.dropout], is_training=phase_train), \
             slim.arg_scope([_conv], dropout_rate=None):

            # initial convolution
            print ("input size: ", images.get_shape())
            net = slim.conv2d(images, num_filters, 7, stride=2, scope='conv1')
            print ("conv1 size: ", net.get_shape())
            net = slim.batch_norm(net)
            net = tf.nn.relu(net)
            net = slim.max_pool2d(net, 3, stride=2, padding='SAME')
            print ("max pool size: ", net.get_shape())

            # blocks
            for i in range(num_dense_blocks - 1):
                # dense blocks
                net, num_filters = _dense_block(net, num_layers[i], num_filters, growth_rate, scope='dense_block' + str(i+1))
                print ("dense block %d size: %s" % (i, net.get_shape()))

                # Add transition_block
                net, num_filters = _transition_block(net, num_filters, compression=compression, scope='transition_block' + str(i+1))
                print ("transition block %d size: %s" % (i, net.get_shape()))

            net, num_filters = _dense_block(
                    net, num_layers[-1], num_filters,
                    growth_rate,
                    scope='dense_block' + str(num_dense_blocks))
            print ("dense block %d size: %s" % (i+1, net.get_shape()))

            # final blocks
            with tf.variable_scope('final_block', [images]):
                net = slim.batch_norm(net)
                net = tf.nn.relu(net)
                net = tf.reduce_mean(net, [1,2], name='global_avg_pool', keep_dims=False)
                print ("global ave pooling size: %s" % (net.get_shape()))

            net = slim.batch_norm(net)
            net = tf.nn.relu(net)
            net = slim.fully_connected(net, bottleneck_layer_size, activation_fn=None,
                                                 scope='logits', reuse=False)
            print ("fully connection size: %s" % (net.get_shape()))
            return net, None

Some details about the pre-trained model

Hi, thanks for your code.
I have two confusions about the pre-trained model.

The order of input channels is RGB or BGR?
I see the dropout_rate is set to None in the code and did you use dropout layer in the pre-trained model?
Thanks again and look forward to your apply!

Pretrained Model

Hello, thanks for your great work. I need your pretrained model for my own experiment. I'll appreciate it if you can give me the access. My google mail address: [email protected]

How periodicaly evaluate the Performance of this model?

I am trying to use DensNet for regression problem with TF-Slim. My data contains 60000 jpeg images with 37 float labels for each image. I divided my data into three different tfrecords files of a train set (60%), a validation set (20%) and a test set (20%).

I need to evaluate validation set during training loop and make a plot like image.
In TF-Slim documentation they just explain train loop and evaluation loop separately. I can just evaluate validation or test set after training loop finished. While as I said I need to evaluate during training.

I tried to use slim.evaluation.evaluation_loop function instead of slim.evaluation.evaluate_once. But it doesn't help.

slim.evaluation.evaluation_loop(
    master=FLAGS.master,
    checkpoint_dir=checkpoint_path,
    logdir=FLAGS.eval_dir,
    num_evals=num_batches,
    eval_op=list(names_to_updates.values()) + print_ops,
    variables_to_restore=variables_to_restore,
    summary_op = tf.summary.merge(summary_ops),
    eval_interval_secs = eval_interval_secs )

I tried evaluation.evaluate_repeatedly as well.

from tensorflow.contrib.training.python.training import evaluation

evaluation.evaluate_repeatedly(
    master=FLAGS.master,
    checkpoint_dir=checkpoint_path,
    eval_ops=list(names_to_updates.values()) + print_ops,
    eval_interval_secs = eval_interval_secs )

In both of these functions, they just read the latest available checkpoint from checkpoint_dir and apparently waiting for the next one, however when the new checkpoints are generated, they don't perform at all.

I use Python 2.7.13 and Tensorflow 1.3.0 on CPU.

Any help will be highly appreciated.

How to evaluate?

Can you give the hand-by-hand instruction for evaluating and testing?

data_format arg_scope

Pudae, apologies you mentioned this in another issue but I couldn't get it work.

If I create symbol like so:

densenet.densenet_arg_scope(data_format='NCHW')

I get

TypeError: densenet_arg_scope() got an unexpected keyword argument 'data_format'

If I do this:

        dense_args = densenet.densenet_arg_scope()
        dense_args['data_format'] = "NCHW"  # This doesn't work!
        print(dense_args)
        with slim.arg_scope(dense_args):
            base_model, _ = densenet.densenet121(in_tensor,
                                                 num_classes=out_features,
                                                 is_training=is_training)

Tensorflow just seems to ignore it and use NHWC

training issues

Hi, I want to ask how to determine the maximum number of training steps if fine-tune this pretrained densenet using my own dataset ?

It did not work when I loaded densenet161 pre-trained model using tf.train.latest_checkpoint

About training from scratch.

Hi! Thank you for your great code! I got a excellent performance by using your pre-trained model.
But when I trained 121-densenet on ImageNet from scratch by using your network code, I got a very pool evaluation result. So, did you train the model from scratch using your code? Thanks~~~

citation DOI request

Hi, thanks for your work, can you provide a DOI for citing your repository and DenseNet-121 pre-trained model on ImageNet? Thanks a lot

how to set the usage of gpu

In tensorflow, we could use these commends to set the usage of gpu
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.9 # 占用GPU90%的显存
session = tf.Session(config=config)
If I use the package of slim, how to set the usage of gpu?

usage link failed

Hi, thanks for your code.
TensorFlow-Slim Models' page is not found
do you have any other link?

How to get more predict info?

I am starting to learn tf and I am very much appreciate your work. I use your prediction code and run it successfully, however, the output is only the top-1 guess. How can I get top-5 guesses and the top-5 probabilities? I try to use tf.nn.top_k and tf.nn.in_top_k but I ran into problems that I cannot solve.

How to control the thread number on Tensorflow?

Hi @pudae ,

Do we have any options to control the number of threads in TF-Slim both in training and evaluation processes?

Specifically, I use this network for my classification problem. I changed the evaluation part in a way that runs train and evaluation in parallel like this code. I can run it on my own CPU without any problem. But I can't execute them on a supercomputer. It seems that it is related to the very large number of threads which are being created by Tensorflow. If the number of threads exceeds the maximum number of threads pre-set in SLURM (= 28) then the job will fail. Since it's unable to create new threads it will end up with error "resource temporarily unavailable".

This error provided when the code tries to restore parameters from checkpoints. If there is no limitation on the number of threads (like on my pc) it works fine:

INFO:tensorflow:Restoring parameters from ./model.ckpt-0
INFO:tensorflow:Starting evaluation at
I tensorflow/core/kernels/logging_ops.cc:79] eval/Accuracy[0]
I tensorflow/core/kernels/logging_ops.cc:79] eval/Recall_5[0]
INFO:tensorflow:Evaluation [1/60]

However, when there is a limitation on the number of threads (like SLURM job submission on supercomputers) we get:

INFO:tensorflow:Restoring parameters from ./model.ckpt-0
terminate called after throwing an instance of 'std::system_error'
  what():  Resource temporarily unavailable

I tried to limit the number of CPU threads used by Tensorflow to 1 by creating config like:

 FLAGS.num_preprocessing_threads=1
  config = tf.ConfigProto()
  config.intra_op_parallelism_threads = FLAGS.num_preprocessing_threads
  config.inter_op_parallelism_threads = FLAGS.num_preprocessing_threads

    slim.evaluation.evaluation_loop(
        master=FLAGS.master,
        checkpoint_path=each_ckpt,
        logdir=FLAGS.eval_dir,
        num_evals=num_batches,
        eval_op=list(names_to_updates.values()) + print_ops,
        variables_to_restore=variables_to_restore,
        session_config=config)

But unfortunately, that didn't help. In my opinion, the main problem we are having here is the fact that we are not able to control the number of threads here. Although we set it to 1 with various TF options you can actually see that this job is creating many more threads on the node:

slurm_script─┬─python───128*[{python}]
└─python───8*[{python}]

Training script is creating 128 threads and evaluation script is creating 8 (both numbers vary over time).

Any idea on the way to control the thread numbers will be highly appreciated because I do need to fix this issue urgently.
Ellie

P.S. I'm using Python 2.7.13 and Tensorflow 1.3.0.

Hello，Question about distribution

Training a model from scratch.
If i want to use 8 GPUs to train a model（a server），how to configure these flags's value?

flags:
num_clones
worker_replicas
num_ps_tasks

NotFoundError

hi,
when I restore your pre_trained model, there raise some mistake

NotFoundError (see above for trackback): Key densenet121/dense_block1/conv_block2/*1/Conv/biases not found in the checkpoint

here is my code

x=tf.placeholder("float",[None,224,224,3])
y=tf.placeholder("float",[None,num_classes])

logits, collection= densenet.densenet121(x,num_classes=num_classes)
predictions = collection['predictions']
loss=-tf.reduce_sum(y*tf.log(predictions))
optimizer=tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(loss)
accuracy=tf.reduce_mean(tf.cast(tf.equal(tf.argmax(predictions, 1), tf.argmax(y, 1)),"float"))

saver=tf.train.Saver()
with tf.Session() as sess:
saver.restore(sess,"tf-densenet121.ckpt")

How can I run "eval_image_classifier.py" on GPUs?

Hi @pudae,
Do you have any idea how can I run "eval_image_classifier.py" on GPUs? Should I change any functions or do any modifications? Or whether there exist any other specific functions for evaluation on GPUs?

I can already run "train_image_classifier.py" on GPUs because of having the associated flag for switching between CPU and GPU:

tf.app.flags.DEFINE_boolean('clone_on_cpu', False,
                        'Use CPUs to deploy clones.')

I did try to add the same line to eval_image_classifier.py, but it had no effect. I'm using Python 2.7.13 and Tensorflow 1.3.0 .

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import math
import tensorflow as tf
from deployment import model_deploy
from datasets import dataset_factory
from nets import nets_factory
from preprocessing import preprocessing_factory
slim = tf.contrib.slim


tf.app.flags.DEFINE_integer(
    'batch_size', 32, 'The number of samples in each batch.')

tf.app.flags.DEFINE_integer(
    'max_num_batches', None,
    'Max number of batches to evaluate by default use all.')

tf.app.flags.DEFINE_string(
    'master', '', 'The address of the TensorFlow master to use.')

tf.app.flags.DEFINE_string(
    'checkpoint_path', '...',
    'The directory where the model was written to or an absolute path to a '
    'checkpoint file.')

tf.app.flags.DEFINE_string(
    'eval_dir', '...',
    'Directory where the results are saved to.')

tf.app.flags.DEFINE_integer('num_clones', 1,
                            'Number of model clones to deploy.')

tf.app.flags.DEFINE_boolean('clone_on_cpu', False,
                            'Use CPUs to deploy clones.')

tf.app.flags.DEFINE_integer('worker_replicas', 1, 'Number of worker replicas.')

tf.app.flags.DEFINE_integer(
    'num_readers', 4,
    'The number of parallel readers that read data from the dataset.')

tf.app.flags.DEFINE_integer(
    'num_ps_tasks', 0,
    'The number of parameter servers. If the value is 0, then the parameters '
    'are handled locally by the worker.')


tf.app.flags.DEFINE_integer(
    'num_preprocessing_threads', 4,
    'The number of threads used to create the batches.')

tf.app.flags.DEFINE_string(
    'dataset_name', '...', 'The name of the dataset to load.')

tf.app.flags.DEFINE_string(
    'dataset_split_name', 'validation', 'The name of the train/test split.')

tf.app.flags.DEFINE_string(
    'dataset_dir', '...', 
    'The directory where the dataset files are stored.')

tf.app.flags.DEFINE_integer(
    'labels_offset', 0,
    'An offset for the labels in the dataset. This flag is primarily used to '
    'evaluate the VGG and ResNet architectures which do not use a background '
    'class for the ImageNet dataset.')

tf.app.flags.DEFINE_string(
    'model_name', 'densenet161', 'The name of the architecture to evaluate.')

tf.app.flags.DEFINE_string(
    'preprocessing_name', None, 'The name of the preprocessing to use. If left '
    'as `None`, then the model_name flag is used.')

tf.app.flags.DEFINE_float(
    'moving_average_decay', None,
    'The decay to use for the moving average.'
    'If left as None, then moving averages are not used.')

tf.app.flags.DEFINE_integer(
    'eval_image_size', None, 'Eval image size')

FLAGS = tf.app.flags.FLAGS


def main(_):
  if not FLAGS.dataset_dir:
    raise ValueError('You must supply the dataset directory with --dataset_dir')
    
    #######################
    # Config model_deploy #
    #######################
  tf.logging.set_verbosity(tf.logging.INFO)
  with tf.Graph().as_default():
      
    deploy_config = model_deploy.DeploymentConfig(
      num_clones=FLAGS.num_clones,
      clone_on_cpu=FLAGS.clone_on_cpu,
      #replica_id=FLAGS.task,
      num_replicas=FLAGS.worker_replicas,
      num_ps_tasks=FLAGS.num_ps_tasks)
      
    # Create global_step
    with tf.device(deploy_config.variables_device()):
      tf_global_step = slim.create_global_step()
    ######################
    # Select the dataset #
    ######################
    dataset = dataset_factory.get_dataset(
        FLAGS.dataset_name, FLAGS.dataset_split_name, FLAGS.dataset_dir)

    ####################
    # Select the model #
    ####################
    network_fn = nets_factory.get_network_fn(
        FLAGS.model_name,
        num_classes=(dataset.num_classes - FLAGS.labels_offset),
        is_training=False)

    ##############################################################
    # Create a dataset provider that loads data from the dataset #
    ##############################################################
    with tf.device(deploy_config.inputs_device()):
        provider = slim.dataset_data_provider.DatasetDataProvider(
            dataset,
            num_readers=FLAGS.num_readers,
            shuffle=False,
            common_queue_capacity=2 * FLAGS.batch_size,
            common_queue_min=FLAGS.batch_size)
        [image, label] = provider.get(['image', 'label'])
        label -= FLAGS.labels_offset

    #####################################
    # Select the preprocessing function #
    #####################################
    preprocessing_name = FLAGS.preprocessing_name or FLAGS.model_name
    image_preprocessing_fn = preprocessing_factory.get_preprocessing(
        preprocessing_name,
        is_training=False)

    eval_image_size = FLAGS.eval_image_size or network_fn.default_image_size

    image = image_preprocessing_fn(image, eval_image_size, eval_image_size)

    images, labels = tf.train.batch(
        [image, label],
        batch_size=FLAGS.batch_size,
        num_threads=FLAGS.num_preprocessing_threads,
        capacity=5 * FLAGS.batch_size)
    batch_queue = slim.prefetch_queue.prefetch_queue(
        [images, labels], capacity=2 * deploy_config.num_clones)
    ####################
    # Define the model #
    ####################
    def clone_fn(batch_queue):
      """Allows data parallelism by creating multiple clones of network_fn."""
      with tf.device(deploy_config.inputs_device()):
        images, labels = batch_queue.dequeue()
      logits, end_points = network_fn(images)
      logits = tf.squeeze(logits)

      #############################
      # Specify the loss function #
      #############################
      if 'AuxLogits' in end_points:
        tf.losses.mean_squared_error(
            predictions=end_points['AuxLogits'], labels=labels, weights=0.4, scope='aux_loss')
      tf.losses.mean_squared_error(
          predictions=logits, labels=labels, weights=1.0)
      return end_points

    #clones = model_deploy.create_clones(deploy_config, clone_fn, [batch_queue])
    #first_clone_scope = deploy_config.clone_scope(0)
    ####################
    # Define the model #
    ####################
    logits, _ = network_fn(images)

    if FLAGS.moving_average_decay:
      variable_averages = tf.train.ExponentialMovingAverage(
          FLAGS.moving_average_decay, tf_global_step)
      variables_to_restore = variable_averages.variables_to_restore(
          slim.get_model_variables())
      variables_to_restore[tf_global_step.op.name] = tf_global_step
    else:
      variables_to_restore = slim.get_variables_to_restore()

    logits = tf.squeeze(logits)

    # Define the metrics:
    predictions = logits

    names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
        'Accuracy': tf.metrics.root_mean_squared_error(predictions, labels),
        'Recall_5': slim.metrics.streaming_recall(
            logits, labels),
    })

    # Print the summaries to screen.
    print_ops = []
    summary_ops = []
    for name, value in names_to_values.items():
      summary_name = 'eval/%s' % name
      op = tf.summary.scalar(summary_name, value, collections=[])
      op = tf.Print(op, [value], summary_name)
      summary_ops.append(op)
      print_ops.append(tf.Print(value, [value], summary_name))
      tf.add_to_collection(tf.GraphKeys.SUMMARIES, op)

    # TODO(sguada) use num_epochs=1
    if FLAGS.max_num_batches:
      num_batches = FLAGS.max_num_batches
    else:
      # This ensures that we make a single pass over all of the data.
      num_batches = math.ceil(dataset.num_samples / float(FLAGS.batch_size))

    if tf.gfile.IsDirectory(FLAGS.checkpoint_path):
        if tf.train.latest_checkpoint(FLAGS.checkpoint_path):
            checkpoint_path = tf.train.latest_checkpoint(FLAGS.checkpoint_path)
        else:
            checkpoint_path = FLAGS.checkpoint_path

    eval_interval_secs = 6
    
    tf.logging.info('Evaluating %s' % checkpoint_path)

    slim.evaluation.evaluation_loop(
        master=FLAGS.master,
        checkpoint_dir=checkpoint_path,
        logdir=FLAGS.eval_dir,
        num_evals=num_batches,
        eval_op=list(names_to_updates.values()) + print_ops,
        variables_to_restore=variables_to_restore,
        eval_interval_secs = eval_interval_secs )
if __name__ == '__main__':
  tf.app.run()

I tried to use some code like Tensorflow tutorial as well:

# Creates a graph.
with tf.device('/gpu:2'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with allow_soft_placement and log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(
         allow_soft_placement=True, log_device_placement=True))
# Runs the op.
print(sess.run(c))

I modified the code in this way:

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import math
import tensorflow as tf

from datasets import dataset_factory
from nets import nets_factory
from preprocessing import preprocessing_factory

slim = tf.contrib.slim

tf.app.flags.DEFINE_integer(
    'batch_size', 32, 'The number of samples in each batch.')

tf.app.flags.DEFINE_integer(
    'max_num_batches', None,
    'Max number of batches to evaluate by default use all.')

tf.app.flags.DEFINE_string(
    'master', '', 'The address of the TensorFlow master to use.')

tf.app.flags.DEFINE_string(
    'checkpoint_path', '...',
    'The directory where the model was written to or an absolute path to a '
    'checkpoint file.')

tf.app.flags.DEFINE_string(
    'eval_dir', '...',
    'Directory where the results are saved to.')

tf.app.flags.DEFINE_integer(
    'num_preprocessing_threads', 4,
    'The number of threads used to create the batches.')

tf.app.flags.DEFINE_string(
    'dataset_name', '...', 'The name of the dataset to load.')

tf.app.flags.DEFINE_string(
    'dataset_split_name', 'validation', 'The name of the train/test split.')

tf.app.flags.DEFINE_string(
    'dataset_dir', '...', 
    'The directory where the dataset files are stored.')

tf.app.flags.DEFINE_integer(
    'labels_offset', 0,
    'An offset for the labels in the dataset. This flag is primarily used to '
    'evaluate the VGG and ResNet architectures which do not use a background '
    'class for the ImageNet dataset.')

tf.app.flags.DEFINE_string(
    'model_name', 'densenet161', 'The name of the architecture to evaluate.')

tf.app.flags.DEFINE_string(
    'preprocessing_name', None, 'The name of the preprocessing to use. If left '
    'as `None`, then the model_name flag is used.')

tf.app.flags.DEFINE_float(
    'moving_average_decay', None,
    'The decay to use for the moving average.'
    'If left as None, then moving averages are not used.')

tf.app.flags.DEFINE_integer(
    'eval_image_size', None, 'Eval image size')

FLAGS = tf.app.flags.FLAGS

# Initialize all global and local variables
init = tf.group(tf.global_variables_initializer(),
                   tf.local_variables_initializer())

def main(_):
  if not FLAGS.dataset_dir:
    raise ValueError('You must supply the dataset directory with --dataset_dir')

  tf.logging.set_verbosity(tf.logging.INFO)
  
  sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True,
                                          log_device_placement=True))
  
  with tf.Graph().as_default(), tf.device('/gpu:0'):
      
    sess.run(init)
    tf_global_step = slim.get_or_create_global_step()

    ######################
    # Select the dataset #
    ######################
    dataset = dataset_factory.get_dataset(
        FLAGS.dataset_name, FLAGS.dataset_split_name, FLAGS.dataset_dir)

    ####################
    # Select the model #
    ####################
    network_fn = nets_factory.get_network_fn(
        FLAGS.model_name,
        num_classes=(dataset.num_classes - FLAGS.labels_offset),
        is_training=False)

    ##############################################################
    # Create a dataset provider that loads data from the dataset #
    ##############################################################
    provider = slim.dataset_data_provider.DatasetDataProvider(
        dataset,
        shuffle=False,
        common_queue_capacity=2 * FLAGS.batch_size,
        common_queue_min=FLAGS.batch_size)
    [image, label] = provider.get(['image', 'label'])
    label -= FLAGS.labels_offset

    #####################################
    # Select the preprocessing function #
    #####################################
    preprocessing_name = FLAGS.preprocessing_name or FLAGS.model_name
    image_preprocessing_fn = preprocessing_factory.get_preprocessing(
        preprocessing_name,
        is_training=False)

    eval_image_size = FLAGS.eval_image_size or network_fn.default_image_size

    image = image_preprocessing_fn(image, eval_image_size, eval_image_size)

    images, labels = tf.train.batch(
        [image, label],
        batch_size=FLAGS.batch_size,
        num_threads=FLAGS.num_preprocessing_threads,
        capacity=5 * FLAGS.batch_size)

    ####################
    # Define the model #
    ####################
    logits, _ = network_fn(images)

    if FLAGS.moving_average_decay:
      variable_averages = tf.train.ExponentialMovingAverage(
          FLAGS.moving_average_decay, tf_global_step)
      variables_to_restore = variable_averages.variables_to_restore(
          slim.get_model_variables())
      variables_to_restore[tf_global_step.op.name] = tf_global_step
    else:
      variables_to_restore = slim.get_variables_to_restore()

    logits = tf.squeeze(logits)

    # Define the metrics:
    predictions = logits

    names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
        'Accuracy': tf.metrics.root_mean_squared_error(predictions, labels),
        'Recall_5': slim.metrics.streaming_recall(
            logits, labels),
    })

    # Print the summaries to screen.
    print_ops = []
    summary_ops = []
    for name, value in names_to_values.items():
      summary_name = 'eval/%s' % name
      op = tf.summary.scalar(summary_name, value, collections=[])
      op = tf.Print(op, [value], summary_name)
      summary_ops.append(op)
      print_ops.append(tf.Print(value, [value], summary_name))
      tf.add_to_collection(tf.GraphKeys.SUMMARIES, op)
      
    # TODO(sguada) use num_epochs=1
    if FLAGS.max_num_batches:
      num_batches = FLAGS.max_num_batches
    else:
      # This ensures that we make a single pass over all of the data.
      num_batches = math.ceil(dataset.num_samples / float(FLAGS.batch_size))

    if tf.gfile.IsDirectory(FLAGS.checkpoint_path):
        if tf.train.latest_checkpoint(FLAGS.checkpoint_path):
            checkpoint_path = tf.train.latest_checkpoint(FLAGS.checkpoint_path)
        else:
            checkpoint_path = FLAGS.checkpoint_path

    #print(checkpoint_path)
    eval_interval_secs = 6
    
    tf.logging.info('Evaluating %s' % checkpoint_path)

    slim.evaluation.evaluation_loop(
        master=FLAGS.master,
        checkpoint_dir=checkpoint_path,
        logdir=FLAGS.eval_dir,
        num_evals=num_batches,
        eval_op=list(names_to_updates.values()) + print_ops,
        variables_to_restore=variables_to_restore,
        eval_interval_secs = eval_interval_secs )


if __name__ == '__main__':
  tf.app.run()

When I run this code, I face this error:

    Traceback (most recent call last):
  File "/home/zgholami/test1/GZ_Project/GZ_DenseNet_TF-slim/eval_image_classifier.py", line 210, in <module>
    tf.app.run()
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "/home/zgholami/test1/GZ_Project/GZ_DenseNet_TF-slim/eval_image_classifier.py", line 206, in main
    eval_interval_secs = 60 )
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/contrib/slim/python/slim/evaluation.py", line 296, in evaluation_loo
p
    timeout=timeout)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/contrib/training/python/training/evaluation.py", line 447, in evalua
te_repeatedly
    session_creator=session_creator, hooks=hooks) as session:
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 668, in __init__
    stop_grace_period_secs=stop_grace_period_secs)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 490, in __init__
    self._sess = _RecoverableSession(self._coordinated_creator)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 842, in __init__
    _WrappedSession.__init__(self, self._create_session())
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 847, in _create_session
    return self._sess_creator.create_session()
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 551, in create_session
    self.tf_sess = self._session_creator.create_session()
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 425, in create_session
    init_fn=self._scaffold.init_fn)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/training/session_manager.py", line 273, in prepare_session
    config=config)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/training/session_manager.py", line 189, in _restore_checkpoin
t
    saver.restore(sess, checkpoint_filename_with_path)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1560, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
    run_metadata_ptr)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1124, in _run
    feed_dict_tensor, options, run_metadata)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
    options, run_metadata)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation 'eval_step': Could not satisfy explicit device s
pecification '/device:GPU:0' because no supported kernel for GPU devices is available.
Colocation Debug Info:
Colocation group had the following types and devices: 
Const: GPU CPU 
AssignAdd: CPU 
VariableV2: CPU 
Identity: GPU CPU 
Assign: CPU 
IsVariableInitialized: CPU 
	 [[Node: eval_step = VariableV2[_class=["loc:@eval_step"], container="", dtype=DT_INT64, shape=[], shared_name="", _device="/device:GPU:0"]
()]]

Caused by op u'eval_step', defined at:
  File "/home/zgholami/test1/GZ_Project/GZ_DenseNet_TF-slim/eval_image_classifier.py", line 210, in <module>
    tf.app.run()
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "/home/zgholami/test1/GZ_Project/GZ_DenseNet_TF-slim/eval_image_classifier.py", line 206, in main
    eval_interval_secs = 60 )
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/contrib/slim/python/slim/evaluation.py", line 296, in evaluation_loo
p
    timeout=timeout)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/contrib/training/python/training/evaluation.py", line 410, in evalua
te_repeatedly
    eval_step = get_or_create_eval_step()
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/training/evaluation.py", line 57, in _get_or_create_eval_step
    collections=[ops.GraphKeys.LOCAL_VARIABLES, ops.GraphKeys.EVAL_STEP])
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1065, in get_variable
    use_resource=use_resource, custom_getter=custom_getter)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 962, in get_variable
    use_resource=use_resource, custom_getter=custom_getter)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 367, in get_variable
    validate_shape=validate_shape, use_resource=use_resource)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 352, in _true_getter
    use_resource=use_resource)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 725, in _get_single_variable
    validate_shape=validate_shape)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 199, in __init__
    expected_shape=expected_shape)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 283, in _init_from_args
    name=name)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/ops/state_ops.py", line 131, in variable_op_v2
    shared_name=shared_name)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 682, in _variable_v2
    name=name)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/group/pawsey0245/zgholami/pyml/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Cannot assign a device for operation 'eval_step': Could not satisfy explicit device specification '
/device:GPU:0' because no supported kernel for GPU devices is available.
Colocation Debug Info:
Colocation group had the following types and devices: 
Const: GPU CPU 
AssignAdd: CPU 
VariableV2: CPU 
Identity: GPU CPU 
Assign: CPU 
IsVariableInitialized: CPU 
	 [[Node: eval_step = VariableV2[_class=["loc:@eval_step"], container="", dtype=DT_INT64, shape=[], shared_name="", _device="/device:GPU:0"]
()]]

ERROR:tensorflow:==================================
Object was never used (type <class 'tensorflow.python.framework.ops.Tensor'>):
<tf.Tensor 'report_uninitialized_variables_1/boolean_mask/Gather:0' shape=(?,) dtype=string>
If you want to mark it as used call its "mark_used()" method.

is_training=True

Hi, I was curious why it's the case for this implementation (and also slim's resnet_v1) that fine-tuning a model with is_training=True and then running inference with is_training=True gives better results than inference with is_training=False?

I searched and this has been mentioned here: tensorflow/models#1288 & tensorflow/models#2138 (comment) & tensorflow/models#391

And the slim walkthrough does the same thing in the last cell (inference with is_training=True).

Model for transfer-learning

Hi Pudae, thanks a lot for this super-neat implementation. I am trying to load the model (without the pre-processing functions, etc) so that I can chop off the last fc layer and stick my own (to re-train on my data-set). However, I was having a bit of an issue loading the model without any of the functions attached:

def create_symbol(chkpt_dir=CHKPT_DIR):
    with tf.Graph().as_default():
        slim.get_or_create_global_step()
        # Not possible to have channels first?
        X = tf.placeholder(tf.float32, 
                           shape=(None, 224, 224, 3))
        logits, endpoints = densenet.densenet121(
            X,
            num_classes=1000, 
            is_training=True,
            reuse=None)
        
        print(logits)
        print(endpoints)
        
        variables_to_restore = slim.get_variables_to_restore()
        
        sess = tf.Session()
        saver = tf.train.Saver(variables_to_restore)

        init_op = tf.group(
          tf.global_variables_initializer(),
          tf.local_variables_initializer())

        checkpoint_path = os.path.join(chkpt_dir, 'tf-densenet121.ckpt')
        sess.run(init_op)
        saver.restore(sess, checkpoint_path)
        return endpoints, sess

sym, sess = create_symbol()

Seems to complain about finding biases:

--------------------------------------------------------------------------
NotFoundError                             Traceback (most recent call last)
/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1322     try:
-> 1323       return fn(*args)
   1324     except errors.OpError as e:

/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/client/session.py in _run_fn(session, feed_dict, fetch_list, target_list, options, run_metadata)
   1301                                    feed_dict, fetch_list, target_list,
-> 1302                                    status, run_metadata)
   1303 

/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
    472             compat.as_text(c_api.TF_Message(self.status.status)),
--> 473             c_api.TF_GetCode(self.status.status))
    474     # Delete the underlying status object from memory otherwise it stays alive

NotFoundError: Key densenet121/dense_block2/conv_block7/x1/Conv/biases not found in checkpoint
    [[Node: save/RestoreV2_158 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_158/tensor_names, save/RestoreV2_158/shape_and_slices)]]
    [[Node: save/RestoreV2_16/_1181 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_2392_save/RestoreV2_16", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

During handling of the above exception, another exception occurred:

NotFoundError                             Traceback (most recent call last)
<ipython-input-5-9d5aca7f5637> in <module>()
----> 1 sym, sess = create_symbol()

<ipython-input-4-e283472f7a13> in create_symbol(chkpt_dir)
     25         checkpoint_path = os.path.join(chkpt_dir, 'tf-densenet121.ckpt')
     26         sess.run(init_op)
---> 27         saver.restore(sess, checkpoint_path)
     28         return network_fn, sess

/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/training/saver.py in restore(self, sess, save_path)
   1664     if context.in_graph_mode():
   1665       sess.run(self.saver_def.restore_op_name,
-> 1666                {self.saver_def.filename_tensor_name: save_path})
   1667     else:
   1668       self._build_eager(save_path, build_save=False, build_restore=True)

/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
    887     try:
    888       result = self._run(None, fetches, feed_dict, options_ptr,
--> 889                          run_metadata_ptr)
    890       if run_metadata:
    891         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1118     if final_fetches or final_targets or (handle and feed_dict_tensor):
   1119       results = self._do_run(handle, final_targets, final_fetches,
-> 1120                              feed_dict_tensor, options, run_metadata)
   1121     else:
   1122       results = []

/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1315     if handle is None:
   1316       return self._do_call(_run_fn, self._session, feeds, fetches, targets,
-> 1317                            options, run_metadata)
   1318     else:
   1319       return self._do_call(_prun_fn, self._session, handle, feeds, fetches)

/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1334         except KeyError:
   1335           pass
-> 1336       raise type(e)(node_def, op, message)
   1337 
   1338   def _extend_graph(self):

NotFoundError: Key densenet121/dense_block2/conv_block7/x1/Conv/biases not found in checkpoint
    [[Node: save/RestoreV2_158 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_158/tensor_names, save/RestoreV2_158/shape_and_slices)]]
    [[Node: save/RestoreV2_16/_1181 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_2392_save/RestoreV2_16", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

Caused by op 'save/RestoreV2_158', defined at:
  File "/anaconda/envs/py35/lib/python3.5/runpy.py", line 184, in _run_module_as_main
    "__main__", mod_spec)
  File "/anaconda/envs/py35/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "/anaconda/envs/py35/lib/python3.5/site-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/anaconda/envs/py35/lib/python3.5/site-packages/ipykernel/kernelapp.py", line 486, in start
    self.io_loop.start()
  File "/anaconda/envs/py35/lib/python3.5/site-packages/tornado/platform/asyncio.py", line 112, in start
    self.asyncio_loop.run_forever()
  File "/anaconda/envs/py35/lib/python3.5/asyncio/base_events.py", line 345, in run_forever
    self._run_once()
  File "/anaconda/envs/py35/lib/python3.5/asyncio/base_events.py", line 1312, in _run_once
    handle._run()
  File "/anaconda/envs/py35/lib/python3.5/asyncio/events.py", line 125, in _run
    self._callback(*self._args)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/tornado/ioloop.py", line 760, in _run_callback
    ret = callback()
  File "/anaconda/envs/py35/lib/python3.5/site-packages/tornado/stack_context.py", line 276, in null_wrapper
    return fn(*args, **kwargs)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 536, in <lambda>
    self.io_loop.add_callback(lambda : self._handle_events(self.socket, 0))
  File "/anaconda/envs/py35/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 450, in _handle_events
    self._handle_recv()
  File "/anaconda/envs/py35/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 480, in _handle_recv
    self._run_callback(callback, msg)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 432, in _run_callback
    callback(*args, **kwargs)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/tornado/stack_context.py", line 276, in null_wrapper
    return fn(*args, **kwargs)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 283, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 233, in dispatch_shell
    handler(stream, idents, msg)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 399, in execute_request
    user_expressions, allow_stdin)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/ipykernel/ipkernel.py", line 208, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/ipykernel/zmqshell.py", line 537, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2728, in run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2850, in run_ast_nodes
    if self.run_code(code, result):
  File "/anaconda/envs/py35/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2910, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-5-9d5aca7f5637>", line 1, in <module>
    sym, sess = create_symbol()
  File "<ipython-input-4-e283472f7a13>", line 19, in create_symbol
    saver = tf.train.Saver(variables_to_restore)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1218, in __init__
    self.build()
  File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1227, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1263, in _build
    build_save=build_save, build_restore=build_restore)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 751, in _build_internal
    restore_sequentially, reshape)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 427, in _AddRestoreOps
    tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 267, in restore_op
    [spec.tensor.dtype])[0])
  File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1021, in restore_v2
    shape_and_slices=shape_and_slices, dtypes=dtypes, name=name)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
    op_def=op_def)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1470, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

NotFoundError (see above for traceback): Key densenet121/dense_block2/conv_block7/x1/Conv/biases not found in checkpoint
    [[Node: save/RestoreV2_158 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_158/tensor_names, save/RestoreV2_158/shape_and_slices)]]
    [[Node: save/RestoreV2_16/_1181 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_2392_save/RestoreV2_16", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

However, the end-points seem ok:

('densenet121/dense_block4', <tf.Tensor 'densenet121/dense_block4/conv_block16/concat:0' shape=(?, 7, 7, 1024) dtype=float32>), ('densenet121/logits', <tf.Tensor 'densenet121/logits/Relu:0' shape=(?, 1, 1, 1000) dtype=float32>), ('predictions', <tf.Tensor 'densenet121/predictions/Reshape_1:0' shape=(?, 1, 1, 1000) dtype=float32>)])

So that I can just extract: 'densenet121/dense_block4/conv_block16/concat:0' and add my own:
('densenet121/logits', <tf.Tensor 'densenet121/logits/Relu:0' shape=(?, 1, 1, 16) dtype=float32>), ('predictions', <tf.Tensor 'densenet121/predictions/Reshape_1:0' shape=(?, 1, 1, 16) dtype=float32>)])

Also was curious if there was a reason that shape is channels-last, since I thought channels-first is faster for cuDNN training?

Thanks