yixuanli / densenet-tensorflow Goto Github PK

View Code? Open in Web Editor NEW

573.0 23.0 197.0 254 KB

DenseNet Implementation in Tensorflow

License: GNU General Public License v3.0

Python 100.00%

tensorflow densenet

densenet-tensorflow's Issues

Weights initialization

Hi, thanks for the beautiful code.

I'm just curious about the way you initialize the conv weights:
tf.random_normal_initializer(stddev=np.sqrt(2.0/9/channel))

Could you please explain a little bit about this setting? Because I found in the paper they 'adopt the weight initialization introduced by [10]', which is the MRSA initialization. Thanks in advance ;)

Question on validation loss

I have implemented your code on cifar100
when halving the learning rate, the test loss suddenly decrease , then increase
while the validation error keeps unchanged.
Did tensorpack maintain a shadow value and show the EMA of cost? (reset this value after learning rate change)

Any suggesstion would be appreciated
Thanks in advance

question about tensor concat

This code l = tf.concat([c, l], 3) in this line seems to be only concat the adjacent two layer output,
shouldn't it be to concat all previous layers in a dense block?

How muth time is spent in once forward and backward in a mini-batch 64?

My experiment environment is CUDA 8.0.61, cudnn 6 and a TITAN X (pascal).
In my implementations, time (s/mini-batch) of DenseNet-BC (l=100, k=12) is 0.216s on Cifar10 with batch_size 64, but original that is 0.153s in training.
I wonder this is due to my implementation or tensorflow, so could tell me your time cost?

add_transition layer

I found that in you code:
def add_transition(name, l):
shape = l.get_shape().as_list()
in_channel = shape[3]
with tf.variable_scope(name) as scope:
l = BatchNorm('bn1', l)
l = tf.nn.relu(l)
l = Conv2D('conv1', l, in_channel, 1, stride=1, use_bias=False, nl=tf.nn.relu)
l = AvgPooling('pool', l, 2)
return l

After BN and ReLU, there is a 1*1 conv layer. However, you apply nl=tf.nn.relu, do you mean after conv layer, we still need the operation ReLU?
In DenseNet(Caffe version) it is different from your configuration here.
Can you explain it to me ?
Thanks.

JFYI, win7 64bit Anaconda python 3.5.3 tensorpack 0.3.0 running error

Dear.

Windows 7 64bit
Anaconda
python 3.5.3
tensorpack 0.3.0

Can I know to fix the error ?

running error :

c:\densenet-tensorflow>python cifar10-densenet.py

�[32m[0810 18:05:20 @logger.py:107]�[0m Use a new log directory train_log/cifar1
0-single-fisrt150-second225-max3000810-180520
�[32m[0810 18:05:20 @logger.py:73]�[0m Argv: cifar10-densenet.py
�[32m[0810 18:05:20 @fs.py:89]�[0m �[5m�[31mWRN�[0m Env var $TENSORPACK_DATASET
not set, using C:\Users\java\tensorpack_data for datasets.
�[32m[0810 18:05:20 @cifar.py:33]�[0m Found cifar10 data in C:\Users\java\tensor
pack_data\cifar10_data.

Traceback (most recent call last):
File "cifar10-densenet.py", line 174, in
config = get_config()
File "cifar10-densenet.py", line 143, in get_config
dataset_train = get_data('train')
File "cifar10-densenet.py", line 135, in get_data
ds = PrefetchData(ds, 3, 2)
File "c:\anaconda3\lib\site-packages\tensorpack\dataflow\prefetch.py", line 84
, in init
start_proc_mask_signal(self.procs)
File "c:\anaconda3\lib\site-packages\tensorpack\utils\concurrency.py", line 21
2, in start_proc_mask_signal
p.start()
File "c:\anaconda3\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "c:\anaconda3\lib\multiprocessing\context.py", line 212, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "c:\anaconda3\lib\multiprocessing\context.py", line 313, in _Popen
return Popen(process_obj)
File "c:\anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 66, in in
it
reduction.dump(process_obj, to_child)
File "c:\anaconda3\lib\multiprocessing\reduction.py", line 59, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'MapDataComponent.init..f'

Traceback (most recent call last):
File "", line 1, in
File "c:\anaconda3\lib\multiprocessing\spawn.py", line 106, in spawn_main
exitcode = _main(fd)
File "c:\anaconda3\lib\multiprocessing\spawn.py", line 116, in _main
self = pickle.load(from_parent)
EOFError: Ran out of input

where is the pretrained model?

[0609 15:27:39 @base.py:252] Epoch 14 (global_step 10934) finished, time:59.7 seconds.

I'm running with dual 1080Ti and get 1 min for each epoch, using same parameters you used. Is this a reasonable running time?

And how should I use my own data for training and testing?

Thank you in advance!!!

densenet not training when using tf.contrib.layers.recompute_grad

I want to implement memory efficient densenet, following the code in
https://github.com/joeyearsley/efficient_densenet_tensorflow/blob/master/models/densenet_creator.py, the traing process is stuck at first epoch
I have just changed the add_layer part

        def add_layer(l):

            def _add_layer(l):
                shape = l.get_shape().as_list()
                in_channel = shape[3]
                with tf.variable_scope(name) as scope:
                    c = BatchNorm('bn1', l)
                    c = tf.nn.relu(c)
                    c = conv('conv1', c, self.growthRate, 1)
                    l = tf.concat([c, l], 3)
                return l
            
            if self.efficient:
                _add_layer = tf.contrib.layers.recompute_grad(_add_layer)
            
            return _add_layer(l)

also add the key word argument "efficient" to specify whether use the memory efficient version.
However the training process stucked.
Using tensorflow 1.9
tensorpack 0.9.1
Do I need to change other parts in the tensorpack?
Thanks in advance

cifar100-densenet.py missing

Is the cifar100-densenet.py also available for cifar100 results?
Or I overlooked something?

error while running the code

TypeError: Can't instantiate abstract class StatPrinter with abstract methods _trigger

Need to consolidate TF function calls with newer API.

How to train in the customer dataset?

Great implementation. I want to use it to train in my dataset that similar the cifar10. Could you tell me how can I modify it? I found that your code load from the dataset that has build-in function implemented by tensorflow. Thanks

hi, I faced the following error, while I was running the algorithm on two gpus. tensorflow=1.9.0, using docker container.

[0507 11:40:09 @training.py:50] [DataParallel] Training a model of 2 towers.
[0507 11:40:09 @interface.py:31] Automatically applying QueueInput on the DataFlow.
[0507 11:40:09 @interface.py:43] Automatically applying StagingInput on the DataFlow.
Traceback (most recent call last):

launch_train_with_config(config, SyncMultiGPUTrainer(nr_tower))

File "/usr/local/lib/python3.6/dist-packages/tensorpack/train/interface.py", line 90, in launch_train_with_config
model.get_input_signature(), input,
File "/usr/local/lib/python3.6/dist-packages/tensorpack/utils/argtools.py", line 200, in wrapper
value = func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorpack/graph_builder/model_desc.py", line 86, in get_input_signature
inputs = self.inputs()
File "/usr/local/lib/python3.6/dist-packages/tensorpack/graph_builder/model_desc.py", line 116, in inputs
raise NotImplementedError()
NotImplementedError

Tensorpack update

Hi,

I believe the tensorpack repository has been updated hence your repository needs update too.

I ran the code and the following error appeared:

Traceback (most recent call last):
  File "cifar10-densenet.py", line 173, in <module>
    config = get_config()
  File "cifar10-densenet.py", line 142, in get_config
    dataset_train = get_data('train')
  File "cifar10-densenet.py", line 120, in get_data
    imgaug.CenterPaste((40, 40)),
  File "/home/turanshare/anaconda2/envs/tensorflow/lib/python3.6/site-packages/tensorpack/utils/develop.py", line 151, in __getattr__
    return getattr(module, item)
AttributeError: module 'tensorpack.dataflow.imgaug' has no attribute 'CenterPaste'

Apart from CenterPaste I think the following lines must be updated:

augmentors = [
            imgaug.CenterPaste((40, 40)),
            imgaug.RandomCrop((32, 32)),
            imgaug.Flip(horiz=True),
            #imgaug.Brightness(20),
            #imgaug.Contrast((0.6,1.4)),
            imgaug.MapImage(lambda x: x - pp_mean),
        ]

If you have already solved this please let me know. Otherwise I will be happy to generate a pull request with your help.

Thank you.

Question: per_pixel_mean_subtract on test

In Cifar10-densenet.py
Line 116: ds = dataset.Cifar10(train_or_test)
Line 117: pp_mean = ds.get_per_pixel_mean()
Can the validation set use all test data statistics like per_pixel_mean?
Thanks in advance

Results error

How to test the results?
And how to plot the picture you showed?

yixuanli / densenet-tensorflow Goto Github PK

densenet-tensorflow's Issues

Recommend Projects

Recommend Topics

Recommend Org