ikhlestov / vision_networks Goto Github PK
View Code? Open in Web Editor NEWRepo about neural networks for images handling
License: MIT License
Repo about neural networks for images handling
License: MIT License
Hi!
Thanks for your kind sharing! There is a problem when I running your code for Cifar10 classification. That is, when I change the kernel size of the convolutional layers in each block to 1x1 (from 3x3 to 1x1), the running time is about 4.11s for each epoch (from 3.05s to 4.11s) on Titan X. However, 3x3 convolution always consumes much computional resources than 1x1 convolution. So I am confused. Can you help analyze whether there is a problem in your code or in the tensorflow optimization?
Thanks again!
Hello, I wonder if you have test results on C10+ dataset?
Thanks!
In Densenet paper, it mentions that for non-BC architectures, the number of the output features for the initial conv layer should be 16, and for BC architectures, it's twice as the growth rate. However, in your implementation, the number is always twice as the growth rate.
When using batchnorm, we need to update moving mean and moving variance as the TensorFlow document says. But I have not found it in your code. Is it wrong?
Bias is used?
I understood your code ,your only used in 'trainsition_layer_to_classes'.
I find PyTorch soucecode that is like used 'conv2d +bias'.
Other Model can't run!
Your implementation can run two cases.
DenseNet(k = 12) d=40
DenseNet-BC(k = 12) d=100
Others can't run .Because OOM
So,Pytorch version has "Memory Efficient Implementation of DenseNets" implementation detail.
https://github.com/liuzhuang13/DenseNet/tree/master/models
https://github.com/liuzhuang13/DenseNet/tree/master/models
Main ideal is that used "share variable".
I understood your code ,your try to use 'out' variable.(reduce?)
I think that will build another varable in tensorflow graph.
I want to implement reduce memory.Do you have some ideals?
I'm main idea that use share variable.(https://www.tensorflow.org/programmers_guide/variable_scope)
But I think that has problem in (tf.concat)?
Hi,
When trying "--growth_rate=12 --depth=100 --dataset=C100", it returned "ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[64,372,32,32] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc"
From the GPU usage, I found that it only used GPU[0] and hit OOM:
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 107... Off | 00000000:01:00.0 On | N/A |
| 0% 61C P0 58W / 180W | 7940MiB / 8112MiB | 3% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 107... Off | 00000000:02:00.0 Off | N/A |
| 0% 44C P2 41W / 180W | 181MiB / 8114MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
How to resolve it?
Regards
As I read your notice, I found an interesting thing as "normalized by mean/std of all images in the dataset(train or test), not by its own only".
next I’ve implemented per channel normalization… And networks began works even worse. It was not clear for me why. .... After precise debugging, it becomes apparent that images should be normalized by mean/std of all images in the dataset(train or test), not by its own only.
Given a training dataset, it includes 100 images with size 256x256 (gray image). From your notice, you mean that we will find the mean among all 100 images and std of 100 images and normalize a given based on these value. Is it right? For implementation, it likes
mean_all=mean[image1, image2,...image100]
std_all=std[image1, image2,...image100]
image1_normalize=(image1-mean_all)/std_all
image2_normalize=(image2-mean_all)/std_all
...
In my opinion, using global mean (mean of all images in the dataset) may sensitive to some images which have high illumination. Instead of this, I think that normalization based on mean/std of the image itself will be better. It likes
image1_normalize=(image1-image1.mean())/image1.std()
image2_normalize=(image2-image2.mean())/image2.std()
...
Do you try with above case? One thing I want to ask if you use global means, then do you need to recompute global mean for the testing set?
Hi Ikhlestov,
Thanks for your Tensorflow version of Densenet! I would like to ask if these models are compatible with any available pre-trained weights file out there?
Thanks!
Roy
Hello,
Thanks for this implementation. I'm trying to follow along and don't understand a fine point.
In the paper they put the bottlenecks as part of the transition layers, whereas you placed the bottlenecks in each internal layer of each block
I suspect that having them in the transition layers is the correct approach, since the point of the bottleneck is to reduce the size of the accrued feature maps due to concatenation. Within each internal layer we aren't accruing much, and I suspect that having the bottlenecks there actually increases the number of parameters as the size of each feature map is less than 4*growth_rate.
I am interested in flower dataset which is a small data set of a few thousand flower images spread across 5 labels: daisy, dandelion, roses, sunflowers, tulips. Could you write the python file for the flower dataset, besides cifar and svhn? I want to use your code to train it. Thanks
The script to download and separate into the train and validation folder is at https://github.com/tensorflow/models/blob/master/research/inception/inception/data/download_and_preprocess_flowers.sh
Hi, is there some way to get the softmax probability values from the saved models instead of the final class predictions.
Hi,
Thanks for providing the code.
I ran your code with SVHN, but it only worked when I set normalization to "by_chanels".
(This is more an FYI than an issue)
Best
Armin
Hi, we found that in dense_net.py, its batch_norm used with tf.contrib.layers.batch_norm api rather than tf.nn.fused_batch_norm api, is there any reason for specific using contrib.layers.batch_norm api? because in current intel mkldnn backend, contrib.layers has much overhead compared with nn.fused_batch_norm.
If I want to use most recent tensorflow-gpu
(Note it is not tensorflow
)
then I would pip install tensorflow-gpu
And tensorflow-gpu
requires
enum34>=1.1.6
six>=1.10.0
tensorflow-tensorboard<0.5.0,>=0.4.0rc1
numpy>=1.12.1
wheel>=0.26
protobuf>=3.3.0
werkzeug>=0.11.10
markdown>=2.6.8
html5lib==0.9999999
bleach==1.5.0
...
So in the requirement.txt file
I think it would be better to write
protobuf>=3.3.0
numpy>=1.12.1
instead of
numpy==1.12.0
protobuf==3.1.0.post1
Also, I think it is better to use >=
instead of ==
in all entries in requirements.txt file
Otherwise when somebody installed pip install -r requiremets.txt
Maybe he will install a older version of package
Hello sir, I am curious on how to use you code with my own dataset ?
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.90 Driver Version: 384.90 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K40m On | 00000000:04:00.0 Off | Off |
| N/A 68C P0 132W / 235W | 4267MiB / 12205MiB | 86% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K40m On | 00000000:84:00.0 Off | 0 |
| N/A 42C P0 61W / 235W | 80MiB / 11439MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 143163 C python 4254MiB |
| 1 143163 C python 69MiB |
+-----------------------------------------------------------------------------+
On my server, there are two gpus, when I run code, it seems that only one k40 is running
normalization process of SVHN images has issues. I have changed it which yields much better results.
elseif normalization_type =='mean_0':
train_n = np.full(((images.shape(0),images.shape(1),images.shape(3)),255.0,dtype-np.float32)
pixel_depth=255.0
images = ((images-train_n)/2)/pixel_depth
Thank you for your effort, it is really helping me in my project. Illarion, I had a small doubt in the implementation part.
While you have applied l2_regularization, you have applied to all the parameters ie weights and bias. Is that advisory? given that l2 is mainly/mostly applied only to weights.
l2_loss = tf.add_n( [tf.nn.l2_loss(var) for var in tf.trainable_variables()])
Shouldn't we just add the tf.nn.l2_loss for weights only ?
Hi,
thank you for the implementation!
I can't find update stat operation
from the documentation
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) with tf.control_dependencies(update_ops): train_op = optimizer.minimize(loss)
https://www.tensorflow.org/api_docs/python/tf/contrib/layers/batch_norm
hello
i have a question regarding the reported test (error) results. is it the mean cross entropy ? or something else. please explain these numbers "6.67(7.00)".
Hi,
Thanks a lot for this "vision_networks"!
I have two GPUs in my machine, exactly same type of GPUs. The first one is also used to connect to monitors, I found that the first GPUs has less available RAM than the second one.
Found device 0 with properties:
name: GeForce GTX 1070 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:01:00.0
totalMemory: 7.92GiB freeMemory: 5.14GiB
Found device 1 with properties:
name: GeForce GTX 1070 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:02:00.0
totalMemory: 7.92GiB freeMemory: 7.70GiB
Two questions:
First, is there a way to use two GPUs in parallel to speed up the training process? Here, it always uses
the first GPU and the second one is idle.
Second, is there a way to choose which GPU to be used if it only supports one GPU?
Regards
I remember there is a 3*3 max pooling layer after the initial convolution. Why remove it?
hi, ikhlestov,
you really do a great job. thanks for your share.
can you provide your trained models?
Hi, I have found the same problem about the GPU memory,and is there any memory-efficient tensorflow Implementation?
Thanks very much!
Hi Illarion,
Thanks for the beautiful code.
I have a question regarding the 'by_chanels' normalization on the CIFAR-10 test set. It seems like, when preprocessing the CIFAR-10 test set, you are computing the means of test set instead of training set since the CifarDataSet objects compute the individual mean of different splits(train/val/test). However, would it be correct that the preprocessing statistics has to come only from the training set?
(Reference: http://cs231n.github.io/neural-networks-2/, 'common pitfalls' paragraph)
Correct me if I misread your code.
Again, thanks for your work and It is a great implementation of DenseNet.
You mentioned that "For Cifar+ datasets image normalization was performed before augmentation. This may cause a little bit lower results than reported in paper."
However, I check the image preprocessing parts of these two implementations, and I find that both of them apply normalization before augmentation, so there should be other reason for the difference in performance.
The code does not execute beyond the data provider for SVHN. It works well for both C10 and C100.
Log File:
Prepare training data...
Traceback (most recent call last):
File "run_nn_pruning.py", line 154, in
data_provider = get_data_provider_by_name(args.dataset, train_params)
File "/home/gkrish19/TCAD/DenseNet/data_providers/utils.py", line 17, in get_data_provider_by_name
return SVHNDataProvider(**train_params)
File "/home/gkrish19/TCAD/DenseNet/data_providers/svhn.py", line 85, in init
images, labels = self.get_images_and_labels(part, one_hot)
File "/home/gkrish19/TCAD/DenseNet/data_providers/svhn.py", line 117, in get_images_and_labels
data = scipy.io.loadmat(filename)
File "/home/gkrish19/anaconda3/lib/python3.6/site-packages/scipy/io/matlab/mio.py", line 142, in loadmat
matfile_dict = MR.get_variables(variable_names)
File "/home/gkrish19/anaconda3/lib/python3.6/site-packages/scipy/io/matlab/mio5.py", line 292, in get_variables
res = self.read_var_array(hdr, process)
File "/home/gkrish19/anaconda3/lib/python3.6/site-packages/scipy/io/matlab/mio5.py", line 252, in read_var_array
return self._matrix_reader.array_from_header(header, process)
File "mio5_utils.pyx", line 675, in scipy.io.matlab.mio5_utils.VarReader5.array_from_header
File "mio5_utils.pyx", line 705, in scipy.io.matlab.mio5_utils.VarReader5.array_from_header
File "mio5_utils.pyx", line 778, in scipy.io.matlab.mio5_utils.VarReader5.read_real_complex
File "mio5_utils.pyx", line 450, in scipy.io.matlab.mio5_utils.VarReader5.read_numeric
File "mio5_utils.pyx", line 355, in scipy.io.matlab.mio5_utils.VarReader5.read_element
File "streams.pyx", line 195, in scipy.io.matlab.streams.ZlibInputStream.read_string
File "streams.pyx", line 188, in scipy.io.matlab.streams.ZlibInputStream.read_into
OSError: could not read bytes
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.