Coder Social home page Coder Social logo

Comments (10)

kwotsin avatar kwotsin commented on August 16, 2024 1

If you look at https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/training/python/training/training.py#L459 this is where the create train ops function came from. The control flow ops ensure that after the gradient updates have been done, then a loss could be attributed back to the train op, so ideally train op should return the loss.

from tensorflow-xception.

lc82111 avatar lc82111 commented on August 16, 2024 1

Thanks. The L459 makes sense.

from tensorflow-xception.

lc82111 avatar lc82111 commented on August 16, 2024 1

The sess.run(sv.global_step) exist in your code, see https://github.com/kwotsin/Tensorflow-Xception/blob/master/eval_flowers.py#L93

from tensorflow-xception.

kwotsin avatar kwotsin commented on August 16, 2024

After checking with the paper, I think batch_norm was wrongly implemented twice in the code. However, after testing the differences between including batch_norm twice and just once, I found a strange behaviour:

  1. The 2x batch norm model has a higher loss, but the accuracy for the flowers dataset is higher even on the validation data.

  2. The correct model has a lower loss but doesn't seem to predict the new classes very well as it has lower accuracy on the validation data. I suspect it has to do with how long my training has been done.

I have changed the code to prevent this duplication for the batch norm, and thank you for pointing this out!

Would it be possible for you to share the performances of these 2 models on your dataset? I think it would be interesting to know the effects of using batch_norm twice in a layer.

from tensorflow-xception.

lc82111 avatar lc82111 commented on August 16, 2024

I'm glad to share, but I still work for getting xception to run on my dataset.

from tensorflow-xception.

lc82111 avatar lc82111 commented on August 16, 2024

I suspect the following code can't return total_loss because the train_op is an Operation with no output value.

total_loss, global_step_count, _ = sess.run([train_op, global_step, metrics_op])

from tensorflow-xception.

kwotsin avatar kwotsin commented on August 16, 2024

Could you post your error log? It would be hard to identify the issue without the error shown. So far for the datasets I've tested, this has given me the total loss so it would be strange to see otherwise.

from tensorflow-xception.

lc82111 avatar lc82111 commented on August 16, 2024

I can't understand the necessity of sess.run(sv.global_step) in the following code. Do you mind telling the reason?

       #Now we are ready to run in one session
        with sv.managed_session() as sess:
            for step in xrange(1):
                sess.run(sv.global_step)
                #print vital information every start of the epoch as always
                if step % num_batches_per_epoch == 0:
                    logging.info('Epoch: %s/%s', step / num_batches_per_epoch + 1, num_epochs)
                    logging.info('Current Streaming Accuracy: %.4f', sess.run(accuracy))

from tensorflow-xception.

kwotsin avatar kwotsin commented on August 16, 2024

From what I know, there shouldn't be a need to run sess.run(sv.global_step) - why did you think there was a need to include this? This should only give you the global step number, and if you are intending to increment the global step, then there's no need as the training step would have increased the global step internally by default when you train the model.

from tensorflow-xception.

kwotsin avatar kwotsin commented on August 16, 2024

Thanks for pointing this out! I think I have left it there when I was debugging to see whether the global step would increment without doing the training (and realized you must increment it on your own). I have fixed the changes now.

from tensorflow-xception.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.