Coder Social home page Coder Social logo

Comments (3)

dpodvyaznikov avatar dpodvyaznikov commented on August 28, 2024 1

Hi, @DrAnoopKulkarni !

The workaround you've suggested might result in invalid validation score estimation in case you use shuffling, change composition of batch or order of items in batch in any way.
Here is why: upon each iteration of run you pass one batch to self.fit method, and last 20% of items in the batch are used for validation. But in case you use, e.g., run(..., shuffle=True), those items might have been on different positions in previous batches, thus occurring in train subset of self.fit.

There are two ways to track validation score without changing the library's code.

  1. Use batchflow.research module. It allows you to estimate model's performance while training using additional pipelines; conveniently train models with different hyperparameters; train models several times to evaluate how stable the results are. See tutorial here.
  2. Define train and validation pipelines, and perform run for each of them in a loop, e.g.:
train = ...
validation = ...
for _ in range(K):
    train.run(..., n_epochs=N)
    validation.run(...)

from cardio.

AnoopRKulkarni avatar AnoopRKulkarni commented on August 28, 2024

UPDATE:

One of the workarounds that I used was to pass the entire "dataset" to the pipeline and defined a new function, "def train_with_validate()" inside KerasModel class on the lines of the "def train()" and used "self.fit(.., validation_split=0.2)" there instead of the "self.train_on_batch()".

Or, should I simply use "test_on_batch()" in the pipeline after training?

thanks
~anoop

from cardio.

AnoopRKulkarni avatar AnoopRKulkarni commented on August 28, 2024

Thank you @dpodvyaznikov

After a few runs I realized I wasnt getting the results I was looking for with my approach, and thanks for explaining why that would be so.

I will take a look at the research module in details in terms of its philosophy and usage.

However, for now, in my limited requirement, guess your second approach will work.

Thanks again for your thoughts. Appreciate them.

Best regards,
~anoop

from cardio.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.