Comments (3)
Hi, @DrAnoopKulkarni !
The workaround you've suggested might result in invalid validation score estimation in case you use shuffling, change composition of batch or order of items in batch in any way.
Here is why: upon each iteration of run
you pass one batch to self.fit
method, and last 20% of items in the batch are used for validation. But in case you use, e.g., run(..., shuffle=True)
, those items might have been on different positions in previous batches, thus occurring in train subset of self.fit
.
There are two ways to track validation score without changing the library's code.
- Use
batchflow.research
module. It allows you to estimate model's performance while training using additional pipelines; conveniently train models with different hyperparameters; train models several times to evaluate how stable the results are. See tutorial here. - Define
train
andvalidation
pipelines, and performrun
for each of them in a loop, e.g.:
train = ...
validation = ...
for _ in range(K):
train.run(..., n_epochs=N)
validation.run(...)
from cardio.
UPDATE:
One of the workarounds that I used was to pass the entire "dataset" to the pipeline and defined a new function, "def train_with_validate()" inside KerasModel class on the lines of the "def train()" and used "self.fit(.., validation_split=0.2)" there instead of the "self.train_on_batch()".
Or, should I simply use "test_on_batch()" in the pipeline after training?
thanks
~anoop
from cardio.
Thank you @dpodvyaznikov
After a few runs I realized I wasnt getting the results I was looking for with my approach, and thanks for explaining why that would be so.
I will take a look at the research module in details in terms of its philosophy and usage.
However, for now, in my limited requirement, guess your second approach will work.
Thanks again for your thoughts. Appreciate them.
Best regards,
~anoop
from cardio.
Related Issues (20)
- Dataset - TypeError: only integer scalar arrays can be converted to a scalar index HOT 2
- support for wfdb-2.2.x HOT 2
- I.CardIO.ipynb Error in line [6] HOT 2
- 'NoneType' object has no attribute 'ndim' HOT 5
- Errors on running tutorials HOT 2
- ModuleNotFoundError: No module named 'cardio.dataset' HOT 2
- GMM in scikit-learn deprecation 0.19.1 HOT 2
- Installation Failed ( pyEDFlib requires a version of NumPy, even for setup ) HOT 2
- pipenv installation doesn't work HOT 2
- what is difference A00001~A08528 microsoft Access Table Shortcut and sel30~selxx HOT 4
- how can i load split ECG data HOT 3
- Cardio with TF 1.14
- how can i find "batch_size" function? HOT 3
- Training time with training2017 dataset HOT 1
- reading labeled data HOT 2
- ECG segmentation perfomance HOT 3
- install_requires question HOT 2
- how to predict with .edf file? HOT 6
- HMM Jupyter notebook gives batchflow import errors HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cardio.