Coder Social home page Coder Social logo

Comments (12)

andreped avatar andreped commented on August 18, 2024

I am sorry if my question is trivial but I have trouble using this package with the tensorflow backend.

Hello, @bertrandchauveau! I had this issue when making this myself, so no worries :]

You can take a look at what is done in the tests here.

Basically, do this instead:

import tensorflow as tf
import torchstain
import numpy as np

T = lambda x: tf.convert_to_tensor(np.moveaxis(x, -1, 0).astype("float32"))
t_to_transform = T(to_transform)

normalizer = torchstain.normalizers.MacenkoNormalizer(backend='tensorflow')
normalizer.fit(T(target))
result, _, _ = normalizer.normalize(I=t_to_transform, stains=True)

result = result.numpy().astype("float32")

Could you try this first to see if it resolves you issue? I'm a bit occupied right now, but could take a new look tomorrow, if you are still having issues.

This will be better documented in the upcoming release, which includes some new and interested stain normalization techniques and new backends (see here).

BTW: What is the status on the release, @carloalbertobarbano? Shall we aim to get it released by next week? I have a master student who would be interested in the new modified reinhard implementation.

from torchstain.

bertrandchauveau avatar bertrandchauveau commented on August 18, 2024

Thank you for your quick response!

Sadly the same problem occurs, i.e. crashes when running:

normalizer.fit(T(target))

the "T" conversion does the same as my attempt of tf tensor conversion

from torchstain.

andreped avatar andreped commented on August 18, 2024

Sadly the same problem occurs, i.e. crashes when running:

Hmm, well, what I described above is what we do in the unit test, so that should work. Could you show me the error log from the terminal?

Also, could you try downloading the test data that we used for the unit tests here and here, and try running them through your code. I believe that should work. If that works, then the intensity range of your image after imread is in the wrong range. You can see the intensity range by running print(np.unique(image))

Also, I noticed that you were a pathologists. If you just want to get a method working, I would recommend trying the command line tool fast-stain-normalization that is based on torchstain. It enables you to normalize an entire folder without needing to code. Just provide arguments to a CLI and run it from the terminal. You can see how to use it here.

from torchstain.

bertrandchauveau avatar bertrandchauveau commented on August 18, 2024

I had the same issue with the test images that you provided.

This is the error message from the terminal:

2023-02-26 16:20:02.217992: I tensorflow/stream_executor/cuda/cuda_blas.cc:1614] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
2023-02-26 16:20:02.225717: I tensorflow/core/util/cuda_solvers.cc:179] Creating GpuSolver handles for stream 000001CE08BFD700
2023-02-26 16:20:03.039762: F tensorflow/core/util/cuda_solvers.cc:114] Check failed: cusolverDnCreate(&cusolver_dn_handle) == CUSOLVER_STATUS_SUCCESS Failed to create cuSolverDN instance.
[I 16:20:28.009 NotebookApp] KernelRestarter: restarting kernel (1/5), keep random ports

I tried this kind of things from what I saw from stackoverflow, but the kernel still crashes:

gpu = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(device=gpu[0], enable=True)

As I understand it, Tensorflow tries to place the tensors on the GPU, but for whatever reason, it does not work (as you said, I'm a pathologist.) For note, I have an RTX 4090 in a Windows setup and I have not encountered similar issues when tranining deep learning models.

So by forcing Tensorflow to use the CPU with:

with tf.device('/CPU:0'):
    tf_normalizer.fit(T(target))
    result_tf, _, _ = tf_normalizer.normalize(I=t_to_transform, stains=True)

It works as intended.

Should it also work with the GPU?

from torchstain.

andreped avatar andreped commented on August 18, 2024

I was unable to reproduce your issue. See gist.
As you can see from the gist, it works just fine with GPU, also for TF backend.

What you are observing I'm guessing is likely related to the TensorFloat-32 message your are seeing, which I have not seen before. This likely happens because you have a very new GPU, 4090, which I would think might produce some issues.

First I would try disabling TensorFloats, by adding this to the top of your script (after tf import): tf.enable_tensor_float_32_execution(False)

If that did not fix the issue, try installing the nightly release of TF to see if this has been fixed recently:

pip uninstall tensorflow && pip install tf-nightly

from torchstain.

bertrandchauveau avatar bertrandchauveau commented on August 18, 2024

Thank you for your response. Agree that it works nicely in colab.

On my local machine, I disabled TensorFloat-32 with :
tf.config.experimental.enable_tensor_float_32_execution(False)

But the kernel still crashes when fitting the normalizer.
Upgrading tensorflow won’t be as simple as that since I am currently running on native Windows and tf_2.10.0 was the last version that allowed this according to the tf documentation. Upgrading would require to use WSL2, but I am not ready for this right now.

My initial idea (perhaps not a good one) for my project was to use torchstain to normalize images on the fly using a custom data generator, this to avoid the duplication of the dataset (normalized and non-normalized).

For now, I will duplicate my dataset, as relying on the CPU for normalization slows down the batch preparation pretty much. I’ll give it a try when I’m ready to upgrade tensorflow or will try with pytorch which seems less windows-phobic.

from torchstain.

carloalbertobarbano avatar carloalbertobarbano commented on August 18, 2024

Hi @bertrandchauveau, what version of CUDA and cuDNN are you using?

from torchstain.

andreped avatar andreped commented on August 18, 2024

My initial idea (perhaps not a good one) for my project was to use torchstain to normalize images on the fly using a custom data generator, this to avoid the duplication of the dataset (normalized and non-normalized).

That's exactly what I do in my training frameworks and that works just fine. As long as you are using tf.data.Dataset and take advantage of multithreading, it is barely any lag :] But I guess it depends on how much lag you expect and can tolerate, how large the images are, which CPU and SSD/HDD you have, and whatnot.

I don't really work on windows for training models anymore. Note that multithreading does not work as well on windows, as for UNIX-based systems.

Hi @bertrandchauveau, what version of CUDA and cuDNN are you using?

I guess as you seem to be using anaconda, you have installed CUDA through something like this. As I said, I don't have that much experience with conda, as I don't use it myself, but I guess @carloalbertobarbano can help you on that.

from torchstain.

bertrandchauveau avatar bertrandchauveau commented on August 18, 2024

Hi @carloalbertobarbano,
cudatoolkit 11.2.2
cudnn 8.1.0.77
Exactly, installed via conda

from torchstain.

andreped avatar andreped commented on August 18, 2024

@bertrandchauveau Are you still experiencing issues?

from torchstain.

bertrandchauveau avatar bertrandchauveau commented on August 18, 2024

Hi,
Thank you for your message and sorry for my late reply. Since my last message:

  • I installed torchstain 1.3.0
  • kernel still crashes when using the Macenko approach, in fact now when calling:
    torchstain.normalizers.MacenkoNormalizer(backend='tensorflow')
    Same error message as before.
  • With using modified Reinhard method on a single image, sometimes it worked with the GPU, sometimes it crashed. I did not have time to explore this more.

It works when I force torchstain to work on the CPU. With tf.data.Dataset, it is true that there is not much lag during pure training (about +10% for me as compared to no stain normalization) but the validation step after each training epoch is much longer.

  • As you suggested it, I tried to install the last tf.2.12 on WSL, but failed for now with it seems endless error messages for tf to simply work and recognize the GPU...

I should have a bit more time this week to see why sometimes it seems to work with the modified Reinhard method.

from torchstain.

andreped avatar andreped commented on August 18, 2024

As you suggested it, I tried to install the last tf.2.12 on WSL, but failed for now with it seems endless error messages for tf to simply work and recognize the GPU...

AFAIK, there does not yet exist a precompiled binary of tf 2.12 on windows, so I believe that might result in some issues. But if you are using WSL it should work better. You could post the error messages you are getting and I could try to debug it for you. Note that I believe you need a nightly release, as the GPU you have might be too new, as discussed above.

I should have a bit more time this week to see why sometimes it seems to work with the modified Reinhard method.

Why it sometimes works and sometimes fails does not make much sense to me. Have you tried not using Anaconda and just regular Python virtual environments? You will need to setup CUDA yourself then.

from torchstain.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.