Coder Social home page Coder Social logo

About the step_loss == nan about svd_xtend HOT 7 CLOSED

maobenz avatar maobenz commented on September 6, 2024
About the step_loss == nan

from svd_xtend.

Comments (7)

pixeli99 avatar pixeli99 commented on September 6, 2024

Could you please provide some more details, such as your specific settings, device information, and so on?

from svd_xtend.

maobenz avatar maobenz commented on September 6, 2024

Thanks a lot!

I tried different resolution of bdd images but all step_loss is nan. I just use one video clip of the bdd and split the videos into some images to be fed into the model. I have tried the GTX3090 and A100.

When I use the fp32 model , the step loss is not nan but fp16'model' s loss is still nan. However, in the last block of upsample_block, query @ key.transpose(-1, -2) is too large to show nan.

My model id is "stabilityai/stable-video-diffusion-img2vid-xt", but when i tried other model is , it also doesn't work.

from svd_xtend.

maobenz avatar maobenz commented on September 6, 2024

My torch version is 1.13.1+cu116, and my diffusers version is 0.25.0. Even if I input the all zeros as input, the loss is also nan.

from svd_xtend.

maobenz avatar maobenz commented on September 6, 2024

OK, i have found the issue, the torch version should be 2.0.1 rather than 1.13.1. When I change the version of pytorch, the problem has been solved.

from svd_xtend.

pixeli99 avatar pixeli99 commented on September 6, 2024

Ah, I see, but in fact, I might not be able to answer why modifications to the PyTorch version would cause this issue.😢

from svd_xtend.

xiliu8006 avatar xiliu8006 commented on September 6, 2024

I upgraded the PyTorch to 2.1.2 but still has this problem, I can only train on the bf16. Any solutions?

from svd_xtend.

Sibo2rr avatar Sibo2rr commented on September 6, 2024

I upgraded the PyTorch to 2.1.2 but still has this problem, I can only train on the bf16. Any solutions?

hi did you get any solutions? I get similar problem but the loss is nan

from svd_xtend.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.