Coder Social home page Coder Social logo

Comments (4)

imlixinyang avatar imlixinyang commented on June 2, 2024

I don't know the actual reason for your situation. But there are several points may cause the problem in my view:

  1. data_prefetcher in https://github.com/imlixinyang/HiSD/blob/main/core/utils.py. Which can speed up the data loader.
  2. cudnn.benchmark in https://github.com/imlixinyang/HiSD/blob/main/core/train.py.
  3. Each iteration, the choice of modules from HiSD is different from previous single-path framework .
  4. Latent code may be not buffered in the same memory, which can be improved by register_buffer.

The 32GB memory (>2x1080Ti) is enough for config file (celeba-hq256.yaml), so I am superised to hear that it raise OUT OF MEMORY fault. Hope you reproduce the results successfully soon and you are welcomed to share more information or solutions here. I will try my best to help you.

from hisd.

PrototypeNx avatar PrototypeNx commented on June 2, 2024

Thank you for such a quick reply, I will check what you mentioned.
But I am sorry that I may not express it clearly, the reason for the training failure is more likely to be due to virtual memory rather than cuda memory.
I use RTX 3090 with 24GB of cuda memory and 32GB of RAM. The config uses the default celebA-HQ.yaml, and the virtual memory usage continues to rise before the training starts, reaching 64GB, causing overflow, but the cuda memory and RAM usage is very low. So I had to set the batch_size to 4 for training. At this time, the virtual memory occupies about 40GB. If I want to add a few training attributes, unfortunately the virtual memory overflows again.

from hisd.

PrototypeNx avatar PrototypeNx commented on June 2, 2024

I switched to an Ubuntu system under the same configuration, and there were no problems during training. I think the reason for the above problem is the different virtual memory allocation mechanism between Linux and Windows. Thank you again for your help!

from hisd.

imlixinyang avatar imlixinyang commented on June 2, 2024

Glad to hear that and you're always welcomed if there are any further problems.

from hisd.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.