Coder Social home page Coder Social logo

GPU Memory of A100 about minigpt-4 HOT 5 CLOSED

vision-cair avatar vision-cair commented on August 21, 2024
GPU Memory of A100

from minigpt-4.

Comments (5)

TsuTikgiau avatar TsuTikgiau commented on August 21, 2024

Thanks for your interest! We are using 80G. I'm not sure if you mean the training stage or the inference. In case you are talking about training, I think a simple way to avoid OOM is to use a smaller batch size. You can set it in the training config files under the folder train_config/. I think maybe 8bit vicuna can also reduce memory usage. But we haven't tested this in the training stage yet. In case you mean inference, the current inference is run on a single card and it doesn't support model parallel in multi GPU yet. So multiple GPU will not help the OOM issue. Some methods discussed in this issue can reduce memory usage for inference dramatically. We are currently working on an official solution to make it run in a 24G memory GPU. Will return to you once it is finished

from minigpt-4.

XXXKAY avatar XXXKAY commented on August 21, 2024

discussed

How much GPU memory do I need if I use it for inference?

from minigpt-4.

Unrealluver avatar Unrealluver commented on August 21, 2024

Thanks for your interest! We are using 80G. I'm not sure if you mean the training stage or the inference. In case you are talking about training, I think a simple way to avoid OOM is to use a smaller batch size. You can set it in the training config files under the folder train_config/. I think maybe 8bit vicuna can also reduce memory usage. But we haven't tested this in the training stage yet. In case you mean inference, the current inference is run on a single card and it doesn't support model parallel in multi GPU yet. So multiple GPU will not help the OOM issue. Some methods discussed in this issue can reduce memory usage for inference dramatically. We are currently working on an official solution to make it run in a 24G memory GPU. Will return to you once it is finished

Thanks for your reply. I use 8xA100(40G) for training Vicuna with a GPU OOM problem now.

from minigpt-4.

TsuTikgiau avatar TsuTikgiau commented on August 21, 2024

@XXXKAY We update the default hyperparameter for inference and load Vicuna as 8bit by default now when you launch the demo. Under this setting, the memory cost is about 23GB. You can check the updated readme for more information

from minigpt-4.

TsuTikgiau avatar TsuTikgiau commented on August 21, 2024

@Unrealluver In your case I think you can set the batchsize per GPU smaller in minigpt4_stage1_pretrain.yaml. The default is 64, which cost about 70+ GB per GPU

from minigpt-4.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.