Coder Social home page Coder Social logo

Comments (10)

simon-mo avatar simon-mo commented on August 25, 2024 4

Hi @sasha0552,

Thank you for bring this up. For now, would you mind maintain this in your fork? There are few reasons that we are hesitant to include support for Pascals:

  • Aside from Triton, we are continuously relying on Cutlass, FlashAttention, and FlashInfer which all seems to dropped Pascal.
  • It is sufficiently easy to build from source in vLLM with Pascal support.
  • As we add more features and performance optimizations, we are afraid we can no longer test and maintain the support for support for Pascal due to added complexity.

from vllm.

sasha0552 avatar sasha0552 commented on August 25, 2024 3

Hello. Can #4409 be included in any of the next releases? Or at least, can I get an explanation as to why it can't be included (maybe I can help in some way?).

Once the wheel size limit increase was approved, and #6394 was merged, wheel size should not be an issue.

I am currently waiting for PyPI staff to approve the wheel size limit increase request to publish the patched triton to PyPI (pypi/support#4295).

It would be nice to see support for Pascal GPUs in vLLM. Many people use them because they are cheap.

from vllm.

WoosukKwon avatar WoosukKwon commented on August 25, 2024 1

July 23rd is Tuesday. Do you mean July 24th?

from vllm.

AlphaINF avatar AlphaINF commented on August 25, 2024 1

hello, when will v0.6.0 release? I'm looking forward to #5036 and MiniCPM-Llama3-V-2_5

from vllm.

simon-mo avatar simon-mo commented on August 25, 2024

v0.5.2 has been released: https://github.com/vllm-project/vllm/releases/tag/v0.5.2

from vllm.

AlphaINF avatar AlphaINF commented on August 25, 2024

Can this PR be added in v0.5.3?
#5036

from vllm.

simon-mo avatar simon-mo commented on August 25, 2024

@AlphaINF unlikely given the current state of the PR at the moment (still being reviewed). but I'm looking very much forward to this PR as well!

from vllm.

AlphaINF avatar AlphaINF commented on August 25, 2024

@simon-mo thanks!

from vllm.

bohr avatar bohr commented on August 25, 2024

@simon-mo for this "async scheduling to overlap scheduling" do we have a plan๏ผŸ

from vllm.

vrdn-23 avatar vrdn-23 commented on August 25, 2024

Would it be possible to get #6594 merged in before the next release is due? @joerunde @Yard1

from vllm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.