Comments (2)
Currently, we do not have plans to expand our ReLU-activated model line-up beyond ReluLLaMA, ReluFalcon, ProSparse, and Bamboo, as model tuning requires significant effort.
That said, while we will not support sparse activation for Qwen1.5, we are developing “hot expert offloading” for the Qwen MoE model based on Qwen1.5, without any further fine-tuning. We plan to roll out this feature and support PowerInfer in MoE scenarios, and you might find the enhanced speed from smarter GPU offloading on this model interesting.
目前,由于训练模型使其转为ReLU激活函数需要投入大量的精力,我们没有计划将我们的稀疏激活模型支持扩展到ReluLLaMA、ReluFalcon、ProSparse和Bamboo之外。
虽然我们不会为Qwen1.5支持稀疏激活,但我们正在为基于Qwen1.5的Qwen MoE模型开发“hot expert offloading”,无需对模型进行进一步训练。我们计划通过专家级别offloading的支持,让PowerInfer能够应用于MoE模型的场景,你可能会对这种情况下的性能提升感兴趣。
from powerinfer.
Thanks!
感谢回复!
from powerinfer.
Related Issues (20)
- Will we have instruct fine-tuned model support in the future? HOT 1
- Clarification on Output Neuron Pruning Method in "Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time HOT 2
- Segmentation fault (core dumped) in ggml test
- two questions that i want to solve HOT 2
- How to assign the specified CUDA_VISIBLE_DEVICE?
- invalid device symbol
- Where is the definition or addition location of GGML_USE_HYBRID_THREADING? HOT 2
- convert.py: error: the following arguments are required: mlp_model HOT 4
- Unable to generate constant output HOT 2
- The code about the figures in paper HOT 1
- 在A100-80G上无法找到cuda的情况 HOT 2
- 请问大神有支持LLama 3 70B 的计划吗?
- 关于在A100显卡上测得的效果异常的疑问 HOT 1
- Why AXPY? HOT 2
- Will this work with Falcon 2?
- Need quite a long time to load the model
- ReluFalcon 40B 在llama.cpp上无效输出 HOT 2
- 推理报错
- ggml-cuda.cu:8949: invalid argument无效参数问题 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from powerinfer.