mbzuai-oryx / mobillama Goto Github PK
View Code? Open in Web Editor NEWMobiLlama : Small Language Model tailored for edge devices
Home Page: https://github.com/mbzuai-oryx/MobiLlama
License: Apache License 2.0
MobiLlama : Small Language Model tailored for edge devices
Home Page: https://github.com/mbzuai-oryx/MobiLlama
License: Apache License 2.0
Could you modify the home activity to load the local model instead? The network speed for downloading the model is quite slow. Thank you for sharing this amazing AI project and Android app.
this is more a question than an issue.
Can one extend the context size of the model ?
I am asking because I would like to test finetuning it to longer context to see how far one can get in terms of context size with constrained resources (RTX 4090)
Great project! After I installed the apk on my Android device, I was able to run the 05B model and enter questions to get some feedback.
But I have some questions:
When i run test code, got the error:
RuntimeError: FlashAttention only supports Ampere GPUs or newer.
Thanks for your great work!
Will you release the code and checkpoints of MobiLlama-V?
This work is very interesting and I hope to develop my work using MobiLlama-V. Thank you so much!
hello @OmkarThawakar , I used the LLM360 Analysis repo to run eval for siqa task:
python Analysis360/eval/harness/main.py --device cuda:0 --model=hf-causal-experimental --batch_size=auto:1 --model_args="pretrained=MBZUAI/MobiLlama-05B,trust_remote_code=True,dtype=bfloat16" --tasks=social_iqa --num_fewshot=0 --output_path=Analysis360-MobiLlama-05B.json
it only gives 0.3327, which is close to random numbers, since there are only three choices.
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | |
---|---|---|---|---|---|---|---|
social_iqa | 0 | none | 0 | acc | 0.3327 | ± | 0.0107 |
Could you share how you ran the siqa evaluation? Thanks
Thanks for your great work! In Multimodal MobiLlama of the Results section, you briefly introduce how you developed MobiLlama-V. The model seems to have a LLaVA-like architecture, but is only trained on the visual instruction tuning data, which is the potential reason that MobiLlama-V exhibits mediocre performance. Hence, my questions are the following:
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.