Coder Social home page Coder Social logo

mbzuai-oryx / mobillama Goto Github PK

View Code? Open in Web Editor NEW
538.0 12.0 38.0 3.81 MB

MobiLlama : Small Language Model tailored for edge devices

Home Page: https://github.com/mbzuai-oryx/MobiLlama

License: Apache License 2.0

Python 99.74% Shell 0.26%
efficient-llm llm slm mobile-llm tiny-llm

mobillama's People

Contributors

ashmalvayani avatar omkarthawakar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mobillama's Issues

Android App load local model

Could you modify the home activity to load the local model instead? The network speed for downloading the model is quite slow. Thank you for sharing this amazing AI project and Android app.

Extending context size

this is more a question than an issue.
Can one extend the context size of the model ?
I am asking because I would like to test finetuning it to longer context to see how far one can get in terms of context size with constrained resources (RTX 4090)

Android running problems

Great project! After I installed the apk on my Android device, I was able to run the 05B model and enter questions to get some feedback.
a65d897840b06038cc6689d5c68e66ef

But I have some questions:

  1. The output content seems to be somewhat irrelevant to the problem;
  2. Can the code for loading local models in apk be open source;
  3. Whether the model supports fine-tuning so that it can handle specific business scenarios;

MobiLlama-V code and ckpts

Thanks for your great work!
Will you release the code and checkpoints of MobiLlama-V?
This work is very interesting and I hope to develop my work using MobiLlama-V. Thank you so much!

cannot reproduce siqa numbers

hello @OmkarThawakar , I used the LLM360 Analysis repo to run eval for siqa task:

python Analysis360/eval/harness/main.py --device cuda:0 --model=hf-causal-experimental --batch_size=auto:1 --model_args="pretrained=MBZUAI/MobiLlama-05B,trust_remote_code=True,dtype=bfloat16" --tasks=social_iqa --num_fewshot=0 --output_path=Analysis360-MobiLlama-05B.json

it only gives 0.3327, which is close to random numbers, since there are only three choices.

Tasks Version Filter n-shot Metric Value Stderr
social_iqa 0 none 0 acc 0.3327 ± 0.0107

Could you share how you ran the siqa evaluation? Thanks

Question on MobiLlama-V

Thanks for your great work! In Multimodal MobiLlama of the Results section, you briefly introduce how you developed MobiLlama-V. The model seems to have a LLaVA-like architecture, but is only trained on the visual instruction tuning data, which is the potential reason that MobiLlama-V exhibits mediocre performance. Hence, my questions are the following:

  1. Can you release more details about the architecture and training process of MobiLlama-V?
  2. Did/Will you perform two-stage training instead of only the second stage?
  3. Do you consider using ALLaVA-4V, a high-quality multimodal dataset for vision-language training? This dataset is proposed to improve the performance of small VLMs.

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.