Topic: llava Goto Github
Some thing interesting about llava
Some thing interesting about llava
llava,RestAI is an AIaaS (AI as a Service) open-source platform. Built on top of LlamaIndex, Ollama and HF Pipelines. Supports any public LLM supported by LlamaIndex and any local LLM suported by Ollama. Precise embeddings usage and tuning.
User: apocas
Home Page: https://apocas.github.io/restai/
llava,Docker image for LLaVA: Large Language and Vision Assistant
User: ashleykleynhans
llava,LLaVA: Large Language and Vision Assistant | RunPod Serverless Worker
User: ashleykleynhans
llava,From scratch implementation of a vision language model in pure PyTorch
User: avisoori1x
llava,MLX-VLM is a package for running Vision LLMs locally on your Mac using MLX.
User: blaizzy
llava,Give your computer an AI Brain
Organization: blib-la
Home Page: https://get-captain.com
llava,ChatGPT爆火,开启了通往AGI的关键一步,本项目旨在汇总那些ChatGPT的开源平替们,包括文本大模型、多模态大模型等,为大家提供一些便利
User: chenking2020
llava,A Python tool to evaluate the performance of VLM on the medical domain.
User: corentin-ryr
llava,AI Device Template Featuring Whisper, TTS, Groq, Llama3, OpenAI and more
User: developersdigest
Home Page: https://developersdigest.tech
llava,FreeGenius AI, an advanced AI assistant that can talk and take multi-step actions. Supports numerous open-source LLMs via Llama.cpp or Ollama or Groq Cloud API, with optional integration with AutoGen agents, OpenAI API, Google Gemini Pro and unlimited plugins.
User: eliranwong
Home Page: https://letmedoit.ai
llava,SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild
User: fanghua-yu
Home Page: http://supir.xpixel.group/
llava,Deploy your very own ChatGPT-Style Web Interface for Ollama 🦙
Organization: fly-apps
Home Page: https://fly.io/docs/gpus/
llava,Chat with large languages models about the contents of an image via this native desktop client for Windows, macOS, and Linux.
User: fmxexpress
Home Page: https://www.fmxexpress.com/
llava,[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
User: fuxiaoliu
Home Page: https://fuxiaoliu.github.io/LRV/
llava,Famous Vision Language Models and Their Architectures
User: gokayfem
llava,Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation
User: gokayfem
llava,[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
User: haotian-liu
Home Page: https://llava.hliu.cc
llava,Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.
User: herrera-luis
llava,Code for "How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary Investigation"
User: jameszhou-gl
Home Page: https://arxiv.org/pdf/2312.07424.pdf
llava,Tag manager and captioner for image datasets
User: jhc13
llava,LLaVA inference with multiple images at once for cross-image analysis.
User: mapluisch
llava,🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
Organization: mbzuai-oryx
llava,[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
Organization: mbzuai-oryx
Home Page: https://mbzuai-oryx.github.io/Video-ChatGPT
llava,A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!
Organization: modelscope
llava,ms-swift: Use PEFT or Full-parameter to finetune 250+ LLMs or 20+ MLLMs
Organization: modelscope
Home Page: https://github.com/modelscope/swift/blob/main/docs/source/LLM/index.md
llava,Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 40+ HF models, 20+ benchmarks
Organization: open-compass
Home Page: https://huggingface.co/spaces/opencompass/open_vlm_leaderboard
llava,Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
Organization: paddlepaddle
llava,Image Classification Testing with LLMs
User: robert-mcdermott
llava,Effective prompting for Large Multimodal Models like GPT-4 Vision, LLaVA or CogVLM. 🔥
Organization: roboflow
Home Page: https://maestro.roboflow.com
llava,Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding"
Organization: salt-nlp
Home Page: https://llavar.github.io/
llava,A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.
Organization: scisharp
Home Page: https://scisharp.github.io/LLamaSharp
llava,👁️ + 💬 + 🎧 = 🤖 Curated list of top foundation and multimodal models! [Paper + Code + Examples + Tutorials]
User: skalskip
llava,Embed arbitrary modalities (images, audio, documents, etc) into large language models.
User: sshh12
llava,🧘🏻♂️KarmaVLM (相生):A family of high efficiency and powerful visual language model.
User: thomas-yanxin
llava,[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Organization: tianyi-lab
llava,A Framework of Small-scale Large Multimodal Models
User: tinyllava
Home Page: https://arxiv.org/abs/2402.14289
llava,LLaVA server (llama.cpp).
User: trzy
llava,This repository includes the official implementation of our paper "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"
Organization: ucsc-vlaa
llava,Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
Organization: unum-cloud
Home Page: https://unum-cloud.github.io/uform/
llava,Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".
User: victorwz
Home Page: https://mlm-filter.github.io/
llava,FreeVA: Offline MLLM as Training-Free Video Assistant
User: whwu95
llava,[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
Organization: wisconsinaivision
Home Page: https://vip-llava.github.io/
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.