Topic: large-multimodal-models Goto Github
Some thing interesting about large-multimodal-models
Some thing interesting about large-multimodal-models
large-multimodal-models,The offical Implementation of "Instruction-Guided Visual Masking"
User: 2toinf
Home Page: https://2toinf.github.io/IVM/
large-multimodal-models,[ECCV 2024] BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models
Organization: aifeg
Home Page: https://arxiv.org/abs/2312.02896
large-multimodal-models,This is the official repository of the paper "Multi-Modal and Multi-Agent Systems Meet Rationality: A Survey"
User: bowen-upenn
Home Page: https://arxiv.org/abs/2406.00252
large-multimodal-models,The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.
User: bzluan
large-multimodal-models,"Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA"
Organization: eric-ai-lab
Home Page: https://jackie-2000.github.io/probmed.github.io/
large-multimodal-models,A curated list of awesome Multimodal studies.
User: friedrichor
large-multimodal-models,Offical Implementation of 2AFC-LMMs
User: h4nwei
large-multimodal-models,Code for "Adapting Large Multimodal Models to Distribution Shifts: The Role of In-Context Learning"
User: jameszhou-gl
Home Page: https://arxiv.org/abs/2405.12217
large-multimodal-models,LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills
Organization: llava-vl
Home Page: https://llava-vl.github.io/llava-plus/
large-multimodal-models,This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"
User: milebench
Home Page: https://milebench.github.io/
large-multimodal-models,This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
Organization: mmmu-benchmark
Home Page: https://mmmu-benchmark.github.io/
large-multimodal-models,This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"
Organization: mmstar-benchmark
Home Page: https://mmstar-benchmark.github.io
large-multimodal-models,AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
Organization: openadaptai
Home Page: https://www.OpenAdapt.AI
large-multimodal-models,A bug-free and improved implementation of LLaVA-UHD, based on the code from the official repo
User: paradoxzw
large-multimodal-models,The official evaluation suite and dynamic data release for MixEval.
User: psycoy
Home Page: https://mixeval.github.io/
large-multimodal-models,A collection of resources on applications of multi-modal learning in medical imaging.
User: richard-peng-xia
large-multimodal-models,Contains code and documentation for our VANE-Bench paper.
User: rohit901
Home Page: https://hananshafi.github.io/vane-benchmark/
large-multimodal-models,ShareGPT4Omni: Towards Building Omni Large Multi-modal Models with Comprehensive Multi-modal Annotations
Organization: sharegpt4omni
Home Page: https://sharegpt4omni.github.io/
large-multimodal-models,[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions
Organization: sharegpt4omni
Home Page: https://sharegpt4v.github.io/
large-multimodal-models,An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Organization: sharegpt4omni
Home Page: https://sharegpt4video.github.io/
large-multimodal-models,A Benchmark for VQA prompt sensitivity
User: shijian2001
large-multimodal-models,[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
User: shikiw
large-multimodal-models,Embed arbitrary modalities (images, audio, documents, etc) into large language models.
User: sshh12
large-multimodal-models,Open Platform for Embodied Agents
Organization: thunlp
Home Page: https://docs.legent.ai
large-multimodal-models,A Framework of Small-scale Large Multimodal Models
User: tinyllava
Home Page: https://arxiv.org/abs/2402.14289
large-multimodal-models,🔥 Official Benchmark Toolkits for "Visual Haystacks: Answering Harder Questions About Sets of Images"
Organization: visual-haystacks
Home Page: https://visual-haystacks.github.io/
large-multimodal-models,Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"
Organization: visualwebbench
Home Page: https://visualwebbench.github.io/
large-multimodal-models,✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM
Organization: vita-mllm
large-multimodal-models,[ICML 2024] Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models
Organization: wang-ml-lab
large-multimodal-models,An open-source implementation for training LLaVA-NeXT.
User: xiaoachen98
large-multimodal-models,FLAME: Learning to Navigate with Multimodal LLM in Urban Environments (arXiv:2408.11051)
User: xyz9911
Home Page: https://flame-sjtu.github.io
large-multimodal-models,Awesome multi-modal large language paper/project, collections of popular training strategies, e.g., PEFT, LoRA.
User: zchoi
large-multimodal-models,A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, qwen-vl, phi3-v etc.
User: zjysteven
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.