Topic: large-multimodal-models Goto Github

Some thing interesting about large-multimodal-models

👇 Here are 33 public repositories matching this topic...

2toinf / ivm

large-multimodal-models,The offical Implementation of "Instruction-Guided Visual Masking"

Home Page: https://2toinf.github.io/IVM/

computer-vision deep-learning large-language-models multimodal pytorch-implementation robotics instruction-following instruction-tuning large-multimodal-models

aifeg / benchlmm

large-multimodal-models,[ECCV 2024] BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models

Organization: aifeg

Home Page: https://arxiv.org/abs/2312.02896

benchmark cv large-language-models dataset large-multimodal-models

bowen-upenn / mmma_rationality

large-multimodal-models,This is the official repository of the paper "Multi-Modal and Multi-Agent Systems Meet Rationality: A Survey"

User: bowen-upenn

Home Page: https://arxiv.org/abs/2406.00252

agents foundation-models large-language-models large-multimodal-models multi-agent-systems multimodal rationality survey

bzluan / textcot

large-multimodal-models,The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.

User: bzluan

chain-of-thought large-multimodal-models

eric-ai-lab / probmed

large-multimodal-models,"Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA"

Organization: eric-ai-lab

Home Page: https://jackie-2000.github.io/probmed.github.io/

evaluation large-multimodal-models llms medical-diagnosis medical-vqa vision-and-language

friedrichor / awesome-multimodal-papers

large-multimodal-models,A curated list of awesome Multimodal studies.

User: friedrichor

deep-learning large-multimodal-models multimodal multimodal-data multimodal-deep-learning multimodal-dialogue multimodal-large-language-models multimodal-learning

h4nwei / 2afc-lmms

large-multimodal-models,Offical Implementation of 2AFC-LMMs

User: h4nwei

image-quality-assessment large-multimodal-models

jameszhou-gl / icl-distribution-shift

large-multimodal-models,Code for "Adapting Large Multimodal Models to Distribution Shifts: The Role of In-Context Learning"

User: jameszhou-gl

Home Page: https://arxiv.org/abs/2405.12217

distribution-shift large-multimodal-models

llava-vl / llava-plus-codebase

large-multimodal-models,LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills

Organization: llava-vl

Home Page: https://llava-vl.github.io/llava-plus/

agent large-language-models large-multimodal-models multimodal-large-language-models tool-use

milebench / milebench

large-multimodal-models,This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"

User: milebench

Home Page: https://milebench.github.io/

computer-vision deep-learning deep-neural-networks long-context-modeling long-context-transformers machine-learning multimodal natural-language-processing benchmark evaluation

mmmu-benchmark / mmmu

large-multimodal-models,This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"

Organization: mmmu-benchmark

Home Page: https://mmmu-benchmark.github.io/

computer-vision deep-learning deep-neural-networks evaluation foundation-models large-language-models large-multimodal-models llm llms machine-learning multimodal multimodal-deep-learning multimodal-learning multimodality natural-language-processing question-answering stem visual-question-answering

mmstar-benchmark / mmstar

large-multimodal-models,This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"

Organization: mmstar-benchmark

Home Page: https://mmstar-benchmark.github.io

evaluation large-language-models large-multimodal-models large-vision-language-model large-vision-language-models llm llms lvlm lvlms multimodal

openadaptai / openadapt

large-multimodal-models,AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models

Organization: openadaptai

Home Page: https://www.OpenAdapt.AI

process-automation python transformers large-language-models large-multimodal-models gpt-4 gpt4-vision huggingface huggingface-transformers segment-anything

paradoxzw / llava-uhd-better

large-multimodal-models,A bug-free and improved implementation of LLaVA-UHD, based on the code from the official repo

User: paradoxzw

large-language-models large-multimodal-models llava multimodal

psycoy / mixeval

large-multimodal-models,The official evaluation suite and dynamic data release for MixEval.

User: psycoy

Home Page: https://mixeval.github.io/

benchmark benchmark-mixture benchmarking-framework benchmarking-suite evaluation evaluation-framework foundation-models large-language-model large-language-models large-multimodal-models

richard-peng-xia / awesome-multimodal-in-medical-imaging

large-multimodal-models,A collection of resources on applications of multi-modal learning in medical imaging.

User: richard-peng-xia

medical-imaging medical-report-generation multimodal-deep-learning multimodal-learning visual-question-answering large-language-models large-multimodal-models multimodal-large-language-models

rohit901 / vane-bench

large-multimodal-models,Contains code and documentation for our VANE-Bench paper.

User: rohit901

Home Page: https://hananshafi.github.io/vane-benchmark/

benchmark-datasets large-language-models large-multimodal-models multimodal-deep-learning multimodal-large-language-models video-anomaly-detection

sharegpt4omni / sharegpt4omni

large-multimodal-models,ShareGPT4Omni: Towards Building Omni Large Multi-modal Models with Comprehensive Multi-modal Annotations

Organization: sharegpt4omni

Home Page: https://sharegpt4omni.github.io/

chatgpt gpt gpt-4o gpt-4v large-multimodal-models large-vision-language-models gpt4-omni

sharegpt4omni / sharegpt4v

large-multimodal-models,[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions

Organization: sharegpt4omni

Home Page: https://sharegpt4v.github.io/

chatgpt gpt gpt-4v gpt4v instruction-tuning language-model large-language-models large-multimodal-models large-vision-language-models vision-language-model

sharegpt4omni / sharegpt4video

large-multimodal-models,An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

Organization: sharegpt4omni

Home Page: https://sharegpt4video.github.io/

chatgpt gpt gpt-4v large-language-models large-multimodal-models large-vision-language-models large-video-language-models sora text-to-video

shijian2001 / vqapromptbench

large-multimodal-models,A Benchmark for VQA prompt sensitivity

User: shijian2001

benchmark evaluation large-multimodal-models

shikiw / opera

large-multimodal-models,[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation

User: shikiw

large-multimodal-models llama multimodal vision-language-learning vision-language-model chatbot chatgpt gpt-4

sshh12 / multi_token

large-multimodal-models,Embed arbitrary modalities (images, audio, documents, etc) into large language models.

User: sshh12

large-language-models llava large-multimodal-models multi-modality multimodal vision-language-model large-context llm

thunlp / legent

large-multimodal-models,Open Platform for Embodied Agents

Organization: thunlp

Home Page: https://docs.legent.ai

embodied-ai language-grounding large-multimodal-models physics-engine robot-simulator

tinyllava / tinyllava_factory

large-multimodal-models,A Framework of Small-scale Large Multimodal Models

User: tinyllava

Home Page: https://arxiv.org/abs/2402.14289

large-multimodal-models llama llava nlp tinyllama transformers vision-language

visual-haystacks / vhs_benchmark

large-multimodal-models,🔥 Official Benchmark Toolkits for "Visual Haystacks: Answering Harder Questions About Sets of Images"

Organization: visual-haystacks

Home Page: https://visual-haystacks.github.io/

large-multimodal-models long-context-modeling multi-image-understanding vision-language-model visual-question-answering

visualwebbench / visualwebbench

large-multimodal-models,Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"

Organization: visualwebbench

Home Page: https://visualwebbench.github.io/

computer-vision deep-learning evaluation foundation-models large-language-models large-multimodal-models llm llms machine-learning mllm multimodal multimodal-deep-learning multimodal-large-language-models natural-language-processing question-answering visual-question-answering

vita-mllm / vita

large-multimodal-models,✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM

Organization: vita-mllm

large-multimodal-models multimodal-large-language-models

wang-ml-lab / interpretable-foundation-models

large-multimodal-models,[ICML 2024] Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models

Organization: wang-ml-lab

bayesian-deep-learning foundation-models graphical-models interpretability large-language-models large-multimodal-models llm multimodal-large-language-models probabilistic-graphical-models vision-transformer

xiaoachen98 / open-llava-next

large-multimodal-models,An open-source implementation for training LLaVA-NeXT.

User: xiaoachen98

chatbot chatgpt gpt-4 gpt4o large-multimodal-models llama llama3 llava multi-modality multimodal

xyz9911 / flame

large-multimodal-models,FLAME: Learning to Navigate with Multimodal LLM in Urban Environments (arXiv:2408.11051)

User: xyz9911

Home Page: https://flame-sjtu.github.io

large-multimodal-models multimodal-large-language-models vision-and-language-navigation vision-language-model embodied-agent

zchoi / multi-modal-large-language-learning

large-multimodal-models,Awesome multi-modal large language paper/project, collections of popular training strategies, e.g., PEFT, LoRA.

User: zchoi

awesome large-language-models multimodal pre-training benchmark foundation-models large-multimodal-models

zjysteven / lmms-finetune

large-multimodal-models,A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, qwen-vl, phi3-v etc.

User: zjysteven

finetuning foundation-models instruction-tuning large-language-model large-multimodal-models multimodal multimodal-large-language-models vision-language visual-instruction-tuning llava llava-next qwen-vl

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.