Topic: multi-modal-learning Goto Github

Some thing interesting about multi-modal-learning

👇 Here are 86 public repositories matching this topic...

924973292 / awesome-multi-modal-object-re-identification

multi-modal-learning,Multi-modal Object Re-identification

awesome code-list missing-modal-retrieval multi-modal-learning multi-modal-object-re-identification paper-list person-reidentification vehicle-reidentification

924973292 / editor

multi-modal-learning,【CVPR2024】Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification

User: 924973292

cvpr2024 msvr310 multi-modal-learning person-reid reid rgbnt100 rgbnt201 vehicle-reidentification frequency-analysis token-selection

abhrac / xmodal-vit

multi-modal-learning,Official implementation of "Cross-Modal Fusion Distillation for Fine-Grained Sketch-Based Image Retrieval", BMVC 2022.

User: abhrac

knowledge-distillation multi-modal-learning sketch-based-image-retrieval

$multimodal-math-pretraining$

deep-symbolic-mathematics / multimodal-math-pretraining

multi-modal-learning,[ICLR 2024 Spotlight] This is the official code for the paper "SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training"

Organization: deep-symbolic-mathematics

Home Page: https://openreview.net/forum?id=KZSEgJGPxu

ai4math ai4science deep-learning multi-modal multi-modal-learning representation-learning symbolic-math symbolic-regression transformers

depshad / deep-learning-framework-for-multi-modal-product-classification

multi-modal-learning,Code repository for Rakuten Data Challenge: Multimodal Product Classification and Retrieval.

User: depshad

nlp computer-vision deep-learning pytorch multi-modal-learning rakuten-data-challenge

dmitryryumin / cvpr-2023-papers

multi-modal-learning,CVPR 2023 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included. ⭐ support visual intelligence development!

User: dmitryryumin

Home Page: https://huggingface.co/spaces/DmitryRyumin/NewEraAI-Papers

action-recognition autonomous-driving biometrics computer-vision cvpr cvpr2023 datasets deep-learning face-recognition gesture-recognition

fmenat / multiviewcropclassification

multi-modal-learning,Public repository of our IGARSS 2023 submission

User: fmenat

Home Page: https://doi.org/10.1109/IGARSS52108.2023.10282138

agriculture-research crop-classification crop-type-mapping croptypes data-fusion datafusion multi-modal-learning multi-view-learning multimodal-learning multisensor-fusion

gaurav104 / wss-cmer

multi-modal-learning,Code for the paper : "Weakly supervised segmentation with cross-modality equivariant constraints", available at https://arxiv.org/pdf/2104.02488.pdf

User: gaurav104

weakly-supervised-learning weakly-supervised-segmentation class-activation-map class-activation-maps grad-cam grad-cam-visualization self-learning multi-modal-learning multi-modal-imaging transformation-equivariance

graphprojects / cm-gcl

multi-modal-learning,Source code of NeurIPS 2022 paper “Co-Modality Graph Contrastive Learning for Imbalanced Node Classification”

User: graphprojects

contrastive-learning graph-contrastive-learning multi-modal-learning graph-self-supervised-learning imbalance-learning node-classification imbalanced-node-classification

guanrunwei / achelous

multi-modal-learning,Achelous: A Fast Unified Water-surface Panoptic Perception Framework based on Fusion of Monocular Camera and 4D mmWave Radar

User: guanrunwei

multi-modal-fusion multi-modal-learning multi-task-learning object-detection object-tracking panoptic-perception point-cloud-segmentation semantic-segmentation 4d-mmwave-radar

hackerhyper / acmvh

multi-modal-learning,Adaptive Confidence Multi-View Hashing

User: hackerhyper

multi-modal-fusion multi-modal-learning multi-view-learning

huggingface / chug

multi-modal-learning,Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.

Organization: huggingface

computer-vision dataloading datasets distributed-training document-understanding multi-modal-learning pdf-document webdataset

icvteam / m3tr

multi-modal-learning,M3TR: Multi-modal Multi-label Recognition with Transformer. ACM MM 2021

Organization: icvteam

Home Page: https://dl.acm.org/doi/abs/10.1145/3474085.3475191

multi-label-image-classification vision-transformer multi-modal-learning

ivclab / neuralmerger

multi-modal-learning,Yi-Min Chou, Yi-Ming Chan, Jia-Hong Lee, Chih-Yi Chiu, Chu-Song Chen, "Unifying and Merging Well-trained Deep Neural Networks for Inference Stage," International Joint Conference on Artificial Intelligence (IJCAI), 2018

Organization: ivclab

deep-neural-networks multi-task-learning cnn-compression unifying-and-merging-cnn efficient-inference tensorflow multi-modal-learning

jackyjsy / sam-slr-v2

multi-modal-learning,SAM-SLR-v2 is an improved version of SAM-SLR for sign language recognition.

User: jackyjsy

sign-language-recognition sign-language-recognition-system graph-convolutional-networks multi-modal-learning

jokieleung / awesome-visual-question-answering

multi-modal-learning,A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.

User: jokieleung

attention-networks awesome-list multi-modal multi-modal-learning vqa

josedolz / hyperdensenet_pytorch

multi-modal-learning,Pytorch version of the HyperDenseNet deep neural network for multi-modal image segmentation

User: josedolz

hyperdensenet deep-learning 3d-convolutional-network 3d-cnn medical-image-processing medical-image-segmentation multi-modal-imaging multi-modal-learning segmentation image-segmentation

kyegomez / megavit

multi-modal-learning,The open source implementation of the model from "Scaling Vision Transformers to 22 Billion Parameters"

User: kyegomez

Home Page: https://discord.gg/qUtxnK2NMf

artificial-intelligence computer-vision gpt4 multi-modal multi-modal-fusion multi-modal-learning vision-and-language vision-transformer

kyegomez / neva

multi-modal-learning,The open source implementation of "NeVA: NeMo Vision and Language Assistant"

User: kyegomez

Home Page: https://discord.gg/qUtxnK2NMf

artificial-intelligence multi-modal multi-modal-learning multithreading robotics cuda gpt4 neva nvidia

kyegomez / zeta

multi-modal-learning,Build high-performance AI models with modular building blocks

User: kyegomez

Home Page: https://zeta.apac.ai

artificial-intelligence multi-modal transformers deep-learning gpt4 llama2 multi-agent-systems multi-modal-learning multi-platform pytorch

likyoo / multimodal-remote-sensing-toolkit

multi-modal-learning,A python tool to perform deep learning experiments on multimodal remote sensing data.

User: likyoo

python pytorch remote-sensing multi-modal-learning

liyichen-cly / mmea

multi-modal-learning,MMEA: Entity Alignment for Multi-Modal Knowledge Graphs, KSEM 2020

User: liyichen-cly

entity-alignment knowledge-graph multi-modal-learning

lucidrains / x-clip

multi-modal-learning,A concise but complete implementation of CLIP with various experimental improvements from recent papers

User: lucidrains

artificial-intelligence deep-learning contrastive-learning zero-shot-learning multi-modal-learning

lyuchenyang / macaw-llm

multi-modal-learning,Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

User: lyuchenyang

language-model multi-modal-learning natural-language-processing deep-learning machine-learning neural-networks

mlfoundations / open_clip

multi-modal-learning,An open source implementation of CLIP.

Organization: mlfoundations

deep-learning pytorch computer-vision language-model multi-modal-learning contrastive-loss zero-shot-classification pretrained-models

mmintlab / virdo

multi-modal-learning,Github repository of a Visio-tactile Implicit Representations of Deformable Objects (ICRA 2022)

Organization: mmintlab

deformable-object deep-learning machine-learning manipulation multi-modal-learning representation-learning

moabarar / nemar

multi-modal-learning,[CVPR2020] Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation

User: moabarar

multimodal registartion stn image-to-image-translation multi-modal multi-modal-learning affine-transformation deformable-transformation deep-learning cnn

nvlabs / prismer

multi-modal-learning,The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".

Organization: nvlabs

Home Page: https://shikun.io/projects/prismer

image-captioning language-model multi-modal-learning multi-task-learning vision-language-model vision-and-language vqa

ofa-sys / chinese-clip

multi-modal-learning,Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

Organization: ofa-sys

chinese computer-vision multi-modal-learning nlp pytorch vision-and-language-pre-training image-text-retrieval clip pretrained-models vision-language

openrobotlab / embodiedscan

multi-modal-learning,[CVPR 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

Organization: openrobotlab

Home Page: https://tai-wang.github.io/embodiedscan/

3d-vision computer-vision multi-modal-learning robotics

peymanbateni / multimodal-emotion-analysis-in-conversations

multi-modal-learning,Multi-model analysis of sentiment and emotion in multi-speaker conversations.

User: peymanbateni

deep-learning emotion-recognition graph-neural-networks multi-modal-learning sentiment-classification

qin2dim / hcaptcha-challenger

multi-modal-learning,🥂 Gracefully face hCaptcha challenge with MoE(ONNX) embedded solution.

User: qin2dim

Home Page: https://docs.captchax.top/

yolov5 hcaptcha opencv-python onnx-models hcaptcha-solver solver onnx yolo onnxruntime playwright

qizekun / recon

multi-modal-learning,[ICML 2023] Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining

User: qizekun

Home Page: https://arxiv.org/abs/2302.02318

3d-point-clouds multi-modal-learning representation-learning self-supervised-learning

raivnlab / sugar-crepe

multi-modal-learning,[NeurIPS 2023] A faithful benchmark for vision-language compositionality

Organization: raivnlab

Home Page: https://arxiv.org/abs/2306.14610

benchmark vision-and-language deep-learning multi-modal-learning pytorch

raphaelmemmesheimer / gimme_signals_action_recognition

multi-modal-learning,Multi-Modal action recognition for skeleton sequences, inertial measurements, motion capturing data and Wi-Fi CSI fingerprints.

User: raphaelmemmesheimer

action-recognition skeleton-based-action-recognition multi-modal-learning computer-vision sensor-fusion

rentainhe / trar-vqa

multi-modal-learning,[ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"

User: rentainhe

vqav2 iccv2021 transformer clevr multi-modal vision-and-language visual-question-answering pytorch multi-scale-features dynamic-network

richard-peng-xia / hgclip

multi-modal-learning,HGCLIP: Exploring Vision-Language Models with Graph Representations for Hierarchical Understanding

User: richard-peng-xia

Home Page: https://arxiv.org/abs/2311.14064

graph-representations hierarchical-image-classification multi-modal-learning vision-language-model

rinnakk / japanese-clip

multi-modal-learning,Japanese CLIP by rinna Co., Ltd.

Organization: rinnakk

Home Page: https://huggingface.co/rinna

clip cloob japanese pretrained-models vision language-model multi-modal-learning

rl4m / mrm-pytorch

multi-modal-learning,An official implementation of Advancing Radiograph Representation Learning with Masked Record Modeling (ICLR'23)

Organization: rl4m

chest-xray-images multi-modal-learning pre-trained-model representation-learning self-supervised-learning

sandipan211 / zsd-sc-resolver

multi-modal-learning,Resolving semantic confusions for improved zero-shot detection (BMVC 2022)

User: sandipan211

computer-vision conditional-gan deep-learning faster-rcnn multi-modal-learning object-detection pytorch-implementation triplet-loss zero-shot-learning zero-shot-object-detection

shikras / d-cube

multi-modal-learning,A detection/segmentation dataset with labels characterized by intricate and flexible expressions. "Described Object Detection: Liberating Object Detection with Flexible Expressions" (NeurIPS 2023).

Organization: shikras

Home Page: https://arxiv.org/abs/2307.12813

multi-modal-learning object-detection referring-expression-comprehension vision-language dataset open-vocabulary-detection

ttgeng233 / unav

multi-modal-learning,Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)

User: ttgeng233

Home Page: https://unav100.github.io

audio-visual-events audio-visual-learning multi-modal-learning

uoo723 / pmgt

multi-modal-learning,Implementation of "Pre-training Graph Transformer with Multimodal Side Information for Recommendation"

User: uoo723

deep-learning graph-transformer machine-learning multi-modal-learning python pytorch recommendation

willdreamer / aurora

multi-modal-learning,[NeurIPS2023] Parameter-efficient Tuning of Large-scale Multimodal Foundation Model

User: willdreamer

Home Page: https://arxiv.org/abs/2305.08381

multi-modal-learning parameter-efficient-tuning

wjun0830 / cgdetr

multi-modal-learning,Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Grounding"

User: wjun0830

Home Page: https://arxiv.org/abs/2311.08835

computer-vision detection-transformer detr highlight-detection moment-retrieval multi-modal-learning pytorch temporal-grounding text-video-retrieval video-grounding

ysz2022 / nerco

multi-modal-learning,[ICCV 2023] Implicit Neural Representation for Cooperative Low-light Image Enhancement

User: ysz2022

Home Page: https://openaccess.thecvf.com/content/ICCV2023/html/Yang_Implicit_Neural_Representation_for_Cooperative_Low-light_Image_Enhancement_ICCV_2023_paper.html

low-light-image low-light-image-enhancement neural-representation multi-modal-learning iccv iccv2023

yuangongnd / uavm

multi-modal-learning,Code for the IEEE Signal Processing Letters 2022 paper "UAVM: Towards Unifying Audio and Visual Models".

User: yuangongnd

computer-vision multi-modal-learning audio-classification

zhengli97 / promptkd

multi-modal-learning,[CVPR 2024] Official PyTorch Code for "PromptKD: Unsupervised Prompt Distillation for Vision-Language Models"

User: zhengli97

Home Page: https://zhengli97.github.io/PromptKD/

clip cvpr2024 knowledge-distillation multi-modal-learning prompt-learning vision-language-model

zhjohnchan / awesome-vision-and-language-pretraining

multi-modal-learning,A curated list of vision-and-language pre-training (VLP). :-)

User: zhjohnchan

multi-modal-learning pre-training vision-and-language-pre-training

zjukg / kg-mm-survey

multi-modal-learning,Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

Organization: zjukg

Home Page: http://arxiv.org/abs/2402.05391

awsome-list cross-modal-retrieval entity-alignment entity-linking image-classification image-generation information-extraction knowledge-graph knowledge-graph-embeddings large-language-models

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.