Topic: multi-modal-learning Goto Github
Some thing interesting about multi-modal-learning
Some thing interesting about multi-modal-learning
multi-modal-learning,Multi-modal Object Re-identification
User: 924973292
multi-modal-learning,【CVPR2024】Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification
User: 924973292
multi-modal-learning,Official implementation of "Cross-Modal Fusion Distillation for Fine-Grained Sketch-Based Image Retrieval", BMVC 2022.
User: abhrac
multi-modal-learning,[ICLR 2024 Spotlight] This is the official code for the paper "SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training"
Organization: deep-symbolic-mathematics
Home Page: https://openreview.net/forum?id=KZSEgJGPxu
multi-modal-learning,Code repository for Rakuten Data Challenge: Multimodal Product Classification and Retrieval.
User: depshad
multi-modal-learning,CVPR 2023 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included. ⭐ support visual intelligence development!
User: dmitryryumin
Home Page: https://huggingface.co/spaces/DmitryRyumin/NewEraAI-Papers
multi-modal-learning,Public repository of our IGARSS 2023 submission
User: fmenat
Home Page: https://doi.org/10.1109/IGARSS52108.2023.10282138
multi-modal-learning,Code for the paper : "Weakly supervised segmentation with cross-modality equivariant constraints", available at https://arxiv.org/pdf/2104.02488.pdf
User: gaurav104
multi-modal-learning,Source code of NeurIPS 2022 paper “Co-Modality Graph Contrastive Learning for Imbalanced Node Classification”
User: graphprojects
multi-modal-learning,Achelous: A Fast Unified Water-surface Panoptic Perception Framework based on Fusion of Monocular Camera and 4D mmWave Radar
User: guanrunwei
multi-modal-learning,Adaptive Confidence Multi-View Hashing
User: hackerhyper
multi-modal-learning,Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.
Organization: huggingface
multi-modal-learning,M3TR: Multi-modal Multi-label Recognition with Transformer. ACM MM 2021
Organization: icvteam
Home Page: https://dl.acm.org/doi/abs/10.1145/3474085.3475191
multi-modal-learning,Yi-Min Chou, Yi-Ming Chan, Jia-Hong Lee, Chih-Yi Chiu, Chu-Song Chen, "Unifying and Merging Well-trained Deep Neural Networks for Inference Stage," International Joint Conference on Artificial Intelligence (IJCAI), 2018
Organization: ivclab
multi-modal-learning,SAM-SLR-v2 is an improved version of SAM-SLR for sign language recognition.
User: jackyjsy
multi-modal-learning,A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.
User: jokieleung
multi-modal-learning,Pytorch version of the HyperDenseNet deep neural network for multi-modal image segmentation
User: josedolz
multi-modal-learning,The open source implementation of the model from "Scaling Vision Transformers to 22 Billion Parameters"
User: kyegomez
Home Page: https://discord.gg/qUtxnK2NMf
multi-modal-learning,The open source implementation of "NeVA: NeMo Vision and Language Assistant"
User: kyegomez
Home Page: https://discord.gg/qUtxnK2NMf
multi-modal-learning,Build high-performance AI models with modular building blocks
User: kyegomez
Home Page: https://zeta.apac.ai
multi-modal-learning,A python tool to perform deep learning experiments on multimodal remote sensing data.
User: likyoo
multi-modal-learning,MMEA: Entity Alignment for Multi-Modal Knowledge Graphs, KSEM 2020
User: liyichen-cly
multi-modal-learning,A concise but complete implementation of CLIP with various experimental improvements from recent papers
User: lucidrains
multi-modal-learning,Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration
User: lyuchenyang
multi-modal-learning,An open source implementation of CLIP.
Organization: mlfoundations
multi-modal-learning,Github repository of a Visio-tactile Implicit Representations of Deformable Objects (ICRA 2022)
Organization: mmintlab
multi-modal-learning,[CVPR2020] Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation
User: moabarar
multi-modal-learning,The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".
Organization: nvlabs
Home Page: https://shikun.io/projects/prismer
multi-modal-learning,Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Organization: ofa-sys
multi-modal-learning,[CVPR 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
Organization: openrobotlab
Home Page: https://tai-wang.github.io/embodiedscan/
multi-modal-learning,Multi-model analysis of sentiment and emotion in multi-speaker conversations.
User: peymanbateni
multi-modal-learning,🥂 Gracefully face hCaptcha challenge with MoE(ONNX) embedded solution.
User: qin2dim
Home Page: https://docs.captchax.top/
multi-modal-learning,[ICML 2023] Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining
User: qizekun
Home Page: https://arxiv.org/abs/2302.02318
multi-modal-learning,[NeurIPS 2023] A faithful benchmark for vision-language compositionality
Organization: raivnlab
Home Page: https://arxiv.org/abs/2306.14610
multi-modal-learning,Multi-Modal action recognition for skeleton sequences, inertial measurements, motion capturing data and Wi-Fi CSI fingerprints.
User: raphaelmemmesheimer
multi-modal-learning,[ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"
User: rentainhe
multi-modal-learning,HGCLIP: Exploring Vision-Language Models with Graph Representations for Hierarchical Understanding
User: richard-peng-xia
Home Page: https://arxiv.org/abs/2311.14064
multi-modal-learning,Japanese CLIP by rinna Co., Ltd.
Organization: rinnakk
Home Page: https://huggingface.co/rinna
multi-modal-learning,An official implementation of Advancing Radiograph Representation Learning with Masked Record Modeling (ICLR'23)
Organization: rl4m
multi-modal-learning,Resolving semantic confusions for improved zero-shot detection (BMVC 2022)
User: sandipan211
multi-modal-learning,A detection/segmentation dataset with labels characterized by intricate and flexible expressions. "Described Object Detection: Liberating Object Detection with Flexible Expressions" (NeurIPS 2023).
Organization: shikras
Home Page: https://arxiv.org/abs/2307.12813
multi-modal-learning,Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)
User: ttgeng233
Home Page: https://unav100.github.io
multi-modal-learning,Implementation of "Pre-training Graph Transformer with Multimodal Side Information for Recommendation"
User: uoo723
multi-modal-learning,[NeurIPS2023] Parameter-efficient Tuning of Large-scale Multimodal Foundation Model
User: willdreamer
Home Page: https://arxiv.org/abs/2305.08381
multi-modal-learning,Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Grounding"
User: wjun0830
Home Page: https://arxiv.org/abs/2311.08835
multi-modal-learning,[ICCV 2023] Implicit Neural Representation for Cooperative Low-light Image Enhancement
User: ysz2022
multi-modal-learning,Code for the IEEE Signal Processing Letters 2022 paper "UAVM: Towards Unifying Audio and Visual Models".
User: yuangongnd
multi-modal-learning,[CVPR 2024] Official PyTorch Code for "PromptKD: Unsupervised Prompt Distillation for Vision-Language Models"
User: zhengli97
Home Page: https://zhengli97.github.io/PromptKD/
multi-modal-learning,A curated list of vision-and-language pre-training (VLP). :-)
User: zhjohnchan
multi-modal-learning,Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey
Organization: zjukg
Home Page: http://arxiv.org/abs/2402.05391
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.