Awesome-Text-Video-Retrieval
一个近几年来各大顶会关于视频文本检索库,同步我的博客:https://blog.csdn.net/AAliuxiaolei/article/details/121433833
****汇总与Video Retrieval 各大顶会的文章以及相关论文查找的链接
附一个比较好总结的GitHub仓库
https://github.com/danieljf24/awesome-video-text-retrieval
2021 https://openaccess.thecvf.com/ICCV2021
TeachText: CrossModal Generalized Distillation for Text-Video Retrieval
HiT: Hierarchical Transformer With Momentum Contrast for Video-Text Retrieval
TACo: Token-Aware Cascade Contrastive Learning for Video-Text Alignment
2019 https://openaccess.thecvf.com/ICCV2019
Neighborhood Preserving Hashing for Scalable Video Retrieval
SVD: A Large-Scale Short Video Dataset for Near-Duplicate Video Retrieval
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips
2021 https://2021.acmmm.org/main-track-list
Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training
HANet: Hierarchical Alignment Networks for Video-Text Retrieval
Discriminative Latent Semantic Graph for Video Captioning
Fine-grained Cross-modal Alignment Network for Text-Video Retrieval
Learning Segment Similarity and Alignment in Large-Scale Content Based Video Retrieval
Progressive Semantic Matching for Video-Text Retrieval
CoCo-BERT: Improving Video-Language Pre-training with Contrastive Cross-modal Matching and Denoising∗
2020 https://2020.acmmm.org/main-track-list.html
Interpretable Embedding for Ad-Hoc Video Search
Memory Enhanced Embedding Learning for Cross-Modal Video-Text Retrieval
A W2VV++ Case Study with Automated and Interactive Text-to-Video Retrieval
2019 https://2019.acmmm.org/accepted-papers/index.html
You Only Recognize Once: Towards Fast Video Text Spotting
2021 https://openaccess.thecvf.com/CVPR2021
T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval
On Semantic Similarity in Video Retrieval
Thinking Fast and Slow: Efficient Text-to-Visual Retrieval With Transformers
Adaptive Cross-Modal Prototypes for Cross-Domain Visual-Language Retrieval
Less is more: Clipbert for video-and-language learning via sparse sampling
Mdmmt: Multidomain multimodal transformer for video retrieval
2020 https://openaccess.thecvf.com/CVPR2020_search
ActBERT: Learning Global-Local Video-Text Representations
Fine-Grained Video-Text Retrieval With Hierarchical Graph Reasoning
2019 https://openaccess.thecvf.com/CVPR2019_search
无
2020
Gabeur, Valentin, et al. "Multi-modal transformer for video retrieval." Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IV 16. Springer International Publishing, 2020.
Graph Wasserstein Correlation Analysis for Movie Retrieval