mregan1314 / multi-modal-transformer Goto Github PK
View Code? Open in Web Editor NEWThis project forked from junchen14/multi-modal-transformer
The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and related datasets. Additionally, it also collects many useful tutorials and tools in these related domains.