trellixvulnteam / joint-multimodal-embeddings_h639 Goto Github PK

View Code? Open in Web Editor NEW

JavaScript 0.05% Python 20.16% CSS 0.01% TeX 5.48% Makefile 0.02% Batchfile 0.02% Jupyter Notebook 74.21% Dockerfile 0.05%

joint-multimodal-embeddings_h639's Introduction

Comparing Multimodal Representations in Co-Attention-Based Models for Visual Question Answering

This repository includes code for our paper. We investigate the properties of joint multimodal representations derived from both a task-specific model and a multi-task model with respect to different training objective and information streams. We compare MCAN and multi-task ViLBERT on the VQA task and evaluate their performance on the VQA 2.0 and GQA datasets. We extend the implementation of both MCAN and multi-task ViLBERT.

Recommend Projects

trellixvulnteam / joint-multimodal-embeddings_h639 Goto Github PK

joint-multimodal-embeddings_h639's Introduction

Comparing Multimodal Representations in Co-Attention-Based Models for Visual Question Answering

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent