vreason's Introduction

Building Blocks for Neural-Symbolic Reasoning

This repo provides implementations of Slot Attention, Vector Quantization, Visual GPT, and Image PCFG.

Image PCFG

This is a straightforward extension of PCFGs of languages to 2-dimensional images. Here are algorithms and implementations. Motivations and technical details are summarized in my thesis (p18-21). The key idea here is to use a pre-trained vector quantization model to tokenize images into $n\times n$ tokens and treat them as 2-dimensional languages.

Check out run_ipcfg.sh and run_ipcfg_eval.sh for training and evaluation.

Visual GPT

Inspired by Image GPT and DALL·E, I combined Vector Quantization and GPT to solve the abstract visual reasoning task. Below is an example of the task: what is the most likely image that follows the given sequence of images (have a guess :))? What I did include (1) using a pre-trained vector quantization model to tokenize the prefix images, (2) formulating the task as causal language modeling, and (3) generating the most likely image using GPT.

Check out run_raven_solver.sh and run_raven_eval.sh for training and evaluation.

Slot Attention

See this paper for technical details. I trained and evaluated models on AbstractScences and CLEVR. Below are some illustrations:

Check out run_slot_abscene.sh and run_slot_clevr.sh for training and evaluation.

License

MIT

Recommend Projects

zhaoyanpeng / vreason Goto Github PK

vreason's Introduction

Building Blocks for Neural-Symbolic Reasoning

Image PCFG

Visual GPT

Slot Attention

License

vreason's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent