MERIt: Meta-Path Guided Contrastive Learning for Logical Reasoning

This is the pytorch implementation of the paper:

MERIt: Meta-Path Guided Contrastive Learning for Logical Reasoning. Fangkai Jiao, Yangyang Guo, Xuemeng Song, Liqiang Nie. Findings of ACL. 2022.

paper link

Project Structure

|--- conf/   // The configs of all experiments in yaml.
|--- dataset/   // The classes or functions to convert raw text inputs as tensor and utils for batch sampling.
|--- experiments/   // We may provide the prediction results on the datasets, e.g., the .npy file that can be submitted to ReClor leaderboard.
|--- general_util/  // Metrics and training utils.
|--- models/    // The transformers for pre-training and fine-tuning.
|--- modules/
|--- preprocess/    // The scripts to pre-process Wikipedia documents for pre-training.
|--- scripts/   // The bash scripts of our experiments.
|--- reclor_trainer....py   // Trainers for fine-tuning.
|--- trainer_base....py     // Trainers for pre-training.

Requirements

For general requirements, please follow the requirements.txt file.

Some special notes:

The fairscale is used for fast and memory efficient distributed training, and it's not necessary.
NVIDIA Apex is required if you want to use the FusedLAMB optimizer for pre-training. It's also not necessary since you can use AdamW.
RoBERTa-large requires at least 12GB memory on single GPU, ALBERT-xxlarge requires at least 14GB, and we use A100 GPU for Deberta-v2-xlarge pre-training. Using fairscale can help reduce the memory requirements with more GPUs. We have tried to use Deepspeed for Deberta pre-training through CPU-offload but failed.

Pre-training

The pre-trained models are available at HuggingFace:

For all model checkpoints (including models of different pre-training steps and the models for ablation studies), we will provide them later.

You can also pre-train the models by yourself as following.

Data Preprocess

Our pre-training procedure uses the data pre-processed by Qin et al., which includes the entities and the distantly annotated relations in Wikipedia. You can also download it from here.

To further pre-process the data, run:

python preprocess wiki_entity_path_reprocess_v7.py --input_file <glob path for input data> --output_dir <output path>

The processed data can also be downloaded from here.

Running

We use hydra to manage the configuration of our experiments. The configs for pre-training are listed below:

# RoBERTa-large-v1
conf/wiki_path_large_v8_2_aug1.yaml

# RoBERTa-large-v2
conf/roberta/wiki_path_large_v8_2_aug1_fix_v3.yaml

# ALBERT-v2-xxlarge
conf/wiki_path_albert_v8_2_aug1.yaml

# DeBERTa-v2-xlarge
conf/deberta_v2/wiki_path_deberta_v8_2_aug1.yaml

# DeBERTa-v2-xxlarge
conf/deberta_v2/wiki_path_deberta_v8_2_aug1_xxlarge.yaml

To run pre-training:

python -m torch.distributed.launch --nproc_per_node N trainer_base_mul_v2.py -cp <directory of config> -cn <name of the config file>

DeepSpeed is also supported by using trainer_base_mul_ds_v1.py and relevant config is specified in the ds_config field of the config.

Fine-tuning

To run fine-tuning:

python -m torch.distributed.launch --nproc_per_node N reclor_trainer_base_v2.py -cp <directory of config> -cn <name of the config file>

The configs for different experiments are pending...

Citation

If the paper and code are helpful, please kindly cite our paper:

@inproceedings{Jiao22merit,
  author    = {Fangkai Jiao and
               Yangyang Guo and
               Xuemeng Song and
               Liqiang Nie},
  title     = {{MERI}t: Meta-Path Guided Contrastive Learning for Logical Reasoning},
  booktitle = {Findings of ACL},
  publisher = {{ACL}},
  year      = {2022}
}

laplacekorea / merit Goto Github PK

merit's Introduction

MERIt: Meta-Path Guided Contrastive Learning for Logical Reasoning

Project Structure

Requirements

Pre-training

Data Preprocess

Running

Fine-tuning

Citation

merit's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent