Coder Social home page Coder Social logo

clap's Introduction

CLAP: Compact Linearization with an Adaptable Parser

Welcome to the official repository for CLAP, an innovative architecture for AMR (Abstract Meaning Representation) parsing, presented at LREC-COLING 2024.

Features

  1. AMR Parsing and Generation: CLAP introduces a flexible and efficient AMR parsing architecture. It supports seamless transitions between different language models and facilitates multilingual adaptability.

  2. Crosslingual AMR Alignment: Integration of the Crosslingual AMR Aligner enables extraction of span-to-node alignments from sentences to graphs, leveraging the model's cross-attention capabilities.

  3. Perplexity Extraction: Incorporating the AMRs Assemble, CLAP can compute perplexity scores and supports training in assembly tasks.

Citing This Work

If you use CLAP in your research, please cite our paper:

@inproceedings{martinez-lorenzo-navigli-2024-efficient-amr,
    title = "Efficient {AMR} Parsing with {CLAP}: Compact Linearization with an Adaptable Parser",
    author = "Martinez Lorenzo, Abelardo Carlos and Navigli, Roberto",
    editor = "Calzolari, Nicoletta and Kan, Min-Yen and Hoste, Veronique and Lenci, Alessandro and Sakti, Sakriani and Xue, Nianwen",
    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
    month = may,
    year = "2024",
    address = "Torino, Italia",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.lrec-main.495",
    pages = "5578--5584",
}

Repository Structure

  • conf/: Configuration files for data paths, model specifications, and training parameters.
  • data/: Datasets for benchmarking AMR evaluation.
  • experiments/: Stores checkpoints post-training.
  • models/: Trained Hugging Face models.
  • src/: Source code for the project.
    • constant.py: Manages tokens added to the model; customizable for new tokens.
    • linearization.py: Implements graph linearization in Depth-First Search and compact formats.
    • pl_data_modules.py: Data module classes for training.
    • pl_modules.py: Contains new modular components for the architecture.
    • predict.py: Script for making predictions using trained models.
    • predict_alignment.py: Script for extracting alignments.
    • predict_perplexity.py: Script for computing perplexity.
    • train.py: Entry point for training models.
    • utils.py: Utility functions for various operations.

Installation

# Create a Python 3.9 environment
conda create -n clap-env python=3.9
conda activate clap-env

# Install dependencies
pip install -r requirements.txt

Training

Configure paths and hyperparameters in conf/ directory files:

  • conf/data.yaml: Specify dataset paths for training and evaluation.
  • conf/model.yaml: Define the model architecture, e.g., google/flan-t5-small.
  • conf/train.yaml: Adjust training-specific hyperparameters.
python src/train.py

Prediction

Set up the necessary paths in conf/data.yaml and conf/model.yaml. Then run:

python src/predict.py

Alignment Extraction

Configure as per the prediction step and execute:

python src/predict_alignments.py

Perplexity Calculation

Configure as per the prediction step and execute:

python src/predict_perplexity.py

License

This project is released under the CC-BY-NC-SA 4.0 license (see LICENSE). If you use AMRs-Assemble!, please reference the paper and put a link to this repo.

Contributing

We welcome contributions to the Cross-lingual AMR Aligner project. If you have any ideas, bug fixes, or improvements, feel free to open an issue or submit a pull request.

Contact

For any questions or inquiries, please contact Abelardo Carlos Martínez Lorenzo at [email protected]

clap's People

Contributors

carlosml26 avatar

Watchers

Roberto Navigli avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.