Coder Social home page Coder Social logo

multidim-typeinf4py / typet5 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from utopia-group/typet5

0.0 0.0 0.0 3.52 MB

TypeT5: Seq2seq Type Inference using Static Analysis and CodeT5

License: BSD 3-Clause "New" or "Revised" License

Shell 0.01% Python 42.39% Jupyter Notebook 57.60%

typet5's Introduction

TypeT5: Seq2seq Type Inference using Static Analysis

TypeT5 Workflow

This repo contains the source code for the paper TypeT5: Seq2seq Type Inference using Static Analysis.

@inproceedings{Wei2023TypeT5,
    title={TypeT5: Seq2seq Type Inference using Static Analysis},
    author={Jiayi Wei and Greg Durrett and Isil Dillig},
    booktitle={International Conference on Learning Representations},
    year={2023},
    url={https://openreview.net/forum?id=4TyNEhI2GdN}
}

Installation

This project uses pipenv to manage the package dependencies. Pipenv tracks the exact package versions and manages the (project-specific) virtual environment for you. To install all dependencies, make sure you have pipenv and Python 3.10 installed, then, at the project root, run the following two commands:

pipenv --python <path-to-your-python-3.10>  # create a new environment for this project
pipenv sync --dev # install all specificed dependencies

More about pipenv:

  • To add new dependences into the virtual environment, you can either add them via pipenv install .. (using pipenv) or pipenv run pip install .. (using pip from within the virtual environment).
  • If your pytorch installation is not working properly, you might need to reinstall it via the pipenv run pip install approach rather than pipenv install.
  • All .py scripts below can be run via pipenv run python <script-name.py>. For .ipynb notebooks, make sure you select the pipenv environment as the kernel. You can run all unit tests by running pipenv run pytest at the project root.

If you are not using pipenv:

  • Make sure to add the environment variables in the .env file to your shell environment when you run the scripts (needed by the parsing library).
  • We also provided a requirements.txt file for you to install the dependencies via pip install -r requirements.txt.

Using the trained model

The notebook scripts/run_typet5.ipynb shows you how to download the TypeT5 model from Huggingface and then use it to make type predictions for a specified codebase.

Training a New Model

  • First, run the notebook scripts/collect_dataset.ipynb to download and split the BetterTypes4Py dataset used in our paper.
    • The exact list of repos we used for the experiments in paper can be loaded from data/repos_split.pkl using pickle.load.
  • Then, run scripts/train_model.py to train a new TypeT5 model. Training takes about 11 hours on a single Quadro RTX 8000 GPU with 48GB memory.

Development

  • Formatter: We use black for formatting with the default options.
  • Type Checker: We use Pylance to type check this codebase. It's the built-in type checker shipped with the VSCode Python extension and can be enabled by setting Python > Anlaysis > Type Checking Mode to basic.

typet5's People

Contributors

mrvplusone avatar bengsparks avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.