Coder Social home page Coder Social logo

multifix's Introduction

MultiFix: Learning to Repair Multiple Errors by Optimal Alignment Learning

Overview

This project is a Torch implementation which learning to repair multiple errors by optimal alignment learning.

Hardware

The models are trained using folloing hardware:

  • Ubuntu 18.04.5 LTS
  • NVIDA TITAN Xp 24GB * 4
  • Intel(R) Xeon(R) W-2145 CPU @ 3.70GHz
  • 64GB RAM

Dependencies

  • Python version is 3.6.7 We use the following version of Pytorch. gpu support (CUDA==10.1)
  • torch==1.1.0 gpu support (CUDA>10.1)
  • torch==1.5.0 Etc. (Included in "requirements.txt")
  • torchtext==0.3.1
  • numpy==1.16.1
  • tqdm
  • matplotlib
  • regex

Prerequisite

  • Use virtualenv
    sudo apt-get install build-essential libssl-dev libffi-dev python-dev
    sudo apt install python3-pip
    sudo pip3 install virtualenv
    virtualenv -p python3 venv
    . venv/bin/activate
    # code your stuff
    deactivate

Datasets

Our dataset is based on the dataset provided by DeepFix. https://www.cse.iitk.ac.in/users/karkare/prutor/prutor-deepfix-09-12-2017.zip

HOW TO EXECUTE OUR MODEL?

Data Processing

Generate training data based on the DeepFix and DrRepair dataset.

    bash data_processing.sh

Model training

Train the data with our model.

    bash model_training.sh

However, this takes a significant time, so we provide 2 models that were trained.

log/pth

Evaluation

You can check the repair result through the saved model.

    bash evaluation.sh

Known issues

  • If the beam size is 100, it takes a significant time.
  • We did not fix the seed, so training results may be slightly different. We actually use the average of the three training results.

multifix's People

Contributors

hyeontae avatar

Stargazers

STGR avatar  avatar Sang-Ki Ko avatar Pepe avatar  avatar Jaeman Son  avatar

Watchers

James Cloos avatar Sang-Ki Ko avatar  avatar

multifix's Issues

Some questions about the test dataset

Hi, thanks for your impressive work. And I have some questions about the test dataset you used in your paper.

  1. In the paper, you said "On a set of 6,975 erroneous C programs from the DeepFix dataset, our approach achieves the state-of-the-art result in terms of full repair rate on the DeepFix dataset" in the abstract. While in Section 4.1, it said "The dataset contains 37,415 correct programs (compiled without error) and 6,971 erroneous programs." The number is different. Is it just a typo?
  2. Based on the description above and released code, does it mean that you are using all erroneous programs for testing which are solving the same problem with the programs in the train dataset? For example, the correct programs in Prob10 are used for training while erroneous programs in Prob10 are also used for testing. If so, I think it may be unfair for your work to compare with DrRepair since they trained their model on bin0/1/2/3 correct and tested it on bin4 error.

Looking forward to your reply and thanks for your contribution!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.