Coder Social home page Coder Social logo

auto-code-rover's Introduction

AutoCodeRover: Autonomous Program Improvement

overall-workflow

ArXiv Paper

๐Ÿ‘‹ Overview

AutoCodeRover is a fully automated approach for resolving GitHub issues (bug fixing and feature addition) where LLMs are combined with analysis and debugging capabilities to prioritize patch locations ultimately leading to a patch.

On SWE-bench lite, which consists of 300 real-world GitHub issues, AutoCodeRover resolves ~22% of issues, improving over the current state-of-the-art efficacy of AI software engineers.

AutoCodeRover works in two stages:

  • ๐Ÿ”Ž Context retrieval: The LLM is provided with code search APIs to navigate the codebase and collect relevant context.
  • ๐Ÿ’Š Patch generation: The LLM tries to write a patch, based on retrieved context.

โœจ Highlights

AutoCodeRover has two unique features:

  • Code search APIs are Program Structure Aware. Instead of searching over files by plain string matching, AutoCodeRover searches for relevant code context (methods/classes) in the abstract syntax tree.
  • When a test suite is available, AutoCodeRover can take advantage of test cases to achieve an even higher repair rate, by performing statistical fault localization.

๐Ÿ—Ž arXiv Paper

AutoCodeRover: Autonomous Program Improvement [arXiv 2404.05427]

For referring to our work, please cite and mention:

@misc{zhang2024autocoderover,
      title={AutoCodeRover: Autonomous Program Improvement},
      author={Yuntong Zhang and Haifeng Ruan and Zhiyu Fan and Abhik Roychoudhury},
      year={2024},
      eprint={2404.05427},
      archivePrefix={arXiv},
      primaryClass={cs.SE}
}

โœ”๏ธ Example: Django Issue #32347

As an example, AutoCodeRover successfully fixed issue #32347 of Django. See the demo video for the full process:

acr-final.mp4

๐Ÿš€ Setup & Running

We recommend running AutoCodeRover in a Docker container. First of all, build and start the docker image:

docker build -f Dockerfile -t acr .
docker run -it acr

In the docker container, set the OPENAI_KEY env var to your OpenAI key:

export OPENAI_KEY=xx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Set up one or more tasks in SWE-bench

In the docker container, we need to first set up the tasks to run in SWE-bench (e.g., django__django-11133). The list of all tasks can be found in conf/swe_lite_tasks.txt.

The tasks need to be put in a file, one per line:

cd /opt/SWE-bench
echo django__django-11133 > tasks.txt

Then, set up these tasks by running:

cd /opt/SWE-bench
conda activate swe-bench
python harness/run_setup.py --log_dir logs --testbed testbed --result_dir setup_result --subset_file tasks.txt

Once the setup for this task is completed, the following two lines will be printed:

setup_map is saved to setup_result/setup_map.json
tasks_map is saved to setup_result/tasks_map.json

The testbed directory will now contain the cloned source code of the target project. A conda environment will also be created for this task instance.

If you want to set up multiple tasks together, put their ids in tasks.txt and follow the same steps.

Run a single task

Before running the task (django__django-11133 here), make sure it has been set up as mentioned above.

cd /opt/auto-code-rover
conda activate auto-code-rover
PYTHONPATH=. python app/main.py --enable-layered --model gpt-4-0125-preview --setup-map ../SWE-bench/setup_result/setup_map.json --tasks-map ../SWE-bench/setup_result/tasks_map.json --output-dir output --task django__django-11133

The output of the run can then be found in output/. For example, the patch generated for django__django-11133 can be found at a location like this: output/applicable_patch/django__django-11133_yyyy-MM-dd_HH-mm-ss/extracted_patch_1.diff (the date-time field in the directory name will be different depending on when the experiment was run).

Run multiple tasks

First, put the id's of all tasks to run in a file, one per line. Suppose this file is tasks.txt, the tasks can be run with

PYTHONPATH=. python app/main.py --enable-layered --model gpt-4-0125-preview --setup-map ../SWE-bench/setup_result/setup_map.json --tasks-map ../SWE-bench/setup_result/tasks_map.json --output-dir output --task-list-file tasks.txt

NOTE: make sure that the tasks in tasks.txt have all been set up in SWE-bench. See the steps above.

Using a config file

Alternatively, a config file can be used to specify all parameters and tasks to run. See conf/vanilla-lite.conf for an example. Also see EXPERIMENT.md for the details of the items in a conf file. A config file can be used by:

python scripts/run.py conf/vanilla-lite.conf

Experiment Replication

Please refer to EXPERIMENT.md for information on experiment replication.

โœ‰๏ธ Contacts

For any queries, you are welcome to open an issue.

Alternatively, contact us at: {yuntong,hruan,zhiyufan}@comp.nus.edu.sg.

Acknowledgements

This work was partially supported by a Singapore Ministry of Education (MoE) Tier 3 grant "Automated Program Repair", MOE-MOET32021-0001.

auto-code-rover's People

Contributors

crhf avatar yuntongzhang avatar zhiyufan avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.