Coder Social home page Coder Social logo

xgdsmileboy / simfix Goto Github PK

View Code? Open in Web Editor NEW
200.0 9.0 28.0 152.14 MB

Automatically fix programs by leveraging existing patches from other projects and similar code snippets from the faulty project.

License: GNU General Public License v2.0

Java 96.12% Shell 1.47% Python 1.17% R 1.07% Perl 0.16% Dockerfile 0.02%
automatic debugging-tool

simfix's Introduction

NOTE: SimFix depends on several Defects4J commands. Therefore, if you want to conduct your experiment on other projects outside Defects4J, please adapt the project to the Defects4J framework.
As an alternative, if you don't want to compile SimFix by youself, you also can download the replication package [HERE] directly, which will save your time for setting up the running environment.

SimFix

I. Introduction

SimFix is an automatic program repair technique, which leverages exisiting patches from other projects and similar code snippets in the same project to generate patches. The following figure is the workflow of our approach.

The workflow of this technique.\label{workflow}

Mining Stage

  1. Mine repair patterns from existing open-source projects, after which we can obtain a set of frequent repair patterns. Those patterns can be reused for other repairing scenarios as well.

Repairing Stage

  1. Fault Localization : obtain a ranking list of candidate faulty statements and extract corresponding code snippets.
  2. Donor Snippet Identification : identify the similarity between faulty code snippet and each candidate similar code snippet by leveraging three similarity metrics, according to which we obtain a list of candidate similar snippets with decending order of similarity value.
  3. Variable Mapping : establish the mapping relationship between variables in faulty and similar code snippets by leveraging similarity metrics and then obtain a mapping table, based on which the variables in the donor code snippet will be replaced with the corresponding variables.
  4. Modification Extraction and Intersection : extract code modifications to faulty code snippet via AST (Abstract Syntax Tree) matching and differencing against the donor snippet, and then the frequent patterns from the mining stage will be used to take intersection with those modifications to further ruled out invalid ones.
  5. Patch Generation & Validation : generate repair patches by applying extracted code modifications to the faulty code snippet with combining and ranking whose modifications, then using the test suite to validate the correctness of candidate patches until a correct patch found or timeout.

If you want to use this project, please cite our technical paper published at ISSTA'18.

@inproceedings{Simfix:2018,
    author   = {Jiang, Jiajun and Xiong, Yingfei and Zhang, Hongyu and
                Gao, Qing and Chen, Xiangqun},
    title    = {Shaping Program Repair Space with Existing Patches and Similar Code},
    series   = {ISSTA},
    year     = {2018},
    location = {Amsterdam, Netherlands},
    doi      = {10.1145/3213846.3213871},
} 

II. Environment

  • OS: Linux (Tested on Ubuntu 16.04.2 LTS)
  • JDK: Oracle jdk1.7 (important!)
  • Download and configure Defects4J (branch fee5ddf020) running environment.
  • Configure the following path.
    • DEFECTS4J_HOME="home_of_defects4j"

III. How to run

SimFix was traditionally developed as an Eclipse Java project, you can simply import this project to your workspace and run it as a common Java program. The main class is cofix.main.Main, and for the running option please refer to the Running Options.

Before running

  • unzip file sbfl/data.zip to sbfl/data : used for fault localization

  • using the command line provided by Defects4J to checkout a buggy version of benchmark for testing.

    Example: defects4j checkout -p Chart -v 1b -w ${buggy_program_path}/chart/chart_1_buggy

    NOTE : the path of the buggy version of benchmark have to be set as:

    …/projectName/projectName_id_buggy => Example: /home/user/chart/chart_1_buggy

Step 1, Build The Project

Originally, SimFix was developed as an Eclipse Java Project, you can simply import the project into your workspace and the class cofix.main.Main is the entry of the whole program.

Step 2, Running Options

Our prototype of SimFix needs three input options for running.

  • --proj_home : the home of buggy program of benchmark. (${buggy_program_path} for the example)

  • --proj_name : the project name of buggy program of benchmark. (chart for the example)

  • --bug_id : the identifier of the buggy program. (1 for the example)

    • The option of --bugy_id supports multiple formats:

      single_id : repair single bug, e.g., 1.

      startId-endId : repair a series of bugs with consecutive identifiers, e.g., 1-3.

      single_id,single_id,single_id : repair any bugs for the specific program, e.g., 1,5,9.

      all : repair all buggy versions of a specific project, i.e., all.

    Usage: --proj_home=${proj_home} --proj_name=${proj_name} --bug_id=${bug_id}
    Example: --proj_home=/home/user --proj_name=chart --bug_id=1
    Another: --proj_home=/home/user --proj_name=chart --bug_id=1,4,8

OPTION 1 : Run within eclipse (please use the old version: tested on Mars, which depends on Java7).

  • From the Main class:

    Run AsRun Configurations…Arguments : set the above arguments as Program Arguments.

OPTION 2 : run using command line.

  • We also provide runnable jar file of SimFix in the home folder of the project i.e., simfix.jar.

    Set the home directory of the SimFix project as your correct path and then run as:

    java -jar simfix.jar --proj_home=/home/user --proj_name=chart --bug_id=1

Step 3, Result Analysis

After finishing the repair, there will be two additional folders: log and patch.

  • log : debug output, including buggy statements already tried, patches and reference code snippet for correct patch generation.

  • patch : a single source file repaired by SimFix that can pass the test suite. In the source file, you can find the patch, which is formatted as (example of Chart_1):

    // start of generated patch
    int index=this.plot.getIndexOf(this);
    CategoryDataset dataset=this.plot.getDataset(index);
    if(dataset==null){
    return result;
    }
    // end of generated patch
    /* start of original code
            int index = this.plot.getIndexOf(this);
            CategoryDataset dataset = this.plot.getDataset(index);
            if (dataset != null) {
                return result;
            }
     end of original code*/

IV. Evaluation Result

Totally, SimFix successfully repair 34 bugs among 357 bugs in Defects4J v1.0 benchmark with generating 22 plausible but incorrect patches. The following table and venn diagram are comparison with existing approaches. More details are presented in the sub-folder final (latest).

The comparison with existing approaches.\label{compare} Intersection among different approaches.\label{venn}
The comparison with existing approaches. Intersections.

V. Structure of the project

  |--- README.md   :  user guidance
  |--- bin         :  binary code
  |--- d4j-info    :  defects4j information
  |--- doc         :  document
  |--- final       :  evaluation result
  |--- lib         :  dependent libraries
  |--- sbfl        :  fault localization tool
  |--- src         :  source code
  |--- test        :  test suite

ALL suggestions are welcomed.

simfix's People

Contributors

claudeyj avatar tomldickson avatar xgdsmileboy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

simfix's Issues

修复失败

您好,我们按照教程配置过程中也遇到了lang和chart修复失败的问题,我看到您的回答说是环境问题,能不能详细说一下是哪个部分的环境出现问题了呢?

Generation of subsequent patches

Hello,

as far as I understood the SimFix paper, the evaluation of SimFix was performed on the first generated patch which passed the test suite. I would like to take a look at the patches which are generated beyond that. Can the configuration switch Constant.PATCH_NUM be increased to achieve the generation of subsequent patches?

best regards
Florian

Does SimFix work under Java-8?

Hi,

The README.md file mentions that SimFix requires Java-7, would it incorrect/wrong to run it under Java-8?

--
Best,
Jose

您好,请教一下lang和chart修复失败的问题

您好,我们已经按照您的readme配置好了相关环境,并且将工程导入eclipse中可以运行。我们几个人跑了lang和chart均失败了(Failed to repair),不知道是哪里的问题。向您请教一下。谢谢~

SimFix crashed in some bugs

Hi Prof. Jiang, I'm using SimFix to fix some new bugs and it's really so powerful and easy to use. But I found some potential issues with SimFix and hope to confirm with you. The workflow of SimFix is G & V, i.e., generation and validation. In the validation stage, for each patch, SimFix will execute the failing tests first. If there's some failing tests, then it will skip to the next patch. Otherwise it will continue to execute the whole test suite. Here

if(!Runner.testSingleTest(_subject, testinfo[0], testinfo[1])){
splits the failing test information into testClass and testMethod, which is reasonable in most cases, because that's the format of the failing tests provided by defects4j in most cases. However, in some cases that may not hold. For example, the program for logging failing tests information: https://github.com/rjust/defects4j/blob/29d84487b1d3828a4ba7bdd0ed8110099c821883/framework/lib/formatter/src/edu/washington/cs/mut/testrunner/Formatter.java#L100 will output testClass::testMethod in most cases, but when the bug leads to the JUnit Initialization Failure(which happens in some Time projects), it will only output testClass. In this case L456 in Repair.java will trigger a crash due to index error (list out of range). If you can confirm this then I'd be happy to create a pull request to fix this issue so that SimFix won't fail on similar cases. Thanks a lot for taking a look!

Questions Regarding "Frequent Modifications Mining"

Hello, Jiang!
My name is Tae Eun Kim.
I am a Computer Science undergraduate student in Handong Univ., Korea.

I am currently interested in APR and recently read your paper, the one about SimFix.
I am also interested in using patches of similar codes, since it can reduce the search space, complementing the limitation of search-based APR. So it is not a surprise that SimFix caught my attention. It was a mind blowing approach on effective exploitation of search space.

After reading your paper, I wanted to see how it works, so I was going through the implementation you opened on Github.
However, I came across a question.

I couldn't find the mining stage of "frequent modifications" on-line, neither the implementation of intersection of S1 and S2, as you mentioned in the paper.

So I was wondering if you can point out where this stage takes place.

Could you help me find answers to my questions?

With all the respect,
Kim.

How to obtain d4j-info for newer bugs?

Hello,

I would like to run SimFix on bugs that were added with later versions of Defects4J.
When trying to run such a bug, SimFix requires some additional information from the d4j-info directory. Where does this information come from or how can I generate/obtain it?
Thank you for your time.

Best regards
Florian

执行jar包时出错

我在ubuntu虚拟机里执行
java -jar simfix.jar --proj_home=/home/baoqian/buggy --proj_name=chart --bug_id=1

出错
UNKNOWN ASTNode type : catch (NoSuchMethodException e) {
result=object;
}
请问这是什么原因,java版本是1.7的,百忙之中还望简答一下,感谢!!!

Timeout different for different bugs

Hi!,

I was running SImFix on the newer version of Defects4J and for some reason, it didn't generate any patches for any defects. Quick skim through the logs shows that TIMEOUT value used for different defects is different. For example, for Math 1, TIMEOUT = 1080000 while for Chart 1, TIMEOUT = 18000000, and for Time 26, TIMEOUT = 480000.
Is it intentional for some reason?

Question about create a patch file

Hi,

We have tried to run your tool on our side with Defects4J data set. For now, we can make the code run and we downloaded Defects4J data and buggy line information. But when using your code, there are always messages like "No block found!" and cannot create the patch file because it cannot find the buggy information.

The command we are using is "java -jar simfix.jar --proj_home=/home/user --proj_name=chart --bug_id=1". Is there any point we missed or could you please provide us some suggestions on this problem?

Thanks a lot.

Question about code snippet size (default 10)

Sorry for interrupting you with this question.

As shown in the SimFix paper, the code snippet size is 10. (i.e., given faulty line n, the snippet ranges from n-5 to n+4 ).

image

But when I use SimFix to repair Math 57, I got such log:

Sat Apr 20 13:15:33 PDT 2019 org.apache.commons.math.stat.clustering.KMeansPlusPlusClusterer,321
0 s for the statement.

Sat Apr 20 13:15:33 PDT 2019 **org.apache.commons.math.stat.clustering.KMeansPlusPlusClusterer,196**

----------------------------------------
----------------------------------------
Pass Single Test : [=>0,0]
while(resultSet.size()<k){
double sum=0;
for(int i=0;i<pointSet.size();i++){
 final T p=pointSet.get(i);
 final Cluster<T> nearest=getNearestCluster(resultSet,p);
 final double d=p.distanceFrom(nearest.getCenter());
sum+=d*d;
dx2[i]=sum;
}
 final double r=random.nextDouble()*sum;
for(int i=0;i<dx2.length;i++){
if(dx2[i]>=r){
 final T p=pointSet.remove(i);
resultSet.add(new Cluster<T>(p));
break;
}
}
}

Time : 19/04/20 13:17
----------------------------------------

----------------------------------------

This log indicates that SimFix chose the org.apache.commons.math.stat.clustering.KMeansPlusPlusClusterer,196 as buggy line, but the whole code snippet spans from KMeansPlusPlusClusterer:175 to KMeansPlusPlusClusterer:196. This is out of my expection and also the expected snippet size.

Therefore, I would like to ask that if there are some points I missed in the process.

Any help would be much appreciated.

Regards,
Dale.

Failed to repair chart1/20

Hi! Both chart 1/20 failed to repair. I wonder is there any possible reason why that happened? We use Ubuntu20.04, jdk1.7 and Defects4J 1.1.0. Thank you very much!

A potential bug in NodeUtils.java implementation

In the method listNodeMatching() of NodeUtils.java, I find a potential bug (if I do not misunderstand): as is shown in the following code snippet of listNodeMatching(), the initial value of index is supposed to be -1 rather than 0. If it is 0, the if(index != -1){ condition will always be true. I am not sure how much impact this would bring. Therefore, I'd like to present such issue for your check and consideration. Thank you.

for(int i = 0; i < otherLen; i++){
				if(record[i] == -1){
					int index = 0;
					for(int j = i + 1; j < otherLen; j ++){
						if (record[j] != -1) {
							index = record[j];
							break;
						}
					}
					if(index != -1){
						Node insert = tarNodeList.get(i);

apply new data

hi,

I really like your work! I would like to apply this project to my own data for the mining and evaluation after that.
I need help on how can I apply the mining stage to my data, so any instruction for how can I do that?
Thanks a lot!

Regard,
Haifan Wu

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.