Coder Social home page Coder Social logo

codematcher's Introduction

使用说明

1. codematcher_analysis.py 用于解析原始数据集中方tokens的频率 github竞赛的数据集我仿照写了github_analysis 用这个就行了

  • stat_method(), get statistics of method name in codebase.
  • stat_parsed(), get statistics of method body in codebase.
  • analyze_method(), get vocabulary of method name according to the output of stat_method().
  • analyze_parsed(), get vocabulary of method body according to the output of stat_parsed().

5. Preprocess queries and parsed methods by codematcher_parse.py

  • query_parse(), preprocess queries and get metadata for each token in queries.
  • query_parse_tree(), generate query keywords with importance order.

6. Index codebase by codematcher_elasticsearch.py es 我已经附在工程目录里了,你只要解压然后把 根目录下的bin目录加入环境变量然后启动

  • format_data(), save reprocessed method components to Json data structure.
  • create_index(), create codebase index for elasticsearch. 为了简化直接用 create_simple_index()
  • fill_data(), fill the formatted data to the elasticsearch. 简化直接用 fill_simple_data()
  • fuzzy_search(), perform fuzzy search on the indexed code with parsed queries.

7. Rerank search results by codematcher_rerank.py

  • reranking(), reorder the search results returned by the fuzzy_search() and generate the final results in 'search.txt'
  • '_search_codematcher_parsed_lzw.txt' in Baidu Pan link shows the final result.

codematcher's People

Contributors

leorpoirot avatar

Stargazers

 avatar

Watchers

 avatar  avatar

codematcher's Issues

Experiment result

may i ask your a question?
Do you follow the experiment procedure in the codematcher paper to verify that the results are consistent with the results in the paper? that is the evaluation Indicator MRR.
thanks a lot.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.