Coder Social home page Coder Social logo

githubevaluator's Introduction

GithubEvaluator

Given a github username, evaluate all the repositories present and select the most complex ones with adequate GPT4 reports

Metrics to cover

Given a repository, we want to evaluate Complexity, Code Quality, Useful/Impactful-ness, Uniqueness/Toughness a repository is

The following are possible metrics:

  • Complexity:
    • Number of languages
    • Amount of bytes per language
    • Number of dependencies*
    • Number of files
    • Number of commits The above metrics can be made into numeric scores each by making use of defined lower and upper thresholds for each metric. The final score can be a weighted sum of all these scores.
  • Code Quality:
    • Code Quality from Repository Map
    • Readability of ReadMe Repository Map can be used to create a comprehensive GPT4 report covering naming convention, structure, modularity and understandability. Readability of ReadMe can be evaluated using GPT4 to generate a report on the ReadMe.
  • Useful/Impactful-ness:
    • Number of stars
    • Number of forks
    • Number of watchers
    • Number of pull requests
    • Number of issues Like with complexity, we shall set thresholds for each of these and arrive at a score for each of these.
  • Uniqueness/Toughness:
    • ReadMe + Description This is a subjective metric and can be arrived at by using GPT4 to generate a report using README and Description to judge difficulty and uniqueness of the implementation .

The goal would be to generate the Uniqueness/Toughness and Code Quality evaluations with GPT4 for the top 5 good repositories ranked using the other metrics.

NOTE *: Number of dependencies function is currently commented out, must test a bit more to ensure generalisability.

How to run

  • Clone the repo
  • Install the requirements using pip install -r requirements.txt
  • Create a file .env and add the following lines
GITHUB_TOKEN='<GITHUB_TOKEN>'
OPENAI_API_KEY='<OPEN_API_KEY>'

To generate a github token, follow the instructions here

To generate an openai api key, follow the instructions here

  • Run the script using
python main.py <github_username>

The script will generate a folder ReportCard which will contain the reports for the top 5 repositories. The report will contain the following for each repository:

  • Complexity
  • Code Quality
  • Useful/Impactful-ness

Customization

The prompts used for generating the reports can be found in the file prompts.py. The prompts can be modified to generate different reports. Highly recommend playing with the prompt to get the best results.

The variables used for governing the thresholds for each metric can be found in the file constants.py. The variables can be modified to govern the number of repositories selected, the weightage given to impact vs complexity and upper-lower bounds used for generating the scores.

The token limit required to not hit maximum content length error can be found in threshold.py.

To Note

The script will only consider code repositories that are authored by the given github user. Contributions made to other repositories will not be considered. This includes contributions made to open source repositories.

This can be explored and implemented in the future.

Credits

Credits to the aider module and the repomap function used goes to paul-gauthier

Checkout his explanation on how to improve GPT4 visibility of code repository using CTags here

Improvements

Highly welcome pull requests and issues to improve the script. Some of the improvements that can be made are:

  • Improve the prompts used to generate the reports
  • Improve the thresholds for the metrics used to evaluate the repositories
  • Explore and implement contributions made to other repositories
  • Explore and implement other metrics to evaluate repositories

githubevaluator's People

Contributors

sumba101 avatar

Stargazers

Gideon Daniel Giftson T avatar Cameron King avatar

Watchers

Kostas Georgiou avatar  avatar Vivek Mathur avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.