labhr / octohatrack Goto Github PK
View Code? Open in Web Editor NEW๐๐ Show _all_ the contributors to a GitHub repository.
License: BSD 3-Clause "New" or "Revised" License
๐๐ Show _all_ the contributors to a GitHub repository.
License: BSD 3-Clause "New" or "Revised" License
https://developer.github.com/changes/2016-06-07-reactions-api-update/
Still in preview mode, but once it's official, octohatrack
should be updated to suit.
Make the section of the repo that self-reports the contributors on this report self update.
Similar to the workflow in glasnt/glasnt/.github/workflows/build.yaml, probably on a weekly schedule.
The update process should replace a specially named section in the README, and commit it to the repo. If there's no change, then no commit is made.
I did a quick look and it looks like the project doesn't currently include/count wiki edits as a form of contribution. Did I miss something? Is this planned? If not, I could help implement that.
I just let octohatrack run on a moderately large repository and the cache file easily reached 1MB after two runs, which is a bit large for inspection (especially since dumping everything in one line makes syntax highlighting take a while โฆ). I think reducing the amount of data written to the cache file might be a good idea :)
Following the GitHub Satellite 2019 keynote, I want to review what that native functionality adds, and if octohatrack
adds any value any more.
*dun dun dun*
Hi there,
I'm trying to run this on fchollet/keras. It's an active project. I am using a github token to increase the number of requests, but it keeps running out of tokens anyway. I'm not sure if the results from previous runs are cached (re-runs will help) or not (re-runs don't help).
I could use some advice on how to run this on a bigger project.
Thanks,
-Tennessee
I'd be nice to have pretty colours in the CLI without having to re-learn TERM colour codes.
Absolutely minor, and much lower priority than anything else open.
Somehow in my packaging messes, I have somehow removed the ability to run octohat from the local folder. I need to work how to do a thing that makes it go ./octohat.py user/repo
for local testing.
I think it's a case of making a octohat.py
file that does the same as the packaging entrypoint
This would be very fancy.
This is totally a wishlist item.
we have an internal Bugzilla instance, it's even linked into other bug systems (big company)
There is an upcoming REST API to Bugzilla, but in the meantime there's the old XMLRPC one ( https://www.bugzilla.org/docs/4.4/en/html/api/Bugzilla/WebService/Server/XMLRPC.html ).
Through that, it should be possible to search for bugs on a particular product and extract who's filing and commenting on them.
For people with commits in a repo that do not have an associated GitHub account (e.g. git-based commits made outside of GitHub, particularly for repos imported that predate GitHub), octohatrack should ensure they are listed in the contributors list.
Firstly - thank you for making this! My team and I really want to recognise all types of contributions to our project and this seems like a great way to do just that!
I've just installed octohatrack
on a windows machine (successfully) and it took me a little while to figure out a couple of the options so I though I'd offer to update the README.md
with the info I was looking for to make it easier for the next person.
Specifically, the README.md
doesn't explain the -g, --generate-html
option which meant I was scratching my head for a little bit trying to figure out what I was supposed to do with the .json
file that was generated. It wasn't hard to figure out, but I just thought I could add a little note to point out how useful it is!
Then there was one other question: am I right that a simple use case would be to download this locally and then every now and again re-run the code to update your html file? Is there a better/different way that you'd recommend?
Thanks again!
Because pagination is hard, I'd like to clean up the internals and possibly use https://github.com/sigmavirus24/github3.py as a replacement for octohub
Not having to embed things would be a plus. And I believe when I was choosing a github wrapper library I discounted github3.py because the github releases aren't being kept up to date, but it's hit 1.0.0 on pypi, and being actively maintained, so yay!
For example, this would be very handy for showing top10 contributors of a repo.
Client output when attempting to run locally:
amcasari-macbookpro% python3 -m octohatrack LABHR/octohatrack
amcasari-macbookpro% pip install octohatrack
Requirement already satisfied: octohatrack in /Users/amcasari/repos-mine/octohatrack (1.0.0a0)
Requirement already satisfied: requests in /Users/amcasari/.pyenv/versions/3.8.6/lib/python3.8/site-packages (from octohatrack) (2.24.0)
Requirement already satisfied: gitpython in /Users/amcasari/.pyenv/versions/3.8.6/lib/python3.8/site-packages (from octohatrack) (3.1.11)
Requirement already satisfied: requests-cache in /Users/amcasari/.pyenv/versions/3.8.6/lib/python3.8/site-packages (from octohatrack) (0.5.2)
Requirement already satisfied: idna<3,>=2.5 in /Users/amcasari/.pyenv/versions/3.8.6/lib/python3.8/site-packages (from requests->octohatrack) (2.10)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /Users/amcasari/.pyenv/versions/3.8.6/lib/python3.8/site-packages (from requests->octohatrack) (1.25.10)
Requirement already satisfied: certifi>=2017.4.17 in /Users/amcasari/.pyenv/versions/3.8.6/lib/python3.8/site-packages (from requests->octohatrack) (2020.6.20)
Requirement already satisfied: chardet<4,>=3.0.2 in /Users/amcasari/.pyenv/versions/3.8.6/lib/python3.8/site-packages (from requests->octohatrack) (3.0.4)
Requirement already satisfied: gitdb<5,>=4.0.1 in /Users/amcasari/.pyenv/versions/3.8.6/lib/python3.8/site-packages (from gitpython->octohatrack) (4.0.5)
Requirement already satisfied: smmap<4,>=3.0.1 in /Users/amcasari/.pyenv/versions/3.8.6/lib/python3.8/site-packages (from gitdb<5,>=4.0.1->gitpython->octohatrack) (3.0.4)
WARNING: You are using pip version 20.2.4; however, version 20.3.1 is available.
You should consider upgrading via the '/Users/amcasari/.pyenv/versions/3.8.6/bin/python3.8 -m pip install --upgrade pip' command.
amcasari-macbookpro% octohatrack --help
usage: octohatrack [-h] [--no-cache] [--wait-for-reset] [-v] username/repo
positional arguments:
username/repo the name of the repo to parse
optional arguments:
-h, --help show this help message and exit
--no-cache Disable local caching of API results
--wait-for-reset Enable waiting for rate limit reset rather than erroring
-v, --version show program's version number and exit
amcasari-macbookpro% python3 -m octohatrack amcasari/octohatrack
amcasari-macbookpro% octohatrack amcasari/octohatrack
Checking repo exists....Repo does not exist: amcasari/octohatrack
I think I need to find someone one to explain this one to me IRL because I'm not getting it.
What I want:
pip install octohatrack; octohatrack blah/blah
to "just work" as a command-line toolgit clone https://github.com/labhr/octohatrack; cd octohatrack; python3 somefile.py blah/blah
to "just work" as an invocation of the local codeWhat I have is octohatrack.py
which goes
import octohatrack
octohatrack.main()
But this doesn't run on systems without octohatrack
the package already installed.
pkg_resources.DistributionNotFound: The 'octohatrack' distribution was not found and is required by the application
What I think might be happening is that I've done doofed the __init__.py
file, so that I can't import the local file functionality. The current setup works gives a valid setup.py
and installation mechanism, but it just doesn't work for local invocation.
However, if you install octohatrack locally, the code in the local package doesn't actually run (provable by changing the current directory files to change the functionality, and have those changes be seen in the output.
I think this is something to do with modules, packages, setup.py's, and the hackery I did in #1
The information is being gathered anyway, so it's no extra work to output this.
Bonus: only the top 100 code contributors are readily accessible from the github interface anyway, so this is super helpful
On https://github.com/coala/gh-board, the following all cause an exception:
{'user_name': 'domlobo', 'name': None}
{'user_name': 'greenkeeper[bot]', 'name': None}
{'user_name': 'brentirwin', 'name': None}
{'user_name': 'adjohnson916', 'name': None}
{'user_name': 'domlobo', 'name': None}
{'user_name': 'brentirwin', 'name': None}
{'user_name': 'ANURADHAJHA99', 'name': None}
All Contributors:
Traceback (most recent call last):
File "/home/jayvdb/.local/bin/octohatrack", line 11, in <module>
load_entry_point('octohatrack', 'console_scripts', 'octohatrack')()
File "/home/jayvdb/projects/ghuser/octohatrack/octohatrack/__main__.py", line 53, in main
display_results(repo_name, contributors, len(api))
File "/home/jayvdb/projects/ghuser/octohatrack/octohatrack/helpers.py", line 16, in display_results
for user in sorted(contributors, key=lambda k: k['name'].lower()):
File "/home/jayvdb/projects/ghuser/octohatrack/octohatrack/helpers.py", line 16, in <lambda>
for user in sorted(contributors, key=lambda k: k['name'].lower()):
AttributeError: 'NoneType' object has no attribute 'lower'
This would help identify the people when many contributors don't have gravitars, as well as making the text output more usable in release notes.
When I enter octohatrack -h, the result is as follows
optional arguments:
-h, --help show this help message and exit
--no-cache Disable local caching of API results
-v, --version show program's version number and exit
-l 10, --limit 10 Limit to the last x Issues/Pull Requests
Which version should I use or what should I do to get the optional arguments as shown in the readme?
Major releases can totally break backwards compatibility, right? ๐
General plan:
[0] Yes, this is contentious. As BDFL[1], I reckon that the amazing work in LABHR/js-hatrack in regards to generating visual representation of contributions overrides anything my hacky python CLI can generate.
[1] I think I can call myself this, in jest[2]
[2] Are nested footnotes legal?
It'd be great if the HTML output could not have >title< in it and instead be a candidate for directly including into a web page somewhere rather than having to either edit it or have it in a separate file.
Environment:
Executed command:
pip3 install octohatrack
Expected output: Installed package.
Actual output:
Collecting octohatrack
Collecting gitpython (from octohatrack)
Using cached GitPython-2.1.8-py2.py3-none-any.whl
Collecting requests (from octohatrack)
Using cached requests-2.18.4-py2.py3-none-any.whl
Collecting simplejson (from octohatrack)
Collecting gitdb2>=2.0.0 (from gitpython->octohatrack)
Using cached gitdb2-2.0.3-py2.py3-none-any.whl
Collecting idna<2.7,>=2.5 (from requests->octohatrack)
Using cached idna-2.6-py2.py3-none-any.whl
Collecting urllib3<1.23,>=1.21.1 (from requests->octohatrack)
Using cached urllib3-1.22-py2.py3-none-any.whl
Collecting chardet<3.1.0,>=3.0.2 (from requests->octohatrack)
Using cached chardet-3.0.4-py2.py3-none-any.whl
Collecting certifi>=2017.4.17 (from requests->octohatrack)
Using cached certifi-2017.11.5-py2.py3-none-any.whl
Collecting smmap2>=2.0.0 (from gitdb2>=2.0.0->gitpython->octohatrack)
Using cached smmap2-2.0.3-py2.py3-none-any.whl
Installing collected packages: smmap2, gitdb2, gitpython, idna, urllib3, chardet, certifi, requests, simplejson, octohatrack
Successfully installed certifi-2017.11.5 chardet-3.0.4 gitdb2-2.0.3 gitpython-2.1.8 idna-2.6 octohatrack-0.6.1 requests-2.18.4 simplejson-3.13.2 smmap2-2.0.3 urllib3-1.22
Segmentation fault (core dumped)
I already have glasnt.github.io -> glasnt.com, so if glasnt/octohat gets a gh-pages branch, then it can be all shiny over on glasnt.com/octohat
I just need some shiny CSS and copy to make it look nice.
Unsure if this will turn into the landing point for #34 or not. Probably not, because static. ๐คท
This idea has been bouncing around my head for a bit, so I thought I'd braindump here if anyone had any comments.
Projects sometimes maintain a CONTRIBUTORS file, where people are listed. In the age of (known limited) contribution lists already being aggregated, I thought it might be an idea to add a semi-automated groking of these kind of files along with the existing parsing.
My thought is that using a git-commit-like format with git log formatting, there could be a pseudo-standard of human readable and machine parseable contributions. When getting the metadata for a repo, the master version of this file, if it exists, can be added to the listings.
I'm thinking something like this:
# These people are additionally awesome for helping with things
Alanis Morrisett <[email protected]>
Kaywinnet Lee Frye <[email protected]>
# More comments could go anywhere here
For any line that looks like a display name and angle-bracket, these should be added to the non-coding contributions list.
In my use case, I'd add people who didn't commit code to this list. They might not have a GitHub account, so I'd probably fallback to pulling their Gravatar for the display versions.
At the moment, if you hit the rate limit (60/hour untokened, 5000/hour tokened), the app will say the repo doesn't exist, instead of saying you've hit the limit. It should ideally also get the X-RateLimit-Reset header and return the time to try again, possibly with a (in xx minutes)
message.
Probably would mock github responses, and/or do a self live test.
https://github.com/pydanny/contributors has a date-range limitation, which is quite useful for working out contributors to sprints and other time-based things.
octohatrack
does have a --limit
flag already, but that's for "x most recent issues/PRs"
Date ranges would be useful.
Reference: https://twitter.com/glasnt/status/741102742119157761
With e1b52cd,
# docker build -t octohatrack .
.....
# docker run -e GITHUB_TOKEN octohatrack
Traceback (most recent call last):
File "octohatrack.py", line 2, in <module>
octohatrack.main()
AttributeError: module 'octohatrack' has no attribute 'main'
Reticketed from #87 (comment)
After a little stumbling around, I got this working (I use pipenv)
brew install pipenv # for keeping each app in a virtualenv
brew install pyenv # for installing other versions of python
pipenv --version
>>> pipenv, version 2018.10.9
pyenv --version
>>> pyenv 1.2.7
pipenv run python --version
>>> Python 3.6.5
# Install a 3.x version with pyenv if you don't have it
git clone LABHR/octohatrack
cd octohatrack
pipenv run python setup.py install
pipenv run octohatrack
>>> usage: octohatrack [-h] [--no-cache] [--wait-for-reset] [-v] username/repo
>>> octohatrack: error: the following arguments are required: username/repo
pipenv run octohatrack LABHR/octohatrack
>>> Checking repo exists....Repo does not exist: LABHR/octohatrack
export GITHUB_TOKEN=xxxxxxxxxxxxxxxxx
pipenv run octohatrack LABHR/octohatrack
>>> Checking repo exists....
>>> Getting API Contributors................. etc etc working
cc: @Chandler-Song
The DEBUG
function shows that the GITHUB_TOKEN
isn't being seen by the script. I've setup this up as an secret environment variable as not to expose it to the public, but it should still be visible to the script.
The problem is that to get all the contributor information for octohat now takes more than 60 API calls, which means that without the token the self-referencing test cannot run. ๐
(Possible cause: Pull Requests with multiple participants)
I need to dive into this more, but it appears that the statistics for what GitHub reports in the banner of a repo and what the API endpoint is reporting might not be the same, and I think this is possibly because of issues when the participants of a Pull Request differ to the participants in a git branch
A PR would be attributed to one person, but when there's more than one person working on a branch and it gets merged, there appears to be an inconsistency.
In the wild: https://twitter.com/nibalizer/status/696279727120654336
Just ran octohatrack on voxpupul/puppet-collectd:
Code Contributors: 30
Non Code Contributors: 133
However, https://github.com/voxpupuli/puppet-collectd reports 94 contributors
Full investigation pending
Potentially then able to use datasette for ad-hoc analytics
Needs to test for: a full up repo, a repo with no activity, generate_html, and probably some sort of unit testing.
Helpful for debugging and versioning
The app should not create files locally unless asked, and should only output to stdout in the default case
This error is reproducible with python 2.7.9 on Debian 8.1:
U+0153 is LATIN SMALL LIGATURE OE: ล
$> octohat -g sebsauvage/MinigalNano
Collecting contributors....
Collecting commentors...........................................................
................................................................................
Code contributions: 11
Non-code contributions: 28
jdedba
AnatomicJC
Riduidel
andersk
Niols
cepcasa
qwertygc
bitbybit
ygbillet
thomas30
Denys06
meh-uk
cborne
lanner
fufroma
k1ka
jabalv
10cre
indoushka
arthurlutz
farbrorlarry
nicosomb
Mermouy
fpp-gh
patcou
fredtantini
Knah-Tsaeb
danc
Traceback (most recent call last):
File "/usr/local/bin/octohat", line 9, in <module>
load_entry_point('octohat==0.1.3', 'console_scripts', 'octohat')()
File "/usr/local/lib/python2.7/dist-packages/octohat/__init__.py", line 59, in
main
f.write(display_user_html(user, args))
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0153' in position 1
97: ordinal not in range(128)
I have an proof of concept implementation of octohatrack using GitHub's GraphQL API v4
proof of concept graphql implementation
As per the limitations, because the API won't allow more than the last 100 results for any item, the results are limited.
But compared to octohatrack 0.6.1
it's just a touch faster.
$ time octohatrack labhr/octohatrack
All Contributors: 31
real 2m47.853s
$ time python octohatrack_graphql.py labhr/octohatrack
# 28 results returned
real 0m3.106s
Just a touch.
The numbers are out because I haven't included the CONTRITUBORS
file contributors, but that's just a case of appending the existing functionality in contributors_file.py
To Do:
I've been mulling over this after conversations I've had with @ossanna16; I don't think I'm using the correct naming conventions for the sections of the report.
I think it should be renamed, but I'm unsure on the new names.
I have two thoughts for this at the moment:
Repo: (repo_name)
GitHub contributors:
[list a]
All Contributors:
[list b]
GitHub Contributors: len(list a)
All Contributors: len(list b)
Where [list a] is the list defined by GitHub, and [list b] is everything else
Repo: (repo_name)
Contributors:
[list]
Contributors (as per GitHub): len(list a)
All Contributors: len(list)
In this form, len(list a)
would be the original list as per the GitHub contributor API endpoint, but the entire list would be an alphabetic list of all contributors (maybe with the ๐ emoji at the end for what was previously defined as 'non-coding' contributors)
@ossanna16: if you have the time, I'd appreciate your input on this.
Probably should check to see if the GITHUB_TOKEN
is set, and stop (unless there's a --force
set)
because
octohatrack -c -n boot2docker/boot2docker
Collecting contributors....
Collecting commentors................Traceback (most recent call last):
File "octohatrack.py", line 2, in <module>
octohatrack.main()
File "/usr/src/app/octohatrack/__init__.py", line 25, in main
code_commentors = get_code_commentors(repo_name, args.limit)
File "/usr/src/app/octohatrack/helpers.py", line 44, in get_code_commentors
users.append(get_user("/repos/%s/pulls/%d" % (repo_name, index)))
File "/usr/src/app/octohatrack/helpers.py", line 86, in get_user
entry = get_data(uri)
File "/usr/src/app/octohatrack/helpers.py", line 55, in get_data
resp = conn.send("GET", uri)
File "/usr/src/app/octohatrack/connection.py", line 71, in send
return parse_response(response)
File "/usr/src/app/octohatrack/response.py", line 103, in parse_response
raise ValueError(message)
ValueError: You have run out of GitHub request tokens. Set a GITHUB_TOKEN to increase your limit to 5000/hour. Try again in ~44 minutes.
just looks ew.
https://twitter.com/shiftkey/status/958081366469492736
https://github.com/blog/2496-commit-together-with-co-authors
Pending checking how the api presents these to rest requests
So that I can run the info on a large repo, without constantly re-requesting everything.
Human readable, important information, etc.
There are some robots that automatically create issues and pull requests that probably shouldn't be attributed as contributions.
From what I can tell, these include:
Ideally, there should be a file of a list of robots that are removed from the two contributors before they are compared
Hello, @glasnt!
I got this error in v0.5.0 while using trying to run octohatrack
against https://github.com/phpdocbrbridge/traducao.
$ octohatrack --version
octohatrack version 0.5.0
$ octohatrack phpdocbrbridge/traducao
Collecting API contributors...
Collecting all repo contributors...
Collecting wiki contributors.....'ascii' codec can't encode character u'\xe1' in position 1: ordinal not in range(128)
$ docker run \
--interactive \
--tty \
--rm \
--env GITHUB_TOKEN=$GITHUB_TOKEN \
--volume $HOME/.octohatrack:/cache \
\
rogeriopradoj/octohatrack:latest \
\
--version
octohatrack version 0.5.0
$ docker run \
--interactive \
--tty \
--rm \
--env GITHUB_TOKEN=$GITHUB_TOKEN \
--volume $HOME/.octohatrack:/cache \
\
rogeriopradoj/octohatrack:latest \
\
phpdocbrbridge/traducao
Collecting API contributors...
Collecting all repo contributors...
Collecting wiki contributors....Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/git/cmd.py", line 614, in execute
**subprocess_kwargs
File "/usr/local/lib/python3.5/subprocess.py", line 950, in __init__
restore_signals, start_new_session)
File "/usr/local/lib/python3.5/subprocess.py", line 1544, in _execute_child
raise child_exception_type(errno_num, err_msg)
FileNotFoundError: [Errno 2] No such file or directory: 'git'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/octohatrack", line 9, in <module>
load_entry_point('octohatrack==0.5.0', 'console_scripts', 'octohatrack')()
File "/usr/local/lib/python3.5/site-packages/octohatrack/__init__.py", line 50, in main
wiki_contributors = get_wiki_contributors(repo_name)
File "/usr/local/lib/python3.5/site-packages/octohatrack/wiki.py", line 54, in get_wiki_contributors
repo = Repo.clone_from(wiki_url, tmp_folder)
File "/usr/local/lib/python3.5/site-packages/git/repo/base.py", line 957, in clone_from
return cls._clone(git, url, to_path, GitCmdObjectDB, progress, **kwargs)
File "/usr/local/lib/python3.5/site-packages/git/repo/base.py", line 898, in _clone
v=True, **add_progress(kwargs, git, progress))
File "/usr/local/lib/python3.5/site-packages/git/cmd.py", line 459, in <lambda>
return lambda *args, **kwargs: self._call_process(name, *args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/git/cmd.py", line 920, in _call_process
return self.execute(make_call(), **_kwargs)
File "/usr/local/lib/python3.5/site-packages/git/cmd.py", line 617, in execute
raise GitCommandNotFound(str(err))
git.exc.GitCommandNotFound: [Errno 2] No such file or directory: 'git'
https://cloud.google.com/blog/products/gcp/github-on-bigquery-analyze-all-the-open-source-code
https://github.blog/2016-06-29-making-open-source-data-more-available/
Blockers would be:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.