dfm / osrc Goto Github PK

View Code? Open in Web Editor NEW

1.0K 46.0 137.0 616 KB

The Open Source Report Card

Home Page: http://osrc.dfm.io

License: MIT License

Python 52.85% Lua 3.23% CSS 18.80% JavaScript 5.93% HTML 18.92% Dockerfile 0.27%

osrc's Introduction

The Open Source Report Card (v3)

A work in progress... hopefully we'll have the OSRC back up soon!

Installation (docker-compose)

If you have docker-compose installed just use:

docker-compose up

to run on a different port than 5000 set environment variable WEB_RUNSERVER_PORT

WEB_RUNSERVER_PORT=8000 docker-compose up

More details check docker-compose.yml and Dockerfile

Installation

Set up the environment:

conda env create -f environment.yml
source activate osrc

Create the tables:

createdb osrc
python manage.py create

These tables can also be dropped using:

python manage.py drop

License & Credits

The Open Source Report Card was created by Dan Foreman-Mackey and it is made available under the MIT License.

osrc's People

Contributors

Stargazers

Watchers

Forkers

migurski jonbaer lingling2012 lukaselmer aaron1011 xenovyzarz jedivind waytai haneefmubarak viennalabs agconti lhl qrwteyrutiyoup wyvernzora melindrea zhuomingliang sudheesh001 n0rmrx mshamber jacopotarantino girasole004 despo theopolisme dwsyoyo cphoover prodigeni pombredanne shaohua scalp42 farcryzry sebdeckers sstelfox mdavid kamilaborowska metricfu svanderbleek duyvk asherbond web5design groovecoder blairanderson hyperunknown samzhang111 patrickjs umangmathur92 jimmynguyc adimpression syntheticpp fawaf drdub kushald manufacturedba tjhorner waseem18 schinckel 99plus2 pcurry longshine loftor-git brynner mattkgross eos87 eddiejaoude marconilanna fr0z3nfyr ghotiv jwcastillo dinesh-ramakrishnan ouisharelabs codemickeycode ksemel fragote anujacharya1 bbt123 ksikka libardo1 peekmo gitcolor4test boostrack-oss mikelindenau pbamotra vingorilla bitrecruiter flowerhack ragnardanneskjold tedlouie andmos wm thomas-daniels mokcy rocknrollmarc hacklabco bettyrose savaki agcolom infoburp ericschles zeke jrmerz phpdata

osrc's Issues

Doesn't like single quotes in names

Change name to 'Dragon' Dave McKee
Second paragraph now starts

's behaviour is...
3. Change name to not have quotes - works fine:

Dragon's behaviour is...

Broken links in the main site

https://github.com/dfm/osrc/blob/master/ghdata/static/swears.txt

https://github.com/dfm/osrc/blob/master/ghdata/__init__.py#L195

Witty wording is sometimes rather problematic

Some of the wording can be misunderstood, especially:

... who would rather be commenting on issues instead of pushing code.

Read correctly (or wrongly :)), this could be read as characterizing this user as one of those two:

This especially applies to users that do a lot of issue triage and document strange behaviours they find on projects bugtrackers - which is definitely a very important task.

I accidentally removed myself from http://osrc.dfm.io/vilmospapp

Please add me back

in repo page, it shows any person as 'main contributors' who have even starred the repo or just forked it, you can see the mistake in any average repo like
http://osrc.dfm.io/pravj/Fubot
I think just starring repo never push users to main contributors list, may be something is wrong with 'stats.get_repo_info()'

Comparison Bug

Comparison of a user with himself doesn't make any sense.

About box on report page

At the bottom of the report page, add a one sentence description and link to front page.

Clicking "OK. I promise!" in IE11 shows a blank page with "false"

Should return true.

Report card showing deleted repos

Upon generating my report card: http://osrc.dfm.io/amhed

I saw that the repo amhed/azureoid came up, but that repo was deleted a couple of months ago. We need to check for that

Nonsensical stats

On the @torvalds page it says he's similar to http://osrc.dfm.io/bbond007 who is "one of the top 38% most active C users". Just look at @bbond007's profile. I doubt he is.

support for sourceforge

Sourceforge still has a huge amount of open source development happening (they say 3.4 million devs and 324,000 projects).

API: http://sourceforge.net/apps/trac/sourceforge/wiki/API
User example: http://sourceforge.net/api/user/username/unhammer/json
Project example: http://sourceforge.net/api/project/id/143781/json

Not sure how to get number of commits etc (https://www.ohloh.net/p/apertium does this, but I guess they just look at the actual repo, e.g. "svn log https://svn.code.sf.net/p/apertium/svn")

Distinguish organisations from users

It looks strange when people are compared to organisations from which something like this is generated:

... from their activity streams—that XYZ and ValveSoftware are probably friends or at least virtual friends.

Add CSS!

GitHub have the CSS at the language list. Can it be added to OSRC?

Swearing stats

The swearing stats still need to be added to the report card page.

Nice effort, but careers are serious business.

It was fun to see what it generated for my profile, but I wouldn't want anyone seeing it. Thanks for adding the option of opting out.

Is this counting code in repositories or commits?

Firstly, great tool! Just one whinge:

The app lists me as 70% PHP and 37% Javascript.

Which is strange, because the majority of my commits are Javascript. The only difference is that I have a large repository that I commit to (and have forked) in PHP, however I have very few commits to it (relatively). On the other hand this repository is the bulk of my GitHub commits, and it's JavaScript.

Is the tool counting the code in repos or the code I have committed? If it's the former, maybe you may want to choose the latter? A lot of people fork code and make small contributions to it which probably shouldn't make the entire repository count as their code.

Thanks.

EvilPenguin breaks it

http://osrc.dfm.io/evilpenguin yields Internal Server Error. Help! (/cc @ac3xx, @evilpenguin)

why my profile downt show?

why?

Allow import of private data

Add a way to dump an activity report to run the analysis on so we can generate stats on private repos/non github data (like bitbucket).

I don't actually know Python

One project I contribute to is primarily Python, but nearly all of my contributions deal with Javascript, which kind of throws off my report card: http://osrc.dfm.io/rummik

I haven't looked at the code, but I'm guessing to fix something like this would require looking at changed files in commits, and then working things out based on the primary language within the commit. Which I understand could be tricky

PS: Really liking OSRC either way! :P

Register an IRC channel for osrc

It would be great if there was an IRC channel specifically for osrc.

Error when my username is requested

The following link http://osrc.dfm.io/sschaef gives:

Internal Server Error

The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.

let's get TeX as a language into github

point @davidwhogg at the relevant repo to make this change?

UTF-8 problem

Look at my card (http://osrc.dfm.io/zdroid). My name contains ć, one of latin unicode chars. Same is with č, đ, ž and š. Please add full UTF-8 support.

Renamed github accounts are not linked to previous account names

My new account name (madisonmay) is not integrated with my old account name (madisonmay13). I'm not sure of the technical difficulty of a fix for this, but I just thought I'd bring it up in case it's an easy fix.

Non-existing Github username leads to ugly 404 page

Support for bitbucket.org

bitbucket.org is also very popular.

Not showing unworthy contributions ?

Hi -- and thanks for this great/fun tool !

I'd like to know if a limit was set up on the number of languages to display ?
It used to show around 10 languages on my page but it only shows 4 now, even tho other languages are not that far off. Is there something i'm missing ?

Does osrc account for gist contributions?

Some developers have significant gists - might be worth incorporating them into the reportcard, somehow

Consider replacing pie charts

Consider replacing evil pie charts with bar charts or some other useful visualization.

Unicode names are broken again

For example:

http://osrc.dfm.io/DasIch
http://osrc.dfm.io/adaoraul

Come on!

More language master names

It's not enought to just have sysadmin, useR and a few more master names. We need many more. 😄

Examples:

AI freak (Lisp)
Scripter (Shell)
Nubist (Nu)

Timezone Issue

Which timezone does it use? We should be able to set the timezone, or we'll get wrong working time.

Server down

osrc.dfm.io is down ! ?

Recognize groups

This is a relatively minor issue, but it would be nice if this project would recognize trying to feed a group into it, instead of just saying "not enough data".

Data has aged

It would be nice if the data would regularly update, e.g. every week or so.

Some people live not in the US…

Who knew!

Turns out that the daily histogram is offset by a day for people in Australia/NZ/etc. This is slightly difficult to fix in the current setup (because the timezone is currently applied at render time) but I think I know of a way to fix this. We could use the average daily schedule to compute an average offset vector for the days and apply that. This won't be included in the KNN indicies but I think that that's probably OK.

What happens if repository is renamed?

I renamed ZDroid/nginZ repository to ZDroid/hackwork in september, but I found this at my page:

These days, Zlatan is most actively contributing to the repositories: twbs/bootstrap, ZDroid/bootstrap, ZDroid/nginZ, ZDroid/mistype, and ZDroid/XDroid.

So, what happens if repository is renamed (not moved to different account / organization)?

Write an about page

The splash page should have a username box, a description of the dataset/data collection, a discussion of K-means (and the results) and KNN.

Typo Javascripter -> JavaScripter

I noticed on the report card that "Javascripter" should be "JavaScripter".

Interest in authing to get private events data?

I have ~70 private repos and most of my 9-5 activity is in private repos. Is there any interest in allowing user authentication to display a more accurate report card?

I understand this goes against the 'OpenSource' in the name!

Liberal license + README

yurp.

Need to consider markup when considering content

So, first: super amused with this project!

Second, to the point: data collection likely needs to consider context.

As an example, when looking at my own profile (http://osrc.dfm.io/weierophinney), perhaps the most amusing stat to me was this:

I hate to say it but Matthew is becoming—as one of the top 66% most vulgar users on GitHub—a tad foul-mouthed (with a particular affinity for filthy words like 'coochie').

What was amusing is that I interact with a contributer by the handle @hoochie-coochie; I'm definitely not prone to saying the word "coochie" in comments, commits, or other github-related dialog, but I will reference the user when addressing them in comments, and always with the @ annotation. (Ironically, I know that this comment will just inflate the counts of that word!)

As another data point, from my own profile:

Matthew and zendframework are probably friends or at least virtual friends.

Well, yes, yes, that is true! However, "zendframework" in this case is not a user, but an organization. The algorithm should check the status of a "user" to see if they are actually an organization, and omit organizations when considering affinity -- or at least alter how the data is presented. (As an example, considering "zendframework" is an organization, the next sentence causes a lot of amusement: "it's worth noting that zendframework is less of a PHP aficionado.")

Again, however, very much enjoying the project!

Too many open connections , can't download data

ConnectionError: HTTPConnectionPool(host='data.githubarchive.org', port=80): Max retries exceeded with url: /2011-02-12-31.json.gz (Caused by <class 'socket.error'>: [Errno 24] Too many open files)
<Greenlet at 0x107fbcaf0: fetch(2011, 2, 12, 31)> failed with ConnectionError

Traceback (most recent call last):
File "/Library/Python/2.7/site-packages/gevent/greenlet.py", line 390, in run
File "fetch.py", line 40, in fetch
File "/Library/Python/2.7/site-packages/requests/api.py", line 55, in get
File "/Library/Python/2.7/site-packages/requests/api.py", line 44, in request
File "/Library/Python/2.7/site-packages/requests/sessions.py", line 335, in request
File "/Library/Python/2.7/site-packages/requests/sessions.py", line 438, in send
File "/Library/Python/2.7/site-packages/requests/adapters.py", line 327, in send
ConnectionError: HTTPConnectionPool(host='data.githubarchive.org', port=80): Max retries exceeded with url: /2011-02-16-31.json.gz (Caused by <class 'socket.error'>: [Errno 24] Too many open files)
<Greenlet at 0x107fbcd70: fetch(2011, 2, 16, 31)> failed with ConnectionError

nickname change

Hey man, thank you for the great app! It's really awesome. But i dont think its have something to handle a nickname change situation.

Thanks.

feature request: opt-out

A user should be able to opt-out of this. The "analysis" being done is really misleading and some people might not want overzealous recruiters making assumptions about them based on what they would read on a page like this.

Feature request: add simple summery for social sharing like og:title?

Hi,

You asked me to create a new issue (I am unsure about how to make a pull request...) for this feature request, so here goes.

When sharing the Open source report card for a github user on say Facebook, it would be great if a simple summery would apear. By adding the tags below you should get that information.

The example is based on my own user "netsi1964".

Thank you :-)

/Sten Hougaard

<meta property="og:site_name" content="osrc.dfm.io"/>
<meta property="og:url" content="http://osrc.dfm.io/netsi1964"/>
<meta property="og:title" content="netsi1964 on the open source report card"/>
<meta property="og:image" content="https://secure.gravatar.com/avatar/0b0d810ea35987a35df121157e6a4ffd?s=220&d=https://a248.e.akamai.net/assets.github.com%2Fimages%2Fgravatars%2Fgravatar-user-420.png"/>
<meta property="og:description" content="Sten is a trend setting Javascripter. Sten is an early-week worker who seems to work best in the wee hours."/>
<meta property="og:type" content="article"/>

Potentially misleading summary

You've built a very impressive tool here. The reason I opted out is that it characterised me as a Common Lisp hacker, whereas I'm much more of a CL n00b. It was interesting to see what your tool made of my profile, but I opted out in case it misled others about my skillset.

FWIW, I'm primarily a Ruby and Javascript (Coffeescript if I can help it) dev these days, with a background in C# and, earlier, C.

"sysadmin" for mainly C work?

Hi, I stumbled across osrc.dfm.io and I'm surprised I'm classified as "sysadmin". Is this a bug? With pretty much 75% C work and mainly push activities it should be kind of obvious my main occupation is C hacking, not system administration.

Wrong verb tense

I saw this in the report:

We already know that Brian loves to commenting on issues whenever they're not pushing code [...]