Coder Social home page Coder Social logo

github-authors-emails's Introduction

Collect git author emails

time ./author_emails_fast.py jrzaurin/pytorch-widedeep
real	2m48.776s
user	0m3.534s
sys	0m0.712s

That will dump pandas dataframe to jrzaurin-pytorch-widedeep.csv:

$ head jrzaurin-pytorch-widedeep.csv 
sha,author_email,repo
74e9de1d8f24d0f4d126c1f123c1b8a48991fbc5,[email protected],jrzaurin/pytorch-widedeep
4325f6017afb9a34ca975c1179ce4f889251facb,[email protected],jrzaurin/pytorch-widedeep

How to fetch author emails fast

eg for jrzaurin/pytorch-widedeep:

How time, s disk gain
GitHub API 172 = 2m 52.1
git clone 35.9 97M --bare or 155M full clone 20% of GitHub API
the trick 2.4 632K 1.3% of GitHub API

git clone --bare about the same to git clone in time 0m22.245s but 97M vs 155M on disk like this git clone --bare --filter=blob:none [email protected]:jrzaurin/pytorch-widedeep.git

The trick is to git clone --bare --filter=blob:none because "commit information is stored in commit objects and file names are tracked using tree objects" (and we dont need filenames) https://stackoverflow.com/a/23253517

github-authors-emails's People

Watchers

Yakov avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.