Coder Social home page Coder Social logo

Comments (7)

marnovo avatar marnovo commented on June 13, 2024 1

All very good points. Effectively the underlying use case and technical questions for personal users may be quite different from orgs in the end vs. just a matter of conforming to the API…

from sourced-ce.

dpordomingo avatar dpordomingo commented on June 13, 2024

About being possible, I'd say yes;
I'd maybe change "org" by "owner", being able to be either a "user" or an "org"; doing so we would also avoid problems if the user becomes an org at any point.

But:
with "org", we fetch metadata from its members.
with "user", we won't fetch that metadata.

But I'm not sure what's the purpose of getting the org members.
If the purpose is to assign the activity in the repos, to its members, then there will be some activity that won't be assigned (because it will belong to gh users that won't be members of that org, so they won't be imported; example: one issue opened in bblfsh by a non bblfsh member, won't be assigned to any user in our DB)

If we need to get the info about all the users contributing in a repo (like the example above), we should also fetch:

  1. all gh users having activity in that repo, and not being members of that repo org,
  2. try to find gh users from repo commits (to be able to assign commits to users, not only gh activity).

If we import also repos from users, as suggested by this issue, the activity in their repos won't be assigned to another user than the imported user, unles we also do (1) and (2).

from sourced-ce.

smacker avatar smacker commented on June 13, 2024

@marnovo even technically it's not that different from org but the results might be very unexpected for users and we should do something about it. Problems I see:

  • half (or more) of the repos I have and any other dev in src-d are forks. Similar happens for external devs. The problem with forks: nobody updates master. Most of our charts rely on the HEAD so repos would produce results only to the moment when they were forked
  • there are no issues or pull requests in forks, all metadata charts will become useless

As a solution for user command, I would propose to resolve forks and download code/metadata for the original repo. Even in some cases (example) it would make more sense to download the fork, but such cases are exceptions.

from sourced-ce.

dpordomingo avatar dpordomingo commented on June 13, 2024

I wouldn't do it automatically but maybe with options: --use-parent, to use the parent repo instead, or --add-parent to fetch both: original, and parent; or even fully ignore forks with --no-forks as requested by @warenlg at #109
Or also --exclude and pass a list of repos to be ignored (in case of repos causing konwn fails, o whatever other reasons)
This way everything would be more explicit, what I think would be better, and more flexible.

from sourced-ce.

se7entyse7en avatar se7entyse7en commented on June 13, 2024

I'd love to have this feature, and I also think that it would increase a lot the chance of being tried by people.

BTW regarding forks I agree that there could be different needs depending on the user. But in general I think that it's either --ignore-forks or not. If the user is interested in resolving forks with original repo then maybe it's more straightforward to just initialize sourced-ce with the owner (whether it is an org or a user) of that original repo and maybe provide some filtering capabilities such as init orgs apache --repositories=incubator-superset.

Also because the repositories that are most likely to be forked are popular ones, and including popular repos together with mine, I think that it will just hide a lot of insights as it will add a lot of noise.

from sourced-ce.

smacker avatar smacker commented on June 13, 2024

Agree with Marvin for most of the points. Though I would want to remind that not everybody (I don't have numbers but most probably it's a majority of github users) don't have real repositories that aren't forks and aren't dump of some code (for a school or workshop or something like that). So analyzing only the profile doesn't make sense for them at all. Exploring the information about repositories they contributed to, on another hand, can be interesting.

from sourced-ce.

se7entyse7en avatar se7entyse7en commented on June 13, 2024

Though I would want to remind that not everybody (I don't have numbers but most probably it's a majority of github users) don't have real repositories that aren't forks and aren't dump of some code (for a school or workshop or something like that).

I don't know whether is the majority of the users, but you're absolutely right about this type of users, I didn't think about it. I'm just wondering how this type of users is likely to use a tool like this for their forked repos, but this is a different point.

from sourced-ce.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.