Coder Social home page Coder Social logo

open-source-survey's Introduction

The Open Source Survey

We've run the largest survey of the open source community to date, the results of are an open dataset for us all to use and learn from. We hope the dataset informs some of the most pressing questions about open source software, the people that create it, their experience, and their relationship to the industry that depends on it.

Learn more about the survey design and the topics we're studying.

Why is GitHub doing this?

At GitHub our goal is to help everyone build better software. We believe open source code, communities, and principles create better software. As an industry, we know a lot about how open source software is created but very little about the people who create and use it. Are they professional developers, students, or hobbyists?

To build better software, then we need a software community where anyone, regardless of what they look like or where they come from, can participate. This survey will help us see how we, as a community, are doing.

Open data

Open source is bigger than any company or community. The dataset is released under CC0-1.0 for anyone to use and learn from.

Contributors

This survey is primarily designed and implemented by GitHub:

  • @franniez - Data and social scientist at GitHub. New to open source but not to studying people or movements, she's done extensive survey research in Washington D.C, from inside the ivory tower, and within the technology sector.
  • @arfon - Program Manager for Open Source Data at GitHub. A lapsed academic with a passion for new models of scientific collaboration, he's used big telescopes to study dust in space, built sequencing technologies in Cambridge, and has engaged millions of people in online citizen science by co-founding the Zooniverse.
  • @mlinksva - Open Source Maven at GitHub. A lapsed engineer and non-lawyer with a passion for increasing the efficacy and scope of open production and policy, he is an advisor/director/volunteer for various open initiatives and was previously a manager and technologist at Creative Commons.

This isn't a solo effort for us, these awesome individuals and organizations have helped us design this survey:

Check out the contributing guidelines if you want to get involved.

License

The material in this repo is open data released under CC0-1.0. This means you need no copyright or database right (if any) permissions to make use of this data and survey questions. However:

  • Survey participants have not waived their privacy rights; read our Privacy Statement regarding Public Information on GitHub. In particular, do not attempt to reidentify survey participants.
  • If you use this dataset in a publication, a link to or citation of this repository would be appreciated.
  • If you extend this dataset, sharing your additions as open data would also be appreciated.
  • CC0-1.0 does not grant any trademark permissions. GitHub® and its stylized versions and the Invertocat mark are GitHub's Trademarks or registered Trademarks. When using GitHub's logos, be sure to follow the GitHub logo guidelines.

Citation info

The data is additionally published on Zenodo, which provides a DOI as well as an easy way to generate citations in a number of formats. We suggest modifying autogenerated citations to reflect the original publication source, e.g as below.

screen shot 2017-06-19 at 4 13 11 pm

@misc{GitHubOpenSourceSurvey2017,
  author       = {Zlotnick, Frances},
  title        = {GitHub Open Source Survey 2017},
  month        = jun,
  year         = 2017,
  doi          = {10.5281/zenodo.806811},
  publisher    = {GitHub, Inc.},
  howpublished = {\url{http://opensourcesurvey.org/2017/}}
}

Citations and Reuse

  • R. Stuart Geiger Summary Analysis of the 2017 GitHub Open Source Survey "presenting frequency counts, proportions, and frequency or proportion bar plots for every question asked in the survey."
  • The LibreOffice Design Team asked users what aspects of open source are important, using questions from the Open Source Survey. Their summary includes a comparison with Open Source Survey responses, and their data is also released under CC0-1.0.

open-source-survey's People

Contributors

amalloy avatar annafil avatar arfon avatar benbalter avatar bkeepers avatar bvasiles avatar caged avatar edmz avatar franniez avatar jdmaturen avatar mattyoho avatar mikemcquaid avatar mlinksva avatar mwarkentin avatar ncoghlan avatar peterdavehello avatar watilde avatar zafarella avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

open-source-survey's Issues

Github robots giving false positives

I am an active github user working with two organisations. I push code almost on daily basis and participate in issues discussion for various libraries. Recently github robots blocked my profile from public when I reported an issue for EduSec repo. I reported this problem at support and profile was restored in less than half an hour. I was actively working from my office and this blockage caused panic for me. I think the NOT SO HARMLESS robots are programmed with weird too-strict rules. It would be great if you guys can revise the rules, to avoid false positives.

BibTex entry for easier citation

The README.md in the "License" section states

If you use this dataset in a publication, a link to or citation of this repository would be appreciated.

Who should be attributed as the author(s), GitHub, Inc.?
Would it be possible to provide a BibTex entry or even a DOI number for the repository to allow easier citations?

E.g.

@misc{OpenSourceSurvey17,
  author = {GitHub, Inc.},
  title = {Open Source Survey},
  year = {2017},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/github/open-source-survey}}
}

I'd like to suggest a survey topic: financial support

I am interested in the proportion of projects that currently or have in the past received direct or indirect financial support for their work. I am also interested in whether this support was used to fund the time of contributors as opposed to incidental (hardware, travel, accommodation, food et al) costs. The current workplace topic:

Much open source development is subsidized [sic] by companies, who either employ engineers to work on open source projects, or allow them to work on those projects when they also provide value to the firm. We’d like to explore how supported developers feel by their employers to contribute to open source

explores some aspects of this but I think a wider interpretation/appreciation would provide a more comprehensive overview of how under financed this community is and help some move the discussion regarding under-supported, 'core infrastructure' forward.

Feedback on translations

We are looking for feedback on translations of the survey instrument. If you are fluent in any of these languages, we would love help reviewing the translations.

If you review a translation and it looks good to you, could you comment with a 👍 here.

If you find any issues with a translation, could you open a pull request to suggest changes and find at least one other person that is fluent in that language to check it out?

Update: There's also a glossary of terms.

Explain our design process

We should explain some of the principles we're using when designing this survey. For example, where possible, re-using existing survey language from past survey instruments to maximize the value of the dataset for the downstream consumers of the data.

We might also want to think about saying something about our planned sampling strategies?

@franniez - is this something you might be able to put together? It doesn't need to be long.

Feedback on topic: Consumption

As described in the README, we'd love to hear from you about any existing survey instruments you know of that that explore the survey topic areas we're interested in.

Please use this issue to let us know about existing survey instruments and research on Consumption:

Previous studies of open source communities have focused on maintainers and contributors, to the near exclusion of those who are solely consumers of open source projects. We know from analyses of traffic on ours site that the passive consumers of content on any public repository vastly outnumber the active contributors. Who are the consumers? What do they use these projects for? Why do they select open source projects instead of commercial products? To what extent are they engaged with, or even aware of, the philosophy and values of free and open source software?

Sexual orientation

"Queer" should be added to the lesbian, gay, bisexual question. Although this was a contentious word at one time it is currently recognized and adopted by a significant portion of people who are not straight.

Adding an "Other" answer to the questions about Harassment

I think the HARASSMENT-WITNESS and HARASSMENT-EXPERIENCE sections could benefit from an "Other" answer. For example, harassers can do the following acts to targets that are not clearly included in the current list: ddoss the personal site of another person or the site of their employer, ban from open source projects by revoking privileges, and incite others to (often anonymously) engage in threats and harassment. I think it would be helpful to allow write-in responses instead of prescribing the range of likely answers.

"Real" name?

The language in the survey instrument puts "real name" and "pseudonym" in contrast with each other.

What is the "real name" that you're talking about? Is it one's legal name (or a derivative, such as someone named Alice who goes by Allie), is it one's professional name (the name one's co-workers and boss calls one, and the name on your resume), or is it the name that one feels is most legitimately and deeply theirs (such as the name that their partner or closest friends call them)?

For many people, these are the same thing.

For those people where these are not the same, what are you interested in hearing about?

Feedback on topic: Demographics

As described in the README, we'd love to hear from you about any existing survey instruments you know of that that explore the survey topic areas we're interested in.

Please use this issue to let us know about existing survey instruments and research on Demographics:

How representative is open source of the software developer community? Of the world population? How do these characteristics correlate with experiences, motivations, and other aspects of participation in open source?

Feedback on topic: Mentorship & contributors

As described in the README, we'd love to hear from you about any existing survey instruments you know of that that explore the survey topic areas we're interested in.

Please use this issue to let us know about existing survey instruments and research on Mentorship & contributors:

Software development is an unusual trade in that autodidacticism is common and people often learn from working with or observing others who are not geographically proximate and will likely never meet face to face. Nonetheless, previous research has demonstrated that 1:1 help and mentorship are critical to building skills, confidence, and professional networks. How do beginners seek and select mentorship within the open source community? What motivates some individuals to make casual (one-off) contributions to new projects rather than finding a mentor and diving deep into a project and a community? How do people willing to mentor make that known, and how do they select mentees?

Expanding Demographics to Include Inclusion

The Demographics section of the survey seems to focus on diversity by asking about how representative the open source community is with regards to the developer community and the world population. However, what about expanding this to also discuss inclusion?

I think that talking about how representation in open source correlates "with experiences, motivations, and other aspects of participation in open source" is hints at the topic of inclusion, but can this section also explicitly ask something like, "How inclusive is open source when it comes to contributions from underrepresented groups?" Or, maybe, "Why do certain demographic groups not contribute to open source software development?"

As it currently stands, the Demographics section is like a downtown restaurant asking, "Who are the people who come into our restaurant at lunch time?" That's an important question to answer, but I think a useful followup question would be, "Why are some people not coming into our restaurant at lunch time?" That's why I think this section should be expanded to explicitly inquire about inclusion.

Suggest clarification around "first access to a computer with internet"

I suspect the questions about "when did you first have access to a computer connected to the internet" is a bit misleading when it comes to older developers like myself, who had access to computers before there was a public access internet to connect to.

Maybe this should be "first have access to a computer connected to the internet or other public networks such as BBSs"? Unless the intent is genuinely about use of the internet - in which case you still might need clarification that this is the case.

Q 30: Observations of events in the context of an open source project

This is a rally nice survey, I'm glad to have participated.

There is one question however, that made me a bit concerned, in particular Question 30. "Have you ever observed any of the following in the context of an open source project?".
The conclusions one can draw from this questions are obvious. At the same time I am convinced that in any project in any context, where humans are involved, one would answer every question with yes after having spent enough time within this setting.

I think these questions should allow answers on a scale between "never" and "very often" to allow estimating the importance of such events. Any conclusion drawn from the current answers (yes/no) will be completely meaningless.

Bad link in readme on this repo for @arfon's thesis (HTTP Status 404 - Not Found)

In the readme for this repo there is a link to @arfon's thesis: https://www.arfon.org/thesis
However this link is now dead (HTTP Status 404 - Not Found).
I've had a look on afron's website but cannot see this document under another link.

Could you advise @arfon if this link can be found elsewhere?
If not then possibly this link could be used instead(?): https://www.arfon.org/#-some-background

Extra info

I found this using a new tool that I recently recreated that uses the GitHub API:
https://github.com/MrCull/GitHub-Repo-ReadMe-Dead-Link-Finder

This tool can be used to find bad links in repo's readme files.

See example url below that will re-check this repo:
http://githubreadmechecker.com/Home/Search?SingleRepoUri=https%3a%2f%2fgithub.com%2fgithub%2fopen-source-survey

I'd love to hear some feedback on this tool, especially from someone at GitHub, so feel free to pass on any comments or suggestions etc...

turning GitHub.com into a intelligent "all-ware" metaframework

Heyho everyone

Github is already, within itself, an awesome tool. Thanks to it I've been able of engaging within open source development, engaging with new people, eventually even finding work. Even though, I think it would be awesome to turn GitHub into an intelligent allware metaframework. what do I mean with this?

  • that all GitHub code would be turned into a single framework, that could even be hosted online;
  • this code would be able to embed and cross bind any language;
  • high level actions would be turned into boxes, just like in scratch;
  • you could connect these boxes to make new apps, and then just compiling them out;
  • you could constantly add new functions in between;
  • this could all be done using a graphic interface similar to what you find in quartz composer, vvvv, touch designer, or the unreal engine blueprint
  • you could turn new user modules into new commands, and making apps out of apps
  • you could group actions by classes. and make mutant programs, by recombining all the called objects using objects from the same classes

Feedback on topic: Relationship to closed source development & the workplace

As described in the README, we'd love to hear from you about any existing survey instruments you know of that that explore the survey topic areas we're interested in.

Please use this issue to let us know about existing survey instruments and research on Relationship to closed source development & the workplace:

Much open source development is subsidized by companies, who either employ engineers to work on open source projects, or allow them to work on those projects when they also provide value to the firm. We’d like to explore how supported developers feel by their employers to contribute to open source i.e. are there IP agreements covering their open source contributions? If so, what do they look like?

In addition, we’d like to explore what the decision making process look like for companies adopting open source/incorporating an open source dependency in their technology stack.

Q40 makes no distinction between copyleft and non-copyleft licenses

Right now, Q40 reads

Which is closest to your employer’s policy on incorporating open source dependencies into your codebase?

Many employers (and their legal departments) are very skittish about incorporating dependencies on GPL-licensed (and copyleft generally) code, but accepting or even encouraging of using MIT- or BSD-licensed (non-copyleft) dependencies.

As it stands, Q40 is difficult to answer accurately because it makes no distinction between these two different categories of licensing.

Please identify yourself on the survey

The survey just says “we”. It doesn’t say that it is GitHub that is behind it.
Granted that the survey is accessed from GitHub, but I don’t think it would hurt
to be more clear about it. First of all, I thought that there might be some
third party that had permission to request stuff from GitHub users. Secondly,
one might open pages like the survey in a new tab and only get back to the page
later, at which point you might have closed the tab from which you accessed the
survey, and forgot the exact context in which you ended up clicking on the
survey.

Feedback on topic: Licensing

As described in the README, we'd love to hear from you about any existing survey instruments you know of that that explore the survey topic areas we're interested in.

Please use this issue to let us know about existing survey instruments and research on Licensing:

How do open source project maintainers select licenses? What considerations go into the selection of an open source license? Is the license (or lack of) of a project a significant factor when choosing whether or not to contribute to or use a project?

Feedback on topic: Transparency vs Privacy

As described in the README, we'd love to hear from you about any existing survey instruments you know of that that explore the survey topic areas we're interested in.

Please use this issue to let us know about existing survey instruments and research on Transparency vs Privacy:

Two strong values within the open source community are commitment to transparency as well as privacy and security of information. There is an inherent tension in these two values, as full commitment to either one limits the extent to which the other can be achieved. Let's explore the boundaries of this tension and establish what values the community holds and how they're ordered (https://en.wikipedia.org/wiki/World_Values_Survey related here).

"Libre", not "Libré"

The word does not take an accent in either Spanish or French.

(As a side note, the misspelling of "libre" and the participation of the OSI in the preparation of the survey does hint that the choice of using "open source" goes beyond mere "simplicity" as stated and suggests a closer affinity with one ideology over the other. The attempt at neutrality is still appreciated anyway.)

HSTS for opensourcesurvey.org

opensourcesurvey.org supports HTTPS as of #96. It would be great to go all the way with preloaded HSTS: https://hstspreload.org/?domain=opensourcesurvey.org

GH Pages supports HSTS headers, but it's not available externally yet so a Hubber like me needs to set it manually. Would it be alright for me to set the header to Strict-Transport-Security: max-age=63072000; includeSubDomains; preload?

Is the data available?

This is an issue in response to #71. I opened this issue with the hope that you will close it once the data is available in this repository. Anybody who watches this issue can this way get a notification if the data arrives.

Looking for off-site communities

We're looking for more off-site open source communities to participate in the off-site sample! If you are part of a community that works primarily somewhere other than GitHub (e.g. hosted elsewhere, or via mailing list, etc.), please fill out this form and we'll send you a community-specific link and email template to share.

Feedback on topic: Community safety

As described in the README, we'd love to hear from you about any existing survey instruments you know of that that explore the survey topic areas we're interested in.

Please use this issue to let us know about existing survey instruments and research on Community safety:

Online harassment is widespread problem that has discouraged participation in online spaces, such as those where open source development takes place, by women and other underrepresented groups. While there have been high-profile cases that generated substantial coverage in the press, and our support team is aware of cases that are escalated to GitHub, there has been no representative study of the prevalence, dynamics, and consequences of online harassment in open source.

Collection of references to existing developer & open source community surveys

The design guidelines make reference to considering (and potentially aligning with) other surveys of developer communities and open source communities, so I figure it makes sense to start an issue to let folks submit information on such surveys that they're aware of. With this survey planned to be delivered in multiple languages, hopefully folks may be able to highlight non-English language surveys as well (my own are all in English).

To start the list, some general surveys:

And some project & technology specific surveys:

These are some older UK focused surveys I hadn't seen before, but came across just now:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.