Coder Social home page Coder Social logo

data-review's Introduction

Forms for Firefox Data Collection Review Process

This respository contains templates for the Firefox data collection review process.

New Firefox data collection (for the client, e.g. telemetry) and services (e.g. Firefox Accounts) must be reviewed and approved prior to deployment of collection code. Our data collection review process is designed to ensure that data collection meets our data and privacy policies and that there is sufficient documentation for all data collection in Firefox.

If you are seeking review for new data collection, please use the request.md form in this repository. Data stewards should fill out the review.md form in this repository in response to a request. We provide both forms so that requesters know what stewards are looking for when performing a review of a request for data collection.

You can read more about the process and view a current list of data steward peers here: (https://wiki.mozilla.org/Data_Collection)

data-review's People

Contributors

badboy avatar brizental avatar chutten avatar kennylong avatar liuche avatar mozilla-github-standards avatar rfk avatar rjweiss avatar tdsmith avatar travis79 avatar wlach avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

data-review's Issues

Create a question around private browsing behavior

For Firefox measurement exclusively at this point, consider amending the existing data collection request forms with a question directed at providing a description of how the measurement should behave in private browsing mode.

Wiki changes

FYI: The following changes were made to this repository's wiki:

These were made as the result of a recent automated defacement of publically writeable wikis.

[request_renewal.md] doesn't require responsible individual

If a probe is being renewed and the expiration is being changed to forever, then the requester needs to specify a responsible individual.

The request.md form has this:

data-review/request.md

Lines 44 to 50 in fe0b96a

7) How long will this data be collected? Choose one of the following:
* This is scoped to a time-limited experiment/project until date MM-DD-YYYY.
* I want this data to be collected for 6 months initially (potentially renewable).
* I want to permanently monitor this data. (put someone’s name here)

7) How long will this data be collected?  Choose one of the following:

* This is scoped to a time-limited experiment/project until date MM-DD-YYYY.

* I want this data to be collected for 6 months initially (potentially renewable).

* I want to permanently monitor this data. (put someone’s name here)

The request_renewal.md form has:

2) When will this collection now expire?

2) When will this collection now expire?

I claim the two should be in sync and we should use the same language.

CODE_OF_CONDUCT.md file missing

As of January 1 2019, Mozilla requires that all GitHub projects include this CODE_OF_CONDUCT.md file in the project root. The file has two parts:

  1. Required Text - All text under the headings Community Participation Guidelines and How to Report, are required, and should not be altered.
  2. Optional Text - The Project Specific Etiquette heading provides a space to speak more specifically about ways people can work effectively and inclusively together. Some examples of those can be found on the Firefox Debugger project, and Common Voice. (The optional part is commented out in the raw template file, and will not be visible until you modify and uncomment that part.)

If you have any questions about this file, or Code of Conduct policies and procedures, please see Mozilla-GitHub-Standards or email [email protected].

(Message COC001)

[request.md] Highlight that "The Firefox Data Collection Preference" is an acceptable option for Question 8

Question 8 reads "If this data collection is default on, what is the opt-out mechanism for users?"

The most common answer I want to see from people is "The Firefox Data Collection Preference" (because I get a lot of bog-standard Telemetry requests). Sometimes requesters are confused and ask if they need to implement something separate from the global telemetry opt-out if they're adding opt-out telemetry (they don't. Please don't do this.)

Shall we add this as a preferred/recommended option?

[wiki] Category 3 language for opt-out eligibility on release is a little awkward

A reader writes in noting some confusion on the following paragraph:

Release: Default off. May be eligible for opt-out on a case-by-case basis if mitigations are identified. Mitigations may include UX changes that make users aware of additional risk, technical mechanisms that remove the risk, or a risk assessment done of a case-by-case basis that determines the risk is limited.

The first and easiest to remedy is "risk assessment done of a case-by-case basis". This should probably be "on a case-by-case basis".

The second is that the reader wasn't clear on why, if it's default off, that we say it is eligible for opt-out. To use their words directly:

If the default is off, shouldn't the option be to opt-in instead of opt-out, or do I misunderstand something?

I think we can restructure this slightly to make it clear that "Default off" is not only the default setting but is the default posture for collecting Category 3 data on release. We can maybe use formatting or word choice to highlight that mitigations to allow opt-out are an exception, not a clarifying statement.

[wiki] Users have expressed confusion about Step 1

Users have come to me for clarification about how to file for Data Collection Review. The topics of confusion usually manifest as questions like

  • "So am I supposed to just copy the form?"
  • "Can I reuse the bug where I'm adding the probe?"
  • "Does bugzilla support markdown now?"

I think this stems from a couple of wording choices and some perceived rigidity. For instance "clone" means something in git, so its use for the form suggests that users should fork the repo or create a gist or something, which is in conflict with the following substeps.

I propose a rewording:

To request a review, Data Review requesters require the following:

  • A completed Request Form, documenting what data is to be collected, why Mozilla needs to collect this data, how much data will be collected, and for how long it will be collected:
    • Take this request and fill it out completely (It is in markdown for easy reading of the template, but we understand bugzilla will likely render it as plain text. That's fine.)
  • A bug to attach the completed Request Form to:
    • Attach the completed Request Form to the bug, or paste the contents as a comment in the Issue
    • Tip: reuse the bug you filed to add the collection code
  • A notification so the Data Steward knows it's time to review your Request Form:
    • r? the attachment, ni? the bug, or @ the Data Steward on the GitHub Issue
    • If the Data Steward doesn't get back to you within a couple of days, please email the list

Bikeshed: begin

The form is too long and discouraging me from adding telemetry probes

tl;dr This form imposes on everybody the burden of the process necessary for the most dramatic cases, which adds more work to an already complicated procedure.

First of all, I'm proud that we at Mozilla have managed to create an internal culture of seeing user privacy protection as one of our central principles (and in fact my job mostly revolves around protecting user privacy in Firefox). Thank you for your work on data stewardship.

Adding a telemetry probe has never been quick and easy. The simple technical limitation that it can't be done using artifact builds, the (rightfully) short expiration time and the data steward review process have provided some necessary overhead that I anecdotally know have discouraged the addition of some "trivial" telemetry probes so far.

This form complicates things even further.

I know that the intention behind this is good (if I understand it correctly it's intended to be like the uplift request comment template on Bugzilla). In fact I was always a bit uncertain how to properly request data-review, so I can get fully behind a more formalized process.

But the questionnaire in its size and tone (and the fact that it's not a Bugzilla comment template) makes me urgently want to do something other than add a new telemetry probe to Firefox.

Excerpts I'm skeptical about (note that these complaints are specifically about adding telemetry to Firefox, this looks like it could be used for other things as well):

All questions are mandatory.

Doesn't set a great mood. "Please fill out all questions"?

  1. What alternative methods did you consider to answer these questions? Why were they not sufficient?

For harmless data, this question feels inappropriate. This question should only be asked when the data we are collecting is in fact category 3 or 4 data. For other categories the honest and correct answer to this is "we didn't consider alternative methods, why should we?". This kind of question should be left to the data reviewer, IMO.

  1. Can current instrumentation answer these questions?

I probably just misunderstand, but what's the difference to number 3?

  1. List all proposed measurements and indicate the category of data collection for each measurement, using the Firefox data collection categories on the found on the Mozilla wiki.
    Measurement Description | Data Collection Category | Tracking Bug #

The table there shows a "Tracking Bug #", what is that supposed to mean?

  1. How long will this data be collected? Choose one of the following:

Firefox telemetry has an expiration version on every probe.

  1. What populations will you measure?

Have data stewards historically had troubles finding out about this? (Honest question, this might be a good thing to ask, I would just like to find out while I'm here).

  1. Please provide a general description of how you will analyze this data.

Why bother? How would my answer influence the decision? Almost all the data is public, anyone can do any kind of analysis with it after it gets recorded, right?

  1. Where do you intend to share the results of your analysis?

See 8., is this question necessary for public data?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.