This issue is about issue metadata (priority, versions, status, etc.), how/where to import them in GitHub, and what metadata to keep/add/remove/update. User/comment/file metadata will be discussed in a separate issue.
bpo tracks different metadata for each issue (see e.g. https://bugs.python.org/issue2771 ) including: title, comments, files (attachments), creator, creation, actor, activity, type, stage, components, versions, status, resolution, dependencies, superseder, assigned to, nosy list, priority, keywords, remote HG repos, linked PRs
The meaning of each field is explained in the devguide. The fields are defined in the schema.py of the bpo instance. The creator, creation (datetime), (last) actor, (last) activity (datetime) are common to all classes.
-
GitHub already has corresponding fields for the followings: title, messages (comments), linked PRs, assigned to (assignees), creator (user) and creation (created_at).
- bpo stores messages as a list of id on the issue, GitHub has a separate list of comments linked to the issue
- GitHub issues have a body that contains the first comment
- Linked PRs seem to be generated automatically at runtime, not at import/export time
-
❓ Does GitHub have fields for (last) actor, (last) activity (datetime)? Do we need them?
- ✔️ there is an updated_at field (datetime), but no last actor. We probably don't need the last actor.
The other fields will need to be replaced with something else (mostly labels) or removed.
Labels in GitHub can be grouped either with colors, and/or with a prefix like priority-high
, priority-medium
, priority-low
. GitHub is working on adding custom fields, but they will be available in ~6 months.
Actions can be used to automate certain tasks in addition or instead of bots (e.g. adding labels, closing stale issues, etc.).
Unused metadata that are not converted to labels (or anything else) can be stored in a comment so that can be retrieved if needed (e.g. if we move away from GH).
On the python/cpython there are currently 32 labels:
- 5 stage labels (yellow), apparently set by bedevere-bot:
awaiting change review
, awaiting changes
, awaiting core eview
, awaiting merge
, awaiting review
- 6 type-related (blue/red) labels:
type-bugfix
, type-documentation
, type-enhancement
, type-performance
, type-security
, type-tests
- 5 version-related (gray) labels for backports (used by bots):
needs backport to 3.6
-3.10
- 5 more labels used by bots:
automerge
, DO-NOT-MERGE
, skip issue
, skip news
, test-with-buildbots
- 2 CLA-related labels (used by bots):
CLA not signed
, CLA signed
- 2 OS-related labels:
OS-mac
, OS-windows
- 7 more misc labels:
invalid
, ctypes
, dependencies
, expert-asyncio
, spam
, sprint
, stale
This is the full list of all the fields we have in Roundup, and how we could convert them to GitHub Issues:
- The exporter creates an event when the title has been updated
- The exporter exports comment author, content, and date.
- See #3 for more info on the msg content.
- Files will still be hosted on bpo
- The exporter will create direct links to the files
- The exporter sets the Assignees field and creates events when the assignee changes
- These can not be imported and the list can't be populated automatically
- PRs are now listed in the table at the top of each imported issue
- To replace the nosy list users can (un)subscribe to individual issues, and can be @mentioned.
- The nosy list users are listed/mentioned in the table at the top of each issue, but this doesn't affect subscriptions.
- ❓ How can we preserve the initial nosy list? @mention all nosy list users in the first message?
✅ it's possible to subscribe people to the issue without sending out any notification when the issue are imported, and enabling notification afterwards so that they will get updates.
- ❌ Subscribing other people is not possible, but it might be possible to retrigger mentions by editing the imported messages to have them notified.
- #12 might also help
- ❓ How can we replace the nosy autocomplete?
- ✅ probably not possible, but GitHub suggests reviewers and there is a CODEOWNERS file
- ❓ Can we automatically add people when a certain label is added?
- ✔️ this is now possible, see #16
- ❓ What options do we have to track dependencies with GitHub? (Projects might be one way, but they are probably overkill for simpler cases -- other ways?)
- ❌
currently there is no built-in support for dependencies, GitHub might add it later.
- ✔️ It is now possible to add a checkbox list of issues, and GitHub will track them as tasks (won't enforce closing all the dependencies before closing the issue though)
- ❌ this doesn't work in tables, so either we list them in a table as a plain list with no checkboxes, or the list of deps should be moved after the table. Since these are
bpo-xxxxx
issues, even if they are moved after the table the checkboxes won't be updated automatically.
- Dependencies are now listed on the table at the top
- Projects/milestones could also be used to track complex issues that are broken down in multiple issues.
- ❓ Does GitHub has a way to mark an issue as duplicate?
- ✅ writing
Duplicate of #xxxxx
as a reply marks the issue as duplicate. A default "duplicate" reply can also be added to the saved replies (the icon with the left-pointing arrow on the top-right).
- ❌ This doesn't work with
bpo-xxxxx
ref, so it can't be used for imported issues
- ✅ we might be able to replace the
bpo-xxxxx
ref with a GH ref after the migration
- The superseder is now included in the table at the top
- If the link still works, these should be converted to a PR (or a patch)
- ❓ Do we need to import the link into GitHub?
- there are currently 340 valid links and 228 unique ones
- of the 228 unique ones, 88 are reachable, 125 are
404
, and 14 are unreachable
- of the 88 that are reachable, 55 are hg.python.org links, 26 are GH/Gist links (so invalid HG links, but might contain a valid patch/branch), and 7 link to other repos
- I could add a "linked repos" row to the table, a simple link to the bpo issue that says "There are repos with patches linked to the original issue", or just ignore them.
- There are currently 7 types on bpo: behavior, crash, compile error, resource usage, security, performance, enhancement
- There are currently 6 type-* labels on GitHub: type-bugfix, type-documentation, type-enhancement, type-performance, type-security, type-tests
so:
- type-bugfix seems to replace behavior, crash, compile error
- type-enhancement, type-performance, and type-security replace the corresponding fields
- resource usage is gone (possibly included in type-performance)
- type-tests and type-documentation are set automatically for
test_*.py
and *.rst
files (not sure if they should be types -- they were components on bpo and got added in python/bedevere#108)
The stage could use the existing stage labels. An awaiting triaging might be added.
-
There are currently 3 statuses: open, pending, closed
-
Events are now created for closed/reopened issues
-
Issues are labeled with the stale label when pending
- There are currently 27 components: 2to3 (2.x to 3.x conversion tool), Argument Clinic, asyncio, Build, C API, Cross-Build, ctypes, Demos and Tools, Distutils, Documentation, email, Extension Modules, FreeBSD, IDLE, Installation, Interpreter Core, IO, Library (Lib), macOS, Regular Expressions, SSL, Subinterpreters, Tests, Tkinter, Unicode, Windows, XML
- ❓ People can be automatically added to the nosy list when a component is selected, can we automatically do the same with labels?
- There are currently 5 versions: Python 3.10, Python 3.9, Python 3.8, Python 3.7, Python 3.6
- Versions need to be added/removed as new versions of Python are released/retired.
- ❓ Do we want to keep versions?
- There are currently 11 resolutions: duplicate, fixed, not a bug, later, out of date, postponed, rejected, remind, wont fix, works for me, third party
- ❓ Do we want to keep resolutions?
- There are currently 6 priorities: release blocker, deferred blocker, critical, high, normal, low
- We might be able to get rid of this field and use milestones for release/deferred blocker.
- ❓ Can we automatically warn release managers somehow?
- ✅ if we keep the release/deferred blocker labels we could set autonosy for the RMs (see #16)
- ✅ we could use milestones/projects to track release/deferred blockers for each release and the RMs can use/follow those more easily.
- There are currently 17 keywords: 3.2regression, 3.3regression, 3.4regression, 3.5regression, 3.6regression, 3.7regression, 3.8regression, 3.9regression, buildbot, easy, easy (C), gsoc, needs review, newcomer friendly, patch, pep3121, security_issue
- ❓ Do we want to keep any of these?