timokau / marvin-mk2 Goto Github PK

Discontinued! See https://github.com/timokau/marvin-mk2/issues/34#issuecomment-1100656280 (Previously: "Making sure your PR gets a review and your reviews don't get lost.")

Home Page: https://github.com/apps/marvin-mk2

License: MIT License

Nix 4.42% Python 95.54% Procfile 0.04%

bot github nixpkgs

marvin-mk2's People

Contributors

Stargazers

Watchers

Forkers

mic92 ryantm blaggacao fgaz kevincox bennyandresen ekleog asymmetric lunnova

marvin-mk2's Issues

Marking checklist

We often have checklist like this: NixOS/nixpkgs#55370 (comment)

but the issue is that only nixpkgs-committers can mark them as done or relate issues to them. There could be a bot command that allows users to say something like:

@marvin-mk2 mark #55370 nixos/modules/services/network-filesystems/samba.nix

And than the bot just does simple pattern matching in the first issue post for this checkbox.

PR reviews

Should work in PR reviews. I thought I covered this (listening to the pull_request_review_comment event), but apparently not: NixOS/nixpkgs#85023 (comment)

Store the reviewer team in the nixpkgs repo

This would make it easier to update (without me as the bottleneck and not requiring a deployment). It would also potentially make it more discoverable (could be integrated with maintainers.nix, though it would lose some expressiveness in that case) and make testing easier (different set of reviewers for the playground so that people are no longer pinged on accident).

Allow reviewers to tell marvin to opt-in

Is there a typo in NixOS/nixpkgs#92152 (comment) ? Should I be part of the team?

Make "active PRs below limit" check more precise

It currently uses github's search to filter by dates. It would be better to be more precise, since being assigned one PR one minute before midnight and another one one minute after isn't quite the intention.

Look into a testing approach

#25 could've been prevented by tests, even if they just test the hooks on some github sample payloads. Having such sample payloads in the repository would be useful anyway.

https://github.com/python/bedevere does that.

pr got approved, but not merged. bot does nothing

the bot should do something to get the pr merged

NixOS/nixpkgs#77009

the bot could set /status needs_merger, when a reviewer has no commit rights and approved

maybe there should be a rule that a pr should have at least 2 approvals before it can get merged, so an impactful pr get's not merged by the first reviewer

Do not distinguish between '-' and '_' in /status command

After using opt-in using - feels more natural than _. Accepting both avoids users doing stupid things like here.

Plus, with some keyboard layouts, - can be entered without a modifier key, where e.g. _ needs -+ Shift.

Notify reviewers when awaiting_changes->awaiting_reviewer through code changes

GitHub doesn't always notify of those, for example I didn't receive a notification for the force-push in NixOS/nixpkgs#85115. There is a tradeoff with noisiness and false-positives here.

Automatic state transitions

Pushing a change should reset any review. Maybe github approvals should count as a review, though that's debatable since people interpret them in different ways.

Marvin for issues

Hey!

This is again an idea that can safely be postponed to after marvin with its current feature set lands.

I'm thinking, something similar to marvin would probably make sense too for issues too? With a state graph that'd look like:

needs_feedback: can become needs_more_details or needs_pr
needs_more_details: can become needs_feedback
needs_pr: equivalent of needs_merger, can become needs_more_details or has_pr
has_pr: can become needs_more_details, needs_feedback or needs_pr

What do you think? There'd basically be three roles:

reporting issues: equivalent of PR author for PRs
giving feedback, asking for more details or a reproducer: equivalent of PR reviewer in the current system
creating a PR for solving an issue: equivalent of PR merger in the current system

The allowed transitions are:

needs_feedback -> needs_more_details or needs_pr: can be done only by “issue reviewers” (role 2)
needs_more_details -> needs_feedback: can be done by anyone (to avoid an unresponsive issue opener to block other people concerned with the issue from getting resolution)
needs_pr -> needs_more_details: can be done by PR creators (role 3) (closing the issue is done by just having “closes #issuenumber” in the created PR)
has_pr -> needs_more_details: can be done by roles 2 and 3
has_pr -> needs_feedback: can be done by anyone (eg. if the PR doesn't actually solve the issue)
has_pr -> needs_pr: probably should eventually be done automatically if the PR is closed, though in a first time this should probably be manual.

(I originally added a thing for “needs_reproducer,” and then decided it was not worth the complexity increase in the state transition graph)

Note that this would probably requires a special handling system for darwin-specific things, as I don't expect people can make PRs for darwin without having a darwin system themselves.

And then marvin could become like stale bot but better, by “stale”-ing only issues that are being blocked on needs_more_details as this is the only state in which it makes sense to stale issues.

What do you think?

(Again, this is exploratory only, I definitely believe there's strong value in marvin as it currently is, just taking advantage of the bugtracker to track ideas for future long-term improvements)

Handle merge conflicts

It would be nice if marvin automatically handled merge conflicts, for example by asking the author to resolve them and set the status to awaiting_changes. References:

Request reviews on old PRs

If no reviewer was available when the state changed to needs_merge (or needs_review once that is implemented), we should go back after some time and try to find a reviewer again.

Short forms

E.g: /status needs_review -> /s nr

Might supersede #31 ex resultatio.

Marvin should not change the status when ofBorg requests a reviewer

Here it changed the status to awaiting_reviewer. From my experience, ofBorg requested maintainers are often unresponsive. So until they actually start the review we should treat the PR as needs_reviewer.

Marvin is pretty verbose/spammy

Currently, you need to enable Marvin via commands in the PR description or a separate comment when you forget to include it in the PR like I always do. It then adds a comment describing itself, adds two labels (separately but GH does batch them after a short while thankfully), requests a review from a suitable reviwer and then changes labels to reflect the status after its action.

You can find a block like this at the very top of pretty much all Marvin PRs:

I think that is pretty verbose and could be simplified:

The comment commands are nice but the boilerplate setup that needs to happen in every Marvin PR could also be implied by the PR owner setting the marvin label on their PR. Simple and unintrusive.
needs_reviewer is implicit when marvin is first enabled (every PR needs at least one reviewer) and this is the only entry point AFAICT. Marvin adds this label only to remove again it seconds later if it finds a suitable reviwer.
Marvin doesn't need to explain itself in every PR. If it's not immediately obvious what it does from its actions or if one wants to find out more, a user can simply click on Marvin's name.

I think it'd be better if the boilerplate was limited to:

Owner sets marvin label
Marvin requests review
Marvin sets awaiting_reviewer label

Evict marvin from draft PR

marvin should evict draft PRs entirely.

Arguably, marvin's state machine is meant to kick in and push on the "last mile".
Draft PRs are a great way to signal "don't bother looking at this yet".

Example of awaiting_changes - in the context of marvin's state diagram - signalling a review process while being draft / [ WIP ]: NixOS/nixpkgs#58230

Idea (once marvin graduates into opt-out):

draft (thus evict) all new PRs upon creation
kick in marvin's state machine upon an activation intent by the author
- /status needs_review or a semantic analogon, like /marvin go (which has going viral potential)

PR not put back to 'needs_reviewer'

In NixOS/nixpkgs#98938 (comment) , the reviewer is inactive, and marvin (correctly) adds a comment that the PR will be set back from awaiting_reviewer to needs_reviewer, but it doesn't actually do so.

No reaction after 20 minutes

They should at least add a label i think.

NixOS/nixpkgs#89124 (comment)

Take check status into account

Triage could automatically set to awaiting_changes and post an appropriate message when checks failed. Assigning a reviewer could be postponed until checks succeed.

https://docs.github.com/en/github/searching-for-information-on-github/searching-issues-and-pull-requests#search-by-commit-status

Planned changes on the status graph

Trying to implement #23 has caused me to re-think the status labels. The intention from the beginning was to make PRs that need somebody (either any reviewer or a reviewer with commit bit) easy to find for the relevant people.

The needs_review and needs_work labels kind of do that, but they are not specific enough. We don't really care that much about the status of a PR which is currently in review. What do care about is which PRs are currently in need of a new reviewer.

Supporting this needs a bit of a more complicated state graph. Here's what I have come up with:

The two interesting states are (2) and (5) (highlighted in blue). They entirely unambiguously mark PRs that need somebody new to look at them. This makes them easier discoverable for reviewers with some time to spare. It also makes it easy for marvin-mk2 to automatically triage them. The plan is to regularly

Search for (3), (6) sorted by least recently updated. Timeout those that are stale for let's say >=3d.
Search for (5) sorted by oldest (since least recently updated could create perverse incentives). Assign privileged reviewers as long as available (i.e. people who have signed up as a reviewer and whose rate limit has not yet been reached)
Search for (2) sorted by oldest (since least recently updated could create perverse incentives). Assign privileged or unprivileged reviewers as long as available.

Also note that only a single transition (the green one) needs manual intervention. People seem (understandably) a bit hesitant to spam PRs with /status messages, so this should be an improvement. Manual changes should still be possible if the heuristics fail.

Regarding naming of the labels, I'm thinking
(1) -> unlabeled / work_in_progress
(2) -> needs_reviewer
(3) -> awaiting_reviewer
(4) -> awaiting_changes
(5) -> needs_merger
(6) -> awaiting_merger

How to handle typos

When you mistype a /status something command and then edit the comment to correct it, this change is not picked up by marvin.

Do not transition needs_reviewer -> awaiting_changes on self-review

Example: NixOS/nixpkgs#107148. We have to change the line

marvin-mk2/marvin/status.py

Line 105 in 34c1b7f

elif "needs_reviewer" in labels:

to add a check for the author of the review.

We already make an exception on issue_comment and pull_request_review_comment, but its missing for pull_request_review_submitted.

Request reviews on needs_review

This needs a bit more care than needs_merge requests. It likely has a higher frequency, and we'd want to preferably re-request the previous reviewer after a needs_review->needs_work->needs_review iteration. Probably other things to consider too.

Switch to awaiting_changes when converted to draft

As suggested by @cole-h here: timokau/marvins-playground#21 (comment)

Dont transition to awaiting changes for simple comments

NixOS/nixpkgs#78451 (comment)

(I think #49 would avoid this, by acting upon github "unresolved reviews")

Call for beta-testing reviewers

I just pushed a new version with a new usage model. This new model now makes it possible to assign reviewers and automate triage. For now I am the only reviewer. I'd like to have a handful more to scale up the test. Not too many though, as everything is still in development.

As a reviewer you can set a limit on how many active nixpkgs PRs you want to be involved in. For example if you have already created 2 PRs today and reviewed one more, you are involved in 3 PRs in the period of one day. If you have set a limit of 4PRs/day, marvin will still request one additional review.

4PRs is likely too much, start slow. Be aware that you're not only signing up for reviews, but also for marvin-mk2 beta testing. Things might go wrong, though I try to avoid that of course.

For now, I will get pinged on every PR that is marked as needs_merger. Your reviews will lead to something. Once that gets too much for me, I'll look for help on that front too.

The easiest way to get started is to just make a PR and add yourself here. The format may not be the most understandable, so I can also do that for you if you prefer.

Any interest @cole-h, @turion, @doronbehar, @glittershark, @symphorien, @fgaz, @evils, @JJJollyjim, @bbigras, @mdlayher?

Document the testing process

Hey! In the same idea as #112 ; would it be possible for you to write some documentation for how you test marvin when you develop on it? Creating multiple github accounts / repositories is totally OK, I just have no idea how to run marvin myself :)

~topical reviewers?

I saw someone in #nixos ask specifically for a reviewer comfortable with C, and it just made me wonder if the ability to summon another reviewer who knows more about a specific language or software stack would be a useful ability for marvin in the longer term.

I would guess it's actually a reviewer command, so that it's only used if it poses domain-specific questions that a general reviewer can't answer.

Schedule triage runs more intelligently

Currently it just runs whenever requested (when a relevant state was set) or after one day. It might also run whenever reviewers become available (maybe conditioned on the cached number of needs_reviewer PRs) (#71) or whenever we predict time-out action to be necessary. As a result, reviewer would be assigned more quickly and timeouts would happen more timely.

Transition needs_reviewer -> awaiting_reviewer on non-author comment

NixOS/nixpkgs#92243 (comment)

Do not transition to awaiting_changes when another review is requested via @mention

Example: NixOS/nixpkgs#92243 (comment)

Maybe the transition should only happen on explicit change request. That would increase false-negatives, but that might be okay. False positives is what we should avoid here, since awaiting_changes takes a PR out of the triage cycle.

Simplified state diagram

Coming from #29, superseding #40, abiding by #29 (comment) :

automatic reviewer assignment (making sure no PR gets drowned out in the queue),

merger assignment (making sure reviews of non-committers mean something),

easy discovery of PRs that need attention and

automatic triage (making sure no PR gets lost due to inactive reviewers).

Also supersedes #22, #44

Link To SVG

Notes:

/marvin go menas to gitub.undraft and LastMile.label
/marvin merge means to assign another github reviewer (really: "merger") , which has committer rights

marmaidjs source

stateDiagram

  %% definitions
  Draft : Github
  Draft : Draft

  NoReviewPending : Github
  NoReviewPending : NoReviewPending ("all resolved")

  Merged : Github
  Merged : Merged

  Closed : Github
  Closed : Closed

  Stale : Stalebot
  Stale : Stale

  LastMile : Marvin
  LastMile : LastMile.label

  LastMeter : Marvin
  LastMeter : LastMeter.label

  %% going stale
  Draft --> Stale
  Stale --> Closed
  Closed --> [*]

  %% going marvin
  Draft --> LastMile : `/marvin go`.commented - anyone
  LastMile --> Assigned : <mv assignment heuristic>.done
  Assigned --> NoReviewPending : <github transition>.done
  NoReviewPending --> Merged : <manual action>.done - committers only
  NoReviewPending --> LastMeter : `/marvin merge`.commented - reviewers only
  LastMeter --> Merged : <manual action>.done - committers only
  Merged --> [*]

  %% internal marvin states during state Assinged & LastMeter
  state Assigned {
    _ReviewerIsUnresponsive : Marvin
    _ReviewerIsUnresponsive : ReviewerIsUnresponsive.internal_state
    [*] --> _ReviewerIsUnresponsive : <3 days w/o review>.passed
    _ReviewerIsUnresponsive --> [*] : <mv assignment heuristic>.done
  }
  note left of Assigned
      Github ...
  end note


  state LastMeter {
    _MergerRequested : Marvin
    _MergerRequested : MergerRequested.internal_state
    [*] --> _MergerRequested
    _MergerRequested --> [*] : <mv assignment heuristic-committers>.done
    --
    _MergerIsUnresponsive : Marvin
    _MergerIsUnresponsive : MergerIsUnresponsive.internal_state
    [*] --> _MergerIsUnresponsive : <3 days w/o merge>.passed
    _MergerIsUnresponsive --> [*] : <mv assignment heuristic-committers>.done
  }
  note left of LastMeter
      Marvin Label ...
  end note

Features:

only introduces two actionable labels:
- LastMile -> in review
- LastMeter -> towards merge
subjective reviewer primarily actionable filters are "one of those labels combined with assignment to self"
subjective reviewer helping hands filters are "one of those labels; optionally with or—explicitly—without pending review"
subjective author needs action filters are "one of those labels combined with pending review"

~~Short forms: (#41)~~

~~- /marvin go -> /m g~~
~~- /marvin merge -> /m m~~
~~- /marvin delegate @timokau -> /m d @timokau~~

Reviewer pool

One exciting addition to this would be a reviewer pool. People could enter themselves, their permissions (commit bit or not) and how many review requests they want to get. Marvin could then assign reviewers from the pool as long as they are available.

Bot not activated if commands are in initial PR comment

Have a look at this example: NixOS/nixpkgs#98084

I needed to say the commands in a separate comment before the bot was activated.

Set awaiting review upon opt-in

If a reviewer was already assigned at the moment I /marvin opt-in, marvin should transition to awaiting_reviewer-state.

Avoid re-requesting the same reviewer

See this PR: NixOS/nixpkgs#91662

The bot has added and removed the same labels a few times. Maybe that was just for debugging purposes, or maybe it's a bug.

Label for approved by maintainers/users

Sometimes maintainers/users approve a PR but cannot merge it.
I wonder if there should be a flag for that so people with merge rights
can go quickly over a list of tested PRs and merge them.
This might be especially interesting for ryantm bot updates.

Avoid labeling race conditions

This is especially relevant after #76. When somebody leaves multiple comments in a pull request review, all of them (and the final review submission) can trigger the same -> awaiting_reviewer transition. They will race, and all but the first one will raise an exception since they try to re-remove the old label.

Not sure how best to prevent this. Just ignoring the error is an option, but not a very nice one. Another option would be to re-fetch the issue before setting a label, but that only tightens the window for the race condition.

Switch to needs_work when changes are requested

Approvals probably should not generally mean needs_merge, but change requests are pretty unambiguous.

add blacklist for comment-triggered transitions

comments by bots such as nixos-discourse should not trigger transitions such as #75

RFC-30 / RFC-0036 / RFC-0039 / RFC-0051 / RFC-0055

Marvin is basically a trial run of rfc 30. (comment)

NixOS/rfcs#30

Make label compatible with existing ones

Our current labels start with number for priority and also have a color scheme according to priority: NixOS/nixpkgs#92249
Marvin should also follow the same scheme.

Add /marvin delegate @[...]

NixOS/nixpkgs#92243 (comment)

(Aside: marvin could apply those same heuristics exposed in the comment on git logs to assign in the first place.)

Enables reliable identification of #44

I would argue a best-effort manual delegation by the delegating reviewer (which is marvin enabled, thus trustworthy) could be opt-out for the targeted reviewer. That implied those reviewers need not necesarily be in marvin's internal maintainer list.

Ping existing reviewers before requesting new ones

Currently marvin times out awaiting_merger and awaiting_reviewer to needs_reviewer after 3 days. In most cases, the reviewer probably just forgot and could use a reminder. Maybe we can remind them after 2 days (this would likely need an extra label, like timeout_started (better name TBD)). The text might suggest different actions, such as

To the best of my knowledge, this PR has been blocked on reviewer feedback for two days now. Reviewers, please have another look.

If you have pinged someone with specific expertise and they have not responded yet, consider looking for a different person with similar expertise. If this PR is not blocked on reviewer feedback but on action from the PR author, change the status with /status awaiting_changes. If you want a new reviewer now, change the status to with /status needs_reviewer.

Lacking and update until tomorrow, I will set the status back to needs_reviewer.

Stick to awaiting_changes as long as the PR is in draft

#22 is only a one-time transition, but it should probably hold until the PR is moved out of draft. This reminder should not have happened (though it apparently wasn't unwelcome in that situation).

Cache reviewer-availability "no"s

If a reviewer is unavailable, we can also determine the earliest time to re-query (since we know how old the oldest relevant PR they are involved in is and when it will "drop out" of the search). This might be necessary, since GitHub rate limits searches relatively tightly.

Put cheat sheet with common commands on marvin website

On https://github.com/apps/marvin-mk2 it would be great if there was a short list with the typical commands

Improving on the rate-limiting algorithm

Based on the comments on #101, it sounds like currently marvin's rate-limiting algorithm is based on “involves the person and was updated in the last N days”.

This attempts to limit the amount of notification, and thus makes sense.

However, IMO it would make more sense to instead limit the amount of comments a person is expected to make: we can't count notifications from the discourse anyway, etc. So IMO instead of trying to count the notifications a person received (and necessarily failing at it because github does not say who (un)subscribed), it might make sense to instead count the messages a person sent in the last N days (plus the tags marvin explicitly sent / reviews explicitly requested by PR authors, maybe).

This way, if one is passive in nixpkgs interactions (eg. due to commenting on one or a few high-traffic nixpkgs issues/PRs), they would get new marvin requests for things that are actually actionable.

What do you think?

(Note: this is just a prospective issue, for gathering ideas. IMO the current rate-limiting algorithm is good enough to go in production, and then it'll be possible to iterate on it — assuming that the reviewers list can somehow get pushed to nixpkgs so that people could easily adjust their reviewing settings)

Tag release-blocking issues

Hey! From my experience with it I'm thinking Marvin is really great already, and could probably already be released on some percentage of all incoming github PRs (just changing the regular-poke to something that sounds more like “people needing to /status needs_reviewer when reviewers don't answer is by design and not a bug”; and maybe making the introductory message more self-explanatory for beginners — verbosity is rarely a bad thing).

So I wonder: which of the current issues, in your mind, should block releasing Marvin to eg. 5% of all incoming PRs? (well… or at least publishing an RFC to that effect?)

If I can find some spare time I'd love to try to help you in getting Marvin used in production :)