gnuxie / matrix-protection-suite Goto Github PK

2.0 3.0 1.0 645 KB

library for interacting with matrix policy lists for moderation.

JavaScript 0.38% Shell 0.02% TypeScript 99.61%

matrix-protection-suite's Introduction

matrix-protection-suite

The matrix protection suite is a library that provides interfaces and models for interacting with matrix policy lists. These models are not attached to any single client backend.

The aim of the project is to extract components from Draupnir and Mjolnir into a complete library that can be used in any application, from other bots, to widgets, and web apps.

Currently anything involving policy lists has a well fleshed interface, and are less likely to change than other components while this library matures.

This library was generated from a template.

matrix-protection-suite's People

Contributors

Stargazers

Watchers

Forkers

mikaela

matrix-protection-suite's Issues

Allow context to access enabled protections through a `Record<ProtectionName, Protection>`

This will help a bunch with using protections for modular reasons in Draupnir, since rather than endlessly gluing things to the Draupnir class they can be put in protections and not endlessly fill the class with irrelevant glue.

`Context` object in Protections only exists because different consequence providers can't be configured to be given to different protections.

If protections need additional authority than provided by the consequence provider, then they should be able to get a consequence provider with a different interface and be configured to handle that. https://github.com/Gnuxie/matrix-protection-suite/blob/main/src/Protection/ProtectionsConfig/ProtectionsConfig.ts#L51

It's important that we remove the context object as soon as we have the config support for matching up different providers to different protections. https://github.com/Gnuxie/matrix-protection-suite/blob/main/src/Protection/ProtectionsConfig/MjolnirProtectionsConfig.ts#L45-L46. This is the only way to make a safe "dry run" mode of Draupnir.

make sure the goddamned `MemberBanSynchronisationProtection` will not reban members that are already banned.

Got other stuff to do just noticed it and obviously shouldn't be forgotten about

Room state concepts need more consideration.

Something that we've missed for years is that if you manage to find a state event in the timeline, and it's for a brand new type+key combination, then you can always treat it like a state delta. It's not possible for the event to be stale state from your and your server's perspective.

To be able to represent this, we need new state change concepts, changing Added to Introduced and Reintroduced.
Where Introduced means that we haven't seen an event of that type+key combination before.

Events that have been Introduced can then be directly inserted by revision issuers into the next revision.

Batching of revisions is something that is inherited from Mjolnir to make sure that server ACL application wouldn't flood a room with m.room.server_acl events if using a script to bulk add policies to a policy room. I'm now of the opinion that batching should be controlled protection side. However, revision issuers should at least perform some batching so that introduced events received within the same sync response end up within the same revision.

Revision issuers should take a ULID before beginning the process of fetching state and revision issuers should then give this ULID to the next revision. This ensures that consumers of revisions that make assumptions and pre-empt the state of the room can invalidate their predictions by comparing the ULID of the revision to the ULID associated with their predictions and whether they came true or not.

(For onlookers, we only use ULIDs within the same process)

Preemption can only be done in cases where the state is a certainty, e.g. introduced state. We can't use ULID's like this because introduced state can revise a revision while a request is being sent across the network.

Add a handle for protected room permission changes to protections, so they can reapply their consequences etc.

For example, the member ban synchronization protection could be informed that it now has permissions in a room, and then it can reapply settings to it. I'm not sure if the handle should be something simple that tells the protection all permissions required by the protection have been gained in the room or lost

Why `TrackedStateEvent` might not be worth it.

Currently the RoomStateRevision can be configured to store a minimal representation of events provided they fit the TrackedStateEvent interface. This can be done by setting events as InformOnly within a StateTrackingMeta object and then giving StateTrackingMeta object to the RoomStateRevision either at initialization or later. In theory this will mean that the RoomStateRevision will use up less memory because events will be stored with most keys redacted. However, this does mean that PolicyListRevisions and RoomMembershipRevisions cannot be instantiated from a RoomStateRevision, only a blank revision that is incrementally built by listening to a RoomStateRevisionIssuer. It's also unclear whether there is going to be much benefit from the reduced memory footprint of some events, given that for Draupnir, all rooms protected will need their membership tracked. Which preserves the event content and most top level fields.

It's going to be expensive to run Draupnir to protected a room with 100k members, and this sort of saving is going to be linear. If we needed to think seriously about this for whatever reason, homeservers probably will reach the limit first. Telegram for instance only supports 200k members per group.

Let's pretend the average membership event will be 2000bytes for Draupnir to hold in memory (my estimates are more in the region of 800bytes, but lets up it just to be conservative, and the real serialized size is probably less still, in the region of 500-600bytes per event). And now let's say there's a million of them. That would mean there only needs to be 2GB to represent them. Which is quite reasonable. Even if we double or quadruple that.

So I think it's fair to say for simplicity, we're not going to have to worry about this quite yet and if we did, we would need to change the architecture of MPS regardless of the linear savings for TrackedStateEvent. We'd reach the point where we have to be able to change the cached membership to be a partial understanding of room state based on who is active in the room at around the same time in either case.

`MembershipChangeType` `Rejoined` is probably error prone

Given a user rejecting an invite or being kicked on knock for the first time and later joining again probably shouldn't be classified as rejoined as they never did really join the room. Of course, this relies on transitions being skipped from our perspective due to downtime and not chasing up the chain of events, because normally the user would be classified as renocked or reinvited.

PersistentData, AccountData, StateData don't seem to be aware of `404` and can't be, implementations need to manage it.

Currently they don't and we can't get Draupnir MPS to startup because of it.

Would it be inconsistent to provide access to invitations outside of the protected rooms set?

Probably not, since you might want this to be able to detect scrapers

The standard `ProtectedRoomsConfig['addRoom']` needs to use a `RoomJoiner` and join the room with it.

Currently this causes rooms add and the test for the BanPropagationProtection to fail

Consider embracing the suite as a monorepo and renaming to reflect a mega SDK: `neightrix`

Do we need to add attribution text to our files that reference the repository?

e.g. we have attribution text for mjolnir

// SPDX-FileAttributionText: <text>
// This modified file incorporates work from mjolnir
// https://github.com/matrix-org/mjolnir
// </text>

We aren't legally obligated to do this, but we do do this because it's the right thing to do.
We want to try enforce that other projects do the same to us when they copy from us, and it seems like the only way to do that is by adding attribution text the same way that is self referential. Which is interesting, since that text can't say modified but has to say modified when it has been copied? Maybe incorporates is enough.

There should be a way to go from changes and a revision to the revision issuer

async function groupRulesByIssuer(policyRoomManager: PolicyRoomManager, changesByList: ChangesByRoomID): Promise<ActionResult<GroupedChanges[]>> {
    for (const [roomID, changes] of changesByList) {
        const issuer = await policyRoomManager.getPolicyRoomRevisionIssuer(MatrixRoomReference.fromRoomID(roomID));
    }
}

This kind of code shouldn't be happening... We can't avoid aggregating changes into other revisions, so maybe this needs to happen at the rule level?

Draupnir Account data & state migration is verified.

org.matrix.mjolnir.enabled_protections https://github.com/the-draupnir-project/Draupnir/blob/v1.86.0/src/protections/ProtectionManager.ts#L102-L108
- Has a Schema in Draupnir to enable the BanPropagationProtection on schema upgrade, with the intention to enable it by default on first use, and existing instances when the protection was released.
- Account Data
org.matrix.mjolnir.protected_rooms https://github.com/the-draupnir-project/Draupnir/blob/v1.86.0/src/ProtectedRoomsConfig.ts#L33
- No Schema
- Account Data
org.matrix.mjolnir.watched_lists https://github.com/the-draupnir-project/Draupnir/blob/v1.86.0/src/models/PolicyList.ts#L45 & https://github.com/the-draupnir-project/Draupnir/blob/v1.86.0/src/models/PolicyListManager.ts#L121
- No Schema
- Account Data

It doesn't seem like any "migration" actually happens.

Should there be a handle specifically for policy matching a user in a protected room?

For example member ban synchronisation and redaction synchronisation might share this code?

`MembershipChangeType` doesn't account for skipped transititions

For example leave->leave implies a user temporarily rejoined the room, so it's wrong to classify it as NoChange since we might want to warn protections about the rejoin even if we don't have a record of it (because of downtime). Of course, all of these special cases have to have to be its own enum variant

Add a `MatrixRoomReference['fromString']` method

https://github.com/the-draupnir-project/Draupnir/blob/main/src/appservice/AppService.ts#L107-L119

then look for references using fromPermalink and replace where they also try to check for room aliases and ids

It's really important that the API for accessing membership infers and creates a bs join event if it can't find one

This needs to be typed in as a distinct thing, and the consumer needs to know how to handle it. It'll be a special MembershipChange returned by the SetMembership

Capability methods that accept a policy list revision with user policies should be called something else

We need to change the user consequences capability to take a revision
and apply it to a single room for when power levels have changed within one room and the protection is prompted to retry taking action.

Ensure `PolicyListRevisionIssuer`s and their dependents are intialized in a consistent way

This can be done by enforcing that the initial data that is used to build a revision (e.g. when fetching state for the first time) is done so from calling reviseFromChanges/reviseFromState on an empty revision (blankRevision). Then adding a flag to the revision event (or an event that can be subscribed to with once (i don't think the latter is necessary given everything will need to subscribe to this events, and only client code will need to distinguish between the two).

Clients shouldn't be forced to chase state changes

There should be visibility in the spec about why the transitions being "skipped" from our perspective is a thing unless we want to do a tonne of work to chase the chain of events.

#21
#21

`SetRoomState`/`SetRoomMembership` 'revision' event consistency

So I was wondering if I needed to create a new listener specifically for when the ProtectedRoomsManagergets a call to addRoom. Instead of just abusing the existing 'revision' event in the setRoomState and membership objects.

When a new policy list is added to the PolicyListsConfig, the issuer manager creates a revision that incorporates all these new changes. But that's because the revision represents the final result of all config and fancy filtering.

So the thing is the protection handles for handleStateChange and handleMembershipChange are called when the setRoomState emits the revision event.

`ActionError['addContext']` probably needs changing and renaming.

This should be used as a way to set the most relevant message to be shown first, and the other message are just a paper trail that doesn't need showing to the user.

Need a hook to run the power level check when rooms are added/removed from a set

Startup time is much worse 0.19.0

by about a minute, it's something to do with the way the set membreship and set room state are intiailised in the ProtectedRoomsManager

We need a derirative of Draupnir's `RoomUpdateException` that has a reference to the room for additional context

`ProtectedRoomsConfig['add/removeRoom']` doesn't ensure consistency with `SetMembership` and `SetRoomState`.

Before the room is added and an event is emitted, we need to atomically modify set membership and set room state to include the new room.

Why we are not using a bundler (rollup to be removed shortly)

I haven't seen a compelling argument to use one. People have suggested that bundling libraries means that the packages are smaller and there is better compatbility with supporting both CommonJS, ESM and even UMD. But It's not actually clear to me whether this is a real problem for more recent npm packages anyway?
If you are going to write a web app where the size of the depenencies will matter, you will have a bundler anyway? If your dependencies are themselves bundled, their dependencies probably will not be de-duplicated by your own bundler. So I don't really buy this argument. It seems like there is just an obsession to save bytes for no purpose and to create tooling that doesn't actually match up with npm. https://cmdcolin.github.io/posts/2022-05-27-youmaynotneedabundler.

For our project, we tried to use rollup and add typebox as an external peer dependency using @rollup/plugin-node-resolve. However, this will not work when you try to use the bundled library in another project that has a subdependency that is installed first which doesn't use the same version of typebox as you. And there doesn't seem to be an obvious way to fix that at all. The tooling is inadequate and a waste of time. And is probably quite damaging in other ways (dependants not being able to override dependencies and all).