Good Morning If you ever had time to read through my <a href="http

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

ACL and Access: inheritance between arbitrary relayed ADOs/Nodes bearing a SBF about strawberryfield HOT 4 OPEN

DiegoPino commented on July 17, 2024

ACL and Access: inheritance between arbitrary relayed ADOs/Nodes bearing a SBF

from strawberryfield.

Comments (4)

pcambra commented on July 17, 2024 1

A word of caution regarding node access grants is that are... node based :), so if we would want to include SBF to make ADO out of other entities we're constrained by that.
Also, this issue moves slowly but is there, https://www.drupal.org/project/drupal/issues/777578 and people are on the fence about this whole system and it might actually go away.

I think we should separate between Drupal permissions and the ones derived from SBF, there's certain granularity to the Drupal permissions and both content access and entity access handlers, I believe we should have a completely different implementation between those. Drupal permissions are quite fast, cached and "could be enough" for some use cases.

Now, the SBF based permissions could be precalculated as you mention but we have "two issues" one is how do we store/manage/invalidate these permissions and the second is how do we check and "bubble" them. I don't know how would we implement this, because we need to lay out some use cases first, maybe something such as https://github.com/graphp/graph would help? At the end of the day, if the granularity is entity, we need to escalate the result of the SBF ACL to the entity level permissions and if it's field, we need to push it up to both field and entity.

A couple of random notes:

Entity API provides a query_access handler that I think could help us on this task https://www.drupal.org/node/3002038
We have a lot of data in SOLR and that could technically bypass the Drupal permissions to the right fingers.

from strawberryfield.

DiegoPino commented on July 17, 2024

@pcambra @patdunlavey @alliomeria @giancarlobi updating a bit on ACL and some ideas on inheritance I have been thinking about/experimenting while i do a large rsync between production archipelagos(so I have time). I will state first the obvious here:

We need an ACL system that is easy to manage
We need an ACL system that is fast to resolve the Allow/Disallow access to resources
We need an ACL system that allows inheritance based on the Parent's wishes without having to mark/annotate or even modify the data of the Child. Both parent and Child are ADOs of course
We need an ACL system that allows inheritance but also Complementing. E.g A parent Collection has all data about IP address restrictions, roles that can access certain resources, etc. A Child Object has a User level allowances, overriding the global.
Our ACL system needs to operate as a filter over the JSON Data
ACL system needs to allow Resources, Wildcards and conditions. Conditions may be matches of JSON data or plugins like IP addresses, Roles, Users, etc.

I will answer to this ones with what I know first/have been thinking about these days (can barely sleep because of this!)

ACLs have to key factors: a definition a context and resolution. The definition, a JSON document, can be quite generic and normally will state the most general allowances/restrictions based on probably generic resources. The definition needs storage (a place to live) and methods/accessors and processors to be able to ask from it, based on a context (the current ADO being evaluated, the User, Its role, some IPV4/6 Address) which resources are allowed/not and return the resolution. Here is the first interesting: a Single ACL may/will be evaluated at different moments of the an ADO Request. Some may block access totally and can act before any actual render happens, others need the actual content and filter it. That means really a single ACL definition can be sorted/split and dissected by the resources they affect. Being so atomic in storage (like storing each RULE) in a separate place makes no sense, still we can statically cache/permanently cache each group by resources/access layer to make things faster.

I have said nothing new really here. But it is clear that the reuse of a ACLs instead of creating every time new ones is something we want. And the ease of use implies that during the UI/UX selection of a Rule, the code provides a Human readable representation of how that rule will be evaluated (so the user can choose well informed) and a simulation so users can see how for a given Context the rule will affect access. This part may be complex but can be done if we are smart enough

Conclusion: Our ACLs go into a Custom Entities with a SBF attached that stores the actual ACL JSON. Nodes that will use it, either by direct reference or by inheritance may have to load it at least once from Source and in subsequent calls given the same Context, Resolution can be returned cached. This Entities have of course no ACLs for itself. Only simplest permission based control for editing/viewing but can programmatically accessed by our system independently of the current user viewing them. (so a programmatic access bypass to its methods)

Fast to resolve. That is tricky. Fast means, fast retrieval first. For an ADO that has a single (always a single) ACL referenced directly the flow is this:
2.1 Does the ADO have an ACL? Fetch the ACL
2.2 Split the ACL into discrete definitions based on the available context at this level.
2.3 Evaluate the definitions using the context, statically cache the results and either deny/allow/no opinion.
2.4 Permanently Cache this resolution at the ADO level using context as cache tags.
2.5 Deny access or allow to next layer of resolution to happen (if any) if allow/no opinion (jumps to 2.2)

As you see we will have permanent caches per resolution levels\dependent on Context based cache tags and subsequent calls, if conditions have not changed resolution will be quite fast. We can also Cache unresolved definitions, to make the call even faster for variating Context. I will talk about that later on.

Where this becomes trickier is for inheritance of course. How do I know which ACL the ADO is inheriting? How many steps do I have to travel to CHILD1 -> DADOFCHILD -> GRANDMOTHERWITHACL? Too many queries. Slow. What if the Object is Orphan? Did I say too many queries? Buuuuhhh. Graph path membership caching to the rescue. I will propose a few options here and if you managed to read to this point I want your opinion

The first Child to be called Does the heavy work. Imagine that a Graph Path is really a state machine. I traverse the tree up and collect parents. Once I fetch the ACL from the first available parent, (so 2.1) I run through 2.4 and Cache for each Visited Parent node (now in reverse) Permanently the definition (not the resolution) because again, resolution will be context based and context is on the Child. Next time another Child of the same chain is called (or even from a deeper hierarchy) we check if Cache for the first Parent exists (I mean we do it every time but the first time there won't be any). Fun part is that each Parent ACL cached definition will have cache tags for its own parents. So a single Change in any ACL any of the Path / tree path members will trigger a cache invalidation. There is some beauty in the fact that we are not really storing a Cache per ADO parent. But we are storing a Cache per "being a child of" Which means really all collections level 2 parent of a level 1 will share a single Cache entry. Does this make sense? This also means moving one Collection out of that tree won't change the fact that the cache will still exists, just won't apply to the moved collection anymore.

Mathematically this seems to be quite performant. Normally you will have More Children than Parents of course and allows us to make a quick Fetch anytime we need from the direct parent. Now to the second option

The Grandparent does all the work. On an ACL change we traverse the tree down. This feels maybe safer and simpler since its like letting water flow. Water comes from the Source/owning ACL Parent and spreads to all its children and children except to the leaves of the tree. Those are untouched always. But it is mathematically bad, except if we only traverse shallow paths and (hear me, this is crazy) stop when there are too many paths ahead.

In that case, we can mix both. And that is quite nice. We let both parties play a role and do some work, eventually finding them in the middle. Making the final result incredible fast. (nice! applause!)

There is a third option which I'm evaluating: A fast hash Graph path pattern/membership access algorithm. This goes into the dark world of genomics (and music!), but there are algorithms out there that allow you to find in an computationally performant way if a given partial path (lets say -A->B->C belongs to an already cached other Path B->C->D->E or B->X->Y (which may share In this case an ACL for a little part of the road). This sounds like "obvious" but having a Single Cached Definition attached to a road, and not to each Corner and each Bus Stop makes things incredible fast to fetch, resolve and to manage. Even if a path changes or we can only discover a tiny piece, we can always check if our tiny pieces belongs to the larger map and reuse, almost instantly. I hope someone makes questions here about the implementation (which I will share really better in code) but I would love to hear your opinions on this.

Will continue writing tomorrow, this is really working out in my head/code wise and speaking about implementation here, having these ACLs cached in Solr/Redis will make things so easy and fast we may get Xmas Cards next year (not this one, 2021 is lost). Rsync is also done like an hour ago

Good night!

from strawberryfield.

DiegoPino commented on July 17, 2024

FYI:

There are two modules in Drupal already that handle Access Control in interesting ways. None of these are really performant but we can learn from them/extend one

https://www.drupal.org/project/content_access
and

https://www.drupal.org/project/acl

Funny is that ACL project is actually a bit the opposite a more like https://en.wikipedia.org/wiki/Access_Control_Matrix
And its quite heavy on DB without actual entities but a direct HUGE table of limited resources/users/roles.

We may do better, but still use grant access probably extending/implementing https://git.drupalcode.org/project/node_access_grants/-/tree/8.x-3.0 Daniel Sipo's simple but effective wrapper for at least some of the Resolution levels where they apply. (For IP ranges I will go for a middleware if Resolution is already cached)

from strawberryfield.

DiegoPino commented on July 17, 2024

@pcambra I agree here. We may want to take a separate/complementary approach to Drupal's build in. As you say for the simplest, most common cases core ones/simple extensions are good enough. My work will be first to separate levels of access control in different "action" layers based on resources affected/source of how the rules are defined. Based on that, I also agree, that Solr (and soon Redis) will play an important role in making this performant.
Since the enforcement of the rules is code, even if there is a little of exposure of rules to certain users (and we can limit) no real security breach will happen.

from strawberryfield.

ACL and Access: inheritance between arbitrary relayed ADOs/Nodes bearing a SBF about strawberryfield HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent