Starting this pre-RFC conversation to gather some feedback on some changes of <a href=

No. I'm speaking about creating a new version of the <code class="notrans

<a href="https://github.com/paritytech/polkadot-sdk/blob/master/polkadot/

Adding a `CoreIndex` commitment to candidate receipts about rfcs HOT 22 CLOSED

polkadot-fellows commented on July 21, 2024 2

Adding a `CoreIndex` commitment to candidate receipts

from rfcs.

Comments (22)

eskimor commented on July 21, 2024 1

use 4 bytes out of this reclaimed space for a new core_index: u32 field.

2 bytes should suffice.

from rfcs.

sandreim commented on July 21, 2024 1

No. I'm speaking about creating a new version of the CandidateCommitment struct. A clear cut and not some hacking.

I agree that clean cut is generally better than hacking, but we have to take other concerns into consideration before we decide Yay/Nay on this topic.

Development cost

The clean way involves the following big changes on top of what is needed for my proposal:

PVF versioning paritytech/polkadot-sdk#645 which needs some big changes in cumulus as well as node pvf and candidate validation modules at least
New network protocols, which currently is PITA to do in Polkadot, one small change in one protocol requires changes in all networking protocols just to account for the new version,
Parachains runtime changes so we can juggle current and new receipts at the same time.

Set of changes required for my proposal:

removing usage of collator signature (which is a good thing to do even outside the scope of this issue) & converting the now unused fields to some reserved and a core_index in the candidate descriptor
A few isolated changes in relay chain runtime to allow using UMP both for messages/commitments
A cumulus parachain API to allow sending the commitments via the UMP transport

Timeline

The PVF changes and networking protocol changes take a lot of time to test and deploy in production and will create additional bugs to surface that will need fixing, further delaying the launch. Also, audits will take much longer since there would be more code to cover.

On top of that, parachain teams already(Moonbeam roadmap for example) expect to use Elastic Scaling. The feature as it stands now it is unlikely to be usable by most teams since their collator sets are open.

Why we should postpone the clean cut

IMO the cost of not delivering Elastic Scaling without limitations this year outweighs the concerns of doing my isolated hack. Time to market matters. Launching Elastic Scaling but not lifting constraints on the MVP as soon as possible because we want to make things perfect doesn't have any value for Parachain teams.

Introduce some new networking version that will only support this new commitment. After it was enabled on the relay chain, collators can start using it, but they can also continue using the old version. The validators will accept both versions.

The reclaimed space in the descriptor and the possibility to commit to additional things via the UMP transport layer provides enough so we don't need to change the CandidateDescriptor/CandidateCommitments in the near/medium term future. IMO the best time to do it would be as part of migration to JAM.

from rfcs.

bkchr commented on July 21, 2024

reclaim 32 bytes from collator: CollatorId and 64 bytes from signature: CollatorSignature fields as reserved fields

Why? Don't we need this for on demand to verify that the claimant is the one that signed the collation?

Older non-upgraded validator nodes for example would get a different commitment hash if they don't decode and hash the whole thing and this would break consensus.

You want to hack this into the protocol? I don't get why you not introduce a new version of the networking protocol to be able to append the core_index.

from rfcs.

sandreim commented on July 21, 2024

reclaim 32 bytes from collator: CollatorId and 64 bytes from signature: CollatorSignature fields as reserved fields

Why? Don't we need this for on demand to verify that the claimant is the one that signed the collation?

No, that was the initial idea AFAIK, but we now want to only rely on the parachain runtime to decide who builds the parachain block. What collator provides the collation is irrelevant , it just has to be valid.

There are plans to have some counter measures against collators spamming with invalid collations, Collator Protocol Revamp which hopefully should be implemented this year.

Older non-upgraded validator nodes for example would get a different commitment hash if they don't decode and hash the whole thing and this would break consensus.

You want to hack this into the protocol? I don't get why you not introduce a new version of the networking protocol to be able to append the core_index.

A new version doesn't solve the problem, the problem is that we always need to check all of the commitments when we validate a candidate. We also rely on candidate hashes to identify unique candidates. Having two distinct candidate commitments for same parachain block breaks consensus.

Simply put, we need all nodes to basically encode/decode/reason about the same thing regardless if there is a minority which don't understand the CoreIndex commitment.

from rfcs.

bkchr commented on July 21, 2024

A new version doesn't solve the problem, the problem is that we always need to check all of the commitments when we validate a candidate. We also rely on candidate hashes to identify unique candidates. Having two distinct candidate commitments for same parachain block breaks consensus.

You already answer yourself that this will not work, because the old will try to validate the signature. Thus, you can not simply "reclaim" space in the candidate. Why not use the node features to make a clean cut? To switch from the old to the new format?

from rfcs.

acatangiu commented on July 21, 2024

CandidateCommitments doesn't really need to be changed, but one idea is to define a special XCM instruction like CommitCoreIndex(u32) that is appended as the last message in the upward_messages vec. The message wouldn't ever be executed, but would just serve as a commitment to a specific CoreIndex.

I have also thought about appending an additional core_index: u32 field to the end of the structure but that doesn't really seem to work because we compute the CandidateHash with hash(CandidateDescriptor, hash(CandidateCommitments)) . Older non-upgraded validator nodes for example would get a different commitment hash if they don't decode and hash the whole thing and this would break consensus.

Any better and less hacky ideas especially regarding the XCM stuff would help me a lot.

This CommitCoreIndex(u32) isn't a good fit for XCM because it would never be executed in the XCVM, it is also not a cross-chain messaging primitive. We'd be really abusing the language.

But IIUC we need a way for the parachain runtime to specify to the relay chain validators/backing-group what core index the current commitment is for. So following on your idea, we don't really need to add it to XCM, what you really want to use is the XCM transport protocol not the language and executor. So you could still use UMP/MQ to transport this extra data, but do it outside of the XCM transport flow.

Disclaimer: I'm not familiar with the details of parachain consensus and elastic scaling, so there might be better solutions than "abusing" the UMP queue (implementation abuse), but wanted to offer a slight alteration to your idea so we don't abuse the XCM language (spec abuse).

from rfcs.

sandreim commented on July 21, 2024

A new version doesn't solve the problem, the problem is that we always need to check all of the commitments when we validate a candidate. We also rely on candidate hashes to identify unique candidates. Having two distinct candidate commitments for same parachain block breaks consensus.

You already answer yourself that this will not work, because the old will try to validate the signature. Thus, you can not simply "reclaim" space in the candidate. Why not use the node features to make a clean cut? To switch from the old to the new format?

That is a minor setback, forcing validators to upgrade is something tractable. Forcing all collators switch to a new format is hard breaking change that would put pressure on parachain teams and delaying the enablement of open collator sets for elastic scaling. My plan is to only require an upgrade for collators if they want to use elastic scaling, otherwise they can keep signing their collations and still not commit to a core index just like they do no, at least for a good period of time. Possibly until Omni-node becomes ubiquitous.

from rfcs.

bkchr commented on July 21, 2024

That is a minor setback, forcing validators to upgrade is something tractable. Forcing all collators switch to a new format is hard breaking change that would put pressure on parachain teams and delaying the enablement of open collator sets for elastic scaling.

I don't see how you force collators to upgrade? I mean basically you just enable support for the new format on the validators. The collators can send whatever they support, but the validators will reject the new format until it is enabled.

from rfcs.

sandreim commented on July 21, 2024

This CommitCoreIndex(u32) isn't a good fit for XCM because it would never be executed in the XCVM, it is also not a cross-chain messaging primitive. We'd be really abusing the language.

But IIUC we need a way for the parachain runtime to specify to the relay chain validators/backing-group what core index the current commitment is for. So following on your idea, we don't really need to add it to XCM, what you really want to use is the XCM transport protocol not the language and executor. So you could still use UMP/MQ to transport this extra data, but do it outside of the XCM transport flow.

thanks @acatangiu for the comment and discussion last week, indeed it makes no sense to add a new XCM instruction.

Disclaimer: I'm not familiar with the details of parachain consensus and elastic scaling, so there might be better solutions than "abusing" the UMP queue (implementation abuse), but wanted to offer a slight alteration to your idea so we don't abuse the XCM language (spec abuse).

How I would see this working is altering the UMP transport layer definition so it can be used to support XCM messages and commitments which seems to be a small surface area for changes in the Parachains Runtime. One way I am thinking this can be implemented is to a have a terminator for XCM messages (something like an empty Vec) followed up by the core index commitment: in receive_upward_messages we would stop sending messages to MessageQueue pallet once we hit the terminator.

from rfcs.

sandreim commented on July 21, 2024

That is a minor setback, forcing validators to upgrade is something tractable. Forcing all collators switch to a new format is hard breaking change that would put pressure on parachain teams and delaying the enablement of open collator sets for elastic scaling.

I don't see how you force collators to upgrade? I mean basically you just enable support for the new format on the validators. The collators can send whatever they support, but the validators will reject the new format until it is enabled.

Yeah, sure, we can also do that this sounds like an optimisation to what I just said, assuming we are talking about no longer checking collator signatures and using the bytes for more useful things.

from rfcs.

bkchr commented on July 21, 2024

No. I'm speaking about creating a new version of the CandidateCommitment struct. A clear cut and not some hacking. Introduce some new networking version that will only support this new commitment. After it was enabled on the relay chain, collators can start using it, but they can also continue using the old version. The validators will accept both versions.

from rfcs.

eskimor commented on July 21, 2024

On top of that, while yes this is not the most clean solution, it is also not that hacky*) and:

If we ever need to do a new version, that hack will go away with it - so we can think of this as an intermediate solution, until we really can not help it and actually need a new version.
This "hack" allows us to get rid of the much uglier hack we did in paras inherent to account for the missing core index.

So while not the most pretty solution:

it should work
I don't see this backfiring
it is quite isolated
it allows us to get rid of an even uglier hack - one I am personally actually uneasy about and only accepted it under the premise that it will go away soon.

*) In the end, why not think of this as a message being sent to the relay chain. The only real hacky part is that we did not properly account for any messages to not be XCM, but the delimiter solution sounds good enough. The only code needing to look at this is candidate validation. Code impacted:

relay chain needs to stop processing messages at delimiter.
candidate validation needs to verify core index in message after delimiter.
parachain needs to send coreindex message. (This message should obviously be an enum to allow for extensibility)

(3) is the most annoying, as it is user facing, but the amount of breakage should stay the same:

If we ever end up doing the properly versioned thing (before JAM), then the reason will be that we introduced something else on top. Hence it would be another breaking change regardless.

I agree, that something we want to be mandatory eventually, does not feel to be right as an extra message after a delimiter after the XCM messages. 😬 But ok, if that contract is well documented, it is not that bad.

from rfcs.

bkchr commented on July 21, 2024

PVF versioning paritytech/polkadot-sdk#645 which needs some big changes in cumulus as well as node pvf and candidate validation modules at least

This is like a max 1 day change in Cumulus and the validation modules. We are just exposing a new function and on validation checking which version exists.

New network protocols, which currently is PITA to do in Polkadot, one small change in one protocol requires changes in all networking protocols just to account for the new version,

Why do all protocols change, when one of them changes?

IMO the cost of not delivering Elastic Scaling without limitations this year outweighs the concerns of doing my isolated hack. Time to market matters. Launching Elastic Scaling but not lifting constraints on the MVP as soon as possible because we want to make things perfect doesn't have any value for Parachain teams

This was clear since the beginning of the year, that this change would be required or at least quite early. I still remember talking to you about this.

If we ever end up doing the properly versioned thing (before JAM), then the reason will be that we introduced something else on top. Hence it would be another breaking change regardless.

If we open this door, I don't see us closing this door ever again, because the same arguments will be brought up again etc.

from rfcs.

Adding a `CoreIndex` commitment to candidate receipts about rfcs HOT 22 CLOSED

Comments (22)

Development cost

Timeline

Why we should postpone the clean cut

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent