Coder Social home page Coder Social logo

Comments (53)

peterbourgon avatar peterbourgon commented on July 18, 2024 1

indexCounter appears to be used to populate an index field for MapItem values, and that index field appears to be used to establish an order of items in a MapSlice value. That order is stable only within a specific instance of a program; a MapSlice produced by one PID has an order that is not necessarily equal to the order of the same MapSlice produced (or received) by a different PID. So MapSlice order is not deterministic, and does not provide "a given order for maps" beyond a process boundary.

from coze.

zamicol avatar zamicol commented on July 18, 2024 1

On the topic of MapSlice, this library actually does look like a suitable replacement for MapSlice: https://github.com/iancoleman/orderedmap

I don't think it would take me too much time to implement.

I would vender that repo so Coze doesn't have external dependencies.

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024 1

Coze/orderedmap.go

Lines 108 to 114 in a3fbc8a

func (o *orderedMap) Values() []any {
v := []any{}
for _, k := range o.values {
v = append(v, k)
}
return v
}

o.values is a map type, and iteration over a map is non-deterministic, so the order of the values in the v slice is also non-deterministic. But it seems like you expect a specific order in the tests:

Coze/orderedmap_test.go

Lines 93 to 99 in a3fbc8a

values := o.Values()
expectedValues := []any{
4, // 3 is overwritten
"x",
[]string{"t", "u"},
[]any{1, "1"},
}

If values should be returned in key order, then you probably want something like

func (o *orderedMap) Values() []any {
	var values []any
	for _, k := range o.keys {
		if v, ok := o.values[k]; ok {
			values = append(values, v)
		}
	}
	return values
}

edit: Note that this example implementation does not provide stable order of values recursively, for example if a value were itself a JSON object (a map of key to value) then the order of the keys and/or values in that value would not be deterministic.

from coze.

zamicol avatar zamicol commented on July 18, 2024 1

The Coze signing/verification step is not JSON aware, it's just UTF-8. Non-JSON,
UTF-8 octets are required for for signing and verification. On ingress, after
verifying the UTF-8 payload, unmarshalling may result in an unordered JSON
data structure. On egress, unordered JSON is marshaled into UTF-8 before signing
by Coze.

Touching on any potential implementation concerns, It's just a few lines of
code to
transform

from the Coze form into the correct UTF-8 form. The resulting UTF-8 octets are
valid JSON.

To summarize previous posts and to come back to address the origin of order
information, this is my recommendation for transmitting order information in
this ranking.

  1. Conveyed order by UTF-8. (Intermediaries should not modify pay and must
    leave it alone.)
  2. Use content-type:application/coze to instruct intermediaries to leave
    cozies alone.
  3. Passing can in the coze e.g. {"pay":...,"can":[...], "sig":...} and
    reserialize pay into the correct UTF-8 form needed for verification.
  4. Serialize the coze into a format that intermediaries leave alone.

If order was a practical problem, I'd expect to see issues caused by
JOSE/JWS/JWT as the payload is
JSON
and must also
be exact UTF-8 bytes.

from coze.

zamicol avatar zamicol commented on July 18, 2024

Thank you for opening an issue. We're taking a look.

from coze.

zamicol avatar zamicol commented on July 18, 2024

Sorry for taking a bit to get useful communication back to you. I've had some
time today to dig into this issue and consider what the other options are.
There's a few other things concerning this matter, so I'll take a moment to
comment on some other aspects.

MapSlice exists because encoding/json has no way to preserve the order of map keys.
See golang/go#27179. There's a lot of discussion in that thread, and I'm not a fan of any of the solutions.

The ultimate goal is to use JSONv2, which solves field order, other issues, and has other best practices.

As currently written MapSlice is not safe for concurrent use, meaning Go Coze itself is not safe for concurrent use.

Another problem is the singleton pattern of indexCounter. I would go
so far to call it "bad design", and normally, it would be unacceptable.
However, since MapSlice is suppose to be a interim solution until JSONv2. (Go maps themselves are not safe for concurrent use and require consideration when used concurrently.)

We could make it safe fairly easily by exposing something useful for sync.RWMutex, but that wouldn't solve the "singleton pattern" problem, and we don't want downstream packages having to worry about it now when it should just be fixed later.

I will add writing a concurrent safe MapSlice to my TODO list, as I'd like to fix this even in the interim, but I don't have time to make a concurrency safe at the moment.

For now, I think these are the two actionable items that should be done now:

  1. MapSlice should not be exported as Coze isn't the proper place for something like this.
    If MapSlice was useful to other packages, MapSlice should go into it's own package and then imported by Go Coze. Packages should not be importing package coze just for MapSlice.
  2. It should be noted that Go Coze is not currently safe for concurrent use.

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

MapSlice exists because encoding/json has no way to preserve the order of map keys.

JSON maps (as well as Go maps) don't have an order of keys which can be preserved. If you want order in JSON, you gotta use an array! 😅

from coze.

zamicol avatar zamicol commented on July 18, 2024

Sorry, I stole the verbiage from the "map keys" issue. MapSlice exists because JSON field order is not preserved by Go, because internally it uses a map which does not preserve order. Order is not set by JSON objects, but order is relevant for the canon aspect of Coze.

You got me thinking, perhaps MapSlice is overkill for even what is needed now. MapSlice is currently only used once, in Canon(), in all of Go Coze.

At the moment, I can't think of a good way replacing MapSlice in Canon() without writing a custom unmarshaler, which at that point it may be better to use JSONv2 even if it means vendoring the existing available library.

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

If Coze should provide "valid and idiomatic JSON" then by definition it cannot rely on any specific ordering of map keys in the JSON encoded forms of its values. JSON does not require any specific key order, or really any deterministic encoded form at all.

If order is relevant to Coze, then it must transform (unordered) JSON encoded values to (ordered) spec-defined encoded values, and operate against those transformed values. It's fine if you want to define that second thing as JSON with additional constraints, including (but not only) lexicographical order of object keys. But that transform has to be explicit.

edit: One example of this issue becoming a problem is the can field, which states that it is an "Array of 'pay' field names in order of appearance". But the pay field is a JSON object, and JSON Object field names do not have any defined order of appearance. That is, {"pay":{"a":1,"b":2}} is fully equivalent to {"pay":{"b":2,"a":1}} when encoded or decoded. It would be an error if can was one thing for the first encoded form, but a different thing for the second encoded form; can is undefined until it's redefined to be more specific.

from coze.

zamicol avatar zamicol commented on July 18, 2024

Base64 isn't defined by JSON. Using base64 doesn't make Coze not JSON, just
as Coze employing canonicalization does not make Coze not JSON.

A Coze message is valid JSON. Coze needs bit perfect messages for verification,
which isn't relevant to JSON.

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

Coze employing (relying upon) canonicalization of encoded data definitely makes Coze not-JSON.

A Coze message is valid JSON, but JSON is not necessarily valid Coze. Bit perfect messages are not, and can not, be provided by JSON. If some code accepts encoded JSON then it necessarily can not assume any specific properties of that encoded form, while remaining JSON-compatible. It must treat {"a":1} as equivalent to { "a": 1} or { "a": 1} or etc.

from coze.

zamicol avatar zamicol commented on July 18, 2024

A canonicalized coze is JSON.
A pre-canonicalized coze is JSON.

JSON is not necessarily valid Coze, but Coze is valid JSON.

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

Agreed. So if Coze claims it accepts JSON, then it cannot assume any property of canonicalized Coze which is not also a property guaranteed by JSON. That includes JSON object key ordering.

If the JSON encoded bytes {"a":1,"b":2} are considered valid, or evaluate to some digest X, then the JSON bytes {"b":2, "a": 1} also must be considered valid, and to evaluate to the same digest X, because they are equivalent JSON objects.

from coze.

zamicol avatar zamicol commented on July 18, 2024

The numbers 000640 and 640 are equivalent to decimal because decimal does not
assign meaning to preceding 0. Even though decimal doesn't interpret meaning
from preceding zeros doesn't mean preceding 0's cannot convey useful information
to other systems. A decimal credit card system may consider 640 invalid while
000640 valid. Does it mean that the credit card system isn't using decimal?
No, it just means it has additional constraints.

It doesn't matter how a JSON library marshal a message before employing Coze to
sign a message, however, once it is marshaled and signed that form has meaning
to Coze even if it does not have meaning to JSON. And that makes sense since
Coze is crypto aware and JSON is not. JSON may consider {"a":1,"b":2} valid
while Coze may not. Conversely, anything that Coze considers valid must be
considered valid by JSON. That's correct, Coze is JSON, JSON is not Coze.

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

It doesn't matter how a JSON library marshal a message before employing Coze to sign a message

What is the type of a message which Coze signs? Is it []byte or a more abstract concept of "a JSON object"?

from coze.

zamicol avatar zamicol commented on July 18, 2024

JSON messages get "compactified" (unneeded spaces are removed) and then serialized into []byte before signing.

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

OK, so then Coze does not sign objects, it signs specific serialized bytes, abiding specific rules. Those rules include "is valid JSON" but also additional rules like "has compacted whitespace" which are not guaranteed by JSON. Where are those additional rules specified?

from coze.

zamicol avatar zamicol commented on July 18, 2024

The Coze object details how signing works, e.g. signing is over the bytes of cad.

The Canon section covers how Coze serializes JSON. When serializing pay the canon is the present fields in the given order. After hashing the compactified and then serialized form, the resulting digest is named cad.

For example, the message {"msg": "Hello Peter!"} has a cad of
HDOZ1KSGu2y69_c6isVsdsOVXcC6PFywtMrz0Msg6-Y. This value (in its binary form) is what is signed by Coze.

After signing a coze, if you hit "advanced" on the verifier the bottom will show cad's value for then given coze. The simple verifier will do the same.

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

When serializing pay the canon is the present fields in the given order.

There is no "given order" of fields in a JSON object. If you encode a JSON object as the bytes {"pay":{"a":1,"b":2}}, it is totally valid for any intermediating software or middlebox to parse and re-encode those bytes to e.g. {"pay": {"b": 2, "a": 1} } before delivering them to the destination. Both of those sequences of bytes represent the exact same JSON object. The order [a, b] is not guaranteed to be preserved between sender and receiver.

edit: But I suspect I've not managed to convince you this is the case 😉 so I'll disengage at this point as well.

from coze.

zamicol avatar zamicol commented on July 18, 2024

On branch orderedmap, we've migrated from mapSlice to orderedMap, which is concurrency safe.

https://github.com/Cyphrme/OrderedMap

I'll likely merge it into master on Monday and tag it.

from coze.

zamicol avatar zamicol commented on July 18, 2024

Thank you Peter for pointing that out!

We fixed that in the latest commit and tagged it: 76f019b

from coze.

zamicol avatar zamicol commented on July 18, 2024

Also, we just published https://github.com/Cyphrme/OrderedMap as a stand alone package.

We added a thank you to you in the README.

from coze.

zamicol avatar zamicol commented on July 18, 2024

@peterbourgon, I don't think I mentioned before Normal:

https://github.com/Cyphrme/CozeGoX/blob/master/normal/normal.go

Normals, such as Canon, are useful for decoratively describing ordering of fields and other normalization for objects.

That package defines 5 normals, Canon, Only, Need, Option, and Extra.

In the context of Coze, Coze is only aware of a single Normal, Canon. In Coze, a pay has an implicit canon, the current present fields in the current order.


Related, I understand the concern about JSON not ordering fields, so there's other more way to view this:

  1. UTF-8 is ordered, so Coze could be considered inheriting order from UTF-8 and not JSON.
  2. Coze is JSON, JSON is not necessarily Coze. It's okay for Coze to define features lacking in JSON as long as they are upward compatible with JSON. (This was the point I was making, I understand that this is the point being questioned)
  3. If order was still really a problem, arrays are ordered in JSON, so are other ways of communicating order via an array and that would be a technical fix. For example, transporting can along with pay. We felt that this was entirely unnecessary and bloated messages so that's why Coze omits an explicit canon from coze.

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

Coze is JSON, JSON is not necessarily Coze.

Exactly! Equivalently: the MarshalJSON method of a Coze type should produce output bytes that are valid JSON, but the UnmarshalJSON method of a Coze type can't assume the input bytes it receives are valid Coze.

It's okay for Coze to define features lacking in JSON as long as they are upward compatible with JSON. (This was the point I was making, I understand that this is the point being questioned)

Following from above, this works for encoding, but not for decoding: a Coze type can't be decoded from a JSON payload.

from coze.

zamicol avatar zamicol commented on July 18, 2024

Theory aside, and only considering the practical, I can't imagine a circumstance where order information is not present. Can you envision a circumstance where order information is not present?

If there was the practical need for order information, transporting can along with paywould be the best solution. Alternatively, the type of pay could be an array, but that's cumbersome and breaks the spirit of pay permitting arbitrary JSON.

"Coze is JSON, JSON is not necessarily Coze" is equivalent to saying, "JSON is UTF-8. UTF-8 isn't necessarily JSON."

The signature of Coze.UnmarshalJSON is []byte, which is UTF-8, which is JSON, which is Coze. The signature of UnmarshalJSON is not type UTF-8 or json.RawMessage. []byte -> UTF-8 -> JSON -> Coze

from coze.

zamicol avatar zamicol commented on July 18, 2024

On the practical side, the Coze specification could say something along the lines of "since JSON specifies that objects are unordered, the order for canonicalizing pay is derived from the UTF-8 serialized form of the JSON message "

I believe that would satisfy the concern.

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

By order information, do you mean the order of the e.g. keys in an encoded JSON payload, or something else?

If it's the former, then yes, anything which handles a JSON payload is totally free to parse and re-encode that payload, as long as the change is semantically equivalent according to the JSON spec. This could be done by an HTTP middleware in the program that produces the JSON payload, a middleware in the program that receives the JSON payload, or an HTTP middlebox anywhere between those two nodes.

from coze.

zamicol avatar zamicol commented on July 18, 2024

Perhaps that's the root of the differing viewpoints:

  1. The fair-game perspective: JSON specifies UTF-8 as it's serialization format.
    Even if JSON objects are unordered, order information exists in the
    serialized form. Since JSON prescribes UTF-8 serialization in its
    specification, it's fair-game for downstream use.

  2. The pure abstract JSON perspective: Even though JSON specifies UTF-8 as it's
    serialization format, serialization is ingress/egress for JSON and should not
    be considered as apart of the core of JSON. Since JSON objects are unordered,
    object order information does not exist for JSON. This includes anything
    downstream of JSON as well.

I think either viewpoint has the same end result. As long as the serialized
form is available object order information exists. Even if Coze doesn't
inherit pay's order from the abstract JSON, order is derivable from UTF-8.

Even if potentially excessive, I think adding something like the following to
the Coze specification alleviates semantic concerns: "The order of pay's canon
must match the UTF-8 serialized form."

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

I don't understand how UTF-8 establishes "order information" in the sense that you mean. A JSON payload may be UTF-8, but that doesn't prevent the bytes {"a": 1, "b": 2} from being transformed to {"b":2,"a":1}. The first form doesn't establish an order of [a b], and the second form doesn't establish an order of [b, a] — both a and b are unordered. Can you give me an example that demonstrates your point?

from coze.

zamicol avatar zamicol commented on July 18, 2024

UTF-8 orders bytes. Out of order bytes breaks UTF-8. Coze is order aware as well as UTF-8.

Coze knows what the order is from {"a": 1, "b": 2}.

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

UTF-8 orders bytes in individual runes (characters), not entire strings. But I think that's a red herring. Coze can't assert anything about order of map keys from a received JSON payload. Receiving {"a": 1, "b": 2} does not guarantee that the sender encoded it that way.

from coze.

zamicol avatar zamicol commented on July 18, 2024

Coze isn't asserting anything about a JSON payload, but it is asserting order in a Coze payload.

from coze.

zamicol avatar zamicol commented on July 18, 2024

If anything downstream of JSON was strictly not allowed to be order aware, and Coze requires a UTF-8 payload in order, that is the worse case scenario that meets Coze's requirements. Since JSON isn't cryptography aware, and the downstream Coze is, this doesn't seem like a reach. Moreover, on the practical side, we've not only not run into any sort of issue with this design, but we can't think of a circumstance where this would be an issue. If there is an issue, it's only regarding design theory, not the implementation. I guess I'd also like to see a practical Go test/example where coze.pay being order aware is unacceptable. I cannot envision a circumstance where this would be the case. If there is a case, I suspect it is something resolvable.

Moreover, I'd like to hear solutions to any design deficiency. Of all the available options, (which are diverse: shuffling permutations, serialization, passing along can, changing type of pay to array, etc...) coze.pay being order aware seems like by far the best option. If coze.pay being order aware is unacceptable, what are the next best solutions?

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

If you put a Coze payload into an HTTP request body, would it be Content-Type: application/json, or Content-Type: application/coze?

from coze.

zamicol avatar zamicol commented on July 18, 2024

That's a great thought. Most specifically, it would be application/coze.

Less specifically, application/json is suitable, but the ingesting application still needs to be aware that the JSON message is Coze, otherwise none of the Coze features are useful.

Our API endpoints at the moment are sending application/json, but the ingesting applications know that the JSON is Coze.
https://cyphr.me/api/v1/e/nPdArkL2UJrn7S46sTcdpKZ7znHi4k8BAI0TS9lPXmA?pretty

image

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

application/json is suitable, but the ingesting application still needs to be aware that the JSON message is Coze

Not only the ingesting application itself, but any intermediary with access to the response.

For example, if cyphr.me used a TLS-terminating CDN, then that CDN is free to transform the response body served by your cyphr.me origin before forwarding it to the requesting client, as long as the transform satisfied the JSON spec. Compression is one common way this can happen: the response body can be parsed and re-encoded in a more compact form, or gzipped, or etc.

As another example, if the client application happened to use any kind of general-purpose decorator on its HTTP client, those decorators could transform the response body arbitrarily, as long as the transform satisfied the JSON spec. Logging is one common way this can happen: the response body could be parsed, individual keys are logged, and then the parsed form re-encoded and forwarded on.

There are many other possibilities: WAF proxies, etc.

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

The Coze signing/verification step is not JSON aware, it's just UTF-8.

Coze may not be JSON aware, but since you shuttle the Coze payload thru HTTP with a Content-Type of application/json, it is by definition a JSON payload as it traverses the network. Any HTTP middle-box, including any HTTP middleware in any intermediating application, is free to parse and re-encode that JSON payload in any way which preserves JSON semantics.

That means that sending {"a":1,"b":2} does not guarantee that the receiver will receive exactly those {"a":1,"b":2} bytes without modification. It's entirely possible that the receiver will receive {"b": 2, "a": 1}, or {"a": 1, "b":2 }, or any other transformation of that initial JSON object which represents the same (abstract) JSON object.

This is not just a theoretical possibility, it is a common and practical reality. Tons of systems -- CDNs, in-process decorators, etc. -- do stuff like this.

If order was a practical problem, I'd expect to see issues caused by JOSE/JWS/JWT as the payload is JSON and must also be exact UTF-8 bytes.

AFAICT, those payloads are represented as fields in a JSON object, and not as the JSON object itself, which means the issues we're discussing here aren't applicable.

If you wanted to lean on the same guarantees that JOSE/JWS/JWT/etc. rely upon, then you would need to encode the Coze UTF-8 bytes as a base64-encoded string field in a JSON object, e.g.

{"coze": "<base64>"}

which would be faithfully preserved over the network.

from coze.

zamicol avatar zamicol commented on July 18, 2024

If middleware risks re-encoding, that stack should use content-type:application/coze. I would say the four point waterfall from above applies.

Those payloads are represented as fields in a JSON object, and not as the JSON object itself, which means the issues we're discussing here aren't applicable.

I'm not sure if I understand how Coze and JOSE are different here. This is the aspect where they are the most alike.

JWS has JSON payloads. https://datatracker.ietf.org/doc/html/rfc7797#section-4.2

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

If middleware risks re-encoding, that stack should use content-type:application/coze.

Any HTTP request body that is content-type:application/json "risks re-encoding" by definition, from every intermediating node: the sending application's JSON library, any sending application middleware(s), any middle-box, any CDN proxy, any other proxy, any middleware(s) in the receiving application, the receiving application's JSON libary, etc. etc.

If the specific order of bytes is significant, then the content-type cannot be application/json, it must be application/coze.

I'm not sure if I understand how Coze and JOSE are different here. This is the aspect where they are the most alike.

JWS has JSON payloads. https://datatracker.ietf.org/doc/html/rfc7797#section-4.2

Read that RFC carefully. The relevant data is not the entire JSON object as transmit over the network, it is a specific field within the JSON object, which has a well-defined and deterministic representation.

from coze.

zamicol avatar zamicol commented on July 18, 2024

I don't think middleware manipulation of a coze is a concern that's significant
enough to warrant any design changes to Coze.

If middleware manipulation is a concern, there are plenty of employable tools
resolve any hypotheticals, minimally this toolbox:

  1. intermediaries should leave cozies alone.
  2. conveyed order by UTF-8,
  3. use content-type:application/coze
  4. pass can in the coze e.g. {"pay":...,"can":[...], "sig":...},
  5. encode coze into a format that intermediaries leave alone.
  6. instruct intermediaries to leave cozies alone as signified by the presence of coze and/or pay.

Beyond what I've previously highlighted, although not strictly required, Coze
recommends that when larger JSON payloads include a coze to use the coze
label. Middleware must interpret JSON to perform useful operations, and the
coze label is a useful signifier to these applications that already perform
interpretation. The presence of the label coze can be used to instruct
intermediaries to perform specific behavior. If a coze is not included in larger
JSON payloads, pay itself is too a useful label. In any situation, the
labels coze and minimally pay are available to signify to intermediaries the
presence of fields relevant to Coze and to instruct intermediaries accordingly.

The relevant data is not the entire JSON object as transmit over the network,
it is a specific field within the JSON object

It is exactly the same for Coze.

has a well-defined and deterministic representation

I'm all ears if you can point me to this deterministic representation, but this
is not the case. The first
JOSE
RFC says, the payload
is "an arbitrary sequence of octets" (in the introduction and terminology
sections). There is nothing about a deterministic representation, but it does
re-iterate a few times that it's arbitrary.
7797, which further
details unencoded JSON payloads, again states it is arbitrary in the abstract
and in the introduction.

Beyond the RFC's, this is also how the industry is using it as well. The
practical application is relevant.

I am unaware of a JOSE equivalent to Coze's canon functionality, although since
it is very useful, I would not be surprised it it wasn't added in the future.
One of the reason why we wrote Coze was to leverage canonicalization to the
extent we felt was warranted. It would be neat if JWS had some sort of
deterministic ordering. Since it is so useful, Coze's canon and normal
functions are intentionally exported so they are usable by any application; for
example Coze's canon/normal can be used on JOSE JSON payloads.

HTTP [...] from every intermediating node

That is not the case for HTTPS, as an application must explicitly permit
middleware manipulation. However, the assumption of "application stacks should
not be doing unnecessary manipulations" is fair for HTTP as well, as any
manipulation of payloads should be considered authorized by the application.
Non-authorized modifications are equivalent to man-in-the-middle attacks. Said
again: any middleware manipulations of payloads should be considered to be in
control of the application's operator, who should not be performing unnecessary
manipulations.

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

The relevant data is not the entire JSON object as transmit over the network, it is a specific field within the JSON object

It is exactly the same for Coze.

I'm not sure this is true. One example among many is Key.SignPay, which signs the bytes produced from calling Marshal on the Pay struct. But Marshal just JSON encodes the provided value with the standard library's encoding/json package, which isn't stable. But maybe I'm misunderstanding this whole thing, I dunno.

from coze.

zamicol avatar zamicol commented on July 18, 2024

which isn't stable

It doesn't need to be. Only for signing and verification is the exact UTF-8 form is required. Coze can sign a JSON struct that's in any order, but once it's signed, the exact UTF-8 form is required.

However, although not relevant here, Pay is stable as Go marshals in order of the struct if I'm not mistaken. But when verifying, order is not derived from the struct, rather the UTF-8 form.

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

UTF-8 form

You keep using this term, but I'm still not sure what you mean by it. Can you explain? UTF-8 guarantees valid byte order within a single rune, but it doesn't guarantee anything about the order of runes within a string...

from coze.

zamicol avatar zamicol commented on July 18, 2024

UTF-8 is just a series of bytes.

but it doesn't guarantee anything about the order of runes within a string

How could it not?

Even more primitive than UTF-8, cryptographic signatures sign byte strings. Strings are in order.

If strings were not in order, the letters of the sentence would be out of order.

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

UTF-8 is just a series of bytes.

UTF-8 defines a variable-length encoding for individual characters (runes). A valid UTF-8 string is guaranteed to be sequence of one or more valid UTF-8 runes, but UTF-8 doesn't guarantee anything about the order of those runes in the string. Example in Go.

Cryptographic signatures sign byte strings. Strings are in order. If strings were not in order, the letters of the sentence would be out of order.

Opaque byte sequences, sure, but JSON byte sequences (or payloads) aren't opaque, and aren't guaranteed to be preserved without modification between sender and receiver. Code that receives bytes designated as JSON can modify those bytes in any way that doesn't violate the JSON spec. For example the prettify middleware here is perfectly valid.

from coze.

zamicol avatar zamicol commented on July 18, 2024

Strings are an ordered series of bytes.

https://go.dev/play/p/OHozWGw0RdE

Firstly, I appreciate your discourse using code. Thank you!

That example highlights that

  1. Indeed, strings are ordered.
  2. Shuffling bytes around is possible in UTF-8. This isn't a surprise, but it's nice to know as a first principle. Yes, shuffling bytes when UTF-8 unaware (not aware of runes) may result in invalid UTF-8.

Shuffling order doesn't mean that order doesn't exist. On the contrary, the ability to shuffle is predicated on order itself.

The point being that Coze is inheriting the order from the byte string which is ordered.

Coze's order design decision: "just use what you're given"

This was the design choice Coze faced: 1. Use strings to convey order 2. use a deterministic order (UTF-8 ordered).

Coze was originally UTF-8 ordered, but we found it to be anti-erognomic since keys would be ordered rigidly. In JSON, some fields are much more important than other fields, and it's nice to have those first. Deterministic ordering is a legitimate design decision, but it's one we found to be not as minimal as simply saying "you're already sending a string, just use the given order". The more minimal design choice is typically better.

Further, we wanted the ability to order keys as desired. For example, if msg was first, we wanted that to be okay. When hand writing JSON, object keys can be written in any order and it is valid. We found that to be the most ergonomic.

Another legitimate option was to 3. order by Coze fields first and then UTF-8 order. Overall, we found "any order" was more aligned with JSON and more ergonomic.

The coffin nail is can. As said above, can may be used to adjust order as needed.

(On a side note, are you peterGo? That account chimed in on a UTF-8/ASCII issue from a while back see issue 52062 in the Go project.)

https://go.dev/play/p/2Frk7rJyvvB

Again, thank you for demonstrating using code. I understand in that example why http.Handleris used, however because it's on the playground it's not executed.

I think the following is better for the sake of demonstration since the code is executed: https://go.dev/play/p/oguhQ5xU8NW

And it works. ;-) Yes, even when re-encoding using json.MarshalIndent. 👍

Here's the code:

package main

import (
	"bytes"
	"encoding/json"
	"fmt"
	"log"

	"github.com/cyphrme/coze"
)

func main() {
	goldenCoze := `{"pay":{"msg":"Coze Rocks","alg":"ES256","iat":1623132000,"tmb":"cLj8vsYtMBwYkzoFVZHBZo6SNL8wSdCIjCKAwXNuhOk","typ":"cyphr.me/msg"},"sig":"Jl8Kt4nznAf0LGgO5yn_9HkGdY3ulvjg-NyRGzlmJzhncbTkFFn9jrwIwGoRAQYhjc88wmwFNH5u_rO56USo_w"}`
	var obj map[string]json.RawMessage
	if err := json.NewDecoder(bytes.NewReader([]byte(goldenCoze))).Decode(&obj); err != nil {
		panic(fmt.Sprintf("decode JSON: %v", err))
	}
	pretty, err := json.MarshalIndent(obj, "", "    ")
	if err != nil {
		panic(fmt.Sprintf("prettify JSON: %v", err))
	}
	log.Printf("pretty:\n %s", string(pretty))

	var GoldenKey = coze.Key{
		Alg: coze.SEAlg(coze.ES256),
		Kid: "Zami's Majuscule Key.",
		Iat: 1623132000,
		X:   coze.MustDecode("2nTOaFVm2QLxmUO_SjgyscVHBtvHEfo2rq65MvgNRjORojq39Haq9rXNxvXxwba_Xj0F5vZibJR3isBdOWbo5g"),
		D:   coze.MustDecode("bNstg4_H3m3SlROufwRSEgibLrBuRq9114OvdapcpVA"),
		Tmb: coze.MustDecode("cLj8vsYtMBwYkzoFVZHBZo6SNL8wSdCIjCKAwXNuhOk"),
	}

	cz := new(coze.Coze)
	err = json.Unmarshal([]byte(goldenCoze), cz)
	if err != nil {
		panic(err)
	}

	fmt.Println(GoldenKey.VerifyCoze(cz))
}

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

But this fails, even though it represents 100% equivalent JSON.

Coze's order design decision: "just use what you're given" — This was the design choice Coze faced: 1. Use strings to convey order 2. use a deterministic order (UTF-8 ordered).

UTF-8 does not define, enforce, or guarantee order beyond byte order of individual runes (characters). It explicitly does not provide any kind of order of characters (runes) in a string, or of keys in a JSON object.

When hand writing JSON, object keys can be written in any order and it is valid. We found that to be the most ergonomic.

Again, the order of keys as encoded by the sender — in hand-written JSON or otherwise — is not guaranteed to be preserved in the payload received by the receiver.

edit: The solution here is straightforward: anywhere you pass a "sig":"..." parameter representing a signature, also pass a "msg":"..." parameter representing (some encoding of) the bytes that were signed by the sig, and when verifying the signature, use those bytes, rather than whatever bytes you received over the wire (which aren't reliable). No problem! Why not just do that?

from coze.

zamicol avatar zamicol commented on July 18, 2024

But this fails, even though it represents 100% equivalent JSON.

Again, order isn't relevant past verification or before signing. Once verified, which is over UTF-8, not JSON, the items may be ordered however pleased. Signing is not over JSON either; pay must be serialized into UTF-8 before signing. Coze's functions are over UTF-8, not JSON. And that's what the JSON spec says, JSON is encoded to and decoded from UTF-8.

So yes, this is correct:

values -> JSON    -> UTF-8   -> Coze                To create a coze from given values.  
Coze   -> UTF-8   ->  JSON   -> values              To decode a coze and retrieve arbitrary values.  

For your example, Coze UTF-8 strings should not be interpreted as JSON until after verification. Afterwards they may be ordered however pleased. (And yes, that's what we do on our servers. We verify coze messages first, before unmarshalling. After verification, payloads may be manipulated however seen fit.)

also pass a "msg":"..."

This is an uglier solution as it requires JSON escaping, significantly hurting readability and ballooning the message size:

{
"pay":"{\"msg\":\"Coze Rocks\",\"alg\":\"ES256\",\"iat\":1623132000,\"tmb\":\"cLj8vsYtMBwYkzoFVZHBZo6SNL8wSdCIjCKAwXNuhOk\",\"typ\":\"cyphr.me\/msg\"}",
"sig":"Jl8Kt4nznAf0LGgO5yn_9HkGdY3ulvjg-NyRGzlmJzhncbTkFFn9jrwIwGoRAQYhjc88wmwFNH5u_rO56USo_w"
}

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

Coze UTF-8 strings should not be interpreted as JSON until after verification

OK, no problem, but in this case you can't use MarshalJSON/UnmarshalJSON, as those methods produce and consume JSON. You would need to define and use e.g. MarshalCoze/UnmarshalCoze.

from coze.

zamicol avatar zamicol commented on July 18, 2024

There is no problem using MarshalJSON/UnmarshalJSON as the example demonstrates: https://go.dev/play/p/oguhQ5xU8NW. Further, the example is idiomatic; it's not a hack.

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

https://go.dev/play/p/yBrTPLzHQWN

If you designate a payload as JSON, then step 2 is perfectly valid, and can occur anywhere between sender and receiver.

edit: I'm not trying to be argumentative, or anything like that. I'm only trying to point out an invalid assumption in the project. At this point I think I've done about as much as I can do to point out the issue, so I'll bow out, do with my comments what you will.

from coze.

zamicol avatar zamicol commented on July 18, 2024

Coze signs strings. That string has an invalid signature and that example is working as specified. It's doing exactly what it should do.

What invalid assumption? That Coze signs and verifies strings?

from coze.

peterbourgon avatar peterbourgon commented on July 18, 2024

What invalid assumption? That Coze signs and verifies strings?

No, that JSON serialization (as defined by the spec) produces "strings" (byte sequences) that can be signed/verified in the first place.

edit: You seem to be assuming that the specific bytes that MarshalJSON produces will not be modified between sender and receiver. That's simply not true, as a statement of fact, and regardless if those bytes are encoded as UTF-8 or otherwise.

from coze.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.