Coder Social home page Coder Social logo

Comments (4)

thehenrytsai avatar thehenrytsai commented on August 21, 2024

Hi @OR13, please ignore the incorrect referenced bug fix referenced above.

I do not understand the issue. Currently the CAS layer operates on Buffer and makes no assumptions on how the buffer is constructed. The content (e.g. anchor or batch file) is written only once to CAS by a rooter, no other instances/entities ever need to reproduce the exact serialization/file without fetching the file itself directly from CAS which would yield the same hash.

from sidetree.

OR13 avatar OR13 commented on August 21, 2024

https://github.com/decentralized-identity/sidetree-core/blob/761cfd6277a31f16d802ec240e0b16e7b09b72fd/src/BatchFile.ts#L31

Its true that sha256(Buffer.from(JSON.stringify(someDeepObject))) will generate a contentAddress, and so long as some other system does not try and derive the same address from their version of someDeepObject, it will work fine.

I have encountered issues with this in other projects, where a client constructs an object and then attempts to verify its existence in another system by generating a contentAddress for the object.

https://stackoverflow.com/questions/42491226/is-json-stringify-deterministic-in-v8

It may not be an issue with the way sidetree is being used today, I just worry about this sort of thing in protocols / standards.

JSON-LD has a similar challenge, and uses a much more complicated way to achieve a canonical representation of a json document: https://github.com/digitalbazaar/jsonld.js/#canonize-normalize

These things matter more when you have protocol implementations in multiple languages, for example mastodon uses ActivityPub and JSON-LD Signatures, and canonizes their objects using the method above before signing them:

https://github.com/tootsuite/mastodon/blob/cabdbb7f9c1df8007749d07a2e186bb3ad35f62b/app/lib/activitypub/linked_data_signature.rb#L55

See also: https://stackoverflow.com/questions/4670494/how-to-cryptographically-hash-a-json-object

In the future, when sidetree supports Swarm, Storj, Sia, Madesafe, subspace or any of the other content addressing storage systems out there, how you construct the object, will determine the buffer, which will determine what content address you get.

Its possible this is not going to be an issue for sidetree, I'm just sensitive to this issue having been burned by it before, its also one of the main reasons I like JSON-LD (there is standard way of normalizing json documents with a context)

from sidetree.

thehenrytsai avatar thehenrytsai commented on August 21, 2024

Thanks @OR13 for the useful info.

In principle, I do like the idea of having a standardized deterministic JSON serializer, but only if it is based on an industry standardized deterministic JSON serialization specification (if there is such a thing); depending a library without such spec can lead to variance/divergence if ported to different languages, or if the library breaks backwards compatibility for in future version, both would result in non-deterministic serialization. I do not want to tie Sidetree implementation to a particular version of a library indefinitely.

All is to say that I would rather not impose serialization rules that are unnecessary and keep things simple if possible. Since it is a non-issue for Sidetree in the foreseeable future, I will won't-fix this for now until it becomes an issue.

Will wait for your comments before closing.

from sidetree.

OR13 avatar OR13 commented on August 21, 2024

sorry for my delay. I think its totally fair to mark as won't fix. It may in practice never actually be an issue if users can not reconstructing objects locally, and are always working backwards from the contentID.

from sidetree.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.