digitalbazaar / cborld Goto Github PK
View Code? Open in Web Editor NEWA Javascript CBOR-LD processor for web browsers and Node.js apps.
Home Page: https://digitalbazaar.github.io/cbor-ld-spec/
License: BSD 3-Clause "New" or "Revised" License
A Javascript CBOR-LD processor for web browsers and Node.js apps.
Home Page: https://digitalbazaar.github.io/cbor-ld-spec/
License: BSD 3-Clause "New" or "Revised" License
Hi,
the current term to code dictionary implementation includes only applied scoped contexts indexed in the application order?! This solution prevents generating static dictionaries for well-known context (sets) - the dictionary is tightly coupled with a document. Is very hard to implement.
The goal is to create a dictionary mapping strings to codes.
Easy solution to implement is to collect a set of all property names (and perhaps string values as well) found in all included contexts, no matter on (contexts) order, occurrence, nor usage, then sort it alphabetically, and that's it.
Currently, this library is Node-only, but we need it to work in the browser (without Buffer polyfill and other stuff).
(Depends on PR #17).
We need to:
fs
, os
and path
.cbor
dependency to be isomorphic (to run the browser), currently it's Node-only.
cbor-web
variant uses Buffer polyfills; no good.@ellipticoin/cbor
is isomorphic, but is way too basic (does not do what we need)Are we supposed to provide our own document loader as an option input to encode
and decode
? This line suggests that documentLoader
is provided by this library, but this value appears to be undefined 🤔
Hi,
just an idea to improve compress ratio. Consider this example:
{
"@context": {
"type": "@type"
}
}
{
"@context": "...",
"type": ["type-id1"]
}
the compressed output by this library is:
[{ 0: https...context.jsonld, 101: [type-id1] }]
101
says it's an array, and there is no need to use CBOR array marker array(1)
. It could be compressed as [{ 0: https...context.jsonld, 101: type-id1 }]
saving one byte.
Possibly house it at https://digitalbazaar.github.io/cbor-ld-spec/
Hi,
the first three bytes generated by the library are 0xd90501
for compressed output, but the specification defines 0xd95001
see Compressed CBOR-LD Buffer Algorithm
For example, a VC with multiple types, each of which brings in its own type-scoped context.
First, let me say that I am beyond excited for CBOR-LD, and this bug that I am hunting is almost certainly related to my relative inexperience combined with my excitement.
{
"@context": [
"https://www.w3.org/2018/credentials/v1",
"https://w3id.org/vaccination/v1"
],
"type": [
"VerifiableCredential",
"VaccinationCertificate"
],
"id": "urn:uvci:af5vshde843jf831j128fj",
"name": "COVID-19 Vaccination Certificate",
"description": "COVID-19 Vaccination Certificate",
"issuanceDate": "2019-12-03T12:19:52Z",
"expirationDate": "2029-12-03T12:19:52Z",
"issuer": "did:key:z6MkiY62766b1LJkExWMsM3QG4WtX7QpY823dxoYzr9qZvJ3",
"credentialSubject": {
"type": "VaccinationEvent",
"batchNumber": "1183738569",
"administeringCentre": "MoH",
"healthProfessional": "MoH",
"countryOfVaccination": "NZ",
"recipient": {
"type": "VaccineRecipient",
"givenName": "JOHN",
"familyName": "SMITH",
"gender": "Male",
"birthDate": "1958-07-17"
},
"vaccine": {
"type": "Vaccine",
"disease": "COVID-19",
"atcCode": "J07BX03",
"medicinalProductName": "COVID-19 Vaccine Moderna",
"marketingAuthorizationHolder": "Moderna Biotech"
}
},
"proof": {
"type": "Ed25519Signature2018",
"created": "2021-02-18T23:00:15Z",
"jws": "eyJhbGciOiJFZERTQSIsImI2NCI6ZmFsc2UsImNyaXQiOlsiYjY0Il19..vD_vXJCWdeGpN-qKHDIlzgGC0auRPcwp3O1sOI-gN8z3UD4pI0HO_77ob5KHhhU1ugLrrwrMsKv71mqHBn-dBg",
"proofPurpose": "assertionMethod",
"verificationMethod": "did:key:z6MkiY62766b1LJkExWMsM3QG4WtX7QpY823dxoYzr9qZvJ3#z6MkiY62766b1LJkExWMsM3QG4WtX7QpY823dxoYzr9qZvJ3"
}
}
in hex, all instances of the encoded “Base58DidURL” are prefixed with d840
in my library but not in yours….
as far as I can tell this happens during cbor encoding, not during encoding mapping stage, since the encoded values, are the same in both libraries…
for example:
issuer: did:key:z6MkiY62766b1LJkExWMsM3QG4WtX7QpY823dxoYzr9qZvJ3
proof.verificationMethod did:key:z6MkiY62766b1LJkExWMsM3QG4WtX7QpY823dxoYzr9qZvJ3#z6MkiY62766b1LJkExWMsM3QG4WtX7QpY823dxoYzr9qZvJ3
I see the same encoded representation before the cbor encoding runs:
[
1025,
Uint8Array(34) [
237, 1, 60, 171, 80, 213, 89, 231,
138, 27, 171, 113, 132, 31, 217, 203,
217, 232, 2, 153, 144, 56, 104, 94,
248, 75, 126, 73, 13, 39, 23, 194,
177, 16
]
]
[
1025,
Uint8Array(34) [
237, 1, 60, 171, 80, 213, 89, 231,
138, 27, 171, 113, 132, 31, 217, 203,
217, 232, 2, 153, 144, 56, 104, 94,
248, 75, 126, 73, 13, 39, 23, 194,
177, 16
],
Uint8Array(34) [
237, 1, 60, 171, 80, 213, 89, 231,
138, 27, 171, 113, 132, 31, 217, 203,
217, 232, 2, 153, 144, 56, 104, 94,
248, 75, 126, 73, 13, 39, 23, 194,
177, 16
]
]
I worry that these changes may be the reason for this inconsistency:
https://github.com/digitalbazaar/cbor/blob/main/CHANGELOG.md
The current code base was written to iterate on a number of intermediate formats to hone in on the final result. This means that there is an intermediate format that is used as a sort of abstract syntax tree which is then converted to the final form. This probably results in close to twice as much processing and even more memory usage than necessary. Once the final algorithm for v1.0 is settled on, we will want to refactor the code base so that the intermediate format is optimized out. I believe the current algorithm only makes one pass over the documents and thus there is no need to keep previous state around. We should be able to do stream processing (since stream processing is possible in JSON-LD and because I don't remember having to implement jumping around when processing contexts or encoding/decoding).
Hi,
please try this example:
context:
{
"@context": {
"@vocab": "https://w3id.org/security#",
"value": { "@type": "multibase" }
}
}
document:
{
"@context": "https://.../0012-context.jsonld",
"value": "z4mAs9uHU16jR4xwPcbhHyRUc6BbaiJQE5MJwn3PCWkRXsriK9AMrQQMbjzG9XXFPNgngmQXHKUz23WRSu9jSxPCF"
}
the expected output is
[{ 0: https://..../0012-context.jsonld, 100: [0x7A,0xBC,0x24,0x3F, ... 65 bytes] }]
but got:
[{ 0: https://.../0012-context.jsonld, 100: z4mAs9uHU16jR4xwPcbhHyRUc6BbaiJQE5MJwn3PCWkRXsriK9AMrQQMbjzG9XXFPNgngmQXHKUz23WRSu9jSxPCF }]
We express the URI scheme encoding IDs (such as 1024
and 1025
for did:v1:nym
and did:key
) in multiple places. We should figure out a way to consolidate them and reference them through constants/static vars or similar. However, it's important not to make it so we have to import any decoder code when encoding and vice versa to enable builds for just encoding or just decoding.
Feel free to copy any of the fixtures from here: https://github.com/transmute-industries/decentralized-cbor/tree/57d2f524a1b8927e6e19113b66890183a0c69871/src/__fixtures__/inputs
some are failing currently.
Audit the code to look for places to make it more DRY. One example of repetitious code is the initialization of a new active context when the context stack is empty.
The current document loader does some bespoke on-disk caching. This should be reworked or moved outside of the core module as an optional module -- and its implementation should be replaced with something less custom, if we decide we want to keep using it. It is a premature optimization at this point and may create a number of foreseen issues (inability to invalidate caches, etc.) and unforeseen issues that other utility modules we could make use of may help us mitigate.
Looking at the coverage reports there are many uses of f({value} = {})
or similar where a value is required for the code to work. Will those default paths ever be taken? If so error handling is needed, if not those default constructs can be removed.
The algorithm for determining which codec to use when encoding/decoding a value is currently too simple and doesn't entirely take scoped terms into account. That is, when there is a redeclaration of an outer term, for example when @protected
isn't used, last defined wins when it shouldn't.
The value encode/decode codec selection algorithm needs to be updated to take where the encoder/decoder is in the parsing tree into account as that can affect which codec is used for complex JSON-LD contexts.
This value is used here but not registered in the spec:
_addRegistration(0x32, 'https://purl.imsglobal.org/spec/ob/v3p0/context.json');
If the id is stable enough to last forever, it should be registered in the spec. If not, then this lib and the spec needs a new feature to support use of unregistered ids.
CBOR-LD is designed such that it expects valid JSON-LD to be passed to it for compression, however, we should have some better error messages both during compression and decompression when we encounter unexpected types, etc.
Many static and instance methods use destructuring for default argument values. Add tests for the default values and add checks (with an assert or assert-like library) to handle errors as needed.
Related:
#41
It appears the XsdDateTimeCodec is encoding a valid date time string as Unix Epoch and then decoding it in accordance with ISO 8601. However if the input string features only a Date component i.e '2021-02-22' upon encoding and then decoding the returned value becomes 2021-02-22T00:00:00Z
hence it is not lossless.
import { encode, documentLoader } from "@digitalbazaar/cborld";
const documentLoader = (iri: string) => {
if (iri === "https://www.w3.org/ns/activitystreams") {
return {
document: {
"@context": {
"@version": 1.1,
"@vocab": "https://www.w3.org/ns/activitystreams#",
},
},
};
}
console.log(iri);
throw new Error("iri not supported");
};
const jsonldDocument = {
"@context": "https://www.w3.org/ns/activitystreams",
type: "Note",
summary: "CBOR-LD",
content: "CBOR-LD is awesome!",
};
it("can compress linked data", async () => {
const cborldBytes = await encode({ jsonldDocument, documentLoader });
console.log(cborldBytes);
});
ERR_UNKNOWN_CBORLD_TERM: Unknown term 'content' was detected in the JSON-LD input.
Steps to reproduce:
npm i @digitalbazaar/[email protected]
https://github.com/digitalbazaar/cborld#encode-to-cbor-ld
On node v14.15.5
Appears to be thrown by your wrapper around cbor.... node_modules/@digitalbazaar/cbor/lib/util.js:1
The examples need to be fixed since the package no longer exports a documentLoader.
please fix this immediately, before it starts to make everyone cry about JSON-LD :)
I've tried running the examples, but can get neither encode nor decode to work.
I set up a new npm init and installed @digitalbazaar/cborld.
When I tried the first code,
import { encode } from '@digitalbazaar/cborld';
const jsonldDocument = {
'@context': 'https://www.w3.org/ns/activitystreams',
type: 'Note',
summary: 'CBOR-LD',
content: 'CBOR-LD is awesome!'
};
// encode a JSON-LD Javascript object into CBOR-LD bytes
const cborldBytes = await encode({ jsonldDocument, documentLoader });
console.log(cborldBytes);
Unfortunately, that complained about the use of import outside a module
^^^^^^
SyntaxError: Cannot use import statement outside a module
If I add "type"="module" in my package.json, I get passed that error, only to get this one:
import { encode } from '@digitalbazaar/cborld';
^^^^^^
SyntaxError: Named export 'encode' not found. The requested module '@digitalbazaar/cborld' is a CommonJS module, which may not support all module.exports as named exports.
CommonJS modules can always be imported via the default export, for example using:
import pkg from '@digitalbazaar/cborld';
const { encode } = pkg;
FWIW, I tried this alternative import and it didn't work.
On a lark, I tried the "type":"module" hack in the cborld package.json, I get a different error:
import { encode } from '@digitalbazaar/cborld';
^^^^^^
SyntaxError: The requested module '@digitalbazaar/cborld' does not provide an export named 'encode'
Any suggestions?
p.s.
This was on a windows bash shell, running node v16.123.1
The following LD document(VC) has trouble running through a CBOR encode mechanism
"@context": [
"https://www.w3.org/2018/credentials/v1",
"https://credreg.net/ctdlasn/schema/context/json"
],
"type": [
"VerifiableCredential"
],
"issuer": "did:key:z6MkpYpkXs9u5ZzV3nij3AbdmPkwmiBQvxNeNFiRPvNz5ArP",
"issuanceDate": "2022-05-02T11:03:53Z",
"credentialSubject": {
"id": "did:key:z6MkfwmZep5ZvkHfeXszxhxEuvkmGFRc8H9Nv9ZaQG4vhFzZ",
"schema:hasCredential": {
"type": "ceterms:MicroCredential",
"ceterms:name": "Ogi ogi",
"ceterms:description": "This is a proof that things work!",
"ceterms:relatedAction": {
"type": "ceterms:CredentialingAction",
"ceterms:startDate": "2022-05-02T09:03:48.964Z",
"ceterms:endDate": "2022-05-04T09:03:48.964Z"
},
"ceterms:subject": [
{
"type": "ceterms:CredentialAlignmentObject",
"ceterms:targetNodeName": {
"en-US": "Making sure you know Javascript"
}
}
]
}
}
It complains on the schema:hasCredential
term. Even if I can run this through a issuance and verification loop using the same documentLoader.
You can run the code under the CBor creat function to test: https://github.com/vongohren/cbor-ld-test
Hi,
please look at this example:
document
{
"@context": "https://raw.githubusercontent.com/filip26/iridium-cbor-ld/main/src/test/resources/com/apicatalog/cborld/encoder/0025-context.jsonld",
"a": "Note",
"x": { "a": "x", "b": "y", "d": 106 },
"y": {"a": "x", "b": "y", "c": 102 }
}
context
{
"@context": {
"@vocab": "https://www.w3.org/ns/activitystreams#",
"y": {
"@id": "idy",
"@context": {
"a": "@type",
"b": "@id",
"c": "longitude"
}
},
"a": "@type",
"x": {
"@id": "idx",
"@context": {
"a": "@id",
"b": "@type",
"d": "latitude"
}
}
}
}
The output produced by the library is:
[{ 0: ...0025-context.jsonld,
100: Note,
102: { 100: 102, 106: y, 108: 106 },
104: { 100: x, 106: 104, 110: 102 } }]
where
100: a, 102: x, 104: y, 105: b, 108: d 110: c
why is not @type x.b : y
and y.a: x
encoded?
by using encoded examples produced by the library as a source to reverse engineering, I've got this result:
[{ 0: ...0025-context.jsonld,
100: Note,
102: { 100: 102, 106: 104, 108: 106 },
104: { 100: 102, 106: 104, 110: 102 } }]
Thank you!
Hi,
try this example:
{
"@context": [
"https://www.w3.org/2018/credentials/v1",
"https://www.w3.org/2018/credentials/examples/v1",
"https://w3id.org/security/suites/ed25519-2020/v1"
],
"id": "http://example.edu/credentials/3732",
"type": [
"VerifiableCredential",
"UniversityDegreeCredential"
],
"issuer": "https://example.edu/issuers/565049",
"issuanceDate": "2010-01-01T00:00:00Z",
"credentialSubject": {
"id": "did:example:ebfeb1f712ebc6f1c276e12ec21",
"degree": {
"type": "BachelorDegree",
"name": "Bachelor of Science and Arts"
}
}
}
it fails with ERR_UNKNOWN_CBORLD_TERM: ... degree ...
but degree
is defined in https://www.w3.org/2018/credentials/examples/v1.
const jsonld = require('jsonld');
const cborld = require('@digitalbazaar/cborld');
import { contexts, resolvedContexts } from '../__fixtures__';
it('can use jsonld.js documentLoader', async () => {
const result = await jsonld.documentLoader(contexts.activitystreams);
expect(result).toEqual(resolvedContexts.activitystreams);
});
it('can use cborld documentLoader', async () => {
const result = await cborld.documentLoader(contexts.activitystreams);
expect(result.toString()).toEqual(resolvedContexts.activitystreams.document);
});
{
"@context": "https://www.w3.org/ns/activitystreams",
"summary": "Object history",
"type": "Collection",
"totalItems": 2,
"items": [
{
"type": "Create",
"actor": "http://www.test.example/sally",
"object": "http://example.org/foo"
},
{
"type": "Like",
"actor": "http://www.test.example/joe",
"object": "http://example.org/foo"
}
]
}
Seeing
- Object {
- "actor": "http://www.test.example/sally",
- "object": "http://example.org/foo",
- "type": "Create",
+ Map {
+ 143 => 15,
+ 64 => "http://www.test.example/sally",
+ 112 => "http://example.org/foo",
},
- Object {
- "actor": "http://www.test.example/joe",
- "object": "http://example.org/foo",
- "type": "Like",
+ Map {
+ 143 => 33,
+ 64 => "http://www.test.example/joe",
+ 112 => "http://example.org/foo",
@dlongley said:
diagnose
seems more like a log function ... perhaps the docs should reflect this. Not sure what to think about this pattern
We should put some thought into how to do logging from deep within the cborld library.
same issue reported on #29
@mattcollier wrote:
The default documentLoader is used for CLI operations. documentLoader issues are very popular in jsonld related repos. Seems like a word or two about that would be good in the README.md.
Since arrays are objects, the second conditional here will never be hit:
Lines 29 to 31 in 499759e
The order needs to be swapped and better test coverage added.
context: https://schema.org/version/latest/schemaorg-current-http.jsonld
input:
{
"@context": "http://schema.org",
"@type": "Person",
"name": "Brent",
"makesOffer": {
"@type": "Offer",
"priceSpecification": {
"@type": "UnitPriceSpecification",
"priceCurrency": "USD",
"price": "18000"
},
"itemOffered": {
"@type": "Car",
"name": "2009 Volkswagen Golf V GTI MY09 Direct-Shift Gearbox",
"description": "2009 Volkswagen Golf V GTI MY09 Direct-Shift Gearbox in perfect mechanical condition and low kilometres. It's impressive 2.0 litre turbo engine makes every drive a fun experience. Well looked after by one owner with full service history. It drives like new and has only done 50,000kms. (...)",
"image": "2009_Volkswagen_Golf_V_GTI_MY09.png",
"color": "Black",
"numberOfForwardGears": "6",
"vehicleEngine": {
"@type": "EngineSpecification",
"name": "4 cylinder Petrol Turbo Intercooled 2.0 L (1984 cc)"
}
}
}
}
The activitystream example in the readme does not make use of appContextMap at all.
https://github.com/digitalbazaar/cborld#encode-to-cbor-ld
This feature seems like dangerous complexity, can it be safely removed?
When using an alias for @id
, the optimized codecs, such as Base58DidUrlCodec, are not used. Example:
{ "@context": "....", "id": "did:key:..." }
Fix may be in lib/codec.js _generateTermEncodingMap.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.