Coder Social home page Coder Social logo

digitalbazaar / cborld Goto Github PK

View Code? Open in Web Editor NEW
13.0 13.0 12.0 265 KB

A Javascript CBOR-LD processor for web browsers and Node.js apps.

Home Page: https://digitalbazaar.github.io/cbor-ld-spec/

License: BSD 3-Clause "New" or "Revised" License

JavaScript 100.00%
jsonld cbor cborld linked-data

cborld's Issues

Term to Code dictionary includes only applied scoped contexts

Hi,
the current term to code dictionary implementation includes only applied scoped contexts indexed in the application order?! This solution prevents generating static dictionaries for well-known context (sets) - the dictionary is tightly coupled with a document. Is very hard to implement.

The goal is to create a dictionary mapping strings to codes.

Easy solution to implement is to collect a set of all property names (and perhaps string values as well) found in all included contexts, no matter on (contexts) order, occurrence, nor usage, then sort it alphabetically, and that's it.

Convert 'cborld' library to isomorphic (to run in browser)

Currently, this library is Node-only, but we need it to work in the browser (without Buffer polyfill and other stuff).

(Depends on PR #17).

We need to:

  • Remove the Node-only fs caching documentLoader (have it throw a 'Not Implemented' error just like the browser loader). This will remove the need for polyfills for fs, os and path.
  • Convert all usages of Buffer to Uint8Array usages.
  • (main blocker) Convert the cbor dependency to be isomorphic (to run the browser), currently it's Node-only.
    • the cbor-web variant uses Buffer polyfills; no good.
    • @ellipticoin/cbor is isomorphic, but is way too basic (does not do what we need)

Unable to retrieve documentLoader

Are we supposed to provide our own document loader as an option input to encode and decode? This line suggests that documentLoader is provided by this library, but this value appears to be undefined 🤔

Single item array could be compressed as a single value saving one byte

Hi,
just an idea to improve compress ratio. Consider this example:

{
  "@context": {
    "type": "@type"
  }
}
{
  "@context": "...",
  "type": ["type-id1"]
}

the compressed output by this library is:

[{ 0: https...context.jsonld, 101: [type-id1] }]

101 says it's an array, and there is no need to use CBOR array marker array(1). It could be compressed as [{ 0: https...context.jsonld, 101: type-id1 }] saving one byte.

Interoperability issue with vanilla cbor for base58didurl codec

First, let me say that I am beyond excited for CBOR-LD, and this bug that I am hunting is almost certainly related to my relative inexperience combined with my excitement.

{
  "@context": [
    "https://www.w3.org/2018/credentials/v1",
    "https://w3id.org/vaccination/v1"
  ],
  "type": [
    "VerifiableCredential",
    "VaccinationCertificate"
  ],
  "id": "urn:uvci:af5vshde843jf831j128fj",
  "name": "COVID-19 Vaccination Certificate",
  "description": "COVID-19 Vaccination Certificate",
  "issuanceDate": "2019-12-03T12:19:52Z",
  "expirationDate": "2029-12-03T12:19:52Z",
  "issuer": "did:key:z6MkiY62766b1LJkExWMsM3QG4WtX7QpY823dxoYzr9qZvJ3",
  "credentialSubject": {
    "type": "VaccinationEvent",
    "batchNumber": "1183738569",
    "administeringCentre": "MoH",
    "healthProfessional": "MoH",
    "countryOfVaccination": "NZ",
    "recipient": {
      "type": "VaccineRecipient",
      "givenName": "JOHN",
      "familyName": "SMITH",
      "gender": "Male",
      "birthDate": "1958-07-17"
    },
    "vaccine": {
      "type": "Vaccine",
      "disease": "COVID-19",
      "atcCode": "J07BX03",
      "medicinalProductName": "COVID-19 Vaccine Moderna",
      "marketingAuthorizationHolder": "Moderna Biotech"
    }
  },
  "proof": {
    "type": "Ed25519Signature2018",
    "created": "2021-02-18T23:00:15Z",
    "jws": "eyJhbGciOiJFZERTQSIsImI2NCI6ZmFsc2UsImNyaXQiOlsiYjY0Il19..vD_vXJCWdeGpN-qKHDIlzgGC0auRPcwp3O1sOI-gN8z3UD4pI0HO_77ob5KHhhU1ugLrrwrMsKv71mqHBn-dBg",
    "proofPurpose": "assertionMethod",
    "verificationMethod": "did:key:z6MkiY62766b1LJkExWMsM3QG4WtX7QpY823dxoYzr9qZvJ3#z6MkiY62766b1LJkExWMsM3QG4WtX7QpY823dxoYzr9qZvJ3"
  }
}

in hex, all instances of the encoded “Base58DidURL” are prefixed with d840 in my library but not in yours….

as far as I can tell this happens during cbor encoding, not during encoding mapping stage, since the encoded values, are the same in both libraries…

for example:

issuer: did:key:z6MkiY62766b1LJkExWMsM3QG4WtX7QpY823dxoYzr9qZvJ3
proof.verificationMethod did:key:z6MkiY62766b1LJkExWMsM3QG4WtX7QpY823dxoYzr9qZvJ3#z6MkiY62766b1LJkExWMsM3QG4WtX7QpY823dxoYzr9qZvJ3

I see the same encoded representation before the cbor encoding runs:

[
  1025,
  Uint8Array(34) [
    237,   1,  60, 171,  80, 213,  89, 231,
    138,  27, 171, 113, 132,  31, 217, 203,
    217, 232,   2, 153, 144,  56, 104,  94,
    248,  75, 126,  73,  13,  39,  23, 194,
    177,  16
  ]
]
[
  1025,
  Uint8Array(34) [
    237,   1,  60, 171,  80, 213,  89, 231,
    138,  27, 171, 113, 132,  31, 217, 203,
    217, 232,   2, 153, 144,  56, 104,  94,
    248,  75, 126,  73,  13,  39,  23, 194,
    177,  16
  ],
  Uint8Array(34) [
    237,   1,  60, 171,  80, 213,  89, 231,
    138,  27, 171, 113, 132,  31, 217, 203,
    217, 232,   2, 153, 144,  56, 104,  94,
    248,  75, 126,  73,  13,  39,  23, 194,
    177,  16
  ]
]

I worry that these changes may be the reason for this inconsistency:

https://github.com/digitalbazaar/cbor/blob/main/CHANGELOG.md

Refactor to do encoding/decoding without re-encoding twice.

The current code base was written to iterate on a number of intermediate formats to hone in on the final result. This means that there is an intermediate format that is used as a sort of abstract syntax tree which is then converted to the final form. This probably results in close to twice as much processing and even more memory usage than necessary. Once the final algorithm for v1.0 is settled on, we will want to refactor the code base so that the intermediate format is optimized out. I believe the current algorithm only makes one pass over the documents and thus there is no need to keep previous state around. We should be able to do stream processing (since stream processing is possible in JSON-LD and because I don't remember having to implement jumping around when processing contexts or encoding/decoding).

Multibase term value is not properly compressed when using @vocab

Hi,
please try this example:
context:

{
    "@context": {
        "@vocab": "https://w3id.org/security#",
        "value": { "@type": "multibase" }
    }
}

document:

{
    "@context": "https://.../0012-context.jsonld",
    "value": "z4mAs9uHU16jR4xwPcbhHyRUc6BbaiJQE5MJwn3PCWkRXsriK9AMrQQMbjzG9XXFPNgngmQXHKUz23WRSu9jSxPCF"
}

the expected output is

[{ 0: https://..../0012-context.jsonld, 100: [0x7A,0xBC,0x24,0x3F, ... 65 bytes] }]

but got:

[{ 0: https://.../0012-context.jsonld, 100: z4mAs9uHU16jR4xwPcbhHyRUc6BbaiJQE5MJwn3PCWkRXsriK9AMrQQMbjzG9XXFPNgngmQXHKUz23WRSu9jSxPCF }]

Consolidate how URI scheme encoding IDs are referenced

We express the URI scheme encoding IDs (such as 1024 and 1025 for did:v1:nym and did:key) in multiple places. We should figure out a way to consolidate them and reference them through constants/static vars or similar. However, it's important not to make it so we have to import any decoder code when encoding and vice versa to enable builds for just encoding or just decoding.

Make code more DRY

Audit the code to look for places to make it more DRY. One example of repetitious code is the initialization of a new active context when the context stack is empty.

Replace/rework node.js document loader

The current document loader does some bespoke on-disk caching. This should be reworked or moved outside of the core module as an optional module -- and its implementation should be replaced with something less custom, if we decide we want to keep using it. It is a premature optimization at this point and may create a number of foreseen issues (inability to invalidate caches, etc.) and unforeseen issues that other utility modules we could make use of may help us mitigate.

Fixup default argument usage.

Looking at the coverage reports there are many uses of f({value} = {}) or similar where a value is required for the code to work. Will those default paths ever be taken? If so error handling is needed, if not those default constructs can be removed.

Applying scoped term codecs fails for redeclarations

The algorithm for determining which codec to use when encoding/decoding a value is currently too simple and doesn't entirely take scoped terms into account. That is, when there is a redeclaration of an outer term, for example when @protected isn't used, last defined wins when it shouldn't.

The value encode/decode codec selection algorithm needs to be updated to take where the encoder/decoder is in the parsing tree into account as that can affect which codec is used for complex JSON-LD contexts.

Ensure codec values are registered in the spec.

This value is used here but not registered in the spec:

_addRegistration(0x32, 'https://purl.imsglobal.org/spec/ob/v3p0/context.json');

If the id is stable enough to last forever, it should be registered in the spec. If not, then this lib and the spec needs a new feature to support use of unregistered ids.

Improve error reporting

CBOR-LD is designed such that it expects valid JSON-LD to be passed to it for compression, however, we should have some better error messages both during compression and decompression when we encounter unexpected types, etc.

Add tests and checks for default values.

Many static and instance methods use destructuring for default argument values. Add tests for the default values and add checks (with an assert or assert-like library) to handle errors as needed.

Related:
#41

`@vocab` not implemented

import { encode, documentLoader } from "@digitalbazaar/cborld";

const documentLoader = (iri: string) => {
  if (iri === "https://www.w3.org/ns/activitystreams") {
    return {
      document: {
        "@context": {
          "@version": 1.1,
          "@vocab": "https://www.w3.org/ns/activitystreams#",
        },
      },
    };
  }
  console.log(iri);
  throw new Error("iri not supported");
};

const jsonldDocument = {
  "@context": "https://www.w3.org/ns/activitystreams",
  type: "Note",
  summary: "CBOR-LD",
  content: "CBOR-LD is awesome!",
};

it("can compress linked data", async () => {
  const cborldBytes = await encode({ jsonldDocument, documentLoader });
  console.log(cborldBytes);
});

ERR_UNKNOWN_CBORLD_TERM: Unknown term 'content' was detected in the JSON-LD input.

Fix README examples.

The examples need to be fixed since the package no longer exports a documentLoader.

Examples don't work with node v16.123.1

I've tried running the examples, but can get neither encode nor decode to work.

I set up a new npm init and installed @digitalbazaar/cborld.

When I tried the first code,

import { encode } from '@digitalbazaar/cborld';

const jsonldDocument = {
  '@context': 'https://www.w3.org/ns/activitystreams',
  type: 'Note',
  summary: 'CBOR-LD',
  content: 'CBOR-LD is awesome!'
};

// encode a JSON-LD Javascript object into CBOR-LD bytes
const cborldBytes = await encode({ jsonldDocument, documentLoader });

console.log(cborldBytes);

Unfortunately, that complained about the use of import outside a module

^^^^^^

SyntaxError: Cannot use import statement outside a module

If I add "type"="module" in my package.json, I get passed that error, only to get this one:

import { encode } from '@digitalbazaar/cborld';
         ^^^^^^
SyntaxError: Named export 'encode' not found. The requested module '@digitalbazaar/cborld' is a CommonJS module, which may not support all module.exports as named exports.
CommonJS modules can always be imported via the default export, for example using:

import pkg from '@digitalbazaar/cborld';
const { encode } = pkg;

FWIW, I tried this alternative import and it didn't work.

On a lark, I tried the "type":"module" hack in the cborld package.json, I get a different error:

import { encode } from '@digitalbazaar/cborld';
         ^^^^^^
SyntaxError: The requested module '@digitalbazaar/cborld' does not provide an export named 'encode'

Any suggestions?

p.s.
This was on a windows bash shell, running node v16.123.1

Issues with loading certain terms

The following LD document(VC) has trouble running through a CBOR encode mechanism

"@context": [
      "https://www.w3.org/2018/credentials/v1",
      "https://credreg.net/ctdlasn/schema/context/json"
  ],
  "type": [
      "VerifiableCredential"
  ],
  "issuer": "did:key:z6MkpYpkXs9u5ZzV3nij3AbdmPkwmiBQvxNeNFiRPvNz5ArP",
  "issuanceDate": "2022-05-02T11:03:53Z",
  "credentialSubject": {
      "id": "did:key:z6MkfwmZep5ZvkHfeXszxhxEuvkmGFRc8H9Nv9ZaQG4vhFzZ",
      "schema:hasCredential": {
          "type": "ceterms:MicroCredential",
          "ceterms:name": "Ogi ogi",
          "ceterms:description": "This is a proof that things work!",
          "ceterms:relatedAction": {
              "type": "ceterms:CredentialingAction",
              "ceterms:startDate": "2022-05-02T09:03:48.964Z",
              "ceterms:endDate": "2022-05-04T09:03:48.964Z"
          },
          "ceterms:subject": [
              {
                  "type": "ceterms:CredentialAlignmentObject",
                  "ceterms:targetNodeName": {
                      "en-US": "Making sure you know Javascript"
                  }
              }
          ]
      }
  }

It complains on the schema:hasCredential term. Even if I can run this through a issuance and verification loop using the same documentLoader.

You can run the code under the CBor creat function to test: https://github.com/vongohren/cbor-ld-test

Scoped Contexts Processing Algorithm

Hi,
please look at this example:

document

{
  "@context": "https://raw.githubusercontent.com/filip26/iridium-cbor-ld/main/src/test/resources/com/apicatalog/cborld/encoder/0025-context.jsonld",
  "a": "Note",
  "x": { "a": "x", "b": "y", "d": 106 },
  "y": {"a": "x", "b": "y", "c": 102 }
}

context

{
  "@context": {
    "@vocab": "https://www.w3.org/ns/activitystreams#",
    "y": {
      "@id": "idy",
      "@context": {
          "a": "@type",
          "b": "@id",
          "c": "longitude"
      }
    },	
    "a": "@type",
    "x": {
      "@id": "idx",
      "@context": {
	 "a": "@id",
	 "b": "@type",
	 "d": "latitude"
       }
    }
  }
}

The output produced by the library is:

[{ 0: ...0025-context.jsonld, 
  100: Note, 
  102: { 100: 102, 106: y, 108: 106 }, 
  104: { 100: x, 106: 104, 110: 102 } }]

where

100: a, 102: x, 104: y,  105: b, 108: d  110: c

why is not @type x.b : y and y.a: x encoded?

by using encoded examples produced by the library as a source to reverse engineering, I've got this result:

[{ 0: ...0025-context.jsonld, 
   100: Note, 
    102: { 100: 102, 106: 104, 108: 106 }, 
    104: { 100: 102, 106: 104, 110: 102 } }]

Thank you!

Encoding fails with undefined term

Hi,
try this example:

{
    "@context": [
        "https://www.w3.org/2018/credentials/v1",
        "https://www.w3.org/2018/credentials/examples/v1",
        "https://w3id.org/security/suites/ed25519-2020/v1"
    ],
    "id": "http://example.edu/credentials/3732",
    "type": [
        "VerifiableCredential",
        "UniversityDegreeCredential"
    ],
    "issuer": "https://example.edu/issuers/565049",
    "issuanceDate": "2010-01-01T00:00:00Z",
    "credentialSubject": {
        "id": "did:example:ebfeb1f712ebc6f1c276e12ec21",
        "degree": {
            "type": "BachelorDegree",
            "name": "Bachelor of Science and Arts"
        }
    }
}

it fails with ERR_UNKNOWN_CBORLD_TERM: ... degree ... but degree is defined in https://www.w3.org/2018/credentials/examples/v1.

documentLoader interface should not change...

const jsonld = require('jsonld');
const cborld = require('@digitalbazaar/cborld');

import { contexts, resolvedContexts } from '../__fixtures__';

it('can use jsonld.js documentLoader', async () => {
  const result = await jsonld.documentLoader(contexts.activitystreams);
  expect(result).toEqual(resolvedContexts.activitystreams);
});

it('can use cborld documentLoader', async () => {
  const result = await cborld.documentLoader(contexts.activitystreams);
  expect(result.toString()).toEqual(resolvedContexts.activitystreams.document);
});

Codec error handling activity stream collection

{
  "@context": "https://www.w3.org/ns/activitystreams",
  "summary": "Object history",
  "type": "Collection",
  "totalItems": 2,
  "items": [
    {
      "type": "Create",
      "actor": "http://www.test.example/sally",
      "object": "http://example.org/foo"
    },
    {
      "type": "Like",
      "actor": "http://www.test.example/joe",
      "object": "http://example.org/foo"
    }
  ]
}

Seeing

    -     Object {
    -       "actor": "http://www.test.example/sally",
    -       "object": "http://example.org/foo",
    -       "type": "Create",
    +     Map {
    +       143 => 15,
    +       64 => "http://www.test.example/sally",
    +       112 => "http://example.org/foo",
          },
    -     Object {
    -       "actor": "http://www.test.example/joe",
    -       "object": "http://example.org/foo",
    -       "type": "Like",
    +     Map {
    +       143 => 33,
    +       64 => "http://www.test.example/joe",
    +       112 => "http://example.org/foo",

Add documentation for documentLoader

@mattcollier wrote:

The default documentLoader is used for CLI operations. documentLoader issues are very popular in jsonld related repos. Seems like a word or two about that would be good in the README.md.

CborldError: Unknown term 'name' was detected in JSON-LD input.Error: ERR_UNKNOWN_CBORLD_TERM: Unknown term 'name' was detected in JSON-LD input.

context: https://schema.org/version/latest/schemaorg-current-http.jsonld
input:

{
  "@context": "http://schema.org",
  "@type": "Person",
  "name": "Brent",
  "makesOffer": {
    "@type": "Offer",
    "priceSpecification": {
      "@type": "UnitPriceSpecification",
      "priceCurrency": "USD",
      "price": "18000"
    },
    "itemOffered": {
      "@type": "Car",
      "name": "2009 Volkswagen Golf V GTI MY09 Direct-Shift Gearbox",
      "description": "2009 Volkswagen Golf V GTI MY09 Direct-Shift Gearbox in perfect mechanical condition and low kilometres. It's impressive 2.0 litre turbo engine makes every drive a fun experience. Well looked after by one owner with full service history. It drives like new and has only done 50,000kms. (...)",
      "image": "2009_Volkswagen_Golf_V_GTI_MY09.png",
      "color": "Black",
      "numberOfForwardGears": "6",
      "vehicleEngine": {
        "@type": "EngineSpecification",
        "name": "4 cylinder Petrol Turbo Intercooled 2.0 L (1984 cc)"
      }
    }
  }
}

Handle codecs for aliased `@id`.

When using an alias for @id, the optimized codecs, such as Base58DidUrlCodec, are not used. Example:

{ "@context": "....", "id": "did:key:..." }

Fix may be in lib/codec.js _generateTermEncodingMap.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.