Coder Social home page Coder Social logo

moleike / haskell-jsonnet Goto Github PK

View Code? Open in Web Editor NEW
20.0 20.0 6.0 315 KB

๐ŸŽ Haskell implementation of Jsonnet

Home Page: https://hackage.haskell.org/package/jsonnet

License: Other

Haskell 39.90% Jsonnet 59.67% Nix 0.43%
compiler configuration-language haskell infrastructure-as-code jsonnet

haskell-jsonnet's People

Contributors

cristhianmotoche avatar gitter-badger avatar moleike avatar ozkutuk avatar vaibhavsagar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

haskell-jsonnet's Issues

Evaluating `expr in super` fails when there is no super class

This is expected given we are treating super as a variable, while Jsonnet core AST super is just another literal.

C++ impl. returns the following:

{ foo: "bar", bar: super.foo }

RUNTIME ERROR: attempt to use super when there is no super class.

While:

{ foo: "bar", bar: "foo" in super }

outputs:

{
  "bar": false,
  "foo": "bar"
}

The following test fails due to this:

std.assertEqual({ f+: 3 }, { f: 3 }) 

Conflate constructors in Core that are constant functions

Since Core expressions are not strictly evaluated, CIfElse need not be defined as primitive.

We could generalize this to include CBinOp, CUnyOp, and CLookup and lump them together in a common constructor like CConst Op (arity defined implicitly in Op), so the core definition goes from 14 (!) term constructors to 11.

Here's how desugaring this would look like:

EUnyOp op e -> CApp (CConst op) (Args [Pos e] Lazy)
EBinOp op e1 e2 -> CApp (CConst op) (Args [Pos e1, Pos e2] Lazy)
EIfElse c t e -> CApp (CConst IfElse) (Args [Pos c, Pos t, Pos e] Lazy)
ELookup e1 e2 -> CApp (CConst Lookup) (Args [Pos e1, Pos e2] Lazy)
EIndex e1 e2 -> CApp (CConst Lookup) (Args [Pos e1, Pos e2] Lazy)

Static checking

There's no type checker in this impl. (since Jsonnet is not statically typed after all), but the Jsonnet spec. does provide a static semantics (in the form of typing judgments) which mostly consists of rejecting programs with:

  • free variables (other than std)
  • duplicate local bindings, duplicate function parameters,
  • positional arguments after named arguments in a function application,
  • self referenced outside an object fields' values (i.e. computed keys should not reference self)

We can easily collect all free variables in a Core expression via the binding structure (unbound-generics)

Comprehensions can't handle chained `ifspec`s

Minimal way to reproduce

$ cabal run hs-jsonnet -- -e '[x for x in [1,2,3] if x > 1 if x < 3]'
1:30:
  |
1 | [x for x in [1,2,3] if x > 1 if x < 3]
  |                              ^
unexpected 'i'
expecting "!=", "&&", "<<", "<=", "==", ">=", ">>", "for", "in", "||", '%', '&', '(', '*', '+', '-', '/', '<', '>', '[', ']', '^', '|', ., or object

Expected output

$ jsonnet -e '[x for x in [1,2,3] if x > 1 if x < 3]'
[
   2
]

The problem

In our implementation, the ifspecs are parsed as an optional part of forspecs. This only allows a singular ifspec to be allowed per forspec. However, the Jsonnet specification allows an array comprehension to contain any number of compspecs after an initial mandatory forspec, where compspecs can be either forspecs or ifspecs.

Top-level arguments

Jsonnet programs cannot always be self-contained, consider e.g. when the generated config needs to have secrets in it. In this case, you do not want to commit your secrets alongside code. More generally, you might want to parameterise your program with commit hashes, versions numbers, etc.

Top-level arguments (TLAs) allow you to call Jsonnet programs with parameters, as long as your program evaluates to a function.

>>> jsonnet -e 'std.map' --tla-code 'func=function(x) x * x' --tla-code arr='[1, 2, 3]'
>>> 
[
   1,
   4,
   9
]

If a program does not reduce to a top-level function, then the arguments are ignored.

See here for how it's done in go-jsonnet: https://github.com/google/go-jsonnet/blob/v0.17.0/interpreter.go#L1227

folds with nested arrays do not terminate

the following is a valid Jsonnet program:

std.assertEqual(std.foldr(function(x, y) [x, y], [1, 2, 3, 4], []), [1, [2, [3, [4, []]]]])

But currently will fail to reduce it to normal form. The problem comes from using monadic folds.

Desugar function parameters with no defaults to error

This might simplify a bit the evaluation of applications.
Currently, we represent parameter defaults with a Maybe t.
When a parameter is not given a default expression, we may desugar it with the following default expr: error "Parameter not bound"

Parse escape characters in string literals

The parsing of strings in its current form is very incomplete.

A Jsonnet string is very much like a JSON string, but with some additional flexibility: string literals use " or '. Either can be used to avoid escaping the other, e.g. "Farmer's Gin" instead of 'Farmer's Gin'.

As a first step, we should handle:

  • escape characters \ + oneOf "'\/bfnrt
  • \uXXXX with 4 hex digits that encode the character's code point (we assume characters in the Basic Multilingual Plane for now)

Ref: https://tools.ietf.org/html/rfc8259#section-7

If any doubts, please ask!

Implement std.objectHas and std.objectHasAll

This issue depends on #5.

It is possible to check fieldโ€™s visibility using std.objectHas and std.objectHasAll standard library functions.

Ref: https://jsonnet.org/ref/stdlib.html

std.objectHas(o, f):
Returns true if the given object has the field (given as a string), otherwise false. Raises an error if the arguments are not object and string respectively. Returns false if the field is hidden.

std.objectHasAll(o, f):
As std.objectHas but also includes hidden fields.

Gradual types?

Jsonnet (the spec) defines the language as dynamically typed. An extension that seems natural to incorporate into Jsonnet is gradual typing with structural subtyping ร  la TypeScript.

Gradual typing enables mixing typed and untyped code, where users decide where (or when) to add type annotations to increase static checking. Fully annotated programs should be statically type-safe. Programs with no annotations at all should behave as current Jsonnet.

In a first (simpler) approach, we erase type annotations after type checking and interpret the program as if it were dynamically typed. A benefit of this approach is the typechecker becomes a standalone component, and thus it can be used with other compilers. There are some quirks though:

local foo(x) = 
  local bar(y: int) = {};
  bar(x);
foo(true)

The above program should intuitively fail, but it runs to completion.

A more elaborate approach performs run-time type checks at dynamically and statically typed code boundaries, by adding explicit casts. In this case, the type checking could be done after desugaring.

Jsonnet original implementation has a related open issue: google/jsonnet#605

`self` not in scope from an object local binding

The following does not work:

local Fib = {
  n: 1,
  local outer = self,
  r: if self.n <= 1 then 1 else (Fib { n: outer.n - 1 }).r + (Fib { n: outer.n - 2 }).r,
};

(Fib { n: 25 }).r

We get a VarNotFound exception

Encoding JSON strings

std.manifestJson("foo\nbar") should return the following escaped string "\"foo\\nbar\""

Preloading of stdlib

Find a way to preload the stdlib library e.g. using a serialisation library instead of a TH splice producing a huge Haskell expression. Parsing and desugaring still happens at compile-time, but we instead store the Jsonnet.Core output in a file (or embedded depending on the size), via a library like cereal or binary.

Unquoted strings are parsed as identifiers

Both identifiers and unquoted strings are pretty-printed the same way, but identifier parsing has higher priority than string parsing during parsing stage, causing the following behavior:

ghci> import Language.Jsonnet.Syntax
ghci> import Language.Jsonnet.Common
ghci> import Language.Jsonnet.Pretty
ghci> import Language.Jsonnet.Test.Roundtrip
ghci> import qualified Data.Text as T
ghci> e = EStr "foo" Unquoted
ghci> parseExpr . T.pack . show . ppExpr $ e
Right (Fix (InL (EIdent "foo")))

This causes roundtrip tests to sometimes fail, since a pretty-printed unquoted string is later parsed as an identifier. Here is a hedgehog seed to see an example failing roundtrip test:

cabal run jsonnet-test -- -p "roundtrip" --hedgehog-replay "Size 91 Seed 12461984714529812419 8668381392682229415"

Add hidden fields

Jsonnet objects have a concept of visibility. By default all fields are visible, but Jsonnet allows hidden fields using :: syntax.
Hidden fields are ignored for both manifestation and equality checking, e.g. { hidden:: "foo"} == {}.

See the tutorial for examples. Join our gitter channel if you have further questions or need help!

object composition

Jsonnet has object orientation features, incl. composition of objects, using the overloaded + operator. Object composition is not just simply merging objects, since that would hinder the ability to override fields that appear in the left-hand side:

local foo = { a: 'foo', b: self.c - 1 };
foo + { a: 1, c: 3 }
{
  "a": 1,
  "b": 2,
  "c": 3
}

In the example above, had we just merged the objects, we would get a NoSuchKey exception for the missing key c in the first object. To avoid this we need to bind self later, once the object fields have been merged but the field values aren't yet reduced (keys should be in rnf). Merging is right-biased unless a field is hidden:

{ a:: 2 } + { a: 1 } --> { a:: 2}

Visibility of a field can be overridden using the ::: field separator.

Implicit + operator

when the rigth-hand side is an object literal, the + can be omitted, i.e. foo + { ... } and foo {...} are the same expression.

This is (apparently) the missing feature to be able to parse the official jsonnet std library. ๐Ÿคž

Debugging: add `std.trace`

std.trace(str, rest) outputs the given string str to stderr and returns rest as the result.

Example:

local conditionalReturn(cond, in1, in2) =
  if (cond) then
      std.trace('cond is true returning '
              + std.toString(in1), in1)
  else
      std.trace('cond is false returning '
              + std.toString(in2), in2);

{
    a: conditionalReturn(true, { b: true }, { c: false }),
}

Prints:

TRACE: test.jsonnet:3 cond is true returning {"b": true}
{
    "a": {
        "b": true
    }
}

Multiple file output

Add a mode for generating multiple JSON files from a single Jsonnet file

// multiple_output.jsonnet
{
  "a.json": {
    x: 1,
    y: $["b.json"].y,
  },
  "b.json": {
    x: $["a.json"].x,
    y: 2,
  },
}
$ jsonnet -m . multiple_output.jsonnet
a.json
b.json
$ cat a.json 
{
   "x": 1,
   "y": 2
}
$ cat b.json 
{
   "x": 1,
   "y": 2
}

Details here: https://jsonnet.org/learning/getting_started.html#multi

Better error messages

We do not have an execution stack, so we can't just provide a stack trace when an error is raised, but AST nodes are annotated with source spans, reporting backtraces of call sites will help debug errors.

Surprisingly, I can't find many examples of how to do this.

One idea is:

  • add SourcePos to the application constructor: CApp SourcePos Core (Args Core)
  • before beta reducing, we can first check the name of the function since it will be a variable (else it's "anonymous")
  • add the pair (Name, SourcePos) to the call stack that we keep in a Reader env.

Cyclic imports

Cyclic imports are valid and well defined:

$ cat a.jsonnet 
{
  a:: 'a',
  c: (import 'b.jsonnet').b,
}
$ cat b.jsonnet 
{
  b:: (import 'a.jsonnet').a,
}
$ /google/data/ro/teams/jsonnet/jsonnet a.jsonnet 
{
   "c": "a"
}

So are imports on self:

$ cat a.jsonnet 
{
  a:: 'a',
  c: (import 'a.jsonnet').a,
}
$ jsonnet a.jsonnet 
{
   "c": "a"
}

The problem in the code snippet in the description is that the object in "main.jsonnet" is recursively defined in a way where the recursion doesn't have a terminating condition (i.e. bottomless). It's equivalent to:

local a = a + {};
a

Which produces similar results:

$ jsonnet main.jsonnet 
RUNTIME ERROR: max stack frames exceeded.
        main.jsonnet:1:11       thunk <a>
        main.jsonnet:1:11       thunk <a>
        main.jsonnet:1:11       thunk <a>
        main.jsonnet:1:11       thunk <a>
        main.jsonnet:1:11       thunk <a>
        main.jsonnet:1:11       thunk <a>
        main.jsonnet:1:11       thunk <a>
        main.jsonnet:1:11       thunk <a>
        main.jsonnet:1:11       thunk <a>
        main.jsonnet:1:11       thunk <a>
        ...

But there are plenty of valid ways to define a variable or set of varibles recursively that do bottom out. For example:

{
 x: {
   a: 1,
   y: $.y,
 },
 y: {
   x: $.x,
   a: 1,
 },
}.x.y.x.y.x.y.x.a
{
  a: 1,
  b: 1,
  fib: self {
    a: super.b,
    b: super.a + super.b,
  },
}.fib.fib.fib.fib.fib.a

Are both valid, terminating programs.

Originally posted by @mikedanese in google/go-jsonnet#353 (comment)

Replace slow std functions with native built-ins

We are using the exact same Jsonnet implementation of the std object that is used in the C++ original implementation
This replacement would not need to modify the vanilla std.jsonnet since we can override them in the native impl.

This should heavily rely on micro-benchmarks so that we only re-implement methods that are known to be slow.

Methods:

External variables

Any Jsonnet value can be bound to an external variable, even functions, which are accessible anywhere in the config, or any file, using std.extVar("foo")

jsonnet --ext-str version="0.9" --ext-code dry-run=true ...
local build_image_version = std.extVar('version');
local dry_run = std.extVar('dry-run');
...
local run(name, commands) = {  
  name: name,  
  image: 'foo/build-image:%s' % build_image_version,  
  commands: commands,
};

Object field caching

As currently implemented, mixins are extremely slow (and possibly leak), take this benchmark for example:

  local fibnext = {
    a: super.a + super.b,
    b: super.a,
  };
  local fib(n) =
    if n == 0 then
      { a: 1, b: 1 }
    else
      fib(n - 1) + fibnext;

  fib(25)

This is runs in exponential time like a naive recursive implementation because super doesn't allow us to use memoization, so we end up recomputing every Fibonacci number.

This seems has been addressed in go-jsonnet and sjsonnet also supports some form of caching.

Contracts?

Here's an article on programming with contracts in Nickel, or as they call them, glorified assertions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.