Coder Social home page Coder Social logo

typst-hs's Introduction

Typst-hs

Typst-hs is a Haskell library for parsing and evaluating typst syntax. Typst (https://typst.app) is a document formatting and layout language, like TeX.

Currently this library targets v0.10.0 of typst, and offers only partial support. There are two main components:

  • a parser, which produces an AST from a typst document
  • an evaluator, which evaluates the typst expressions in the AST

typst-hs's People

Contributors

adelhult avatar cmdjojo avatar glguy avatar hack3ric avatar jgm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

typst-hs's Issues

Bug in parsing function in math

$lr(#sym.alpha#sym.beta)$
"stdin" (line 1, column 15):
unexpected "#"
expecting "//", "/*", operator, "," or ")"

typst.app handles this fine.

Parse error with function-applications as dictionary keys

Explain the problem.
The following is a valid typst document that compiles under typst 0.10.0 (70ca0d25)
pandoc online link

#let nums = (1, none, 2)
#nums.map(num => {
  if num == none {
    return 1
  }
  return ((str(num)): 1)
})

But converting with pandoc 3.1.10, it results in the following error:

".\tmp.typ" (line 2, column 17):
unexpected '>'

Regex inconsistencies

Hello,
I've observed several inconsistencies between the regex pandoc uses when reading Typst documents and the regex Typst uses.

Here are a few of them:

  1. Not all flags are supported. Typst regex supports the flags i, m, s, u, x. Of those, only i appears to be supported by Pandoc.
    For example, #(regex("(?m)a") in "A") compiles in Typst, but doesn't in Pandoc (3.1.11.1 via try.pandoc.org), with the error (line 1, column 2): parseRegex for Text.Regex.TDFA.Text failed:"({0,1}m)a" (line 1, column 4): unexpected '0' expecting an atom.
    • I especially miss the m (multiline) flag in order to be able to match the start of a line with ^ and the end of a line with $.
  2. Unnamed capture groups are not supported: #(regex("(?:x)") in "x") compiles in Typst, but not in pandoc ((line 1, column 2): parseRegex for Text.Regex.TDFA.Text failed:"({0,1}:x)" (line 1, column 4): unexpected '0' expecting an atom).
    • This is needed to avoid unnecessary capture groups in the output, and is frequently used across my packages.
  3. Explicitly named capture groups are not supported: #(regex("(?P<a>x)") in "x") compiles in Typst, but not in Pandoc ((line 1, column 2): parseRegex for Text.Regex.TDFA.Text failed:"({0,1}P<a>x)" (line 1, column 4): unexpected '0' expecting an atom).

Besides non-compilation, there are inconsistencies in the results of regex matching as well.

  1. #(regex("[\s\S]+") in "x") returns true in Typst, but false in Pandoc.
  2. #("a \n b" == "a \n b".match(regex("[^.]+")).text) returns true in Typst, but false in Pandoc. In general, [ ] seems to unable to accept newlines, when it should.

There are probably inconsistencies I haven't found yet as well, but they could be added to this issue as they are found.

Parse error on raw inline literals containing `#` in `let` expression

#let x = `#`
// Also happens with ```#```

Parsing produces

unexpected "`"
expecting "//", "/*", "none", "auto", "true", "false", "0b", "0x", "0o", digit, ".", "\"", "let", "set", "show", "if", "while", "for", "import", "include", "break", "continue", "return", "(", "$", "<", "{" or "["    

Cannot handle `dict.at(variable) = value` expression

#let x = (a: 5)
#let key = "a"
#{
  x.at(key) = 6
}
#x

This is a valid typst document but pandoc produces the following error:

Cannot update expression FuncCall (FieldAccess (Ident (Identifier "at")) (Ident (Identifier "x"))) [NormalArg (Ident (Identifier "key"))]

Note that pandoc succeeds when directly indexing x with a string, i.e., x.at("a") = 6.

`container.at(...) = value` shouldn't insert new values

Typst only allows at to modify existing values, not insert new ones. So this should fail

#let x = (a: 4)
#{
  x.at("b") = 5
}
#x

With the error

dictionary does not contain key "b" and no default value was specified

But pandoc will compile it.

Also,

#let x = (1,2)
#{
  x.at(2) = 5
}
#x

Should fail with an "out of bounds" error but generates an internal pandoc stacktrace instead of a human facing message for the same reason.

Typst math formatting issues

  • for double dots, uses \overset{¨}{x} instead of \ddot{x} from the amsmath package
  • for simple dot, uses \overset{\cdot}{x} instead of \dot{x} from the amsmath package
  • for overline, \underset{¯}{...} is used instead of \overline
  • for norm, the bars are rendered with \left.\parallel and there is actually no scaling of the delimiters. \left\| and \right\| should be used instead
  • tilde(x) translated to \overset{\sim}{x} instead of \tilde{x}

Reproduction : https://github.com/Doublonmousse/pandoc-typst-reproducer/tree/main/formatting

Error when importing from ctheorems:1.1.2

A Typst document test.typ containing only the import statement #import "@preview/ctheorems:1.1.2": * will fail with pandoc 3.1.12.2, giving the message:

"test.typ" (line 1, column 2):
"/home/kaarel/.cache/typst/packages/preview/ctheorems/1.1.2/lib.typ" (line 223, column 18):
unexpected '>'
expecting operator

test.typ compiles with Typst version 0.10.0.

testsuite failures in Stackage Nightly

Test suite failure for package typst-0.3.1.0:

             test/typ/math/spacing-04.typ:                FAIL (0.01s)                                                              
               Test output was different from 'test/out/math/spacing-04.out'. Output of ["diff","-u","test/out/math/spacing-04.out",
"/tmp/spacing-0444-40.actual"]:                                                                                                     
               --- test/out/math/spacing-04.out 2023-07-24 03:03:19.932092787 +0000                                                 
               +++ /tmp/spacing-0444-40.actual  2023-08-19 06:26:08.131211805 +0000                                                 
               @@ -283,96 +283,5 @@                                                                                                 
                    ]                                                                                                               
                , ParBreak                                                                                                          
                ]                                                                                                                   
               ---- evaluated ---                                                                                                   
               -{ text(body: [                                                                                                      
               -]),                                                                                                                 
               -  text(body: [                                                                                                      
               -]),                                                                                                                 
               -  math.equation(block: false,                                                                                       
               -                body: { text(body: [a]),                                                                            
               -                        text(body: [≡]),                                                                            
               -                        text(body: [b]),                                                                            
               -                        text(body: [+]),                                                                            
               -                        text(body: [c]),                                                                            
               -                        text(body: [-]),                                                                            
               -                        text(body: [d]),                                                                            
               -                        text(body: [⇒]),                                                                            
               -                        text(body: [e]),                                                                            
               -                        math.op(limits: false,                                                                      
               -                                text: "log"),                                                                       
               -                        text(body: [5]),                                                                            
               -                        math.op(text: text(body: [ln])),                                                            
               -                        text(body: [6]) },                                                                                         -                numbering: none),                                                                                   
               -  text(body: [ ]),                                                                                                  
               -  linebreak(),                                                                                                      
               -  math.equation<truncated>                                                                                          
               Use --accept or increase --size-cutoff to see full output.                                                           
               Use -p '/test\/typ\/math\/spacing-04.typ/' to rerun this test only.
             test/typ/math/spacing-02.typ:                FAIL (0.01s)
               Test output was different from 'test/out/math/spacing-02.out'. Output of ["diff","-u","test/out/math/spacing-02.out",
"/tmp/spacing-0244-98.actual"]:                                   
               --- test/out/math/spacing-02.out 2023-07-24 03:03:19.932092787 +0000
               +++ /tmp/spacing-0244-98.actual  2023-08-19 06:26:08.443205900 +0000
               @@ -110,47 +110,5 @@                         
                    ]                                                                                                               
                , ParBreak
                ]
               ---- evaluated --- 
               -{ text(body: [
               -]), 
               -  math.equation(block: false, 
               -                body: { text(body: [a]), 
               -                        text(body: [ ]), 
               -                        text(body: [b]), 
               -                        text(body: [,]), 
               -                        text(body: [a]), 
               -                        text(body: [ ]), 
               -                        text(body: [b]), 
               -                        text(body: [,]), 
               -                        text(body: [a]), 
               -                        text(body: [ ]), 
               -                        text(body: [b]), 
               -                        text(body: [,]), 
               -                        text(body: [a]), 
               -                        text(body: [ ]), 
               -                        text(body: [b]) }, 
               -                numbering: none), 
               -  text(body: [ ]), 
               -  linebreak(), 
               -  math.equation(b<truncated>
               Use --accept or increase --size-cutoff to see full output.
               Use -p '/test\/typ\/math\/spacing-02.typ/' to rerun this test only.

If you would the testsuite to count after resolving this, please open a PR to revert the following stackage commit.

`set` rule followed by `if-else` statement is incorrectly interpreted

#let fn() = {
  set text(fill: red)
  if true [
    test
  ] else [
    test2
  ]
}

#fn()

Produces the pandoc error:

unexpected end of input
expecting end of input
Identifier "else" not found

According to typst, this is a set rule followed by an independent if-statement. I suppose this means set-if code must have the if and set keywords on the same line to cause a conditional set.

Some inconsistencies with your parser vs the Rust parser

Hi, me, @adelhult, and @leopoldwigbratt have been playing around a bit with typst-hs. We made a test suite parsing all packages in the Typst package repository to check for parser errors.

In total, there are about 1300 files, of which 76 failed. A few of them are actually invalid Typst, but most were problems with typst-hs. You can check out our test suite here, but here is a summary:

Trailing comment lines

It seems like ending a file with a comment line does not parse (unexpected EOF):

foo
// Hello

example

Colon as part of label in argument treated as named argument

#let x = selector(<foo:bar>)

Typst treats this as a label named foo:bar while Typst-hs treats it like a named arg <foo (which is an invalid name and fails to parse)
example

Raw literal in code mode is not supported

#let x = `hello`

simply does not parse.
example

Space after math mode in lambda is not allowed

#let nada(ignore_me) = {
    "foo"
}
// removing space after $ works
#let w = nada(x => $x$ + x)

We encountered a strange bug where $x$+x works while $x$ + x doesn't.
example

NBSP/other spaces are not regarded Whitespace

In Typst, nbsp and hair space (and possibly many additional space characters) are regarded as whitespace, while in Typst-hs they are not.
example

Colon for declaring dict literal does not work with spreading

In Typst, you can write this:

#let d = (a: true, b: false)
#let nd = (: ..d)

: ..d is needed to first make nd of type dictionary before spreading the dictionary d into it, otherwise it will try to spread it as an array and fail. Typst-hs does not compile this.
example

These examples are reduced by hand out of some of the failures when testing on typst/packages, and there might be even smaller minimal examples. You can check out the log from running the tests here. Note that we found some files in the typst/packages repo does not parse with the rust implementation either, but our examples in the counter-examples folder have been tested so they compile with the Rust implementation but not with typst-hs.

Hopefully we can contribute with fixing some of these bugs or adding some documentation!

Pandoc error on multline string

#let a = "
This is a
multiline string
"

This is a valid typst document but results in the following pandoc error:

".\tmp.typ" (line 1, column 11):
unexpected "\n"
expecting "\"" or "\\"

Use utf-8 encoding indenpendent of the system locale

By the documentation of Data.Text.IO.readFile, unexpected system locale will cause invalid argument (invalid byte sequence) error.
In fact, on my computer whose locale is gb2312, and typst-hs will throw that error if input file is in gb2312, or give wrong character if input file is utf-8.
(I can change my locale, but this is...not robust.)

Maybe one need System.IO.hSetEncoding or something similar.

Array parser fails on just spreading `none`

#let x=(..none) doesn't parse as intended: it should parse as spreading none into an array but it parses as spreading it into a dict. Since the array parser has priority over the dictionary parser, the error is in the array parser. This issue leads to the test spread-09 not being truthful; before #46 it simply failed to parse but now it fails to run due to this parser bug.

Implement "style" properly

Most elements we pass through to be interpreted elsewhere, but this one needs to be evaluated here, because only here do we have the needed style information.

We also need to implement styles properly.

Pandoc looks for Typst packages with an incorrect path

When running Pandoc on a Typst file, it looks for packages in ~/.cache/typst/packages/preview/packagename-major.minor.patch. However, the correct path is ~/.cache/typst/packages/preview/packagename/major.minor.patch (note that the version is a subdirectory, not separated with a hyphen).

For instance:

theo@dev ~/P/typst-test> cat main.typ
#import "@preview/tablex:0.0.6": tablex, cellx, colspanx, rowspanx

theo@dev ~/P/typst-test> typst compile main.typ main.pdf

theo@dev ~/P/typst-test> ls ~/.cache/typst/packages/preview
tablex/

theo@dev ~/P/typst-test> ls ~/.cache/typst/packages/preview/tablex/
0.0.6/

theo@dev ~/P/typst-test> pandoc --from typst --to docx -o main.docx ./main.typ
"./main.typ" (line 1, column 2):
Could not find package in local packages or cache. Looked in
/home/theo/.local/share/typst/packages/preview/tablex-0.0.6
/home/theo/.cache/typst/packages/preview/tablex-0.0.6
Compile with typst compile to bring the package into your local cache.

theo@dev ~/P/typst-test> pandoc --version
pandoc 3.1.9
Features: +server +lua
Scripting engine: Lua 5.4
User data directory: /home/theo/.local/share/pandoc
Copyright (C) 2006-2023 John MacFarlane. Web: https://pandoc.org
This is free software; see the source for copying conditions. There is no
warranty, not even for merchantability or fitness for a particular purpose.

theo@dev ~/P/typst-test> typst --version
typst 0.9.0 (7bb4f6df)

theo@dev ~/P/typst-test> uname -a
Linux dev.theo-lpt 6.5.11-300.fc39.x86_64 jgm/pandoc#1 SMP PREEMPT_DYNAMIC Wed Nov  8 22:37:57 UTC 2023 x86_64 GNU/Linux

theo@dev ~/P/typst-test> cat /etc/os-release 
NAME="Arch Linux"
PRETTY_NAME="Arch Linux"
ID=arch
BUILD_ID=rolling
VERSION_ID=20231105.0.189722
ANSI_COLOR="38;2;23;147;209"
HOME_URL="https://archlinux.org/"
DOCUMENTATION_URL="https://wiki.archlinux.org/"
SUPPORT_URL="https://bbs.archlinux.org/"
BUG_REPORT_URL="https://bugs.archlinux.org/"
PRIVACY_POLICY_URL="https://terms.archlinux.org/docs/privacy-policy/"
LOGO=archlinux-logo

Please consider replacing 'digits' library

Hi @jgm, happy new year!

typst depends on digits, which seems to be unmaintained (the last upload of digits was ~8 years ago, and the code repository doesn't exist anymore).

I noticed this since we are about to package typst for Debian, and we need to package digits as well. This is not a blocker, but I thought it would be good to let you now.

Subscript on composite symbol fails in math mode

I'm testing this with the standalone binary for pandoc 3.1.12.2. Unfortunately I was unable to find what version of typst-hs it is tied to, so I apologize if this has already been fixed.

A typst document of the form $plus_2$ works fine, as does $plus.circle$, while $plus.circle_2$ fails with the complaint

(line 1, column 2):
Symbol does not have a method "circle_2" or Symbol does not have variant "circle_2"

Math parsing issue (factorial)

$ 3! $
"stdin" (line 1, column 5):
unexpected " "
expecting digit, "\"", "\\", "&", "\8730", "none", "auto", "true", "false", "0b", "0x", "0o", ".", "(", "{", "[", "|", "#", "!=", ">=", "<=", "<-", "->", "=>", "<--", "-->", "<==", "==>", "...", "'" or "$"

some `color` primitives make the conversion fails

These are the

  • desaturate method, fails with Color does not have a method "desaturate" or FieldAccess requires a dictionary
  • color.hsl constructor, fails with Identifier "color" not found
  • color.hsv constructor, fails with Identifier "color" not found
  • color.linear-rgb constructor, fails with Identifier "color" not found
  • .mix method, fails with Color does not have a method "mix" or FieldAccess requires a dictionary
  • oklab constructor, fails with Identifier "oklab" not found
  • oklch constructor, fails with Identifier "oklch" not found
  • rotate method, fails with Color does not have a method "rotate" or FieldAccess requires a dictionary
  • saturate method fails with Color does not have a method "saturate" or FieldAccess requires a dictionary

Typst error: unexpected end of input, expecting new-line

Typst to markdown.

pandoc -s test.typ -o test.md
  • test.typ:
he'*llo Worl*d
  • error:
"test.typ" (line 3, column 1):
unexpected end of input
expecting new-line, "*", "=", "//", "/*", "\\", "_", "$", "-", "+", digit, "/", "http://", "https://", "```", "`", "~", ".", "'", "\"", "<", "@", "#" or "["

or test.typ

l’*exactitude*.

Explain the problem.

French has many single quotes', the bug happened when we tried to convert strong typst text with beside quote.

Pandoc version?

  • pandoc v3.1.8
  • Linux (void-linux compile from source)

Pandoc incorrectly resolves file path referenced by package

#import "@preview/tada:0.1.0"

Converting this fails with

pandoc: typst.toml: withBinaryFile: does not exist (No such file or directory)

This is because tada reads the version number from typst.toml that exists in its folder (https://github.com/typst/packages/blob/main/packages/preview/tada/0.1.0/lib.typ#L23). However, it appears pandoc is trying to read from the working directory, since placing a typst.toml file there resolves the warning. The solution is to ensure filepaths referenced by a package have their root resolve to the package directory.

Identifier `<x>` not found

Sorry for so many back-to-back issues, haha. I'm converting a somewhat large project to markdown and reporting all the errors that ensue.

#rect(fill: gradient.linear(..color.map.viridis))

Produces the error

Identifier "gradient" not found

error from typst input: unexpected end of input

Explain the problem.
test.typ:

测试文本*加粗*

Run command pandoc test.typ -o test1.md, get:

"test.typ" (line 4, column 1):
unexpected end of input
expecting new-line, "*", "=", "//", "/*", "\\", "_", "$", "-", "+", digit, "/", "http://", "https://", letter or digit, "```", "`", "~", ".", "'", "\"", "<", "@", "#" or "["

The *。 is what causing the issue, by

  • replacing "。" to other symbols
  • replacing *...* to #text(weight...)[...]

can workaround this.

Pandoc version?
What version of pandoc are you using, on what OS? (If it's not the latest release, please try with the latest release before reporting the issue.)

pandoc -v                  
pandoc 3.2-nightly-2024-05-15
Features: +server +lua
Scripting engine: Lua 5.4
User data directory: /var/home/yan/.local/share/pandoc
Copyright (C) 2006-2024 John MacFarlane. Web: https://pandoc.org
This is free software; see the source for copying conditions. There is no
warranty, not even for merchantability or fitness for a particular purpose.

from docker.io/pandoc/latex:edge

≡ causes issues

Explain the problem.
A Markdown file with the sign or \equiv sign in inline LaTeX fails to compile with the Typst backend:

[apropos@arch math342]$ pandoc --pdf-engine=typst foo.md -o foo.pdf
error: unknown variable: ident
   ┌─ toPdfViaTempFile113034-0.html:96:3
   │
96 │ ≡ $ident$ $ident$
   │    ^^^^^

Error producing PDF.
[apropos@arch math342]$ cat foo.md
≡
$≡$
$\equiv$

Pandoc version?
3.1.6

Add more tracking of source position to the AST

I'm working on a tool that consumes Typst ASTs (ideally typed) and this package looks amazing and exactly what I was looking for. Thank you for making it available!

Currently, it looks like SourcePos is only used in the Code constructor of the Markup type (in Typst.Syntax).
It would be extremely useful if this was more widely available, being able to tell where in the input each AST node corresponds to.

Unless there is some non-obvious reason for why we cannot do this, here are two proposals of how we could go about doing this:

  1. We could add a SourcePos field to essentially every AST-related constructor (seems like this could be a bit annoying); or

  2. We could add some special constructors that wrap an AST with a span. For example, we would add to the Markup constructor:

    data Markup = ... | Tick SourcePos Markup
    

    (This is very much inspired by what GHC does internally; in fact, I borrowed the name Tick from there. Perhaps MarkupSourcePos would be better here...)

    This is nice, because people who don't care about source position can just ignore this constructor.
    Also, if this causes too much of an increase in memory usage, we can add an argument to parseTypst for whether to emit these Tick constructors.

    Finally, there would be the matter of whether to keep the single already existing SourcePos field or to just reconstruct it via these Ticks. I don't have any strong opinions on this, but I imagine that a better awareness of how this API is currently used in Pandoc may be helpful in making this judgemnt 😄.

Parse error with nested equations

$ #let g = $3$
#g $
(line 1, column 15):
unexpected "\n"
expecting digit, "\"", "\\", "&", "\8730", "none", "auto", "true", "false", "0b", "0x", "0o", ".", "(", "{", "[", "|", "#", "!=", ">=", "<=", "<-", "->", "=>", "<--", "-->", "<==", "==>", "...", "'" or "$"

Colon for declaring dict literal should always be allowed

In regards to #43,

A leading colon in dict literals should always be allowed (even if it isn't always required). Following are examples that Typst accepts but not typst-hs:

#let b = (:b:2)
#let c = ( : c : 3)
#let d = (:..a)
#let e = ( : ..b)

The grammar should be updated accordingly:

-- dict-expr ::= '(' (':' | ':'? (pair (',' pair)* ','?)) ')'
-- pair ::= (ident | str) ':' expr

I am working on this as we speak

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.