jgm / djot.lua Goto Github PK
View Code? Open in Web Editor NEWLua parser for the djot light markup language
License: MIT License
Lua parser for the djot light markup language
License: MIT License
A mere space error:
...
... some text {.class}
Yields a warning:
{
message = "Ignoring unattached attribute",
pos = 30
}
Where pos
is the byte position.
When the sourcepos map is enabled, it would be handy to have the usual line:col:charpos
in warnings, as we can have on nodes, for easier investigation by the typist.
I've found a couple of cases where I'd love to ask djot to parse only inline elements. This might be an XY problem, so let me describe specific use-cases.
First is the formatting in code blocks:
```
dtw[i, j] =
min(dtw[i-1, j-1], dtw[i, j-1], [i-1, j]) + (xs[i] - ys[i])^2^
```
Here, I'd love to render this as a verbatim block, except for superscript 2
Second is formatting in the attributes:
{caption="Code with _inline markup_"}
```adoc
[source,rust,subs="+quotes"]
----
let x = 1;
let r: &i32;
{
let y = 2;
r = [.hl-error]##&y##; // borrowed value does not live long enough
}
println!("{}", *r);
----
```
Here, as we don't have dedicated syntax for captions yet, I want to use caption=
attribute, and the value of that attribute is a djot inline.
The way asciidoctor solves this is by allowing to control the parser's behavior from the attributes (amusingly, the example above shows asciidoctor example of this feature: +quotes
enables processing of some inlines for the following code block).
I don't like this solution: it feeds semantic information back into the parsing, which is a the-lexer-hack-shaped layering violation.
What I'd love to do here is to parse this as a normal string-valued attribute/verbatim block, and then let it to the conversion layer to recursively feed those strings too djot for processing. And that's why I think I want an API to parse only inlines! Implementation wise, I can of course just manually call the relevant lua function, but I think making it accessible via cli would signal that this is an official way to use djot, and something that the spec and other impls should support.
Learning from asciidoctor experience, which has a lot of toggles to control what exacly is parsed, I think we also might want this to be more fine-grained than just inlines. "Just inlines" would work perfectly for "caption as an attribute" use case, but for code block I'd probably want to disable _
and *
, as those are very likely to need escaping.
We should now be able to supply self-sufficient binaries for linux, windows, macos, since we're building these in CI.
In principle we could also provide the compiled wasm, but it would be good to package this up in a node module or something.
So, I don't really know lua really well.
But I've been using this library to make a plugin, but it errors out due to this line in a few places:
local unpack = unpack or table.unpack
It works if you change it to
local unpack = table.unpack or unpack
I'm guessing cause short circuit eval
If that's an acceptable change I can open a pr for it
When doing:
$ djot --ast --json my.dj |pandoc -f json -s -t html -o my.html
JSON parse error: Error in $: JSON missing pandoc-api-version.
@jgm,
I tried to add a class to a complete blockquote, such as in the following sample:
_a_{.dz}
> {.a}
>
> a
>
> a
>
> a
>
> a
But this only grants the class an an exclusive attribute to the first paragraph.
Is this really a a feature, or is it a bug?
Many thanks for your help.
- a
- b
yields
doc
list (1:2:2-4:1:11) list_style="-" tight="true"
list_item (1:2:2-2:2:7)
para (1:4:4-2:1:6)
str (1:4:4-1:4:4) text="a"
list_item (2:2:7-4:1:11)
para (2:4:9-2:5:10)
str (2:4:9-2:4:9) text="b"
references = {
}
footnotes = {
}
Ideally the first list item would end at 2:1 or 1:4 rather than 2:2.
djot 0.2.0-1 depends on lua >= 5.1 (5.4-1 provided by VM)
djot 0.2.0-1 is now installed in /usr (license: MIT)
Sample:
This is a test
- This is a single list item
Output:
$ pandoc -f djot-reader.lua -t html test.dj
Error running Lua:
djot-reader.lua:223: bad argument #1 to 'gsub' (string expected, got nil)
stack traceback:
djot-reader.lua:223: in function <djot-reader.lua:208>
(...tail calls...)
djot-reader.lua:98: in method 'render_children'
djot-reader.lua:111: in function <djot-reader.lua:108>
(...tail calls...)
A variety of other markup (headings, quotes, blockquotes, most inline markup, divs) cause no problems; the items which I've found that generate errors are:
djot-reader.lua
)I haven't tried everything; most of the issues appear to be blocks, but not all block formatting has issues.
Hi.
djot-reader.lua
is not able to process tables. (See output two paragraphs below.)
The problem seems to be exclusively in the djot-reader.lua
's, given lua bin/main.lua
properly renders the HTML of the table, as well as the AST representation of djot.lua
is almost identical to the one of djot.net/playground
(the TypeScript version).
$ pandoc --version
pandoc 3.1.2
Features: +server +lua
Scripting engine: Lua 5.4
User data directory: /home/user/.local/share/pandoc
Copyright (C) 2006-2023 John MacFarlane. Web: https://pandoc.org
This is free software; see the source for copying conditions. There is no
warranty, not even for merchantability or fitness for a particular purpose.
$ GIT_PAGER=cat git log --oneline -- djot-reader.lua
c0109d4 Initial commit (split from jgm/djot)
$ GIT_PAGER=cat git log --oneline -1
5970f1c (HEAD -> main, tag: 0.2.1, origin/main, origin/HEAD) Bump to 0.2.1.
$ cat test.dj
| Header 1 | Header 2 |
|---|---|
| Row 1.1 | Row 1.2 |
| Row 2.1 | Row 2.2 |
$ pandoc -f djot-reader.lua -t html test.dj -o test.html
Error running Lua:
table expected, got nil
while retrieving list
while retrieving function argument align
while retrieving arguments for function SimpleTable
stack traceback:
djot-reader.lua:201: in function <djot-reader.lua:164>
(...tail calls...)
djot-reader.lua:98: in method 'render_children'
djot-reader.lua:111: in function <djot-reader.lua:108>
(...tail calls...)
$ lua bin/main.lua test.dj
<table>
<tr>
<th>Header 1</th>
<th>Header 2</th>
</tr>
<tr>
<td>Row 1.1</td>
<td>Row 1.2</td>
</tr>
<tr>
<td>Row 2.1</td>
<td>Row 2.2</td>
</tr>
</table>
I tried to quickly look in the origin of the problem (see djot-reader-logging.diff
and pandoc-djot-reader-stderr.txt
in djot-reader-bug-report.zip
), and I have the impression the parameter (node
) sent to Renderer:table
is not proper.
That is the furthest I managed to go by myself given my limited knowledge on Lua, but let me know whether I can support you further.
As a final remark, I insist on djot.lua
instead of djot.js
because of limitations on my environment, and given djot.js
is not binary-distributed (as Pandoc is), it is not an optional for me to use at my current environment.
Thanks in advance.
djot-reader-logging.diff
and pandoc-djot-reader-stderr.txt
were extract by the commands below:
$ git diff > djot-reader-logging.diff
$ pandoc --verbose -f djot-reader.lua -t html test.dj 1>/dev/null 2> pandoc-djot-reader-stderr.txt
When converting Markdown to Djot the writer does not respect the argument --wrap=none
Here is the command used:
pandoc -f Markdown mydoc.md -t djot-writer.lua --wrap=none -o mydoc.dj
Currently we don't emit match objects for lists, only list items. List items are gathered together into lists in the AST building phase. I dimly recall there was some reason for this, but it would simplify AST building, and make the match stream more useful, if the work were done at the earlier stage.
Line 17 in a0583ef
It is probably more efficient to use "\\%0"
as the replacement rather than a function. In a replacement string %0
gets replaced with the whole match, %1
with the first capture group and so on up to %9
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.