Coder Social home page Coder Social logo

lunamark's Introduction

Lunamark

Lunamark is a lua library and command-line program for conversion of markdown to other textual formats. Currently HTML, dzslides (HTML5 slides), Docbook, ConTeXt, LaTeX, and Groff man are the supported output formats, but it is easy to add new writers or modify existing ones. The markdown parser is written using a PEG grammar and can also be modified by the user.

The library is as portable as lua and has very good performance. It is roughly as fast as the author's own C library peg-markdown, two orders of magnitude faster than Markdown.pl, and three orders of magnitude faster than markdown.lua.

Links

Extensions

Lunamark's markdown parser currently supports a number of extensions (which can be turned on or off individually), including:

  • Smart typography (fancy quotes, dashes, ellipses)
  • Significant start numbers in ordered lists
  • Footnotes (both regular and inline)
  • Definition lists
  • Pandoc-style title blocks
  • Pandoc-style citations
  • Fenced code blocks
  • Flexible metadata using lua declarations

See the lunamark(1) man page for a complete list.

It is very easy to extend the library by modifying the writers, adding new writers, and even modifying the markdown parser. Some simple examples are given in the API documentation.

Benchmarks

Generated with

PROG=$program make bench

This converts the input files from the original markdown test suite concatenated together 25 times.

     0.04s   sundown
     0.15s   discount
->   0.56s   lunamark + luajit
     0.80s   peg-markdown
->   0.97s   lunamark
     4.05s   PHP Markdown
     6.11s   pandoc
   113.13s   Markdown.pl
  2322.33s   markdown.lua

Installing

If you want a standalone version of lunamark that doesn't depend on lua or other lua modules being installed on your system, just do

make standalone

Your executable will be created in the standalone directory.

If you are a lua user, you will probably prefer to install lunamark using luarocks. You can install the latest development version this way:

git clone http://github.com/jgm/lunamark.git
cd lunamark
luarocks make

Released versions will be uploaded to the luarocks repository, so you should be able to install them using:

luarocks install lunamark

There may be a short delay between the release and the luarocks upload.

Using the library

Simple usage example:

local lunamark = require("lunamark")
local opts = { }
local writer = lunamark.writer.html.new(opts)
local parse = lunamark.reader.markdown.new(writer, opts)
print(parse("Here's my *text*"))

For more examples, see API documentation.

lunamark

The lunamark executable allows easy markdown conversion from the command line. For usage instructions, see the lunamark(1) man page.

lunadoc

Lunamark comes with a simple lua library documentation tool, lunadoc. For usage instructions, see the lunadoc(1) man page. lunadoc reads source files and parses specially marked markdown comment blocks. Here is an example of the result.

Tests

The source directory contains a large test suite in tests. This includes existing Markdown and PHP Markdown tests, plus more tests for lunamark-specific features and additional corner cases.

To run the tests, use bin/shtest.

bin/shtest --help            # get usage
bin/shtest                   # run all tests
bin/shtest indent            # run all tests matching "indent"
bin/shtest -p Markdown.pl -t # run all tests using Markdown.pl, and normalize using 'tidy'

Lunamark currently fails four of the PHP Markdown tests:

  • tests/PHP_Markdown/Quotes in attributes.test: The HTML is semantically equivalent; using the -t/--tidy option to bin/shtest makes the test pass.

  • tests/PHP_Markdown/Email auto links.test: The HTML is semantically equivalent. PHP markdown does entity obfuscation, and lunamark does not. This feature could be added easily enough, but the test would still fail, because the obfuscation involves randomness. Again, using the -t/--tidy option makes the test pass.

  • tests/PHP_Markdown/Ins & del.test: PHP markdown puts extra <p> tags around <ins>hello</ins>, while lunamark does not. It's hard to tell from the markdown spec which behavior is correct.

  • tests/PHP_Markdown/Emphasis.test: A bunch of corner cases with nested strong and emphasized text. These corner cases are left undecided by the markdown spec, so in my view the PHP test suite is not normative here; I think lunamark's behavior is perfectly reasonable, and I see no reason to change.

The make test target only runs the Markdown and lunamark tests, skipping the PHP Markdown tests.

Authors

lunamark is released under the MIT license.

Most of the library is written by John MacFarlane. Hans Hagen made some major performance improvements. Khaled Hosny added the original ConTeXt writer.

The dzslides HTML, CSS, and javascript code is by Paul Rouget, released under the DWTFYWT Public License.

lunamark's People

Contributors

daurnimator avatar jgm avatar khaledhosny avatar omikhleia avatar tarleb avatar tst2005 avatar witiko avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lunamark's Issues

Error on generating context, latex and man

There is a problem on my computer with generating context, latex and man output (html and html5 work fine).

sh-3.2# lunamark --to context urk.txt
/usr/local/bin/lua: /usr/local/share/lua/5.1/lunamark/writer/context.lua:101: bad argument #3 to 'format' (string expected, got table)
stack traceback:
[C]: in function 'format'
/usr/local/share/lua/5.1/lunamark/writer/context.lua:101: in function </usr/local/share/lua/5.1/lunamark/writer/context.lua:86>
[C]: in function 'lpegmatch'
...usr/local/share/lua/5.1/lunamark/reader/markdown.lua:122: in function 'parse_blocks'
...usr/local/share/lua/5.1/lunamark/reader/markdown.lua:1001: in function 'parse'
...local/lib/luarocks/rocks/lunamark/0.2-1/bin/lunamark:425: in main chunk
[C]: ?

If urk.txt contains the text:

This paragraph has two sentences.

lunamark runs without errors and output is equal to input.

When this text is changed to

This paragraph has two sentences

the error occurs.

Can you help?

Best, Roelof Langman

I'm unable to build it through mingw (on windows)

Hello,

I'm by no way competent neither in lua neither in C, taking this in account could moderate such issue declaration.
The first met error (output below) is about the main.squish.lua file, which i'm not able to understand.

Thank you for your software.

Here is the output resulting of "$>make standalone" :

make -C standalone
make[1]: Entering directory /C/Users/laurent/Downloads/_Gitted/lunamark/standalone' (cd src && lua ../squish.lua) C:\Msys_1.0\bin\lua.exe: ../squish.lua:219: attempt to index field 'path' (a nil value) stack traceback: ../squish.lua:219: in main chunk [C]: ? Writing ../main.squished.lua... Couldn't resolve module: lunamark.util Couldn't resolve module: lunamark.entities Couldn't resolve module: lunamark.writer Couldn't resolve module: lunamark.writer.generic Couldn't resolve module: lunamark.writer.xml Couldn't resolve module: lunamark.writer.docbook Couldn't resolve module: lunamark.writer.html Couldn't resolve module: lunamark.writer.html5 Couldn't resolve module: lunamark.writer.dzslides Couldn't resolve module: lunamark.writer.tex Couldn't resolve module: lunamark.writer.latex Couldn't resolve module: lunamark.writer.context Couldn't resolve module: lunamark.writer.groff Couldn't resolve module: lunamark.writer.man Couldn't resolve module: lunamark.reader Couldn't resolve module: lunamark.reader.markdown make[1]: *** [main.squished.lua] Error 1 make[1]: Leaving directory/C/Users/laurent/Downloads/_Gitted/lunamark/standalone'
make: *** [standalone] Error 2

Support tex_math_dollars extension

The title says it all, with Pandoc-like tex_math_dollars, one may have $ math formula $ or $$ math formula $$.

EDIT: I mean here, to clarify, minimally, the reader (parsing) support for it, and the basic generic writer handling methods. Transforming the input in some appropriate format in the writers (e.g. MathML in HTML ?) could be left to other implementers.

inline <style> element results in parse error

example markdown contents:

a

<style>

  img.grid2 {
    width: 370px;
    height: auto;
  }
  img.grid3 {
    width: 244px;
    height: auto;
  }
  img.grid2 + img.grid2 {
    margin-left: 4px;
  }
  img.grid3 + img.grid3 {
    margin-left: 4px;
  }

</style>

hello

error:

lua: .../vi/.luarocks/share/lua/5.4/lunamark/reader/markdown.lua:196: parse_blocks failed on:
a

<style>

  im
stack traceback:
	[C]: in function 'error'
	.../vi/.luarocks/share/lua/5.4/lunamark/reader/markdown.lua:196: in upvalue 'parse_blocks'
	.../vi/.luarocks/share/lua/5.4/lunamark/reader/markdown.lua:1193: in upvalue 'md_to_html'

version info:

> luarocks --local --lua-version 5.4 show lunamark

lunamark 0.5.0-1 - General markup format converter using lpeg.

alter_syntax: more parameters

A feature enhancement suggestion:

When changing or extending the markdown syntax with the help of alter_syntax, one might need parse_blocks in addition to the syntax parameter.

A simple change in a line does it:

syntax = options.alter_syntax(syntax, parse_blocks)

Although it's easy to pass writer to the new parser, one might even go further with the following, for convenience:

syntax = options.alter_syntax(syntax, parse_blocks, writer)

I've successfully used this second version in a personal project.

However, I don't have the "whole picture" like you people, so don't know if the above would break something somewhere I haven't looked.

Support fenced divs with implicit class names and trailing colons

Pandoc's documentation for fenced divs includes the following example:

::: Warning ::::::
This is a warning.

::: Danger
This is a warning within a warning.
:::
::::::::::::::::::

Pandoc produces the following HTML output for the above input:

<div class="Warning">
<p>This is a warning.</p>
<div class="Danger">
<p>This is a warning within a warning.</p>
</div>
</div>

By contrast, Lunamark does not support implicit classnames (::: Danger) and trailing colons (::: Warning ::::::) and produces the following output for the above input when the fenced_divs extension is enabled:

<p>::: Warning :::::: This is a warning.</p>

<p>::: Danger This is a warning within a warning. ::: ::::::::::::::::::</p>

Plain, Paragraph and parse_blocks

Not sure if this is intended or a bug.

If a multiblock document ends with a text line (only Inlines) not followed by a new line, that last line is interpreted as a Plain, not a Paragraph.

More generally, here are some results when applying parse_blocks to the following strings:

s1 = "A\n\nB"     -- A will be a Para, B will be a Plain
s2 = "A\n\nB\n"   -- A will be a Para, B will be a Plain
s3 = "A\n\nB\n\n" -- A will be a Para, B will be a Para

It seems to me that B should be a Para for s1 and s2, like in s3.

If however this is as intended, please disregard.

Content slicing, table support, and miscellaneous fixes

The witiko/markdown package has recently seen the introduction of several new options and syntax extensions sponsored by David Vins and Omedym, which may be worth porting back to jgm/lunamark:

There have also been several bugfixes in witiko/markdown, which apply to jgm/lunamark:

  • the removal an unreachable branch of the parsers.line parser (see 5653f51), and
  • the completion of support for named HTML5 entities (see 4d24fc3).

Let me know which patches you would like to backport, and if you would like me to do the backporting. Any other comments are also appreciated.

sample code on front page doesn't work

The sample on the front page doesn't work as is. Because the function is returning a parser(table) and not a call (function/callable-table).

I added the following to my my markdown_parser.lua

local mt_parser = {__call=function(self,inp)
 return self.to_string(inp)
end}

return setmetatable({
 writer = writer,
 options = options,
 parse = parse,
 to_string = function(inp) return util.to_string(parse(inp)) end,
 write = function(f, inp) return util.to_file(f, parse(inp)) end
 },mt_parser)

this way the parser can be called as if it were a function.
Anyways great job, it was just sort of a surprise the sample didn't work.

Create download archive

Hi,

It would be very helpful for packagers to have predicable release archive downloads.

You can do that by:

git archive --format tar.gz --prefix lunamark-0.2/ -o ../lunamark-0.2.tar.gz 0.2

and upload that to Downloads section in github.

Thanks!

Support Djot-like symbols?

Djot has this concept of a symbol, as a word surrounded by colons (:symb:)... which could be an emoji or something else, it's currently left to the decision of the renderer...

Heh, perhaps we could have this in Lunamark too, behind a djot_symbols extension?

(Full disclosure: I am seriously considering toying with it in rather arcane and likely unintended ways in my Djot support for SILE, just 'cause I kind of like the idea of variable substitution without external preprocessing)

Discrepancy on option (require_)blank_before_fenced_code_block

  • require_blank_before_fenced_code_block in lunamark/bin/lunamark (both in help text and in the initial option table)
  • require_blank_before_fenced_code_block too in the comment block in lunamark/lunamark/reader/markdown.lua...

... but later there the code uses options.blank_before_fenced_code_blocks

Smart quotes sometimes too aggressive

Running with the smart extension enabled.

It's clear it shouldn't happen. Aujourd'hui n'est pas demain.

With Pandoc: <p>It’s clear it shouldn’t happen. Aujourd’hui n’est pas demain.</p>
(with right quotes = apostrophes everywhere)

But Lunamark generates <p>It‘s clear it shouldn’t happen. Aujourd‘hui n’est pas demain.</p>
(with left quotes in "It's" and "Aujourd'hui")

4 tests failing

This is known (it's in the README: https://github.com/jgm/lunamark#tests) but perhaps we shouldn't run these tests by default?

Running tests with lua 5.1, lpeg 0.10, git HEAD:

[OK]     tests/Markdown_1.0.3/Amps and angle encoding.test
[OK]     tests/Markdown_1.0.3/Auto links.test
[OK]     tests/Markdown_1.0.3/Backslash escapes.test
[OK]     tests/Markdown_1.0.3/Blockquotes with code blocks.test
[OK]     tests/Markdown_1.0.3/Code Blocks.test
[OK]     tests/Markdown_1.0.3/Code Spans.test
[OK]     tests/Markdown_1.0.3/Hard-wrapped paragraphs with list-like lines.test
[OK]     tests/Markdown_1.0.3/Horizontal rules.test
[OK]     tests/Markdown_1.0.3/Inline HTML (Advanced).test
[OK]     tests/Markdown_1.0.3/Inline HTML (Simple).test
[OK]     tests/Markdown_1.0.3/Inline HTML comments.test
[OK]     tests/Markdown_1.0.3/Links, inline style.test
[OK]     tests/Markdown_1.0.3/Links, reference style.test
[OK]     tests/Markdown_1.0.3/Links, shortcut references.test
[OK]     tests/Markdown_1.0.3/Literal quotes in titles.test
[OK]     tests/Markdown_1.0.3/Markdown Documentation - Basics.test
[OK]     tests/Markdown_1.0.3/Markdown Documentation - Syntax.test
[OK]     tests/Markdown_1.0.3/Nested blockquotes.test
[OK]     tests/Markdown_1.0.3/Ordered and unordered lists.test
[OK]     tests/Markdown_1.0.3/Strong and em together.test
[OK]     tests/Markdown_1.0.3/Tabs.test
[OK]     tests/Markdown_1.0.3/Tidyness.test
[OK]     tests/PHP_Markdown/Auto Links.test
[OK]     tests/PHP_Markdown/Backslash escapes.test
[OK]     tests/PHP_Markdown/Code Spans.test
[OK]     tests/PHP_Markdown/Code block in a list item.test
[OK]     tests/PHP_Markdown/Code block on second line.test
[FAILED] tests/PHP_Markdown/Email auto links.test
expectedactual
<p><a href="&#109;&#x61;&#105;&#x6c;&#116;&#x6f;&#58;&#x6d;&#105;&#x63;&#104;&#x65;&#108;&#x2e;&#102;&#x6f;&#114;&#x74;&#105;&#x6e;&#64;&#x6d;&#105;&#x63;&#104;&#x65;&#108;&#x66;&#46;&#x63;&#111;&#x6d;">&#x6d;&#105;&#x63;&#104;&#x65;&#108;&#x2e;&#102;&#x6f;&#114;&#x74;&#105;&#x6e;&#64;&#x6d;&#105;&#x63;&#104;&#x65;&#108;&#x66;&#46;&#x63;&#111;&#x6d;</a></p>href="mailto:[email protected]">[email protected]</a></p>

<p>International domain names: <a href="&#x6d;&#97;&#105;&#x6c;&#x74;&#111;&#58;&#x68;&#x65;&#108;&#112;&#x40;&#x74;ū&#x64;&#x61;&#108;&#105;ņ&#46;&#108;&#x76;">&#x68;&#x65;&#108;&#112;&#x40;&#x74;ū&#x64;&#x61;&#108;&#105;ņ&#46;&#108;&#x76;</a></p>href="mailto:help@tūdaliņ.lv">help@tūdaliņ.lv</a></p>
[FAILED] tests/PHP_Markdown/Emphasis.test
expectedactual
<p>Combined emphasis:</p>

<ol>
<li><strong><em>test test</em></strong></li>
<li><strong><em>test test</em></strong></li>
<li><em>test <strong>test</strong></em></li>
<li><strong>test <em>test</em></strong></li>
<li><strong><em>test</em> test</strong></li>
<li><em><strong>test</strong> test</em></li>
<li><strong><em>test</em> test</strong></li>
<li><strong>test <em>test</em></strong></li>
<li><em>test <strong>test</strong></em></li>
<li><em>test <strong>test</strong></em></li>
<li><strong>test <em>test</em></strong></li>
<li><strong><em>test</em> test</strong></li>
<li><em><strong>test</strong> test</em></li>
<li><strong><em>test</em> test</strong></li>
<li><strong>test <em>test</em></strong></li>
<li><em>test <strong>test</strong></em></li>
</ol>

<p>Incorrect nesting:</p>

<ol>
<li>*test  <strong>test* <strong>test* test</strong></li>
<li>_test  <strong>test_  test</strong></li>
<li><strong>test<li><em><em>test  *test</strong><em>test</em></em> test*</li>test</em></li>
<li><strong>test <li><em><em>test _test</strong><em>test</em></em> test_</li>test</em></li>
<li><em>test <em>test</em>  *test</em>  test*</li>test</em></li>
<li><em>test <em>test</em>  _test</em>  test_</li>test</em></li>
<li><strong>test **test</strong><strong>test</strong> test**</li>test</strong></li>
<li><strong>test __test</strong><strong>test</strong> test__</li>test</strong></li>
</ol>

<p>No emphasis:</p>

<ol>
<li>test*  test  *test</li>
<li>test**<li>test<em>* test **test</li>*</em>test</li>
<li>test_ <li>test_ test  _test</li>
<li>test__<li>test<em>_ test __test</li>_</em>test</li>
</ol>

<p>Middle-word emphasis (asterisks):</p>

<ol>
<li><em>a</em>b</li>
<li>a<em>b</em></li>
<li>a<em>b</em>c</li>
<li><strong>a</strong>b</li>
<li>a<strong>b</strong></li>
<li>a<strong>b</strong>c</li>
</ol>

<p>Middle-word emphasis (underscore):</p>

<ol>
<li><em>a</em>b</li>
<li>a<em>b</em></li>
<li>a<em>b</em>c</li>
<li><strong>a</strong>b</li>
<li>a<strong>b</strong></li>
<li>a<strong>b</strong>c</li>
</ol>

<p>my<em>precious</em>file.txt</p>

<h2>Tricky Cases</h2>

<p>E**. <strong>Test</strong> TestTestTest</p>

<p>E**. <strong>Test</strong> Test Test Test</p>


<h2>Overlong emphasis</h2>

<p>Name: ____________<br____________<br/>Organization: />
Organization: ____<br />
Region/Country:____<br/>Region/Country: __</p>

<p>_____Cut here_____</p>

<p>____Cut here____</p>
[OK]     tests/PHP_Markdown/Empty List Item.test
[OK]     tests/PHP_Markdown/Headers.test
[OK]     tests/PHP_Markdown/Horizontal Rules.test
[OK]     tests/PHP_Markdown/Inline HTML (Simple).test
[OK]     tests/PHP_Markdown/Inline HTML (Span).test
[OK]     tests/PHP_Markdown/Inline HTML comments.test
[FAILED] tests/PHP_Markdown/Ins & del.test
expectedactual
<p>Here is a block tag ins:</p>

<ins>
<p>Some text</p>
</ins>

<p><ins>And<ins>And here it is inside a paragraph.</ins></p>paragraph.</ins>

<p>And here it is <ins>in the middle of</ins> a paragraph.</p>

<del>
<p>Some text</p>
</del>

<p><del>And<del>And here is ins as a paragraph.</del></p>paragraph.</del>

<p>And here it is <del>in the middle of</del> a paragraph.</p>
[OK]     tests/PHP_Markdown/Links, inline style.test
[OK]     tests/PHP_Markdown/MD5 Hashes.test
[OK]     tests/PHP_Markdown/Mixed OLs and ULs.test
[OK]     tests/PHP_Markdown/Nesting.test
[OK]     tests/PHP_Markdown/PHP-Specific Bugs.test
[OK]     tests/PHP_Markdown/Parens in URL.test
[FAILED] tests/PHP_Markdown/Quotes in attributes.test
expectedactual
<p><a href="/&quot;style=&quot;color:red">Test</a> <a href="/'style='color:red">Test</a></p>href="/&#39;style=&#39;color:red">Test</a></p>

<p><img src="/&quot;style=&quot;border-color:red;border-size:1px;border-style:solid" alt="" />
 <img src="/'style='border-color:red;border-size:1px;border-style:solid" src="/&#39;style=&#39;border-color:red;border-size:1px;border-style:solid" alt="" /></p>
[OK]     tests/PHP_Markdown/Tight blocks.test
[OK]     tests/lunamark/backticks_in_links.test
[OK]     tests/lunamark/case_insensitive_references.test
[OK]     tests/lunamark/case_insensitive_tags.test
[OK]     tests/lunamark/consecutive_lists.test
[OK]     tests/lunamark/definition_lists.test
[OK]     tests/lunamark/escape_lt.test
[OK]     tests/lunamark/hash_enumerators.test
[OK]     tests/lunamark/indented_code_in_list_item.test
[OK]     tests/lunamark/line_break.test
[OK]     tests/lunamark/list_spacing.test
[OK]     tests/lunamark/lists_and_hrules.test
[OK]     tests/lunamark/nested_divs.test
[OK]     tests/lunamark/notes.test
[OK]     tests/lunamark/reference_defined_in_blockquote.test
[OK]     tests/lunamark/require_blank_before_blockquote.test
[OK]     tests/lunamark/require_blank_before_header.test
[OK]     tests/lunamark/smart.test
[OK]     tests/lunamark/unicode.test
Passed: 58
Failed: 4

(minor) Notes without blank line at end of document aren't parsed

Found by @jslabovitz (via discussion)

If a (foot)note is the last element in a document, without blank line(s) after it, it fails to be parsed.

Test for reproducing

local lunamark = require("lunamark")
local opts = { notes = true }
local writer = lunamark.writer.html5.new(opts)
local parse = lunamark.reader.markdown.new(writer, opts)
local result, matadata = parse([[
Here is a note.[^mynote]

[^mynote]: This is supposed to be the note's footnote.
]])

print(result)

Observed result

<p>Here is a note.[^mynote]</p>

[^mynote]: This is supposed to be the note&#39;s footnote.

Expected result

<p>Here is a note.<sup><a href="#fn1" class="footnoteRef" id="fnref1">1</a></sup></p>

<hr />

<ol class="notes">
<li id="fn1"><p>This is supposed to be the note&#39;s footnote. <a href="#fnref1" class="footnoteBackLink">↩</a></p></li>
</ol>

Rationale

For comparison, Pandoc is happy parsing the note whether there's a blank line or not.

The workaround is easy (just adding a blank line), but it could be misleading to users.

Fenced Div + List problem

The following will not produce a div nor a list

::: foo
- a
- b
:::

But adding an extra blank line will make it work (list inside a div):

::: foo
- a
- b

:::

Same goes for ordered lists.

This is similar to Issue #60

lunamark/reader/markdown.lua:150: parse_blocks failed

Error:

/usr/local/share/lua/5.1/lunamark/reader/markdown.lua:150: parse_blocks failed on:
THIS IS A TEST
====

Source text:

THIS IS A TEST
============

[a link](http://google.com)

Same error with following settings all enabled, and all disabled (have not tried other combinations):

  • definition_lists
  • require_blank_before_blockquote
  • require_blank_before_header
  • hash_enumerators

Same error:

/usr/local/share/lua/5.1/lunamark/reader/markdown.lua:150: parse_blocks failed on:
# This
# is
# a li

Different source text:

# This
# is
# a list?

## This is a level 2 headr?

This is a header
===============

- list
- with items

[and a link](http://google.com) with *some bold **and italics?** text*

Woo!

> blockquote maybe

Oh yea...maybe.

Same options used.

Version: latest on luarocks, 0.4.0-1

hard crash on cygwin in call to API

I tried (succesfully) to install lunamark on my system (CYGWIN_NT-10.0 3.2.0(0.340/5/3) 2021-03-29 08:42 x86_64 Cygwin); but while trying to compile the standalone lunamark the executable crashes with a mysterious message:

PANIC: unprotected error in call to Lua API (attempt to call a string value)

I have not the faintest clue about where to look for the faulty value. How does the debugging mechanism work in Lua modules?

Trying with Lua 5.3.6 Copyright (C) 1994-2020 Lua.org, PUC-Rio

Broken Pandoc title blocks

This change breaks Pandoc title blocks for me: 17d53e4#L0R833

I get the following error because of it:

lua5.1: /usr/share/lua/5.1/lunamark/reader/markdown.lua:995: invalid replacement value (a table)

Missing parts in link_attributes extension support

For recollection and possible future improvement, the link_attributes extension support added in #36 is partial: it is supported on direct images only... but not on links and indirect images.

For parity with Pandoc, these should be supported:

On [link](#target){ #source .class key=value }

On [indirect link][reflink]

[reflink]: #target { #source .class key=value }

On ![indirect image][imgref]

[imgref]: image.png "opt title" {#id .class key=value}

Generating, e.g. in HTML, something such as:

<p>On <a href="#target" id="source" class="class" data-key="value">link</a></p>
<p>On <a href="#target" id="source" class="class" data-key="value">indirect link</a></p>
<p>On <img src="image.png" title="opt title" id="id" class="class" data-key="value" alt="indirect image" /></p>

Fenced div blocks nesting

For recollection and possible future improvement, the fenced_divs extension support added in #36 is partial: it is doesn't support nesting of fenced blocks (as it always assumes ::: as div marker -- For my excuse, I wasn't aware of that possibility, albeit very logical)

For parity with Pandoc, this should be supported, e.g.

::: { .diva }
I am in div A

:::: { .divb }
I am in nested div B
::::

Back to div A
:::

Generating, e.g. in HTML, something such as:

<div class="diva">
    <p>I am in div A</p>
    <div class="divb">
        <p>I am in nested div B</p>
    </div>
    <p>Back to div A</p>
</div>

(HTML indented for readability)

lunamark, 5.1, and lulpeg: Emphasis and Strong

Hello jgm,

I've been hunting this bug for a while now, and I have to reach out directly because I'm still out of my element.

I'm trying to run lunamark to get a markdown parser in pure luaJIT. Requires me to use a utf8 lua library, and lulPeg to provide the dependencies for lunamark.

All that aside, I don't believe that is what is tripping me up with my test code. Very specifically, I can't parse *emphasis* or _emphasis_ or anything for Strong. See here

Looking at the code, it has something to do with parsers.between, a small function you wrote to parse multiple inlines per emphasis or strong grouping.

Now, I'm not saying it can't be the UTF8 library or something else, I did disable "smart" processing because that was also causing issues.

Here is my repo where I cobbled all this together. Just clone, edit test.lua to turn off smart mode, and put some italics in there and run lua test.lua and you'll get the error I'm getting.

I made a minor change (check first commit I applied) to headerstart and and one other spot, but they aren't going to affect emphasis or strong parsing.

Thank you in advance for any help you can provide!

Do not recognize fenced divs with no closing elements

Since #51, we have been treating fenced div beginnings and ends as separate elements. As a result, this allows fenced divs without a closing tag to be recognized. Consider the following markdown text:

::: {.some-classname}
This should not parse as a div.

Currently, lunamark will produce the following invalid HTML output:

<div class="some-classname">
<p>This should not parse as a div.</p>

We should at least always produce the closing </div> tag(s):

<div class="some-classname">
<p>This should not parse as a div.</p>
</div>

However, for Pandoc compliance, we should not even match the opening tag:

<p>::: {.some-classname} This should not parse as a div.</p>

LuaLaTex module for direct markdown inclusion in LaTeX documents

I created a module with which it is possible to include markdown code directly into LaTeX documents. Currently it is based on peg-multimarkdown. I tried to use lunamark instead, since I'm also interested in ConTeXt output. Unfortunately lpeg.B doesn't exist in LuaTeX since it includes lpeg 0.9.

Is there a way to simulate lpeg.B in pure lua?
If not, how would one modify the parser so that it wouldn't use lpeg.B?

The module is here:

https://github.com/mmahnic/peg-multimarkdown/tree/master/lualatex

Thanks,
Marko

Please cut a new release

It seems that the last release was version 0.5.0 in 2016. It would be great if the current version could be installed via luarocks instead of having to build from source.

Not compatible with lpeg version 0.11-2

I installed lunamark (version 0.3-1) using luarocks on OS X, which installs lpeg version 0.11-2 to solve dependency. When run lunamark, I got the following error:

/usr/local/opt/lua/bin/lua: ...usr/local/share/lua/5.1/lunamark/reader/markdown.lua:122: pattern too large
stack traceback:
        [C]: in function 'lpegmatch'
        ...usr/local/share/lua/5.1/lunamark/reader/markdown.lua:122: in function 'parse_blocks'
        ...usr/local/share/lua/5.1/lunamark/reader/markdown.lua:1001: in function 'parse'
        ...local/lib/luarocks/rocks/lunamark/0.3-1/bin/lunamark:425: in main chunk
        [C]: ?

I manually installed lpeg version 0.10-2 and then lunamark works.

Links to `https://jgm.github.com` not working

From the 404 page:

If you're the owner of this site, please update your links to use jgm.github.io instead.
Subdomains of github.com are deprecated for GitHub Pages.
They will not redirect to github.io after April 15, 2021. 

Support for inline_code_attributes extension

In #36 I added attribute supports to several thing, but not on verbatim inline code (extension inline_code_attributes in Pandoc), which I overlooked... It should be so simple, though ^^

For parity with Pandoc,

As in `make build`{.bash}...

Corner cases for fenced div attributes

In #36, we added support for Pandoc's fenced divs. However, the fenced div attributes are treated as infostrings and only matched at capture time, which has a number of repercussions:

  1. Our implementation parses attributes as fenced code infostrings. Infostrings may not contain backticks, whereas attributes may (in Pandoc):

    ::: {key=value`}
    foo bar
    :::
  2. Divs with empty infostrings are not divs. Consider the following document:

    :::
    foo
    :::

    Pandoc converts this document to the following HTML output: <p>::: foo :::</p>, where as we still consider this a div.

Key-value attributes are not general enough

The syntax for attribute values in key-value pairs ({ .... key=value } attributes) is not general enough:

parsers.attrvalue = (parsers.dquote * C((parsers.alphanumeric + S("._- "))^1) * parsers.dquote)
+ C((parsers.alphanumeric + S("._-"))^1)

FWIW, Pandoc allows more things:
https://github.com/jgm/pandoc/blob/bc670665a1ce18ddc48531867b7b77c2f3493f49/src/Text/Pandoc/Readers/Markdown.hs#L658-L662

I am not reading Haskell well, but at least:

  • single quoted value key='value'
  • some escaping things
  • wider range of characters (i.e. not restricted as currently to alphanums and some additions).

Found the issue while trying to have a width="15%" attribute pair ;)

Make `div_level` relative to the current blockquote level

Since #51, we have been keeping track of the nesting level of fenced divs using a local counter variable div_level. However, divs can be further nested inside blockquotes, which would require that div_level is a stack of counters. Consider the following markdown text:

::: {.some-classname}
This is the beginning of a div
> This is a blockquote
> :::
:::

Currently, lunamark will produce the following invalid HTML output:

<div class="some-classname">
<p>This is the beginning of a div</p>
<blockquote>
<p>This is a blockquote</p>
</div>
</blockquote>
<p>:::</p>

Here is the expected output:

<div class="some-classname">
<p>This is the beginning of a div</p>
<blockquote>
<p>This is a blockquote :::</p>
</blockquote>
</div>

markdown.lua:974: empty loop in rule 'Blocks'

$ lunamark
/usr/bin/lua5.1: /usr/share/lua/5.1/lunamark/reader/markdown.lua:974: empty loop in rule 'Blocks'
stack traceback:
        [C]: in function 'Ct'
        /usr/share/lua/5.1/lunamark/reader/markdown.lua:974: in function 'new'
        ...r/lib/luarocks/rocks-5.1/lunamark/0.3-1/bin/lunamark:399: in main chunk
        [C]: ?

I have lpeg 0.12.1 and lunamark 0.3

Moving the parsers into a hash table

I am thinking of moving the parsers in lunamark/readers/markdown.lua into a hash table. There is an upper limit of 200 local variables and 60 upvalues in Lua and we are currently hitting both; this should fix the problem.

I am additionally thinking of splitting the hash table into a global one and a local one. The global one would store parsers that do not need to be created more than once, whereas the local one would store parsers whose behavior depends on the options received. I did not profile the code, so this might be a premature optimization, but I would argue that this does not impact the readability / maintainability of the code and I might as well do it, since I am refactoring it all.

So far, I did this for the initial couple of parsers (see Witiko@7681cbb). Since this will have a huge delta against the current code base and will take quite some time to finish, I would love to hear your thoughts on this.

image without attr will give bad element with writer.html

Consider:

![la lune](lalune.jpg "Voyage *to the* moon")

The html writer will output:

<p><img src="lalune.jpg" alt="la lune" title="Voyage *to the* moon"</p>

with the img end tag /> missing.

cause: with the newer parameter attr in writer.html.image since july 2020, if attr is nil, two nil values w and h will be inserted in the rope.

solution: make w and h empty strings if attr is nil

Unclear license

Hi. I'm making a PKGBUILD for Archlinux, so that this code is available to AUR users, and my big question is: What is the license I shall put in this file? Custom? Public domain?

For now, I'm puting "Public domain", because that's what I determined from reading your LICENSE file.
Anyway, I hope this PKGBUILD works when I'm done.

Another fenced div + blockquote edge case (vs. Pandoc)

This doesn't parse as a fenced div:

::: {lang=fr}
> Cette citation est en français!
:::

Workaround is to have a blank line after the blockquote:

::: {lang=fr}
> Cette citation est en français!

:::

Pandoc parses the first syntax, though, without issue.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.