Coder Social home page Coder Social logo

Improve performance about commonmark-hs HOT 4 OPEN

jgm avatar jgm commented on May 28, 2024
Improve performance

from commonmark-hs.

Comments (4)

jgm avatar jgm commented on May 28, 2024

What I've tried

  • rewriting to operate directly on Text instead of tokenizing first
  • rewriting to operate directly on Text, using megaparsec instead of parsec, and using the fast parsers takeWhileP etc.
  • rewriting to use ByteStrings instead of Texts in the Toks.

None of this achieved any speed improvement over the current version using [Tok]; indeed, in every case performance was worse.

Profiling reveals that block structure parsing is fast. Most of the time is taken up by tokenize and restOfLine (31%), and by inline parsing.

Instructions for profiling

make prof

Current results (March 12 2020):

1.8 	 parseChunks
2.1 	 pDelimChunk
2.2 	 Commonmark.Blocks.runInlineParser
2.5 	 blockContinues
2.6 	 Commonmark.Inlines.processBs
2.9 	 MAIN
3.9 	 block_starts
6.6 	 renderHtml
9.0 	 pSymbol
11.9 	 defaultInlineParser
17.5 	 Commonmark.Tokens.tokenize
32.6 	 restOfLine

from commonmark-hs.

jgm avatar jgm commented on May 28, 2024

For a 1.4MB file:

Screen Shot 2020-03-12 at 9 23 26 PM

from commonmark-hs.

jgm avatar jgm commented on May 28, 2024

Benchmarks for different extensions:

extension mean
-xautolinks 310.8 ms (309.3 ms .. 311.3 ms)
-xpipe_tables 295.2 ms (293.2 ms .. 296.6 ms)
-xstrikethrough 267.9 ms (265.6 ms .. 269.1 ms)
-xsuperscript 267.8 ms (264.9 ms .. 269.5 ms)
-xsubscript 266.8 ms (263.6 ms .. 267.9 ms)
-xsmart 293.0 ms (292.0 ms .. 294.3 ms)
-xmath 287.4 ms (285.4 ms .. 290.7 ms)
-xemoji 281.6 ms (280.3 ms .. 282.8 ms)
-xfootnotes 291.3 ms (286.1 ms .. 293.3 ms)
-xdefinition_lists 272.6 ms (271.0 ms .. 275.4 ms)
-xfancy_lists 271.2 ms (269.3 ms .. 273.8 ms)
-xattributes 284.2 ms (283.4 ms .. 285.7 ms)
-xraw_attribute 280.7 ms (279.6 ms .. 281.6 ms)
-xbracketed_spans 268.5 ms (267.0 ms .. 269.4 ms)
-xfenced_divs 269.6 ms (267.5 ms .. 271.6 ms)
-xauto_identifiers 274.9 ms (273.0 ms .. 277.8 ms)
-ximplicit_heading_references 269.8 ms (268.2 ms .. 272.8 ms)
-xall 520.4 ms (515.5 ms .. 523.6 ms)

from commonmark-hs.

jgm avatar jgm commented on May 28, 2024

One idea to explore: use ShortText from text-short package instead of Text in Tok.
The public API could still use Text.
This should reduce the memory used by the tokens.

from commonmark-hs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.