Coder Social home page Coder Social logo

Comments (7)

ulysses4ever avatar ulysses4ever commented on July 22, 2024

After a couple of shots in the dark, it seems to me that the most straightforward way to go about it is to print violations in the main loop, right away, without building up a list of violations.

Flag sounds good! (I honestly thought that you'd require it from the original PR.) I wouldn't call it as generically as --verbose, but something like --print-violations.

from fix-whitespace.

andreasabel avatar andreasabel commented on July 22, 2024

After a couple of shots in the dark, it seems to me that the most straightforward way to go about it is to print violations in the main loop, right away, without building up a list of violations.

I think it is good to keep a pure interface, and do IO only in the layer using that interface.

transform
:: TabSize -- ^ Expand tab characters to so many spaces. Keep tabs if @<= 0@.
-> Text -- ^ Text before transformation.
-> (Text, [LineError]) -- ^ Text after transformation and violating lines if any.

Flag sounds good! (I honestly thought that you'd require it from the original PR.) I wouldn't call it as generically as --verbose, but something like --print-violations.

Ok, we can bikeshed it. For a --print-XXX flag I would expect stuff to be printed to stdout for later consumption by another program. (Kind of the proper output of the program.)
There isn't too much written about naming options. There is https://www.gnu.org/prep/standards/html_node/Command_002dLine-Interfaces.html . The subsequent option table mentions --quiet / --silent and ``--verbose`. This would be my first take, unless further diversification is needed.

from fix-whitespace.

ulysses4ever avatar ulysses4ever commented on July 22, 2024

#49 provides a way to work around the performance issue by resurrecting the old implementation and putting the new one under a flag. This is largely independent of the the issue itself.

I must say I find few things in live as sad as tuning performance of Haskell applications. Below are couple observations of my failed attempts at this in our case.

First, many times slow applications generate excessive amount of garbage and productivity goes down. Surprising as it is we don't experience this. Productivity stays above 99%.

Another observation: a lot of slowdown comes from just printing out stuff. I performed a little experiment where I don't do any analysis, and just print out every input line with the decoration displayLineError:

fix :: Mode -> Verbose -> TabSize -> FilePath -> IO Bool
fix mode verbose _tabSize f = do
  s <- Text.readFile f
  pure (CheckViolation s (buildVs s))
    >>= \case

    CheckViolation s vs ->  do
      Text.hPutStrLn stderr (msg vs)
      when (mode == Fix) $
        withFile f WriteMode $ \h -> do
          hSetEncoding h utf8
          Text.hPutStr h s
      return True

  where
    buildVs = zipWith LineError [1..] . Text.lines

    msg vs
      | mode == Fix =
        "[ Violation fixed ] " <> ft

      | otherwise =
        "[ Violation detected ] " <> ft <>
        (if not verbose then "" else
           ":\n" <> Text.unlines (map (displayLineError ft) vs))

    ft = Text.pack f
fix-whitespace on  performance-experiments-2 [?] via λ 9.2.6 
❯ time $FW --check 20000-violations.txt +RTS -s -p -RTS 2>/dev/null

________________________________________________________
Executed in    6.22 millis    fish           external
   usr time    1.31 millis  341.00 micros    0.97 millis
   sys time    5.00 millis  142.00 micros    4.86 millis


fix-whitespace on  performance-experiments-2 [?] via λ 9.2.6 
❯ time $FW -v --check 20000-violations.txt +RTS -s -p -RTS 2>/dev/null

________________________________________________________
Executed in  433.11 millis    fish           external
   usr time  253.06 millis    0.18 millis  252.88 millis
   sys time  180.00 millis    1.08 millis  178.91 millis

(i also use profiling but without profiling numbers are not much different)

from fix-whitespace.

ulysses4ever avatar ulysses4ever commented on July 22, 2024

My idea for improving performance of the output would be to try the Builder type, perhaps.

from fix-whitespace.

ulysses4ever avatar ulysses4ever commented on July 22, 2024

Unfortunately, the basic lazy builder didn't seem to help anything (the code is here ulysses4ever@faabc6a). I'm thinking to give this a try: https://github.com/Bodigrim/linear-builder.

from fix-whitespace.

ulysses4ever avatar ulysses4ever commented on July 22, 2024

I tried an even simpler experiment, and it does look like just printing stuff out is embarrassingly slow. Here's a little program that reads stdin, breaks it into lines, adds the index to every line and prints out the result:

-- test.hs
module Main where

import TextShow
import qualified Data.Text                    as T
import qualified Data.Text.IO                 as T

main = T.getContents >>= T.putStrLn . procText

procText =
  T.unlines .
  zipWith (\i l -> showt i <> l) [1::Int ..] .
  T.lines

It's pretty slow already:

fix-whitespace on  performance-experiments-2 [?] via λ 9.2.6 
❯ time cat 20000-violations.txt | runghc test.hs >/dev/null

________________________________________________________
Executed in  387.92 millis    fish           external
   usr time  300.18 millis    0.22 millis  299.96 millis
   sys time   86.46 millis    2.10 millis   84.37 millis


fix-whitespace on  performance-experiments-2 [?] via λ 9.2.6 
❯ time cat 200000-violations.txt | runghc test.hs >/dev/null

________________________________________________________
Executed in    1.72 secs    fish           external
   usr time    1.20 secs    0.00 millis    1.20 secs
   sys time    0.52 secs    2.90 millis    0.52 secs

from fix-whitespace.

ulysses4ever avatar ulysses4ever commented on July 22, 2024

Oops, using runghc was a slip. If I compile the above program, it runs on par with Rust analog. Back to square one...

from fix-whitespace.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.