Comments (7)
After a couple of shots in the dark, it seems to me that the most straightforward way to go about it is to print violations in the main loop, right away, without building up a list of violations.
Flag sounds good! (I honestly thought that you'd require it from the original PR.) I wouldn't call it as generically as --verbose
, but something like --print-violations
.
from fix-whitespace.
After a couple of shots in the dark, it seems to me that the most straightforward way to go about it is to print violations in the main loop, right away, without building up a list of violations.
I think it is good to keep a pure interface, and do IO only in the layer using that interface.
fix-whitespace/src/Data/Text/FixWhitespace.hs
Lines 70 to 73 in 37220b7
Flag sounds good! (I honestly thought that you'd require it from the original PR.) I wouldn't call it as generically as
--verbose
, but something like--print-violations
.
Ok, we can bikeshed it. For a --print-XXX
flag I would expect stuff to be printed to stdout
for later consumption by another program. (Kind of the proper output of the program.)
There isn't too much written about naming options. There is https://www.gnu.org/prep/standards/html_node/Command_002dLine-Interfaces.html . The subsequent option table mentions --quiet / --silent
and ``--verbose`. This would be my first take, unless further diversification is needed.
from fix-whitespace.
#49 provides a way to work around the performance issue by resurrecting the old implementation and putting the new one under a flag. This is largely independent of the the issue itself.
I must say I find few things in live as sad as tuning performance of Haskell applications. Below are couple observations of my failed attempts at this in our case.
First, many times slow applications generate excessive amount of garbage and productivity goes down. Surprising as it is we don't experience this. Productivity stays above 99%.
Another observation: a lot of slowdown comes from just printing out stuff. I performed a little experiment where I don't do any analysis, and just print out every input line with the decoration displayLineError
:
fix :: Mode -> Verbose -> TabSize -> FilePath -> IO Bool
fix mode verbose _tabSize f = do
s <- Text.readFile f
pure (CheckViolation s (buildVs s))
>>= \case
CheckViolation s vs -> do
Text.hPutStrLn stderr (msg vs)
when (mode == Fix) $
withFile f WriteMode $ \h -> do
hSetEncoding h utf8
Text.hPutStr h s
return True
where
buildVs = zipWith LineError [1..] . Text.lines
msg vs
| mode == Fix =
"[ Violation fixed ] " <> ft
| otherwise =
"[ Violation detected ] " <> ft <>
(if not verbose then "" else
":\n" <> Text.unlines (map (displayLineError ft) vs))
ft = Text.pack f
fix-whitespace on performance-experiments-2 [?] via λ 9.2.6
❯ time $FW --check 20000-violations.txt +RTS -s -p -RTS 2>/dev/null
________________________________________________________
Executed in 6.22 millis fish external
usr time 1.31 millis 341.00 micros 0.97 millis
sys time 5.00 millis 142.00 micros 4.86 millis
fix-whitespace on performance-experiments-2 [?] via λ 9.2.6
❯ time $FW -v --check 20000-violations.txt +RTS -s -p -RTS 2>/dev/null
________________________________________________________
Executed in 433.11 millis fish external
usr time 253.06 millis 0.18 millis 252.88 millis
sys time 180.00 millis 1.08 millis 178.91 millis
(i also use profiling but without profiling numbers are not much different)
from fix-whitespace.
My idea for improving performance of the output would be to try the Builder
type, perhaps.
from fix-whitespace.
Unfortunately, the basic lazy builder didn't seem to help anything (the code is here ulysses4ever@faabc6a). I'm thinking to give this a try: https://github.com/Bodigrim/linear-builder.
from fix-whitespace.
I tried an even simpler experiment, and it does look like just printing stuff out is embarrassingly slow. Here's a little program that reads stdin, breaks it into lines, adds the index to every line and prints out the result:
-- test.hs
module Main where
import TextShow
import qualified Data.Text as T
import qualified Data.Text.IO as T
main = T.getContents >>= T.putStrLn . procText
procText =
T.unlines .
zipWith (\i l -> showt i <> l) [1::Int ..] .
T.lines
It's pretty slow already:
fix-whitespace on performance-experiments-2 [?] via λ 9.2.6
❯ time cat 20000-violations.txt | runghc test.hs >/dev/null
________________________________________________________
Executed in 387.92 millis fish external
usr time 300.18 millis 0.22 millis 299.96 millis
sys time 86.46 millis 2.10 millis 84.37 millis
fix-whitespace on performance-experiments-2 [?] via λ 9.2.6
❯ time cat 200000-violations.txt | runghc test.hs >/dev/null
________________________________________________________
Executed in 1.72 secs fish external
usr time 1.20 secs 0.00 millis 1.20 secs
sys time 0.52 secs 2.90 millis 0.52 secs
from fix-whitespace.
Oops, using runghc
was a slip. If I compile the above program, it runs on par with Rust analog. Back to square one...
from fix-whitespace.
Related Issues (20)
- Support GHC 8.10.3
- Support GHC 9.0.1
- Release fix-whitespace on hackage HOT 1
- Support GHC 8.10.6
- Support GHC 9.2 HOT 1
- Release new version supporting GHC 8.10.7 HOT 3
- `--check` should be the default, option `--fix` should touch files HOT 3
- Catch error `hGetContents: invalid argument (invalid byte sequence)`
- Use `fix-whitespace.yaml` as filter when file arguments are given
- Make tab expansion optional (new option `--tab`). HOT 1
- Add `--version` option
- Release a github action that does `fix-whitespace --check` HOT 3
- Fix trailing tabs, too. HOT 5
- Add testsuite
- Report error when file does not end with a newline character HOT 1
- Build with GHC 9.8
- create_release workflow is outdated; use `gh release`
- Enhancement: more verbose error reporting
- Symlink treated as directory HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fix-whitespace.