Coder Social home page Coder Social logo

erikd / streaming-bytestring Goto Github PK

View Code? Open in Web Editor NEW

This project forked from haskell-streaming/streaming-bytestring

0.0 1.0 0.0 628 KB

effectful sequences of bytes; an alternative no-lazy-io implementation of Data.ByteString.Lazy

License: BSD 3-Clause "New" or "Revised" License

Shell 0.82% Haskell 99.18%

streaming-bytestring's Introduction

streaming-bytestring

Build Build Status Hackage

This library enables fast and safe streaming of byte data, in either Word8 or Char form. It is a core addition to the streaming ecosystem and avoids the usual pitfalls of combinbing lazy ByteStrings with lazy IO.

This library is used by streaming-attoparsec to enable vanilla Attoparsec parsers to work with streaming "for free".

Usage

Importing and Types

Modules from this library are intended to be imported qualified. To avoid conflicts with both the bytestring library and streaming, we recommended Q as the qualified name:

import qualified Streaming.ByteString.Char8 as Q

Like the bytestring library, leaving off the Char8 will expose an API based on Word8. Following the philosophy of streaming that "the best API is the one you already know", these APIs are based closely on bytestring. The core type is ByteStream m r, where:

  • m: The Monad used to fetch further chunks from the "source", usually IO.
  • r: The final return value after all streaming has concluded, usually () as in streaming.

You can imagine this type to represent an infinitely-sized collection of bytes, although internally it references a strict ByteString no larger than 32kb, followed by monadic instructions to fetch further chunks.

Examples

File Input

To open a file of any size and count its characters:

import Control.Monad.Trans.Resource (runResourceT)
import qualified Streaming.Streaming.Char8 as Q

-- | Represents a potentially-infinite stream of `Char`.
chars :: ByteStream IO ()
chars = Q.readFile "huge-file.txt"

main :: IO ()
main = runResourceT (Q.length_ chars) >>= print

Note that file IO specifically requires the resourcet library.

Line splitting and Stream interop

In the example above you may have noticed a lack of Of that we usually see with Stream. Our old friend lines hints at this too:

lines :: Monad m => ByteStream m r -> Stream (ByteStream m) m r

A stream-of-streams, yet no Of here either. The return type can't naively be Stream (Of ByteString) m r, since the first line break might be at the very end of a large file. Forcing that into a single strict ByteString would crash your program.

To count the number of lines whose first letter is i:

countOfI :: IO Int
countOfI = runResourceT
  . S.length_                   -- IO Int
  . S.filter (== 'i')           -- Stream (Of Char) IO ()
  . S.concat                    -- Stream (Of Char) IO ()
  . S.mapped Q.head             -- Stream (Of (Maybe Char)) IO ()
  . Q.lines                     -- Stream (ByteStream IO) IO ()
  $ Q.readFile "huge-file.txt"  -- ByteStream IO ()

Critically, there are several functions which when combined with mapped can bring us back into Of-land:

head     :: Monad m => ByteStream m r -> m (Of (Maybe Char) r)
last     :: Monad m => ByteStream m r -> m (Of (Maybe Char) r)
null     :: Monad m => ByteStream m r -> m (Of Bool) r)
count    :: Monad m => ByteStream m r -> m (Of Int) r)
toLazy   :: Monad m => ByteStream m r -> m (Of ByteString r) -- Be careful with this.
toStrict :: Monad m => ByteStream m r -> m (Of ByteString r) -- Be even *more* careful with this.

When moving in the opposite direction API-wise, consider:

fromChunks :: Stream (Of ByteString) m r -> ByteStream m r

streaming-bytestring's People

Contributors

fosskers avatar michaelt avatar dependabot[bot] avatar hs-viktor avatar andrewthad avatar chessai avatar endgame avatar unkindpartition avatar bergmark avatar bgamari avatar erikd avatar larskuhtz avatar carbolymer avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.