pipes-text-0.0.2.5: properly streaming text

Pipes.Prelude.Text

Contents

Synopsis

# Simple line-based Text IO

Line-based operations are marked with a final -Ln, like stdinLn, readFileLn, etc. They are drop-in Text replacements for the corresponding String operations in Pipes.Prelude and Pipes.Safe.Prelude - a final -Ln being added where necessary. This module can thus be imported unqualified if Pipes.Prelude is imported qualified, as it must be.

In using the line-based operations, one is producing and consuming semantically significant individual texts, understood as lines, just as one would produce or pipe Ints or Chars or anything else. The standard materials from Pipes and Pipes.Prelude and Data.Text are all you need to work with them, and you can use these operations without using any of the other modules in this package.

Thus, to take a trivial case, here we upper-case three lines from standard input and write them to a file. (runSafeT from Pipes.Safe just makes sure to close any handles opened in its scope; it is only needed for readFileLn and writeFileLn.)

>>> import Pipes
>>> import qualified Pipes.Prelude as P
>>> import qualified Pipes.Prelude.Text as Text
>>> import qualified Data.Text as T
>>> Text.runSafeT $runEffect$ Text.stdinLn >-> P.take 3 >-> P.map T.toUpper >-> Text.writeFileLn "threelines.txt"
one<Enter>
two<Enter>
three<Enter>
>>> :! cat "threelines.txt"
ONE
TWO
THREE


The point of view is very much that of Pipes.Prelude, substituting Text for String. It would still be the same even if we did something a bit more sophisticated, like run an ordinary attoparsec Text parser on each line, as is frequently desirable. Here we use a minimal attoparsec number parser, scientific, on separate lines of standard input, dropping bad parses with P.concat:

>>> import Data.Attoparsec.Text (parseOnly, scientific)
>>> P.toListM \$ Text.stdinLn >-> P.takeWhile (/= "quit") >-> P.map (parseOnly scientific) >-> P.concat
1<Enter>
2<Enter>
3<Enter>
quit<Enter>
[1.0,2.0,3.0]


The line-based operations are, however, subject to a number of caveats.

• Where these line-based operations read from a handle, they will accumulate indefinitely long lines. This makes sense for input typed in by a user, and for locally produced files of known characteristics, but otherwise not. See the post on perfect streaming to see why pipes-bytestring and this package, outside this module, take a different approach, in which lines themselves are permitted to stream without accumulation.
• The line-based operations, like those in Data.Text.IO, use the system encoding (and T.hGetLine, T.hPutLine etc.) and thus are slower than the 'official' route, which would use the very fast bytestring IO operations from Pipes.ByteString and the encoding and decoding functions in Pipes.Text.Encoding, which are also quite fast thanks to the streaming-commons package.
• The line-based operations (again like those in Data.Text.IO) will generate text exceptions after the fashion of Data.Text.Encoding, rather than returning the undigested bytes in the style of Pipes.Text.Encoding. This is the standard practice in the pipes libraries.

fromHandleLn :: MonadIO m => Handle -> Producer' Text m () Source #

Read separate lines of Text from a Handle using hGetLine, terminating at the end of input

This operation will accumulate indefinitely large strict texts. See the caveats above.

toHandleLn :: MonadIO m => Handle -> Consumer' Text m r Source #

Write separate lines of Text to a Handle using hPutStrLn

stdinLn :: MonadIO m => Producer' Text m () Source #

Read separate lines of Text from stdin using getLine, terminating on end of input.

This function will accumulate indefinitely long strict Texts. See the caveats above.

stdoutLn :: MonadIO m => Consumer' Text m () Source #

Write Text lines to stdout using putStrLn, terminating without error on a broken output pipe

stdoutLn' :: MonadIO m => Consumer' Text m r Source #

Write lines of Text to stdout. This does not handle a broken output pipe, but has a polymorphic return value.

readFileLn :: MonadSafe m => FilePath -> Producer Text m () Source #

Stream separate lines of text from a file. Apply runSafeT after running the pipeline to manage the opening and closing of the handle.

This operation will accumulate indefinitely long strict text chunks. See the caveats above.

Write lines to a file. Apply runSafeT after running the pipeline to manage the opening and closing of the handle.