{-# LANGUAGE RankNTypes, TypeFamilies, BangPatterns, Trustworthy #-}
-{-| This package provides @pipes@ utilities for \'text streams\', which are
- streams of 'Text' chunks. The individual chunks are uniformly @strict@, and thus you
+{-| This /package/ provides @pipes@ utilities for /text streams/, which are
+ streams of 'Text' chunks. The individual chunks are uniformly /strict/, and thus you
will generally want @Data.Text@ in scope. But the type @Producer Text m r@ is
in some ways the pipes equivalent of the lazy @Text@ type.
- This module provides many functions equivalent in one way or another to
- the 'pure' functions in
+ This /module/ provides many functions equivalent in one way or another to
+ the pure functions in
<https://hackage.haskell.org/package/text-1.1.0.0/docs/Data-Text-Lazy.html Data.Text.Lazy>.
They transform, divide, group and fold text streams. Though @Producer Text m r@
is the type of \'effectful Text\', the functions in this module are \'pure\'
in the sense that they are uniformly monad-independent.
- Simple IO operations are defined in @Pipes.Text.IO@ -- as lazy IO @Text@
- operations are in @Data.Text.Lazy.IO@. Interoperation with @ByteString@
+ Simple /IO/ operations are defined in @Pipes.Text.IO@ -- as lazy IO @Text@
+ operations are in @Data.Text.Lazy.IO@. Inter-operation with @ByteString@
is provided in @Pipes.Text.Encoding@, which parallels @Data.Text.Lazy.Encoding@.
- The Text type exported by @Data.Text.Lazy@ is basically '[Text]'. The implementation
- is arranged so that the individual strict 'Text' chunks are kept to a reasonable size;
- the user is not aware of the divisions between the connected 'Text' chunks.
+ The Text type exported by @Data.Text.Lazy@ is basically that of a lazy list of
+ strict Text: the implementation is arranged so that the individual strict 'Text'
+ chunks are kept to a reasonable size; the user is not aware of the divisions
+ between the connected 'Text' chunks.
So also here: the functions in this module are designed to operate on streams that
are insensitive to text boundaries. This means that they may freely split
- text into smaller texts and /discard empty texts/. However, the objective is
+ text into smaller texts and /discard empty texts/. The objective, though, is
that they should /never concatenate texts/ in order to provide strict upper
bounds on memory usage.
> import Pipes
> import qualified Pipes.Text as Text
> import qualified Pipes.Text.IO as Text
-> import Pipes.Group
+> import Pipes.Group (takes')
> import Lens.Family
>
> main = runEffect $ takeLines 3 Text.stdin >-> Text.stdout
> where
> takeLines n = Text.unlines . takes' n . view Text.lines
-> -- or equivalently:
-> -- takeLines n = over Text.lines (takes' n)
+
The above program will never bring more than one chunk of text (~ 32 KB) into
memory, no matter how long the lines are.
As this example shows, one superficial difference from @Data.Text.Lazy@
is that many of the operations, like 'lines',
- are \'lensified\'; this has a number of advantages where it is possible, in particular
+ are \'lensified\'; this has a number of advantages (where it is possible), in particular
it facilitates their use with 'Parser's of Text (in the general
<http://hackage.haskell.org/package/pipes-parse-3.0.1/docs/Pipes-Parse-Tutorial.html pipes-parse>
sense.)
Each such expression, e.g. 'lines', 'chunksOf' or 'splitAt', reduces to the
- intuitively corresponding function when used with @view@ or @(^.)@. The lens combinators
- you will find indispensible are \'view\'/ '(^.)', 'zoom' and probably 'over', which
+ intuitively corresponding function when used with @view@ or @(^.)@.
+
+ Note similarly that many equivalents of 'Text -> Text' functions are exported here as 'Pipe's.
+ They reduce to the intuitively corresponding functions when used with '(>->)'. Thus something like
+
+> stripLines = Text.unlines . Group.maps (>-> Text.stripStart) . view Text.lines
+
+ would drop the leading white space from each line.
+
+ The lens combinators
+ you will find indispensible are \'view\' / '(^.)', 'zoom' and probably 'over'. These
are supplied by both <http://hackage.haskell.org/package/lens lens> and
- <http://hackage.haskell.org/package/lens-family lens-family>
+ <http://hackage.haskell.org/package/lens-family lens-family> The use of 'zoom' is explained
+ in <http://hackage.haskell.org/package/pipes-parse-3.0.1/docs/Pipes-Parse-Tutorial.html Pipes.Parse.Tutorial>
+ and to some extent in Pipes.Text.Encoding. The use of
+ 'over' is simple, illustrated by the fact that we can rewrite @stripLines@ above as
+
+> stripLines = over Text.lines $ maps (>-> stripStart)
- A more important difference the example reveals is in the types closely associated with
- the central type, @Producer Text m r@. In @Data.Text@ and @Data.Text.Lazy@
- we find functions like
+ These simple 'lines' examples reveal a more important difference from @Data.Text.Lazy@ .
+ This is in the types that are most closely associated with our central text type,
+ @Producer Text m r@. In @Data.Text@ and @Data.Text.Lazy@ we find functions like
-> splitAt :: Int -> Text -> (Text, Text)
-> lines :: Int -> Text -> [Text]
+> splitAt :: Int -> Text -> (Text, Text)
+> lines :: Text -> [Text]
> chunksOf :: Int -> Text -> [Text]
- which relate a Text with a pair or list of Texts. The corresponding functions here (taking
- account of \'lensification\') are
+ which relate a Text with a pair of Texts or a list of Texts.
+ The corresponding functions here (taking account of \'lensification\') are
-> view . splitAt :: (Monad m, Integral n) => n -> Producer Text m r -> Producer Text.Text m (Producer Text.Text m r)
-> view lines :: Monad m => Producer Text m r -> FreeT (Producer Text m) m r
-> view . chunksOf :: (Monad m, Integral n) => n -> Producer Text m r -> FreeT (Producer Text m) m r
+> view . splitAt :: (Monad m, Integral n) => n -> Producer Text m r -> Producer Text m (Producer Text m r)
+> view lines :: Monad m => Producer Text m r -> FreeT (Producer Text m) m r
+> view . chunksOf :: (Monad m, Integral n) => n -> Producer Text m r -> FreeT (Producer Text m) m r
- In the type @Producer Text m (Producer Text m r)@ the second
- element of the \'pair\' of of \'effectful Texts\' cannot simply be retrieved
- with 'snd'. This is an \'effectful\' pair, and one must work through the effects
- of the first element to arrive at the second Text stream. Similarly in @FreeT (Producer Text m) m r@,
- which corresponds with @[Text]@, on cannot simply drop 10 Producers and take the others;
- we can only get to the ones we want to take by working through their predecessors.
-
Some of the types may be more readable if you imagine that we have introduced
our own type synonyms
-> type Text m r = Producer T.Text m r
+> type Text m r = Producer T.Text m r
> type Texts m r = FreeT (Producer T.Text m) m r
Then we would think of the types above as
-> view . splitAt :: (Monad m, Integral n) => n -> Text m r -> Text m (Text m r)
-> view lines :: (Monad m) => Text m r -> Texts m r
+> view . splitAt :: (Monad m, Integral n) => n -> Text m r -> Text m (Text m r)
+> view lines :: (Monad m) => Text m r -> Texts m r
> view . chunksOf :: (Monad m, Integral n) => n -> Text m r -> Texts m r
which brings one closer to the types of the similar functions in @Data.Text.Lazy@
+ In the type @Producer Text m (Producer Text m r)@ the second
+ element of the \'pair\' of \'effectful Texts\' cannot simply be retrieved
+ with something like 'snd'. This is an \'effectful\' pair, and one must work
+ through the effects of the first element to arrive at the second Text stream.
+ Note that we use Control.Monad.join to fuse the pair back together, since it specializes to
+
+> join :: Producer Text m (Producer m r) -> Producer m r
+
-}
module Pipes.Text (
#-}
-- | uppercase incoming 'Text'
-toUpper :: Monad m => Pipe Text Text m ()
+toUpper :: Monad m => Pipe Text Text m r
toUpper = P.map T.toUpper
{-# INLINEABLE toUpper #-}