aboutsummaryrefslogtreecommitdiffhomepage
path: root/Pipes
diff options
context:
space:
mode:
Diffstat (limited to 'Pipes')
-rw-r--r--Pipes/Text.hs91
1 files changed, 53 insertions, 38 deletions
diff --git a/Pipes/Text.hs b/Pipes/Text.hs
index 2f69806..b90948f 100644
--- a/Pipes/Text.hs
+++ b/Pipes/Text.hs
@@ -1,26 +1,27 @@
1{-# LANGUAGE RankNTypes, TypeFamilies, BangPatterns, Trustworthy #-} 1{-# LANGUAGE RankNTypes, TypeFamilies, BangPatterns, Trustworthy #-}
2 2
3{-| This package provides @pipes@ utilities for \'text streams\', which are 3{-| This /package/ provides @pipes@ utilities for /text streams/, which are
4 streams of 'Text' chunks. The individual chunks are uniformly @strict@, and thus you 4 streams of 'Text' chunks. The individual chunks are uniformly /strict/, and thus you
5 will generally want @Data.Text@ in scope. But the type @Producer Text m r@ is 5 will generally want @Data.Text@ in scope. But the type @Producer Text m r@ is
6 in some ways the pipes equivalent of the lazy @Text@ type. 6 in some ways the pipes equivalent of the lazy @Text@ type.
7 7
8 This module provides many functions equivalent in one way or another to 8 This /module/ provides many functions equivalent in one way or another to
9 the 'pure' functions in 9 the pure functions in
10 <https://hackage.haskell.org/package/text-1.1.0.0/docs/Data-Text-Lazy.html Data.Text.Lazy>. 10 <https://hackage.haskell.org/package/text-1.1.0.0/docs/Data-Text-Lazy.html Data.Text.Lazy>.
11 They transform, divide, group and fold text streams. Though @Producer Text m r@ 11 They transform, divide, group and fold text streams. Though @Producer Text m r@
12 is the type of \'effectful Text\', the functions in this module are \'pure\' 12 is the type of \'effectful Text\', the functions in this module are \'pure\'
13 in the sense that they are uniformly monad-independent. 13 in the sense that they are uniformly monad-independent.
14 Simple IO operations are defined in @Pipes.Text.IO@ -- as lazy IO @Text@ 14 Simple /IO/ operations are defined in @Pipes.Text.IO@ -- as lazy IO @Text@
15 operations are in @Data.Text.Lazy.IO@. Interoperation with @ByteString@ 15 operations are in @Data.Text.Lazy.IO@. Inter-operation with @ByteString@
16 is provided in @Pipes.Text.Encoding@, which parallels @Data.Text.Lazy.Encoding@. 16 is provided in @Pipes.Text.Encoding@, which parallels @Data.Text.Lazy.Encoding@.
17 17
18 The Text type exported by @Data.Text.Lazy@ is basically '[Text]'. The implementation 18 The Text type exported by @Data.Text.Lazy@ is basically that of a lazy list of
19 is arranged so that the individual strict 'Text' chunks are kept to a reasonable size; 19 strict Text: the implementation is arranged so that the individual strict 'Text'
20 the user is not aware of the divisions between the connected 'Text' chunks. 20 chunks are kept to a reasonable size; the user is not aware of the divisions
21 between the connected 'Text' chunks.
21 So also here: the functions in this module are designed to operate on streams that 22 So also here: the functions in this module are designed to operate on streams that
22 are insensitive to text boundaries. This means that they may freely split 23 are insensitive to text boundaries. This means that they may freely split
23 text into smaller texts and /discard empty texts/. However, the objective is 24 text into smaller texts and /discard empty texts/. The objective, though, is
24 that they should /never concatenate texts/ in order to provide strict upper 25 that they should /never concatenate texts/ in order to provide strict upper
25 bounds on memory usage. 26 bounds on memory usage.
26 27
@@ -30,66 +31,80 @@
30> import Pipes 31> import Pipes
31> import qualified Pipes.Text as Text 32> import qualified Pipes.Text as Text
32> import qualified Pipes.Text.IO as Text 33> import qualified Pipes.Text.IO as Text
33> import Pipes.Group 34> import Pipes.Group (takes')
34> import Lens.Family 35> import Lens.Family
35> 36>
36> main = runEffect $ takeLines 3 Text.stdin >-> Text.stdout 37> main = runEffect $ takeLines 3 Text.stdin >-> Text.stdout
37> where 38> where
38> takeLines n = Text.unlines . takes' n . view Text.lines 39> takeLines n = Text.unlines . takes' n . view Text.lines
39> -- or equivalently: 40
40> -- takeLines n = over Text.lines (takes' n)
41 41
42 The above program will never bring more than one chunk of text (~ 32 KB) into 42 The above program will never bring more than one chunk of text (~ 32 KB) into
43 memory, no matter how long the lines are. 43 memory, no matter how long the lines are.
44 44
45 As this example shows, one superficial difference from @Data.Text.Lazy@ 45 As this example shows, one superficial difference from @Data.Text.Lazy@
46 is that many of the operations, like 'lines', 46 is that many of the operations, like 'lines',
47 are \'lensified\'; this has a number of advantages where it is possible, in particular 47 are \'lensified\'; this has a number of advantages (where it is possible), in particular
48 it facilitates their use with 'Parser's of Text (in the general 48 it facilitates their use with 'Parser's of Text (in the general
49 <http://hackage.haskell.org/package/pipes-parse-3.0.1/docs/Pipes-Parse-Tutorial.html pipes-parse> 49 <http://hackage.haskell.org/package/pipes-parse-3.0.1/docs/Pipes-Parse-Tutorial.html pipes-parse>
50 sense.) 50 sense.)
51 Each such expression, e.g. 'lines', 'chunksOf' or 'splitAt', reduces to the 51 Each such expression, e.g. 'lines', 'chunksOf' or 'splitAt', reduces to the
52 intuitively corresponding function when used with @view@ or @(^.)@. The lens combinators 52 intuitively corresponding function when used with @view@ or @(^.)@.
53 you will find indispensible are \'view\'/ '(^.)', 'zoom' and probably 'over', which 53
54 Note similarly that many equivalents of 'Text -> Text' functions are exported here as 'Pipe's.
55 They reduce to the intuitively corresponding functions when used with '(>->)'. Thus something like
56
57> stripLines = Text.unlines . Group.maps (>-> Text.stripStart) . view Text.lines
58
59 would drop the leading white space from each line.
60
61 The lens combinators
62 you will find indispensible are \'view\' / '(^.)', 'zoom' and probably 'over'. These
54 are supplied by both <http://hackage.haskell.org/package/lens lens> and 63 are supplied by both <http://hackage.haskell.org/package/lens lens> and
55 <http://hackage.haskell.org/package/lens-family lens-family> 64 <http://hackage.haskell.org/package/lens-family lens-family> The use of 'zoom' is explained
65 in <http://hackage.haskell.org/package/pipes-parse-3.0.1/docs/Pipes-Parse-Tutorial.html Pipes.Parse.Tutorial>
66 and to some extent in Pipes.Text.Encoding. The use of
67 'over' is simple, illustrated by the fact that we can rewrite @stripLines@ above as
68
69> stripLines = over Text.lines $ maps (>-> stripStart)
56 70
57 A more important difference the example reveals is in the types closely associated with 71 These simple 'lines' examples reveal a more important difference from @Data.Text.Lazy@ .
58 the central type, @Producer Text m r@. In @Data.Text@ and @Data.Text.Lazy@ 72 This is in the types that are most closely associated with our central text type,
59 we find functions like 73 @Producer Text m r@. In @Data.Text@ and @Data.Text.Lazy@ we find functions like
60 74
61> splitAt :: Int -> Text -> (Text, Text) 75> splitAt :: Int -> Text -> (Text, Text)
62> lines :: Int -> Text -> [Text] 76> lines :: Text -> [Text]
63> chunksOf :: Int -> Text -> [Text] 77> chunksOf :: Int -> Text -> [Text]
64 78
65 which relate a Text with a pair or list of Texts. The corresponding functions here (taking 79 which relate a Text with a pair of Texts or a list of Texts.
66 account of \'lensification\') are 80 The corresponding functions here (taking account of \'lensification\') are
67 81
68> view . splitAt :: (Monad m, Integral n) => n -> Producer Text m r -> Producer Text.Text m (Producer Text.Text m r) 82> view . splitAt :: (Monad m, Integral n) => n -> Producer Text m r -> Producer Text m (Producer Text m r)
69> view lines :: Monad m => Producer Text m r -> FreeT (Producer Text m) m r 83> view lines :: Monad m => Producer Text m r -> FreeT (Producer Text m) m r
70> view . chunksOf :: (Monad m, Integral n) => n -> Producer Text m r -> FreeT (Producer Text m) m r 84> view . chunksOf :: (Monad m, Integral n) => n -> Producer Text m r -> FreeT (Producer Text m) m r
71 85
72 In the type @Producer Text m (Producer Text m r)@ the second
73 element of the \'pair\' of of \'effectful Texts\' cannot simply be retrieved
74 with 'snd'. This is an \'effectful\' pair, and one must work through the effects
75 of the first element to arrive at the second Text stream. Similarly in @FreeT (Producer Text m) m r@,
76 which corresponds with @[Text]@, on cannot simply drop 10 Producers and take the others;
77 we can only get to the ones we want to take by working through their predecessors.
78
79 Some of the types may be more readable if you imagine that we have introduced 86 Some of the types may be more readable if you imagine that we have introduced
80 our own type synonyms 87 our own type synonyms
81 88
82> type Text m r = Producer T.Text m r 89> type Text m r = Producer T.Text m r
83> type Texts m r = FreeT (Producer T.Text m) m r 90> type Texts m r = FreeT (Producer T.Text m) m r
84 91
85 Then we would think of the types above as 92 Then we would think of the types above as
86 93
87> view . splitAt :: (Monad m, Integral n) => n -> Text m r -> Text m (Text m r) 94> view . splitAt :: (Monad m, Integral n) => n -> Text m r -> Text m (Text m r)
88> view lines :: (Monad m) => Text m r -> Texts m r 95> view lines :: (Monad m) => Text m r -> Texts m r
89> view . chunksOf :: (Monad m, Integral n) => n -> Text m r -> Texts m r 96> view . chunksOf :: (Monad m, Integral n) => n -> Text m r -> Texts m r
90 97
91 which brings one closer to the types of the similar functions in @Data.Text.Lazy@ 98 which brings one closer to the types of the similar functions in @Data.Text.Lazy@
92 99
100 In the type @Producer Text m (Producer Text m r)@ the second
101 element of the \'pair\' of \'effectful Texts\' cannot simply be retrieved
102 with something like 'snd'. This is an \'effectful\' pair, and one must work
103 through the effects of the first element to arrive at the second Text stream.
104 Note that we use Control.Monad.join to fuse the pair back together, since it specializes to
105
106> join :: Producer Text m (Producer m r) -> Producer m r
107
93-} 108-}
94 109
95module Pipes.Text ( 110module Pipes.Text (
@@ -294,7 +309,7 @@ toLower = P.map T.toLower
294 #-} 309 #-}
295 310
296-- | uppercase incoming 'Text' 311-- | uppercase incoming 'Text'
297toUpper :: Monad m => Pipe Text Text m () 312toUpper :: Monad m => Pipe Text Text m r
298toUpper = P.map T.toUpper 313toUpper = P.map T.toUpper
299{-# INLINEABLE toUpper #-} 314{-# INLINEABLE toUpper #-}
300 315