aboutsummaryrefslogtreecommitdiffhomepage
path: root/Pipes/Text.hs
diff options
context:
space:
mode:
Diffstat (limited to 'Pipes/Text.hs')
-rw-r--r--Pipes/Text.hs89
1 files changed, 67 insertions, 22 deletions
diff --git a/Pipes/Text.hs b/Pipes/Text.hs
index 9641256..d0a219d 100644
--- a/Pipes/Text.hs
+++ b/Pipes/Text.hs
@@ -1,43 +1,88 @@
1{-# LANGUAGE RankNTypes, TypeFamilies, BangPatterns, Trustworthy #-} 1{-# LANGUAGE RankNTypes, TypeFamilies, BangPatterns, Trustworthy #-}
2 2
3{-| This package provides @pipes@ utilities for \"text streams\", which are 3{-| This package provides @pipes@ utilities for \'text streams\', which are
4 streams of 'Text' chunks. The individual chunks are uniformly @strict@, and you 4 streams of 'Text' chunks. The individual chunks are uniformly @strict@, and you
5 will generally want @Data.Text@ in scope. But the type @Producer Text m r@ is 5 will generally want @Data.Text@ in scope. But the type @Producer Text m r@ is
6 in many ways the pipes equivalent of lazy @Text@ . 6 in some ways the pipes equivalent of the lazy @Text@ type.
7 7
8 This module provides many functions equivalent in one way or another to 8 This module provides many functions equivalent in one way or another to
9 the 'pure' functions in 9 the 'pure' functions in
10 <https://hackage.haskell.org/package/text-1.1.0.0/docs/Data-Text-Lazy.html Data.Text.Lazy>. 10 <https://hackage.haskell.org/package/text-1.1.0.0/docs/Data-Text-Lazy.html Data.Text.Lazy>.
11 They transform, divide, group and fold text streams. The functions 11 They transform, divide, group and fold text streams. Though @Producer Text m r@
12 is \'effectful\' Text, functions
12 in this module are \'pure\' in the sense that they are uniformly monad-independent. 13 in this module are \'pure\' in the sense that they are uniformly monad-independent.
13 Simple IO operations are defined in 14 Simple IO operations are defined in @Pipes.Text.IO@ -- as lazy IO @Text@
14 @Pipes.Text.IO@ -- as lazy IO @Text@ operations are in @Data.Text.Lazy.IO@ Interoperation 15 operations are in @Data.Text.Lazy.IO@. Interoperation with @ByteString@
15 with @ByteString@ is provided in @Pipes.Text.Encoding@, which parallels @Data.Text.Lazy.Encoding@. 16 is provided in @Pipes.Text.Encoding@, which parallels @Data.Text.Lazy.Encoding@.
16 17
17 The Text type exported by @Data.Text.Lazy@ is similar to '[Text]' 18 The Text type exported by @Data.Text.Lazy@ is basically '[Text]'. The implementation
18 where the individual chunks are kept to a reasonable size; the user is not 19 is arranged so that the individual strict 'Text' chunks are kept to a reasonable size;
19 aware of the divisions between the connected (strict) 'Text' chunks. 20 the user is not aware of the divisions between the connected 'Text' chunks.
20 Similarly, functions in this module are designed to operate on streams that 21 So also here: the functions in this module are designed to operate on streams that
21 are insensitive to text boundaries. This means that they may freely split 22 are insensitive to text boundaries. This means that they may freely split
22 text into smaller texts, /discard empty texts/. However, the objective is that they should 23 text into smaller texts and /discard empty texts/. However, the objective is
23 /never concatenate texts/ in order to provide strict upper bounds on memory usage. 24 that they should /never concatenate texts/ in order to provide strict upper
24 25 bounds on memory usage.
25 One difference from @Data.Text.Lazy@ is that many of the operations are 'lensified'; 26
26 this has a number of advantages where it is possible, in particular it facilitate
27 their use with pipes-style 'Parser's of Text.
28 For example, to stream only the first three lines of 'stdin' to 'stdout' you 27 For example, to stream only the first three lines of 'stdin' to 'stdout' you
29 might write: 28 might write:
30 29
31> import Pipes 30> import Pipes
32> import qualified Pipes.Text as Text 31> import qualified Pipes.Text as Text
33> import qualified Pipes.Parse as Parse 32> import qualified Pipes.Text.IO as Text
34> 33> import Pipes.Group
34> import Lens.Family
35>
35> main = runEffect $ takeLines 3 Text.stdin >-> Text.stdout 36> main = runEffect $ takeLines 3 Text.stdin >-> Text.stdout
36> where 37> where
37> takeLines n = Text.unlines . Parse.takeFree n . Text.lines 38> takeLines n = Text.unlines . takes' n . view Text.lines
39> -- or equivalently:
40> -- takeLines n = over Text.lines (takes' n)
38 41
39 The above program will never bring more than one chunk of text (~ 32 KB) into 42 The above program will never bring more than one chunk of text (~ 32 KB) into
40 memory, no matter how long the lines are. 43 memory, no matter how long the lines are.
44
45 As this example shows, one superficial difference from @Data.Text.Lazy@
46 is that many of the operations, like 'lines',
47 are \'lensified\'; this has a number of advantages where it is possible, in particular
48 it facilitates their use with 'Parser's of Text in the general pipes sense.
49 Each such expression reduces to the naturally corresponding function when
50 used with @view@ or @(^.)@.
51
52 A more important difference the example reveals is in the types closely associated with
53 the central type, @Producer Text m r@. In @Data.Text@ and @Data.Text.Lazy@
54 we find functions like
55
56> splitAt :: Int -> Text -> (Text, Text)
57> lines :: Int -> Text -> [Text]
58
59 which relate a Text with a pair or list of Texts. The corresponding functions here (taking
60 account of \'lensification\') are
61
62> view . splitAt :: (Monad m, Integral n)
63> => n -> Producer Text m r -> Producer Text.Text m (Producer Text.Text m r)
64> view lines :: Monad m => Producer Text m r -> FreeT (Producer Text m) m r
65
66 In the type @Producer Text.Text m (Producer Text.Text m r)@ the second
67 element of the \'pair\' of of \'effectful Texts\' cannot simply be retrieved
68 with 'snd'. This is an \'effectful\' pair, and one must work through the effects
69 of the first element to arrive at the second. Similarly in @FreeT (Producer Text m) m r@,
70 which corresponds with @[Text]@, on cannot simply drop 10 Producers and take the others;
71 we can only get to the ones we want to take by working through their predecessors.
72
73 Some of the types may be more readable if you imagine that we have introduced
74 our own type synonyms
75
76> type Text m r = Producer T.Text m r
77> type Texts m r = FreeT (Producer T.Text m) m r
78
79 Then we would think of the types above as
80
81> view . splitAt :: (Monad m, Integral n) => n -> Text m r -> Text m (Text m r)
82> view lines :: (Monad m) => Text m r -> Texts m r
83
84 which brings one closer to the types of the similar functions in @Data.Text.Lazy@
85
41-} 86-}
42 87
43module Pipes.Text ( 88module Pipes.Text (