diff options
-rw-r--r-- | Pipes/Text.hs | 89 | ||||
-rw-r--r-- | pipes-text.cabal | 2 |
2 files changed, 68 insertions, 23 deletions
diff --git a/Pipes/Text.hs b/Pipes/Text.hs index 9641256..d0a219d 100644 --- a/Pipes/Text.hs +++ b/Pipes/Text.hs | |||
@@ -1,43 +1,88 @@ | |||
1 | {-# LANGUAGE RankNTypes, TypeFamilies, BangPatterns, Trustworthy #-} | 1 | {-# LANGUAGE RankNTypes, TypeFamilies, BangPatterns, Trustworthy #-} |
2 | 2 | ||
3 | {-| This package provides @pipes@ utilities for \"text streams\", which are | 3 | {-| This package provides @pipes@ utilities for \'text streams\', which are |
4 | streams of 'Text' chunks. The individual chunks are uniformly @strict@, and you | 4 | streams of 'Text' chunks. The individual chunks are uniformly @strict@, and you |
5 | will generally want @Data.Text@ in scope. But the type @Producer Text m r@ is | 5 | will generally want @Data.Text@ in scope. But the type @Producer Text m r@ is |
6 | in many ways the pipes equivalent of lazy @Text@ . | 6 | in some ways the pipes equivalent of the lazy @Text@ type. |
7 | 7 | ||
8 | This module provides many functions equivalent in one way or another to | 8 | This module provides many functions equivalent in one way or another to |
9 | the 'pure' functions in | 9 | the 'pure' functions in |
10 | <https://hackage.haskell.org/package/text-1.1.0.0/docs/Data-Text-Lazy.html Data.Text.Lazy>. | 10 | <https://hackage.haskell.org/package/text-1.1.0.0/docs/Data-Text-Lazy.html Data.Text.Lazy>. |
11 | They transform, divide, group and fold text streams. The functions | 11 | They transform, divide, group and fold text streams. Though @Producer Text m r@ |
12 | is \'effectful\' Text, functions | ||
12 | in this module are \'pure\' in the sense that they are uniformly monad-independent. | 13 | in this module are \'pure\' in the sense that they are uniformly monad-independent. |
13 | Simple IO operations are defined in | 14 | Simple IO operations are defined in @Pipes.Text.IO@ -- as lazy IO @Text@ |
14 | @Pipes.Text.IO@ -- as lazy IO @Text@ operations are in @Data.Text.Lazy.IO@ Interoperation | 15 | operations are in @Data.Text.Lazy.IO@. Interoperation with @ByteString@ |
15 | with @ByteString@ is provided in @Pipes.Text.Encoding@, which parallels @Data.Text.Lazy.Encoding@. | 16 | is provided in @Pipes.Text.Encoding@, which parallels @Data.Text.Lazy.Encoding@. |
16 | 17 | ||
17 | The Text type exported by @Data.Text.Lazy@ is similar to '[Text]' | 18 | The Text type exported by @Data.Text.Lazy@ is basically '[Text]'. The implementation |
18 | where the individual chunks are kept to a reasonable size; the user is not | 19 | is arranged so that the individual strict 'Text' chunks are kept to a reasonable size; |
19 | aware of the divisions between the connected (strict) 'Text' chunks. | 20 | the user is not aware of the divisions between the connected 'Text' chunks. |
20 | Similarly, functions in this module are designed to operate on streams that | 21 | So also here: the functions in this module are designed to operate on streams that |
21 | are insensitive to text boundaries. This means that they may freely split | 22 | are insensitive to text boundaries. This means that they may freely split |
22 | text into smaller texts, /discard empty texts/. However, the objective is that they should | 23 | text into smaller texts and /discard empty texts/. However, the objective is |
23 | /never concatenate texts/ in order to provide strict upper bounds on memory usage. | 24 | that they should /never concatenate texts/ in order to provide strict upper |
24 | 25 | bounds on memory usage. | |
25 | One difference from @Data.Text.Lazy@ is that many of the operations are 'lensified'; | 26 | |
26 | this has a number of advantages where it is possible, in particular it facilitate | ||
27 | their use with pipes-style 'Parser's of Text. | ||
28 | For example, to stream only the first three lines of 'stdin' to 'stdout' you | 27 | For example, to stream only the first three lines of 'stdin' to 'stdout' you |
29 | might write: | 28 | might write: |
30 | 29 | ||
31 | > import Pipes | 30 | > import Pipes |
32 | > import qualified Pipes.Text as Text | 31 | > import qualified Pipes.Text as Text |
33 | > import qualified Pipes.Parse as Parse | 32 | > import qualified Pipes.Text.IO as Text |
34 | > | 33 | > import Pipes.Group |
34 | > import Lens.Family | ||
35 | > | ||
35 | > main = runEffect $ takeLines 3 Text.stdin >-> Text.stdout | 36 | > main = runEffect $ takeLines 3 Text.stdin >-> Text.stdout |
36 | > where | 37 | > where |
37 | > takeLines n = Text.unlines . Parse.takeFree n . Text.lines | 38 | > takeLines n = Text.unlines . takes' n . view Text.lines |
39 | > -- or equivalently: | ||
40 | > -- takeLines n = over Text.lines (takes' n) | ||
38 | 41 | ||
39 | The above program will never bring more than one chunk of text (~ 32 KB) into | 42 | The above program will never bring more than one chunk of text (~ 32 KB) into |
40 | memory, no matter how long the lines are. | 43 | memory, no matter how long the lines are. |
44 | |||
45 | As this example shows, one superficial difference from @Data.Text.Lazy@ | ||
46 | is that many of the operations, like 'lines', | ||
47 | are \'lensified\'; this has a number of advantages where it is possible, in particular | ||
48 | it facilitates their use with 'Parser's of Text in the general pipes sense. | ||
49 | Each such expression reduces to the naturally corresponding function when | ||
50 | used with @view@ or @(^.)@. | ||
51 | |||
52 | A more important difference the example reveals is in the types closely associated with | ||
53 | the central type, @Producer Text m r@. In @Data.Text@ and @Data.Text.Lazy@ | ||
54 | we find functions like | ||
55 | |||
56 | > splitAt :: Int -> Text -> (Text, Text) | ||
57 | > lines :: Int -> Text -> [Text] | ||
58 | |||
59 | which relate a Text with a pair or list of Texts. The corresponding functions here (taking | ||
60 | account of \'lensification\') are | ||
61 | |||
62 | > view . splitAt :: (Monad m, Integral n) | ||
63 | > => n -> Producer Text m r -> Producer Text.Text m (Producer Text.Text m r) | ||
64 | > view lines :: Monad m => Producer Text m r -> FreeT (Producer Text m) m r | ||
65 | |||
66 | In the type @Producer Text.Text m (Producer Text.Text m r)@ the second | ||
67 | element of the \'pair\' of of \'effectful Texts\' cannot simply be retrieved | ||
68 | with 'snd'. This is an \'effectful\' pair, and one must work through the effects | ||
69 | of the first element to arrive at the second. Similarly in @FreeT (Producer Text m) m r@, | ||
70 | which corresponds with @[Text]@, on cannot simply drop 10 Producers and take the others; | ||
71 | we can only get to the ones we want to take by working through their predecessors. | ||
72 | |||
73 | Some of the types may be more readable if you imagine that we have introduced | ||
74 | our own type synonyms | ||
75 | |||
76 | > type Text m r = Producer T.Text m r | ||
77 | > type Texts m r = FreeT (Producer T.Text m) m r | ||
78 | |||
79 | Then we would think of the types above as | ||
80 | |||
81 | > view . splitAt :: (Monad m, Integral n) => n -> Text m r -> Text m (Text m r) | ||
82 | > view lines :: (Monad m) => Text m r -> Texts m r | ||
83 | |||
84 | which brings one closer to the types of the similar functions in @Data.Text.Lazy@ | ||
85 | |||
41 | -} | 86 | -} |
42 | 87 | ||
43 | module Pipes.Text ( | 88 | module Pipes.Text ( |
diff --git a/pipes-text.cabal b/pipes-text.cabal index f38b7f2..3ced2bc 100644 --- a/pipes-text.cabal +++ b/pipes-text.cabal | |||
@@ -1,5 +1,5 @@ | |||
1 | name: pipes-text | 1 | name: pipes-text |
2 | version: 0.0.0.7 | 2 | version: 0.0.0.8 |
3 | synopsis: Text pipes. | 3 | synopsis: Text pipes. |
4 | description: * This package will be in a draft, or testing, phase until version 0.0.1. Please report any installation difficulties, or any wisdom about the api, on the github page! | 4 | description: * This package will be in a draft, or testing, phase until version 0.0.1. Please report any installation difficulties, or any wisdom about the api, on the github page! |
5 | . | 5 | . |