]> git.immae.eu Git - github/fretlink/text-pipes.git/blame - Pipes/Text.hs
generalized signatures
[github/fretlink/text-pipes.git] / Pipes / Text.hs
CommitLineData
9667f797
GG
1{-# LANGUAGE RankNTypes, TypeFamilies, BangPatterns, Trustworthy #-}
2
c70edb9d 3{-| This /package/ provides @pipes@ utilities for /text streams/, which are
4 streams of 'Text' chunks. The individual chunks are uniformly /strict/, and thus you
0ac0c414 5 will generally want @Data.Text@ in scope. But the type @Producer Text m r@ is
74377aa0 6 in some ways the pipes equivalent of the lazy @Text@ type.
0ac0c414 7
c70edb9d 8 This /module/ provides many functions equivalent in one way or another to
9 the pure functions in
0ac0c414 10 <https://hackage.haskell.org/package/text-1.1.0.0/docs/Data-Text-Lazy.html Data.Text.Lazy>.
74377aa0 11 They transform, divide, group and fold text streams. Though @Producer Text m r@
02f89dfe 12 is the type of \'effectful Text\', the functions in this module are \'pure\'
13 in the sense that they are uniformly monad-independent.
c70edb9d 14 Simple /IO/ operations are defined in @Pipes.Text.IO@ -- as lazy IO @Text@
15 operations are in @Data.Text.Lazy.IO@. Inter-operation with @ByteString@
74377aa0 16 is provided in @Pipes.Text.Encoding@, which parallels @Data.Text.Lazy.Encoding@.
17
c70edb9d 18 The Text type exported by @Data.Text.Lazy@ is basically that of a lazy list of
19 strict Text: the implementation is arranged so that the individual strict 'Text'
20 chunks are kept to a reasonable size; the user is not aware of the divisions
21 between the connected 'Text' chunks.
74377aa0 22 So also here: the functions in this module are designed to operate on streams that
0ac0c414 23 are insensitive to text boundaries. This means that they may freely split
c70edb9d 24 text into smaller texts and /discard empty texts/. The objective, though, is
74377aa0 25 that they should /never concatenate texts/ in order to provide strict upper
26 bounds on memory usage.
27
0ac0c414 28 For example, to stream only the first three lines of 'stdin' to 'stdout' you
31f41a5d 29 might write:
91727d11 30
31> import Pipes
31f41a5d 32> import qualified Pipes.Text as Text
74377aa0 33> import qualified Pipes.Text.IO as Text
c70edb9d 34> import Pipes.Group (takes')
74377aa0 35> import Lens.Family
36>
31f41a5d 37> main = runEffect $ takeLines 3 Text.stdin >-> Text.stdout
74377aa0 38> where
39> takeLines n = Text.unlines . takes' n . view Text.lines
c70edb9d 40
91727d11 41
31f41a5d 42 The above program will never bring more than one chunk of text (~ 32 KB) into
74377aa0 43 memory, no matter how long the lines are.
44
45 As this example shows, one superficial difference from @Data.Text.Lazy@
46 is that many of the operations, like 'lines',
c70edb9d 47 are \'lensified\'; this has a number of advantages (where it is possible), in particular
e7ad3643 48 it facilitates their use with 'Parser's of Text (in the general
49 <http://hackage.haskell.org/package/pipes-parse-3.0.1/docs/Pipes-Parse-Tutorial.html pipes-parse>
50 sense.)
51 Each such expression, e.g. 'lines', 'chunksOf' or 'splitAt', reduces to the
c70edb9d 52 intuitively corresponding function when used with @view@ or @(^.)@.
53
54 Note similarly that many equivalents of 'Text -> Text' functions are exported here as 'Pipe's.
55 They reduce to the intuitively corresponding functions when used with '(>->)'. Thus something like
56
57> stripLines = Text.unlines . Group.maps (>-> Text.stripStart) . view Text.lines
58
59 would drop the leading white space from each line.
60
61 The lens combinators
62 you will find indispensible are \'view\' / '(^.)', 'zoom' and probably 'over'. These
02f89dfe 63 are supplied by both <http://hackage.haskell.org/package/lens lens> and
c70edb9d 64 <http://hackage.haskell.org/package/lens-family lens-family> The use of 'zoom' is explained
65 in <http://hackage.haskell.org/package/pipes-parse-3.0.1/docs/Pipes-Parse-Tutorial.html Pipes.Parse.Tutorial>
66 and to some extent in Pipes.Text.Encoding. The use of
67 'over' is simple, illustrated by the fact that we can rewrite @stripLines@ above as
68
69> stripLines = over Text.lines $ maps (>-> stripStart)
74377aa0 70
c70edb9d 71 These simple 'lines' examples reveal a more important difference from @Data.Text.Lazy@ .
72 This is in the types that are most closely associated with our central text type,
73 @Producer Text m r@. In @Data.Text@ and @Data.Text.Lazy@ we find functions like
74377aa0 74
c70edb9d 75> splitAt :: Int -> Text -> (Text, Text)
76> lines :: Text -> [Text]
e7ad3643 77> chunksOf :: Int -> Text -> [Text]
74377aa0 78
c70edb9d 79 which relate a Text with a pair of Texts or a list of Texts.
80 The corresponding functions here (taking account of \'lensification\') are
74377aa0 81
c70edb9d 82> view . splitAt :: (Monad m, Integral n) => n -> Producer Text m r -> Producer Text m (Producer Text m r)
83> view lines :: Monad m => Producer Text m r -> FreeT (Producer Text m) m r
84> view . chunksOf :: (Monad m, Integral n) => n -> Producer Text m r -> FreeT (Producer Text m) m r
74377aa0 85
74377aa0 86 Some of the types may be more readable if you imagine that we have introduced
87 our own type synonyms
88
c70edb9d 89> type Text m r = Producer T.Text m r
74377aa0 90> type Texts m r = FreeT (Producer T.Text m) m r
91
92 Then we would think of the types above as
93
c70edb9d 94> view . splitAt :: (Monad m, Integral n) => n -> Text m r -> Text m (Text m r)
95> view lines :: (Monad m) => Text m r -> Texts m r
e7ad3643 96> view . chunksOf :: (Monad m, Integral n) => n -> Text m r -> Texts m r
74377aa0 97
98 which brings one closer to the types of the similar functions in @Data.Text.Lazy@
99
c70edb9d 100 In the type @Producer Text m (Producer Text m r)@ the second
101 element of the \'pair\' of \'effectful Texts\' cannot simply be retrieved
102 with something like 'snd'. This is an \'effectful\' pair, and one must work
103 through the effects of the first element to arrive at the second Text stream.
104 Note that we use Control.Monad.join to fuse the pair back together, since it specializes to
105
106> join :: Producer Text m (Producer m r) -> Producer m r
107
91727d11 108-}
109
7faef8bc 110module Pipes.Text (
91727d11 111 -- * Producers
1a83ae4e 112 fromLazy
91727d11 113
114 -- * Pipes
1677dc12 115 , map
116 , concatMap
117 , take
118 , drop
119 , takeWhile
120 , dropWhile
121 , filter
122 , scan
1677dc12 123 , pack
124 , unpack
125 , toCaseFold
126 , toLower
127 , toUpper
128 , stripStart
91727d11 129
130 -- * Folds
1677dc12 131 , toLazy
132 , toLazyM
133 , foldChars
134 , head
135 , last
136 , null
137 , length
138 , any
139 , all
140 , maximum
141 , minimum
142 , find
143 , index
144 , count
145
146 -- * Primitive Character Parsers
1677dc12 147 , nextChar
148 , drawChar
149 , unDrawChar
150 , peekChar
9e9bb0ce 151 , isEndOfChars
1677dc12 152
153 -- * Parsing Lenses
9e9bb0ce 154 , splitAt
1677dc12 155 , span
156 , break
157 , groupBy
158 , group
9e9bb0ce 159 , word
160 , line
1677dc12 161
162 -- * FreeT Splitters
163 , chunksOf
164 , splitsWith
0f8c6f1b 165 , splits
1a83ae4e 166 , groupsBy
167 , groups
1677dc12 168 , lines
169 , words
170
91727d11 171 -- * Transformations
1677dc12 172 , intersperse
9e9bb0ce 173 , packChars
31f41a5d 174
91727d11 175 -- * Joiners
1677dc12 176 , intercalate
177 , unlines
178 , unwords
9e9bb0ce 179
1a83ae4e 180 -- * Re-exports
91727d11 181 -- $reexports
1677dc12 182 , module Data.ByteString
183 , module Data.Text
184 , module Data.Profunctor
1677dc12 185 , module Pipes.Parse
7ed76745 186 , module Pipes.Group
91727d11 187 ) where
188
0f8c6f1b 189import Control.Applicative ((<*))
70125641 190import Control.Monad (liftM, join)
9e9bb0ce 191import Control.Monad.Trans.State.Strict (StateT(..), modify)
91727d11 192import qualified Data.Text as T
91727d11 193import Data.Text (Text)
194import qualified Data.Text.Lazy as TL
91727d11 195import Data.Text.Lazy.Internal (foldrChunks, defaultChunkSize)
31f41a5d 196import Data.ByteString (ByteString)
1677dc12 197import Data.Functor.Constant (Constant(Constant, getConstant))
91727d11 198import Data.Functor.Identity (Identity)
1677dc12 199import Data.Profunctor (Profunctor)
200import qualified Data.Profunctor
91727d11 201import Pipes
7fc48f7c 202import Pipes.Group (concats, intercalates, FreeT(..), FreeF(..))
7ed76745 203import qualified Pipes.Group as PG
91727d11 204import qualified Pipes.Parse as PP
7ed76745 205import Pipes.Parse (Parser)
91727d11 206import qualified Pipes.Prelude as P
91727d11 207import Data.Char (isSpace)
1a83ae4e 208import Data.Word (Word8)
1677dc12 209
91727d11 210import Prelude hiding (
211 all,
212 any,
213 break,
214 concat,
215 concatMap,
216 drop,
217 dropWhile,
218 elem,
219 filter,
220 head,
221 last,
222 lines,
223 length,
224 map,
225 maximum,
226 minimum,
227 notElem,
228 null,
229 readFile,
230 span,
231 splitAt,
232 take,
233 takeWhile,
234 unlines,
235 unwords,
236 words,
237 writeFile )
238
239-- | Convert a lazy 'TL.Text' into a 'Producer' of strict 'Text's
240fromLazy :: (Monad m) => TL.Text -> Producer' Text m ()
241fromLazy = foldrChunks (\e a -> yield e >> a) (return ())
ca6f90a0 242{-# INLINE fromLazy #-}
91727d11 243
1677dc12 244
245type Lens' a b = forall f . Functor f => (b -> f b) -> (a -> f a)
246
247type Iso' a b = forall f p . (Functor f, Profunctor p) => p b (f b) -> p a (f a)
248
249(^.) :: a -> ((b -> Constant b b) -> (a -> Constant b a)) -> b
250a ^. lens = getConstant (lens Constant a)
251
252
91727d11 253-- | Apply a transformation to each 'Char' in the stream
254map :: (Monad m) => (Char -> Char) -> Pipe Text Text m r
255map f = P.map (T.map f)
256{-# INLINABLE map #-}
257
ff38b9f0 258{-# RULES "p >-> map f" forall p f .
259 p >-> map f = for p (\txt -> yield (T.map f txt))
260 #-}
261
31f41a5d 262-- | Map a function over the characters of a text stream and concatenate the results
91727d11 263concatMap
264 :: (Monad m) => (Char -> Text) -> Pipe Text Text m r
265concatMap f = P.map (T.concatMap f)
266{-# INLINABLE concatMap #-}
267
ff38b9f0 268{-# RULES "p >-> concatMap f" forall p f .
269 p >-> concatMap f = for p (\txt -> yield (T.concatMap f txt))
270 #-}
7faef8bc 271
ff38b9f0 272
c0343bc9 273-- | Transform a Pipe of 'String's into one of 'Text' chunks
7faef8bc 274pack :: Monad m => Pipe String Text m r
275pack = P.map T.pack
276{-# INLINEABLE pack #-}
277
ff38b9f0 278{-# RULES "p >-> pack" forall p .
279 p >-> pack = for p (\txt -> yield (T.pack txt))
280 #-}
281
282-- | Transform a Pipes of 'Text' chunks into one of 'String's
7faef8bc 283unpack :: Monad m => Pipe Text String m r
d4732515 284unpack = for cat (\t -> yield (T.unpack t))
7faef8bc 285{-# INLINEABLE unpack #-}
286
ff38b9f0 287{-# RULES "p >-> unpack" forall p .
288 p >-> unpack = for p (\txt -> yield (T.unpack txt))
289 #-}
d4732515 290
b0d86a59 291-- | @toCaseFold@, @toLower@, @toUpper@ and @stripStart@ are standard 'Text' utilities,
292-- here acting as 'Text' pipes, rather as they would on a lazy text
a4913c42 293toCaseFold :: Monad m => Pipe Text Text m r
7faef8bc 294toCaseFold = P.map T.toCaseFold
295{-# INLINEABLE toCaseFold #-}
296
ff38b9f0 297{-# RULES "p >-> toCaseFold" forall p .
298 p >-> toCaseFold = for p (\txt -> yield (T.toCaseFold txt))
299 #-}
300
301
c0343bc9 302-- | lowercase incoming 'Text'
a4913c42 303toLower :: Monad m => Pipe Text Text m r
7faef8bc 304toLower = P.map T.toLower
305{-# INLINEABLE toLower #-}
306
ff38b9f0 307{-# RULES "p >-> toLower" forall p .
308 p >-> toLower = for p (\txt -> yield (T.toLower txt))
309 #-}
310
c0343bc9 311-- | uppercase incoming 'Text'
c70edb9d 312toUpper :: Monad m => Pipe Text Text m r
7faef8bc 313toUpper = P.map T.toUpper
314{-# INLINEABLE toUpper #-}
315
ff38b9f0 316{-# RULES "p >-> toUpper" forall p .
317 p >-> toUpper = for p (\txt -> yield (T.toUpper txt))
318 #-}
319
c0343bc9 320-- | Remove leading white space from an incoming succession of 'Text's
7faef8bc 321stripStart :: Monad m => Pipe Text Text m r
322stripStart = do
323 chunk <- await
324 let text = T.stripStart chunk
325 if T.null text
326 then stripStart
b0d86a59 327 else do yield text
328 cat
7faef8bc 329{-# INLINEABLE stripStart #-}
330
31f41a5d 331-- | @(take n)@ only allows @n@ individual characters to pass;
332-- contrast @Pipes.Prelude.take@ which would let @n@ chunks pass.
91727d11 333take :: (Monad m, Integral a) => a -> Pipe Text Text m ()
334take n0 = go n0 where
335 go n
336 | n <= 0 = return ()
337 | otherwise = do
31f41a5d 338 txt <- await
339 let len = fromIntegral (T.length txt)
91727d11 340 if (len > n)
31f41a5d 341 then yield (T.take (fromIntegral n) txt)
91727d11 342 else do
31f41a5d 343 yield txt
91727d11 344 go (n - len)
345{-# INLINABLE take #-}
346
31f41a5d 347-- | @(drop n)@ drops the first @n@ characters
91727d11 348drop :: (Monad m, Integral a) => a -> Pipe Text Text m r
349drop n0 = go n0 where
350 go n
351 | n <= 0 = cat
352 | otherwise = do
31f41a5d 353 txt <- await
354 let len = fromIntegral (T.length txt)
91727d11 355 if (len >= n)
356 then do
31f41a5d 357 yield (T.drop (fromIntegral n) txt)
91727d11 358 cat
359 else go (n - len)
360{-# INLINABLE drop #-}
361
31f41a5d 362-- | Take characters until they fail the predicate
91727d11 363takeWhile :: (Monad m) => (Char -> Bool) -> Pipe Text Text m ()
364takeWhile predicate = go
365 where
366 go = do
31f41a5d 367 txt <- await
368 let (prefix, suffix) = T.span predicate txt
91727d11 369 if (T.null suffix)
370 then do
31f41a5d 371 yield txt
91727d11 372 go
373 else yield prefix
374{-# INLINABLE takeWhile #-}
375
31f41a5d 376-- | Drop characters until they fail the predicate
91727d11 377dropWhile :: (Monad m) => (Char -> Bool) -> Pipe Text Text m r
378dropWhile predicate = go where
379 go = do
31f41a5d 380 txt <- await
381 case T.findIndex (not . predicate) txt of
91727d11 382 Nothing -> go
383 Just i -> do
31f41a5d 384 yield (T.drop i txt)
91727d11 385 cat
386{-# INLINABLE dropWhile #-}
387
388-- | Only allows 'Char's to pass if they satisfy the predicate
389filter :: (Monad m) => (Char -> Bool) -> Pipe Text Text m r
390filter predicate = P.map (T.filter predicate)
391{-# INLINABLE filter #-}
392
ff38b9f0 393{-# RULES "p >-> filter q" forall p q .
394 p >-> filter q = for p (\txt -> yield (T.filter q txt))
395 #-}
396
31f41a5d 397-- | Strict left scan over the characters
91727d11 398scan
399 :: (Monad m)
400 => (Char -> Char -> Char) -> Char -> Pipe Text Text m r
11645cdc
GG
401scan step begin = do
402 yield (T.singleton begin)
403 go begin
91727d11 404 where
31f41a5d 405 go c = do
406 txt <- await
407 let txt' = T.scanl step c txt
408 c' = T.last txt'
11645cdc 409 yield (T.tail txt')
31f41a5d 410 go c'
91727d11 411{-# INLINABLE scan #-}
412
413{-| Fold a pure 'Producer' of strict 'Text's into a lazy
414 'TL.Text'
415-}
416toLazy :: Producer Text Identity () -> TL.Text
417toLazy = TL.fromChunks . P.toList
418{-# INLINABLE toLazy #-}
419
420{-| Fold an effectful 'Producer' of strict 'Text's into a lazy
421 'TL.Text'
422
423 Note: 'toLazyM' is not an idiomatic use of @pipes@, but I provide it for
424 simple testing purposes. Idiomatic @pipes@ style consumes the chunks
425 immediately as they are generated instead of loading them all into memory.
426-}
427toLazyM :: (Monad m) => Producer Text m () -> m TL.Text
428toLazyM = liftM TL.fromChunks . P.toListM
429{-# INLINABLE toLazyM #-}
430
31f41a5d 431-- | Reduce the text stream using a strict left fold over characters
64e03122 432foldChars
91727d11 433 :: Monad m
434 => (x -> Char -> x) -> x -> (x -> r) -> Producer Text m () -> m r
64e03122 435foldChars step begin done = P.fold (T.foldl' step) begin done
1677dc12 436{-# INLINABLE foldChars #-}
91727d11 437
438-- | Retrieve the first 'Char'
439head :: (Monad m) => Producer Text m () -> m (Maybe Char)
440head = go
441 where
442 go p = do
443 x <- nextChar p
444 case x of
445 Left _ -> return Nothing
31f41a5d 446 Right (c, _) -> return (Just c)
91727d11 447{-# INLINABLE head #-}
448
449-- | Retrieve the last 'Char'
450last :: (Monad m) => Producer Text m () -> m (Maybe Char)
451last = go Nothing
452 where
453 go r p = do
454 x <- next p
455 case x of
456 Left () -> return r
31f41a5d 457 Right (txt, p') ->
458 if (T.null txt)
91727d11 459 then go r p'
31f41a5d 460 else go (Just $ T.last txt) p'
91727d11 461{-# INLINABLE last #-}
462
463-- | Determine if the stream is empty
464null :: (Monad m) => Producer Text m () -> m Bool
465null = P.all T.null
466{-# INLINABLE null #-}
467
62e8521c 468-- | Count the number of characters in the stream
91727d11 469length :: (Monad m, Num n) => Producer Text m () -> m n
31f41a5d 470length = P.fold (\n txt -> n + fromIntegral (T.length txt)) 0 id
91727d11 471{-# INLINABLE length #-}
472
473-- | Fold that returns whether 'M.Any' received 'Char's satisfy the predicate
474any :: (Monad m) => (Char -> Bool) -> Producer Text m () -> m Bool
475any predicate = P.any (T.any predicate)
476{-# INLINABLE any #-}
477
478-- | Fold that returns whether 'M.All' received 'Char's satisfy the predicate
479all :: (Monad m) => (Char -> Bool) -> Producer Text m () -> m Bool
480all predicate = P.all (T.all predicate)
481{-# INLINABLE all #-}
482
62e8521c 483-- | Return the maximum 'Char' within a text stream
91727d11 484maximum :: (Monad m) => Producer Text m () -> m (Maybe Char)
485maximum = P.fold step Nothing id
486 where
31f41a5d 487 step mc txt =
488 if (T.null txt)
489 then mc
490 else Just $ case mc of
491 Nothing -> T.maximum txt
492 Just c -> max c (T.maximum txt)
91727d11 493{-# INLINABLE maximum #-}
494
62e8521c 495-- | Return the minimum 'Char' within a text stream (surely very useful!)
91727d11 496minimum :: (Monad m) => Producer Text m () -> m (Maybe Char)
497minimum = P.fold step Nothing id
498 where
31f41a5d 499 step mc txt =
500 if (T.null txt)
501 then mc
502 else case mc of
503 Nothing -> Just (T.minimum txt)
504 Just c -> Just (min c (T.minimum txt))
91727d11 505{-# INLINABLE minimum #-}
506
91727d11 507-- | Find the first element in the stream that matches the predicate
508find
509 :: (Monad m)
510 => (Char -> Bool) -> Producer Text m () -> m (Maybe Char)
511find predicate p = head (p >-> filter predicate)
512{-# INLINABLE find #-}
513
62e8521c 514-- | Index into a text stream
91727d11 515index
516 :: (Monad m, Integral a)
517 => a-> Producer Text m () -> m (Maybe Char)
518index n p = head (p >-> drop n)
519{-# INLINABLE index #-}
520
63ea9ffd 521
31f41a5d 522-- | Store a tally of how many segments match the given 'Text'
523count :: (Monad m, Num n) => Text -> Producer Text m () -> m n
524count c p = P.fold (+) 0 id (p >-> P.map (fromIntegral . T.count c))
525{-# INLINABLE count #-}
526
9e9bb0ce 527
1a83ae4e 528-- | Consume the first character from a stream of 'Text'
529--
530-- 'next' either fails with a 'Left' if the 'Producer' has no more characters or
531-- succeeds with a 'Right' providing the next character and the remainder of the
532-- 'Producer'.
9e9bb0ce 533
9e9bb0ce 534nextChar
535 :: (Monad m)
536 => Producer Text m r
537 -> m (Either r (Char, Producer Text m r))
538nextChar = go
539 where
540 go p = do
541 x <- next p
542 case x of
543 Left r -> return (Left r)
544 Right (txt, p') -> case (T.uncons txt) of
545 Nothing -> go p'
546 Just (c, txt') -> return (Right (c, yield txt' >> p'))
547{-# INLINABLE nextChar #-}
548
1a83ae4e 549-- | Draw one 'Char' from a stream of 'Text', returning 'Left' if the 'Producer' is empty
550
9e9bb0ce 551drawChar :: (Monad m) => Parser Text m (Maybe Char)
552drawChar = do
553 x <- PP.draw
554 case x of
555 Nothing -> return Nothing
556 Just txt -> case (T.uncons txt) of
557 Nothing -> drawChar
558 Just (c, txt') -> do
559 PP.unDraw txt'
560 return (Just c)
561{-# INLINABLE drawChar #-}
562
563-- | Push back a 'Char' onto the underlying 'Producer'
564unDrawChar :: (Monad m) => Char -> Parser Text m ()
565unDrawChar c = modify (yield (T.singleton c) >>)
566{-# INLINABLE unDrawChar #-}
567
568{-| 'peekChar' checks the first 'Char' in the stream, but uses 'unDrawChar' to
569 push the 'Char' back
570
571> peekChar = do
572> x <- drawChar
573> case x of
574> Left _ -> return ()
575> Right c -> unDrawChar c
576> return x
1a83ae4e 577
9e9bb0ce 578-}
1a83ae4e 579
9e9bb0ce 580peekChar :: (Monad m) => Parser Text m (Maybe Char)
581peekChar = do
582 x <- drawChar
583 case x of
584 Nothing -> return ()
585 Just c -> unDrawChar c
586 return x
587{-# INLINABLE peekChar #-}
588
589{-| Check if the underlying 'Producer' has no more characters
590
591 Note that this will skip over empty 'Text' chunks, unlike
592 'PP.isEndOfInput' from @pipes-parse@, which would consider
593 an empty 'Text' a valid bit of input.
594
595> isEndOfChars = liftM isLeft peekChar
596-}
597isEndOfChars :: (Monad m) => Parser Text m Bool
598isEndOfChars = do
599 x <- peekChar
600 return (case x of
601 Nothing -> True
602 Just _-> False )
603{-# INLINABLE isEndOfChars #-}
604
605
31f41a5d 606-- | Splits a 'Producer' after the given number of characters
91727d11 607splitAt
608 :: (Monad m, Integral n)
609 => n
9e9bb0ce 610 -> Lens' (Producer Text m r)
611 (Producer Text m (Producer Text m r))
612splitAt n0 k p0 = fmap join (k (go n0 p0))
91727d11 613 where
614 go 0 p = return p
615 go n p = do
616 x <- lift (next p)
617 case x of
618 Left r -> return (return r)
31f41a5d 619 Right (txt, p') -> do
620 let len = fromIntegral (T.length txt)
91727d11 621 if (len <= n)
622 then do
31f41a5d 623 yield txt
91727d11 624 go (n - len) p'
625 else do
31f41a5d 626 let (prefix, suffix) = T.splitAt (fromIntegral n) txt
91727d11 627 yield prefix
628 return (yield suffix >> p')
629{-# INLINABLE splitAt #-}
630
91727d11 631
1a83ae4e 632-- | Split a text stream in two, producing the longest
633-- consecutive group of characters that satisfies the predicate
634-- and returning the rest
635
91727d11 636span
637 :: (Monad m)
638 => (Char -> Bool)
9e9bb0ce 639 -> Lens' (Producer Text m r)
640 (Producer Text m (Producer Text m r))
641span predicate k p0 = fmap join (k (go p0))
91727d11 642 where
643 go p = do
644 x <- lift (next p)
645 case x of
646 Left r -> return (return r)
31f41a5d 647 Right (txt, p') -> do
648 let (prefix, suffix) = T.span predicate txt
91727d11 649 if (T.null suffix)
650 then do
31f41a5d 651 yield txt
91727d11 652 go p'
653 else do
654 yield prefix
655 return (yield suffix >> p')
656{-# INLINABLE span #-}
657
1a83ae4e 658{-| Split a text stream in two, producing the longest
62e8521c 659 consecutive group of characters that don't satisfy the predicate
91727d11 660-}
661break
662 :: (Monad m)
663 => (Char -> Bool)
9e9bb0ce 664 -> Lens' (Producer Text m r)
665 (Producer Text m (Producer Text m r))
91727d11 666break predicate = span (not . predicate)
667{-# INLINABLE break #-}
668
9e9bb0ce 669{-| Improper lens that splits after the first group of equivalent Chars, as
670 defined by the given equivalence relation
671-}
672groupBy
673 :: (Monad m)
674 => (Char -> Char -> Bool)
675 -> Lens' (Producer Text m r)
676 (Producer Text m (Producer Text m r))
677groupBy equals k p0 = fmap join (k ((go p0))) where
678 go p = do
679 x <- lift (next p)
680 case x of
681 Left r -> return (return r)
682 Right (txt, p') -> case T.uncons txt of
683 Nothing -> go p'
684 Just (c, _) -> (yield txt >> p') ^. span (equals c)
685{-# INLINABLE groupBy #-}
686
687-- | Improper lens that splits after the first succession of identical 'Char' s
688group :: Monad m
689 => Lens' (Producer Text m r)
690 (Producer Text m (Producer Text m r))
691group = groupBy (==)
692{-# INLINABLE group #-}
693
694{-| Improper lens that splits a 'Producer' after the first word
695
696 Unlike 'words', this does not drop leading whitespace
697-}
698word :: (Monad m)
699 => Lens' (Producer Text m r)
700 (Producer Text m (Producer Text m r))
701word k p0 = fmap join (k (to p0))
702 where
703 to p = do
704 p' <- p^.span isSpace
705 p'^.break isSpace
706{-# INLINABLE word #-}
707
708
709line :: (Monad m)
710 => Lens' (Producer Text m r)
711 (Producer Text m (Producer Text m r))
712line = break (== '\n')
713
714{-# INLINABLE line #-}
715
716
717-- | Intersperse a 'Char' in between the characters of stream of 'Text'
718intersperse
719 :: (Monad m) => Char -> Producer Text m r -> Producer Text m r
720intersperse c = go0
721 where
722 go0 p = do
723 x <- lift (next p)
724 case x of
725 Left r -> return r
726 Right (txt, p') -> do
727 yield (T.intersperse c txt)
728 go1 p'
729 go1 p = do
730 x <- lift (next p)
731 case x of
732 Left r -> return r
733 Right (txt, p') -> do
734 yield (T.singleton c)
735 yield (T.intersperse c txt)
736 go1 p'
737{-# INLINABLE intersperse #-}
738
739
740
741-- | Improper isomorphism between a 'Producer' of 'ByteString's and 'Word8's
742packChars :: Monad m => Iso' (Producer Char m x) (Producer Text m x)
743packChars = Data.Profunctor.dimap to (fmap from)
744 where
745 -- to :: Monad m => Producer Char m x -> Producer Text m x
7ed76745 746 to p = PG.folds step id done (p^.PG.chunksOf defaultChunkSize)
9e9bb0ce 747
748 step diffAs c = diffAs . (c:)
749
750 done diffAs = T.pack (diffAs [])
751
752 -- from :: Monad m => Producer Text m x -> Producer Char m x
753 from p = for p (each . T.unpack)
754{-# INLINABLE packChars #-}
755
0f8c6f1b 756
757-- | Split a text stream into 'FreeT'-delimited text streams of fixed size
758chunksOf
759 :: (Monad m, Integral n)
760 => n -> Lens' (Producer Text m r)
761 (FreeT (Producer Text m) m r)
762chunksOf n k p0 = fmap concats (k (FreeT (go p0)))
763 where
764 go p = do
765 x <- next p
766 return $ case x of
7ed76745 767 Left r -> Pure r
768 Right (txt, p') -> Free $ do
0f8c6f1b 769 p'' <- (yield txt >> p') ^. splitAt n
7ed76745 770 return $ FreeT (go p'')
0f8c6f1b 771{-# INLINABLE chunksOf #-}
772
773
62e8521c 774{-| Split a text stream into sub-streams delimited by characters that satisfy the
91727d11 775 predicate
776-}
1677dc12 777splitsWith
91727d11 778 :: (Monad m)
779 => (Char -> Bool)
780 -> Producer Text m r
7ed76745 781 -> FreeT (Producer Text m) m r
782splitsWith predicate p0 = FreeT (go0 p0)
91727d11 783 where
784 go0 p = do
785 x <- next p
786 case x of
7ed76745 787 Left r -> return (Pure r)
31f41a5d 788 Right (txt, p') ->
789 if (T.null txt)
91727d11 790 then go0 p'
7ed76745 791 else return $ Free $ do
9e9bb0ce 792 p'' <- (yield txt >> p') ^. span (not . predicate)
7ed76745 793 return $ FreeT (go1 p'')
91727d11 794 go1 p = do
795 x <- nextChar p
796 return $ case x of
7ed76745 797 Left r -> Pure r
798 Right (_, p') -> Free $ do
9e9bb0ce 799 p'' <- p' ^. span (not . predicate)
7ed76745 800 return $ FreeT (go1 p'')
1677dc12 801{-# INLINABLE splitsWith #-}
91727d11 802
31f41a5d 803-- | Split a text stream using the given 'Char' as the delimiter
0f8c6f1b 804splits :: (Monad m)
91727d11 805 => Char
0f8c6f1b 806 -> Lens' (Producer Text m r)
807 (FreeT (Producer Text m) m r)
808splits c k p =
7ed76745 809 fmap (PG.intercalates (yield (T.singleton c))) (k (splitsWith (c ==) p))
0f8c6f1b 810{-# INLINABLE splits #-}
811
812{-| Isomorphism between a stream of 'Text' and groups of equivalent 'Char's , using the
813 given equivalence relation
814-}
815groupsBy
816 :: Monad m
817 => (Char -> Char -> Bool)
818 -> Lens' (Producer Text m x) (FreeT (Producer Text m) m x)
7ed76745 819groupsBy equals k p0 = fmap concats (k (FreeT (go p0))) where
0f8c6f1b 820 go p = do x <- next p
7ed76745 821 case x of Left r -> return (Pure r)
0f8c6f1b 822 Right (bs, p') -> case T.uncons bs of
823 Nothing -> go p'
7ed76745 824 Just (c, _) -> do return $ Free $ do
0f8c6f1b 825 p'' <- (yield bs >> p')^.span (equals c)
7ed76745 826 return $ FreeT (go p'')
0f8c6f1b 827{-# INLINABLE groupsBy #-}
828
829
830-- | Like 'groupsBy', where the equality predicate is ('==')
831groups
832 :: Monad m
833 => Lens' (Producer Text m x) (FreeT (Producer Text m) m x)
834groups = groupsBy (==)
835{-# INLINABLE groups #-}
836
91727d11 837
91727d11 838
62e8521c 839{-| Split a text stream into 'FreeT'-delimited lines
91727d11 840-}
841lines
0f8c6f1b 842 :: (Monad m) => Iso' (Producer Text m r) (FreeT (Producer Text m) m r)
843lines = Data.Profunctor.dimap _lines (fmap _unlines)
91727d11 844 where
7ed76745 845 _lines p0 = FreeT (go0 p0)
0f8c6f1b 846 where
847 go0 p = do
848 x <- next p
849 case x of
7ed76745 850 Left r -> return (Pure r)
0f8c6f1b 851 Right (txt, p') ->
852 if (T.null txt)
853 then go0 p'
7ed76745 854 else return $ Free $ go1 (yield txt >> p')
0f8c6f1b 855 go1 p = do
856 p' <- p ^. break ('\n' ==)
7ed76745 857 return $ FreeT $ do
0f8c6f1b 858 x <- nextChar p'
859 case x of
7ed76745 860 Left r -> return $ Pure r
0f8c6f1b 861 Right (_, p'') -> go0 p''
862 -- _unlines
863 -- :: Monad m
864 -- => FreeT (Producer Text m) m x -> Producer Text m x
7fc48f7c 865 _unlines = concats . PG.maps (<* yield (T.singleton '\n'))
866
0f8c6f1b 867
91727d11 868{-# INLINABLE lines #-}
91727d11 869
31f41a5d 870
31f41a5d 871-- | Split a text stream into 'FreeT'-delimited words
91727d11 872words
0f8c6f1b 873 :: (Monad m) => Iso' (Producer Text m r) (FreeT (Producer Text m) m r)
874words = Data.Profunctor.dimap go (fmap _unwords)
91727d11 875 where
7ed76745 876 go p = FreeT $ do
cf10d6f1 877 x <- next (p >-> dropWhile isSpace)
878 return $ case x of
7ed76745 879 Left r -> Pure r
880 Right (bs, p') -> Free $ do
9e9bb0ce 881 p'' <- (yield bs >> p') ^. break isSpace
cf10d6f1 882 return (go p'')
7ed76745 883 _unwords = PG.intercalates (yield $ T.singleton ' ')
0f8c6f1b 884
91727d11 885{-# INLINABLE words #-}
886
cf10d6f1 887
31f41a5d 888{-| 'intercalate' concatenates the 'FreeT'-delimited text streams after
889 interspersing a text stream in between them
91727d11 890-}
891intercalate
892 :: (Monad m)
893 => Producer Text m ()
894 -> FreeT (Producer Text m) m r
895 -> Producer Text m r
896intercalate p0 = go0
897 where
898 go0 f = do
7ed76745 899 x <- lift (runFreeT f)
91727d11 900 case x of
7ed76745 901 Pure r -> return r
902 Free p -> do
91727d11 903 f' <- p
904 go1 f'
905 go1 f = do
7ed76745 906 x <- lift (runFreeT f)
91727d11 907 case x of
7ed76745 908 Pure r -> return r
909 Free p -> do
91727d11 910 p0
911 f' <- p
912 go1 f'
913{-# INLINABLE intercalate #-}
914
62e8521c 915{-| Join 'FreeT'-delimited lines into a text stream
91727d11 916-}
917unlines
918 :: (Monad m) => FreeT (Producer Text m) m r -> Producer Text m r
919unlines = go
920 where
921 go f = do
7ed76745 922 x <- lift (runFreeT f)
91727d11 923 case x of
7ed76745 924 Pure r -> return r
925 Free p -> do
91727d11 926 f' <- p
927 yield $ T.singleton '\n'
928 go f'
929{-# INLINABLE unlines #-}
930
31f41a5d 931{-| Join 'FreeT'-delimited words into a text stream
91727d11 932-}
933unwords
934 :: (Monad m) => FreeT (Producer Text m) m r -> Producer Text m r
7fc48f7c 935unwords = intercalate (yield $ T.singleton ' ')
91727d11 936{-# INLINABLE unwords #-}
937
91727d11 938
91727d11 939{- $reexports
91727d11 940
941 @Data.Text@ re-exports the 'Text' type.
942
0f8c6f1b 943 @Pipes.Parse@ re-exports 'input', 'concat', 'FreeT' (the type) and the 'Parse' synonym.
64e03122 944-}
945
bbdfd305 946