Portability | unportable (GADTs, Rank2Types) |
---|---|

Stability | experimental |

Maintainer | Brent Yorgey <byorgey@gmail.com> |

The Data.List.Split module contains a wide range of strategies for splitting lists with respect to some sort of delimiter, mostly implemented through a unified combinator interface. The goal is to be flexible yet simple. Scroll past the Synopsis for usage, examples, and detailed documentation of all exported functions. If you want to learn about the implementation, see Data.List.Split.Internals.

A darcs repository containing the source (including a module with over 40 QuickCheck properties) can be found at http://code.haskell.org/~byorgey/code/split.

- splitOn :: Eq a => [a] -> [a] -> [[a]]
- splitOneOf :: Eq a => [a] -> [a] -> [[a]]
- splitWhen :: (a -> Bool) -> [a] -> [[a]]
- sepBy :: Eq a => [a] -> [a] -> [[a]]
- sepByOneOf :: Eq a => [a] -> [a] -> [[a]]
- endBy :: Eq a => [a] -> [a] -> [[a]]
- endByOneOf :: Eq a => [a] -> [a] -> [[a]]
- unintercalate :: Eq a => [a] -> [a] -> [[a]]
- wordsBy :: (a -> Bool) -> [a] -> [[a]]
- linesBy :: (a -> Bool) -> [a] -> [[a]]
- splitEvery :: Int -> [e] -> [[e]]
- chunk :: Int -> [e] -> [[e]]
- splitPlaces :: Integral a => [a] -> [e] -> [[e]]
- splitPlacesBlanks :: Integral a => [a] -> [e] -> [[e]]
- chop :: ([a] -> (b, [a])) -> [a] -> [b]
- data Splitter a
- defaultSplitter :: Splitter a
- split :: Splitter a -> [a] -> [[a]]
- oneOf :: Eq a => [a] -> Splitter a
- onSublist :: Eq a => [a] -> Splitter a
- whenElt :: (a -> Bool) -> Splitter a
- dropDelims :: Splitter a -> Splitter a
- keepDelimsL :: Splitter a -> Splitter a
- keepDelimsR :: Splitter a -> Splitter a
- condense :: Splitter a -> Splitter a
- dropInitBlank :: Splitter a -> Splitter a
- dropFinalBlank :: Splitter a -> Splitter a
- dropBlanks :: Splitter a -> Splitter a
- startsWith :: Eq a => [a] -> Splitter a
- startsWithOneOf :: Eq a => [a] -> Splitter a
- endsWith :: Eq a => [a] -> Splitter a
- endsWithOneOf :: Eq a => [a] -> Splitter a

# Getting started

To get started, you should take a look at the functions `splitOn`

,
`splitWhen`

, `sepBy`

, `endBy`

, `splitEvery`

, `splitPlaces`

, and
other functions listed in the next two sections. These functions
implement various common splitting operations, and one of them will
probably do the job 90% of the time. For example:

> splitOn "x" "axbxc" ["a","b","c"] > splitOn "x" "axbxcx" ["a","b","c",""] > endBy ";" "foo;bar;baz;" ["foo","bar","baz"] > splitWhen (<0) [1,3,-4,5,7,-9,0,2] [[1,3],[5,7],[0,2]] > splitOneOf ";.," "foo,bar;baz.glurk" ["foo","bar","baz","glurk"] > splitEvery 3 ['a'..'z'] ["abc","def","ghi","jkl","mno","pqr","stu","vwx","yz"]

If you want more flexibility, however, you can use the combinator library in terms of which these functions are defined. For more information, skip to the section labeled "Splitting Combinators".

Note that the goal of this library is to be flexible yet simple. It does not implement any particularly sophisticated list-splitting methods, nor is it tuned for speed. If you find yourself wanting something more complicated or optimized, it probably means you should use a real parsing or regular expression library.

# Convenience functions

These functions implement some common splitting strategies. Note that all of the functions in this section drop delimiters from the final output, since that is a more common use case even though it is not the default.

splitOn :: Eq a => [a] -> [a] -> [[a]]Source

Split on the given sublist. Equivalent to

. For example:
`split`

. `dropDelims`

. `onSublist`

splitOn ".." "a..b...c....d.." == ["a","b",".c","","d",""]

splitOneOf :: Eq a => [a] -> [a] -> [[a]]Source

Split on any of the given elements. Equivalent to

. For example:
`split`

. `dropDelims`

. `oneOf`

splitOneOf ";.," "foo,bar;baz.glurk" == ["foo","bar","baz","glurk"]

splitWhen :: (a -> Bool) -> [a] -> [[a]]Source

Split on elements satisfying the given predicate. Equivalent to

. For example:
`split`

. `dropDelims`

. `whenElt`

splitWhen (<0) [1,3,-4,5,7,-9,0,2] == [[1,3],[5,7],[0,2]]

sepByOneOf :: Eq a => [a] -> [a] -> [[a]]Source

A synonym for `splitOneOf`

.

endBy :: Eq a => [a] -> [a] -> [[a]]Source

Split into chunks terminated by the given subsequence.
Equivalent to

. For example:
`split`

. `dropFinalBlank`

. `dropDelims`

. `onSublist`

endBy ";" "foo;bar;baz;" == ["foo","bar","baz"]

Note also that the `lines`

function from Data.List is equivalent
to

.
`endBy`

"\n"

endByOneOf :: Eq a => [a] -> [a] -> [[a]]Source

Split into chunks terminated by one of the given elements.
Equivalent to

.
`split`

. `dropFinalBlank`

. `dropDelims`

. `oneOf`

unintercalate :: Eq a => [a] -> [a] -> [[a]]Source

A synonym for `sepBy`

/ `splitOn`

.

Note that this is the right inverse of the `intercalate`

function
from Data.List, that is,

. It is also the case that `intercalate`

x . `unintercalate`

x
== `id`

is idempotent. `unintercalate`

x
. `intercalate`

x

is the identity on certain lists, but it is
tricky to state the precise conditions under which this holds.
(For example, it is not enough to say that `unintercalate`

x
. `intercalate`

x`x`

does not occur in
any elements of the input list. Working out why is left as an
exercise for the reader.)

wordsBy :: (a -> Bool) -> [a] -> [[a]]Source

Split into words, with word boundaries indicated by the given
predicate. Satisfies `words === wordsBy isSpace`

; equivalent to
`split . dropBlanks . dropDelims . whenElt`

. For example:

wordsBy (=='x') "dogxxxcatxbirdxx" == ["dog","cat","bird"]

linesBy :: (a -> Bool) -> [a] -> [[a]]Source

Split into lines, with line boundaries indicated by the given
predicate. Satisfies `lines === linesBy (=='\n')`

; equivalent to
`split . dropFinalBlank . dropDelims . whenElt`

. For example:

linesBy (=='x') "dogxxxcatxbirdxx" == ["dog","","","cat","bird",""]

# Other splitting methods

Other useful splitting methods which are not implemented using the combinator framework.

splitEvery :: Int -> [e] -> [[e]]Source

splits a list into length-n pieces. The last
piece will be shorter if `splitEvery`

n`n`

does not evenly divide the length of
the list. If `n <= 0`

,

returns an infinite list
of empty lists.
`splitEvery`

n l

Note that

is `splitEvery`

n []`[]`

, not `[[]]`

. This is
intentional, and is consistent with a recursive definition of
`splitEvery`

; it satisfies the property that

splitEvery n xs ++ splitEvery n ys == splitEvery n (xs ++ ys)

whenever `n`

evenly divides the length of `xs`

.

chunk :: Int -> [e] -> [[e]]Source

A common synonym for `splitEvery`

.

splitPlaces :: Integral a => [a] -> [e] -> [[e]]Source

Split a list into chunks of the given lengths. For example:

splitPlaces [2,3,4] [1..20] == [[1,2],[3,4,5],[6,7,8,9]] splitPlaces [4,9] [1..10] == [[1,2,3,4],[5,6,7,8,9,10]] splitPlaces [4,9,3] [1..10] == [[1,2,3,4],[5,6,7,8,9,10]]

The behavior of

when `splitPlaces`

ls xs

can
be inferred from the above examples and the fact that `sum`

ls /= `length`

xs`splitPlaces`

is total.

splitPlacesBlanks :: Integral a => [a] -> [e] -> [[e]]Source

Split a list into chunks of the given lengths. Unlike `splitPlaces`

, the
output list will always be the same length as the first input argument. For
example:

splitPlacesBlanks [2,3,4] [1..20] == [[1,2],[3,4,5],[6,7,8,9]] splitPlacesBlanks [4,9] [1..10] == [[1,2,3,4],[5,6,7,8,9,10]] splitPlacesBlanks [4,9,3] [1..10] == [[1,2,3,4],[5,6,7,8,9,10],[]]

chop :: ([a] -> (b, [a])) -> [a] -> [b]Source

A useful recursion pattern for processing a list to produce a new list, often used for "chopping" up the input list. Typically chop is called with some function that will consume an initial prefix of the list and produce a value and the rest of the list.

For example, many common Prelude functions can be implemented in
terms of `chop`

:

group :: (Eq a) => [a] -> [[a]] group = chop (\ xs@(x:_) -> span (==x) xs) words :: String -> [String] words = filter (not . null) . chop (span (not . isSpace) . dropWhile isSpace)

# Splitting combinators

The core of the library is the `Splitter`

type, which represents a
particular list-splitting strategy. All of the combinators revolve
around constructing or transforming `Splitter`

objects; once a
suitable `Splitter`

has been created, it can be run with the
`split`

function. For example:

> split (dropBlanks . condense $ whenElt (<0)) [1,2,4,-5,-6,4,9,-19,-30] [[1,2,4],[-5,-6],[4,9],[-19,-30]]

defaultSplitter :: Splitter aSource

The default splitting strategy: keep delimiters in the output as separate chunks, don't condense multiple consecutive delimiters into one, keep initial and final blank chunks. Default delimiter is the constantly false predicate.

Note that `defaultSplitter`

should normally not be used; use
`oneOf`

, `onSublist`

, or `whenElt`

instead, which are the same as
the `defaultSplitter`

with just the delimiter overridden.

The `defaultSplitter`

strategy with any delimiter gives a
maximally information-preserving splitting strategy, in the sense
that (a) taking the `concat`

of the output yields the original
list, and (b) given only the output list, we can reconstruct a
`Splitter`

which would produce the same output list again given
the original input list. This default strategy can be overridden
to allow discarding various sorts of information.

split :: Splitter a -> [a] -> [[a]]Source

Split a list according to the given splitting strategy. This is
how to "run" a `Splitter`

that has been built using the other
combinators.

## Basic strategies

All these basic strategies have the same parameters as the
`defaultSplitter`

except for the delimiter.

oneOf :: Eq a => [a] -> Splitter aSource

A splitting strategy that splits on any one of the given elements. For example:

split (oneOf "xyz") "aazbxyzcxd" == ["aa","z","b","x","","y","","z","c","x","d"]

onSublist :: Eq a => [a] -> Splitter aSource

A splitting strategy that splits on the given list, when it is encountered as an exact subsequence. For example:

split (onSublist "xyz") "aazbxyzcxd" == ["aazb","xyz","cxd"]

Note that splitting on the empty list is a special case, which splits just before every element of the list being split. For example:

split (onSublist "") "abc" == ["","","a","","b","","c"] split (dropDelims . dropBlanks $ onSublist "") "abc" == ["a","b","c"]

However, if you want to break a list into singleton elements like
this, you are better off using

, or better yet,
`splitEvery`

1

.
`map`

(:[])

whenElt :: (a -> Bool) -> Splitter aSource

A splitting strategy that splits on any elements that satisfy the given predicate. For example:

split (whenElt (<0)) [2,4,-3,6,-9,1] == [[2,4],[-3],[6],[-9],[1]]

## Strategy transformers

Functions for altering splitting strategy parameters.

dropDelims :: Splitter a -> Splitter aSource

Drop delimiters from the output (the default is to keep them). For example,

split (oneOf ":") "a:b:c" == ["a", ":", "b", ":", "c"] split (dropDelims $ oneOf ":") "a:b:c" == ["a", "b", "c"]

keepDelimsL :: Splitter a -> Splitter aSource

Keep delimiters in the output by prepending them to adjacent chunks. For example:

split (keepDelimsL $ oneOf "xyz") "aazbxyzcxd" == ["aa","zb","x","y","zc","xd"]

keepDelimsR :: Splitter a -> Splitter aSource

Keep delimiters in the output by appending them to adjacent chunks. For example:

split (keepDelimsR $ oneOf "xyz") "aazbxyzcxd" == ["aaz","bx","y","z","cx","d"]

condense :: Splitter a -> Splitter aSource

Condense multiple consecutive delimiters into one. For example:

split (condense $ oneOf "xyz") "aazbxyzcxd" == ["aa","z","b","xyz","c","x","d"] split (dropDelims $ oneOf "xyz") "aazbxyzcxd" == ["aa","b","","","c","d"] split (condense . dropDelims $ oneOf "xyz") "aazbxyzcxd" == ["aa","b","c","d"]

dropInitBlank :: Splitter a -> Splitter aSource

Don't generate a blank chunk if there is a delimiter at the beginning. For example:

split (oneOf ":") ":a:b" == ["",":","a",":","b"] split (dropInitBlank $ oneOf ":") ":a:b" == [":","a",":","b"]

dropFinalBlank :: Splitter a -> Splitter aSource

Don't generate a blank chunk if there is a delimiter at the end. For example:

split (oneOf ":") "a:b:" == ["a",":","b",":",""] split (dropFinalBlank $ oneOf ":") "a:b:" == ["a",":","b",":"]

## Derived combinators

Combinators which can be defined in terms of other combinators, but are provided for convenience.

dropBlanks :: Splitter a -> Splitter aSource

Drop all blank chunks from the output. Equivalent to

. For example:
`dropInitBlank`

. `dropFinalBlank`

. `condense`

split (oneOf ":") "::b:::a" == ["",":","",":","b",":","",":","",":","a"] split (dropBlanks $ oneOf ":") "::b:::a" == ["::","b",":::","a"]

startsWith :: Eq a => [a] -> Splitter aSource

Make a strategy that splits a list into chunks that all start
with the given subsequence (except possibly the first).
Equivalent to

.
For example:
`dropInitBlank`

. `keepDelimsL`

. `onSublist`

split (startsWith "app") "applyapplicativeapplaudapproachapple" == ["apply","applicative","applaud","approach","apple"]

startsWithOneOf :: Eq a => [a] -> Splitter aSource

Make a strategy that splits a list into chunks that all start
with one of the given elements (except possibly the first).
Equivalent to

. For
example:
`dropInitBlank`

. `keepDelimsL`

. `oneOf`

split (startsWithOneOf ['A'..'Z']) "ACamelCaseIdentifier" == ["A","Camel","Case","Identifier"]

endsWith :: Eq a => [a] -> Splitter aSource

Make a strategy that splits a list into chunks that all end with
the given subsequence, except possibly the last. Equivalent to

. For example:
`dropFinalBlank`

. `keepDelimsR`

. `onSublist`

split (endsWith "ly") "happilyslowlygnarlylily" == ["happily","slowly","gnarly","lily"]

endsWithOneOf :: Eq a => [a] -> Splitter aSource

Make a strategy that splits a list into chunks that all end with
one of the given elements, except possibly the last. Equivalent
to

. For example:
`dropFinalBlank`

. `keepDelimsR`

. `oneOf`

split (condense $ endsWithOneOf ".,?! ") "Hi, there! How are you?" == ["Hi, ","there! ","How ","are ","you?"]