# Optics Regular Expressions This package is inspired by and based on the wonderful [`lens-regex-pcre`][lens-regex-pcre] library, but ported to work with [`optics`][optics] as a provider of lenses rather than [`lens`][lens]. That library has great documentation, which will generally be applicable to this library as well, since `optics` and `lens` are so similar. But it still makes sense to explain things for this library, so we do so here. ## Table Of Contents - [Creating Regular Expressions](#creating-regular-expressions) - [Flags](#flags) - [Examples](#examples) - [Search For All Occurrences Of A Pattern](#search-for-all-occurrences-of-a-pattern) - [Search For N-th Occurrence Of A Pattern](#search-for-n-th-occurrence-of-a-pattern) - [Get All Groups For Every Occurrence Of A Pattern](#get-all-groups-for-every-occurrence-of-a-pattern) - [Get Just The N-th Groups In Every Occurrence Of A Pattern](#get-just-the-n-th-groups-in-every-occurrence-of-a-pattern) - [Replace Every Occurrence Of A Pattern](#replace-every-occurrence-of-a-pattern) - [Replace The N-th Occurrence Of A Pattern](#replace-the-n-th-occurrence-of-a-pattern) - [Transform All Groups In Every Occurrence Of A Pattern](#transform-all-groups-in-every-occurrence-of-a-pattern) - [Transform the Odd-Numbered Groups In Every Even-Numbered Occurrence Of A Pattern](#transform-the-odd-numbered-groups-in-every-even-numbered-occurrence-of-a-pattern) - [On Lens/Optic Laws](#on-lensoptic-laws) - [Implementation Notes](#implementation-notes) ## Creating Regular Expressions Not all strings can be interpreted as a valid regular expression - for example, consider the string `"a(b"` or `"*"`. To account for this possibility of failure, there are two ways to create a regular expression from a string with this library: 1. at compile time (taking advantage of the `QuasiQuotes` language extension) 2. at run time Creating them at compile time means you get well-formed regular expressions without having to explicitly account for the possibility of failure. This is done by bring the `regex` quasi-quoter into scope and then writing `[regex|pattern|]`. If a pattern is not well-formed, then you will get a compile time error telling you as much. The alternative is to create them at run time, in which case you will need to consider the possibility that a given string may not be a valid regular expression, and account for that. To do this, ... ## Flags One complicating factor here is passing different flags to the regular expression engine. The quasi-quoters we provide in this library have a certain set of flags, which are fixed. If you need a quasi-quoter with a different set of flags then you will have to create one yourself. This is done like so: ... ## Examples Here we will show a selection of use cases for regular expressions, and how you can express them in the vocabulary of this library. For the purpose of these examples, we will write - `str` to mean some string type, e.g. `String` - `pat` to mean some regular expression, e.g. `[regex|(\w+) (\w+)|]` - `src` to mean some input `str`, e.g. `"a b c"` - `repl` to mean some replacement `str`, e.g. `"x"` - `n` to mean some `Int` - `fn` to mean some function of type `str -> str` We will also be using a number of functions and operators from [`optics`][optics], so if you are not familiar with any of them you can look them up in the documentation for that library. ### Search For All Occurrences Of A Pattern ```haskell src ^.. pat % match :: [str] ``` ### Search For N-th Occurrence Of A Pattern ```haskell src ^? elementOf (pat % match) n :: Maybe str ``` Note: the indexing here starts at `0`. ### Get All Groups For Every Occurrence Of A Pattern ```haskell src ^.. pat % groups :: [[str]] ``` Here the inner lists will all be of the same length, that length being the number of capture groups in the pattern `pat`. ### Get Just The N-th Groups In Every Occurrence Of A Pattern ```haskell src ^.. pat % group n :: [str] ``` or equivalently ```haskell src ^.. pat % groups % ix n :: [str] ``` Note: the indexing here starts at `0`. ### Replace Every Occurrence Of A Pattern ```haskell src & pat % match .~ repl :: str ``` ### Replace The N-th Occurrence Of A Pattern ```haskell src & elementOf (pat % match) n .~ repl :: str ``` ### Transform All Groups In Every Occurrence Of A Pattern ```haskell src & pat % groups % mapped %~ fn :: str ``` ### Transform the Odd-Numbered Groups In Every Even-Numbered Occurrence Of A Pattern This is of course a contrived example, but it does showcase how it is possible to express complex queries succinctly. ```haskell src & elementsOf (pat % groups) even -- get even matches % elementsOf traversed odd -- get odd groups %~ fn -- apply transformation `fn` :: str ``` Again, the indexing here starts is zero-based, so this example will apply `fn` to groups 1, 3, 5, ... in matches 0, 2, 4... of the pattern `pat`. ## On Lens/Optic Laws It is worth noting that the optics provided by this library do not obey the lens laws. This does not cause any harm - ... ## Implementation Notes We make use of [Backpack][backpack] to provide the same interface over the both `String` (from `base`) and `Text` (from [`text`][text], both lazy and strict varieties). We also provide the same for `ByteString` as well (from [`bytestring`][bytestring], again both lazy and strict). Of course, the `ByteString` type does not represent a string, it is simply a binary blob. But sometimes `ByteString` does get used as a string, and the internals make it easy, so we provide a `ByteString` interface for convenience. (Note: it works by assuming the `ByteString` is utf-8 encoded.) We follow suit with `lens-regex-pcre` in using [`pcre-heavy`][pcre-heavy] to provide the core regular expression functionality. However, it would not be too difficult to use a different regex provider, as most of the implementation involves creating the necessary optics and exposing them over different string types. [backpack]: https://gitlab.haskell.org/ghc/ghc/wikis/backpack [bytestring]: https://hackage.haskell.org/package/bytestring [lens-regex-pcre]: https://hackage.haskell.org/package/lens-regex-pcre [lens]: https://hackage.haskell.org/package/lens [optics]: https://hackage.haskell.org/package/optics [pcre-heavy]: https://hackage.haskell.org/package/pcre-heavy [text]: https://hackage.haskell.org/package/text