optics-regexp: A lensy/optical interface to regular expressions

This is a package candidate release! Here you can preview how this package release will appear once published to the main package index (which can be accomplished via the 'maintain' link below). Please note that once a package has been published to the main package index it cannot be undone! Please consult the package uploading documentation for more information.

[maintain] [Publish]

This package, in combination with the optics package, allows you to do regular-expression style string manipulations in a lensy manner. We provide support for the common string types String and Text, as well as for the common binary type ByteString (which works by assuming utf-8 encoded ByteStrings).

It is probably easiest to explain with some quick examples.

This library is inspired by and based on the lens-regex-pcre library, which does the same thing but for lens rather than for optics.


[Skip to Readme]

Properties

Versions 0.1.0.0
Change log CHANGELOG.md
Dependencies base (>=4.12.0.0 && <4.13), bytestring, deepseq, gauge, hspec, optics-core, optics-regexp, pcre-heavy, template-haskell, text [details]
License BSD-3-Clause
Copyright 2020 lawrencebell
Author Lawrence Bell
Maintainer Lawrence Bell <lawrence.matthew.bell@gmail.com>
Category Lenses, Optics, Regex
Bug tracker https://github.com/lawrencebell/optics-regexp/issues
Source repo head: git clone https://github.com/lawrencebell/optics-regexp
Uploaded by lawrencebell at 2020-04-28T13:37:34Z

Modules

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees


Readme for optics-regexp-0.1.0.0

[back to package description]

Optics Regular Expressions

This package is inspired by and based on the wonderful lens-regex-pcre library, but ported to work with optics as a provider of lenses rather than lens.

That library has great documentation, which will generally be applicable to this library as well, since optics and lens are so similar. But it still makes sense to explain things for this library, so we do so here.

Table Of Contents

Creating Regular Expressions

Not all strings can be interpreted as a valid regular expression - for example, consider the string "a(b" or "*". To account for this possibility of failure, there are two ways to create a regular expression from a string with this library:

  1. at compile time (taking advantage of the QuasiQuotes language extension)
  2. at run time

Creating them at compile time means you get well-formed regular expressions without having to explicitly account for the possibility of failure. This is done by bring the regex quasi-quoter into scope and then writing [regex|pattern|]. If a pattern is not well-formed, then you will get a compile time error telling you as much.

The alternative is to create them at run time, in which case you will need to consider the possibility that a given string may not be a valid regular expression, and account for that. To do this, ...

Flags

One complicating factor here is passing different flags to the regular expression engine. The quasi-quoters we provide in this library have a certain set of flags, which are fixed. If you need a quasi-quoter with a different set of flags then you will have to create one yourself.

This is done like so:

...

Examples

Here we will show a selection of use cases for regular expressions, and how you can express them in the vocabulary of this library.

For the purpose of these examples, we will write

We will also be using a number of functions and operators from optics, so if you are not familiar with any of them you can look them up in the documentation for that library.

Search For All Occurrences Of A Pattern

src ^.. pat % match :: [str]

Search For N-th Occurrence Of A Pattern

src ^? elementOf (pat % match) n :: Maybe str

Note: the indexing here starts at 0.

Get All Groups For Every Occurrence Of A Pattern

src ^.. pat % groups :: [[str]]

Here the inner lists will all be of the same length, that length being the number of capture groups in the pattern pat.

Get Just The N-th Groups In Every Occurrence Of A Pattern

src ^.. pat % group n :: [str]

or equivalently

src ^.. pat % groups % ix n :: [str]

Note: the indexing here starts at 0.

Replace Every Occurrence Of A Pattern

src & pat % match .~ repl :: str

Replace The N-th Occurrence Of A Pattern

src & elementOf (pat % match) n .~ repl :: str

Transform All Groups In Every Occurrence Of A Pattern

src & pat % groups % mapped %~ fn :: str

Transform the Odd-Numbered Groups In Every Even-Numbered Occurrence Of A Pattern

This is of course a contrived example, but it does showcase how it is possible to express complex queries succinctly.

src
  & elementsOf (pat % groups) even  -- get even matches
  % elementsOf traversed odd        -- get odd groups
  %~ fn                             -- apply transformation `fn`
    :: str

Again, the indexing here starts is zero-based, so this example will apply fn to groups 1, 3, 5, ... in matches 0, 2, 4... of the pattern pat.

On Lens/Optic Laws

It is worth noting that the optics provided by this library do not obey the lens laws. This does not cause any harm - ...

Implementation Notes

We make use of Backpack to provide the same interface over the both String (from base) and Text (from text, both lazy and strict varieties). We also provide the same for ByteString as well (from bytestring, again both lazy and strict).

Of course, the ByteString type does not represent a string, it is simply a binary blob. But sometimes ByteString does get used as a string, and the internals make it easy, so we provide a ByteString interface for convenience. (Note: it works by assuming the ByteString is utf-8 encoded.)

We follow suit with lens-regex-pcre in using pcre-heavy to provide the core regular expression functionality. However, it would not be too difficult to use a different regex provider, as most of the implementation involves creating the necessary optics and exposing them over different string types.