regex-posix-0.96.0.1: POSIX Backend for "Text.Regex" (regex-base)
Copyright(c) Chris Kuklewicz 2006
LicenseBSD-3-Clause
Maintainerhvr@gnu.org, Andreas Abel
Stabilitystable
Portabilitynon-portable (regex-base needs MPTC+FD)
Safe HaskellNone
LanguageHaskell2010

Text.Regex.Posix

Description

Module that provides the Regex backend that wraps the C POSIX.2 regex api. This is the backend being used by the regex-compat package to replace Text.Regex.

The Text.Regex.Posix module provides a backend for regular expressions. If you import this along with other backends, then you should do so with qualified imports, perhaps renamed for convenience.

If the =~ and =~~ functions are too high level, you can use the compile, regexec, and execute functions from importing either Text.Regex.Posix.String or Text.Regex.Posix.ByteString. If you want to use a low-level CString interface to the library, then import Text.Regex.Posix.Wrap and use the wrap* functions.

This module is only efficient with ByteString only if it is null terminated, i.e. (Bytestring.last bs)==0. Otherwise the library must make a temporary copy of the ByteString and append the NUL byte.

A String will be converted into a CString for processing. Doing this repeatedly will be very inefficient.

Note that the posix library works with single byte characters, and does not understand Unicode. If you need Unicode support you will have to use a different backend.

When offsets are reported for subexpression captures, a subexpression that did not match anything (as opposed to matching an empty string) will have its offset set to the unusedRegOffset value, which is (-1).

Benchmarking shows the default regex library on many platforms is very inefficient. You might increase performace by an order of magnitude by obtaining libpcre and regex-pcre or libtre and regex-tre. If you do not need the captured substrings then you can also get great performance from regex-dfa. If you do need the capture substrings then you may be able to use regex-parsec to improve performance.

Synopsis

Documentation

Wrap, for =~ and =~~, types and constants

data Regex Source #

A compiled regular expression.

Instances

Instances details
RegexLike Regex String 
Instance details

Defined in Text.Regex.Posix.String

Methods

matchOnce :: Regex -> String -> Maybe MatchArray

matchAll :: Regex -> String -> [MatchArray]

matchCount :: Regex -> String -> Int

matchTest :: Regex -> String -> Bool

matchAllText :: Regex -> String -> [MatchText String]

matchOnceText :: Regex -> String -> Maybe (String, MatchText String, String)

RegexLike Regex ByteString 
Instance details

Defined in Text.Regex.Posix.ByteString.Lazy

RegexLike Regex ByteString 
Instance details

Defined in Text.Regex.Posix.ByteString

RegexContext Regex String String 
Instance details

Defined in Text.Regex.Posix.String

Methods

match :: Regex -> String -> String

matchM :: MonadFail m => Regex -> String -> m String

RegexContext Regex ByteString ByteString 
Instance details

Defined in Text.Regex.Posix.ByteString.Lazy

RegexContext Regex ByteString ByteString 
Instance details

Defined in Text.Regex.Posix.ByteString

RegexOptions Regex CompOption ExecOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

RegexMaker Regex CompOption ExecOption String 
Instance details

Defined in Text.Regex.Posix.String

RegexMaker Regex CompOption ExecOption ByteString 
Instance details

Defined in Text.Regex.Posix.ByteString.Lazy

RegexMaker Regex CompOption ExecOption ByteString 
Instance details

Defined in Text.Regex.Posix.ByteString

RegexMaker Regex CompOption ExecOption (Seq Char) 
Instance details

Defined in Text.Regex.Posix.Sequence

RegexLike Regex (Seq Char) 
Instance details

Defined in Text.Regex.Posix.Sequence

Methods

matchOnce :: Regex -> Seq Char -> Maybe MatchArray

matchAll :: Regex -> Seq Char -> [MatchArray]

matchCount :: Regex -> Seq Char -> Int

matchTest :: Regex -> Seq Char -> Bool

matchAllText :: Regex -> Seq Char -> [MatchText (Seq Char)]

matchOnceText :: Regex -> Seq Char -> Maybe (Seq Char, MatchText (Seq Char), Seq Char)

RegexContext Regex (Seq Char) (Seq Char) 
Instance details

Defined in Text.Regex.Posix.Sequence

Methods

match :: Regex -> Seq Char -> Seq Char

matchM :: MonadFail m => Regex -> Seq Char -> m (Seq Char)

newtype ExecOption Source #

A bitmapped CInt containing options for execution of compiled regular expressions. Option values (and their man 3 regexec names) are

  • execBlank which is a complete zero value for all the flags. This is the blankExecOpt value.
  • execNotBOL (REG_NOTBOL) can be set to prevent ^ from matching at the start of the input.
  • execNotEOL (REG_NOTEOL) can be set to prevent $ from matching at the end of the input (before the terminating NUL).

Constructors

ExecOption CInt 

Instances

Instances details
Eq ExecOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

Num ExecOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

Show ExecOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

Bits ExecOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

RegexOptions Regex CompOption ExecOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

RegexMaker Regex CompOption ExecOption String 
Instance details

Defined in Text.Regex.Posix.String

RegexMaker Regex CompOption ExecOption ByteString 
Instance details

Defined in Text.Regex.Posix.ByteString.Lazy

RegexMaker Regex CompOption ExecOption ByteString 
Instance details

Defined in Text.Regex.Posix.ByteString

RegexMaker Regex CompOption ExecOption (Seq Char) 
Instance details

Defined in Text.Regex.Posix.Sequence

newtype CompOption Source #

A bitmapped CInt containing options for compilation of regular expressions. Option values (and their man 3 regcomp names) are

  • compBlank which is a completely zero value for all the flags. This is also the blankCompOpt value.
  • compExtended (REG_EXTENDED) which can be set to use extended instead of basic regular expressions. This is set in the defaultCompOpt value.
  • compNewline (REG_NEWLINE) turns on newline sensitivity: The dot (.) and inverted set [^ ] never match newline, and ^ and $ anchors do match after and before newlines. This is set in the defaultCompOpt value.
  • compIgnoreCase (REG_ICASE) which can be set to match ignoring upper and lower distinctions.
  • compNoSub (REG_NOSUB) which turns off all information from matching except whether a match exists.

Constructors

CompOption CInt 

Instances

Instances details
Eq CompOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

Num CompOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

Show CompOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

Bits CompOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

RegexOptions Regex CompOption ExecOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

RegexMaker Regex CompOption ExecOption String 
Instance details

Defined in Text.Regex.Posix.String

RegexMaker Regex CompOption ExecOption ByteString 
Instance details

Defined in Text.Regex.Posix.ByteString.Lazy

RegexMaker Regex CompOption ExecOption ByteString 
Instance details

Defined in Text.Regex.Posix.ByteString

RegexMaker Regex CompOption ExecOption (Seq Char) 
Instance details

Defined in Text.Regex.Posix.Sequence

compBlank :: CompOption Source #

A completely zero value for all the flags. This is also the blankCompOpt value.

execBlank :: ExecOption Source #

A completely zero value for all the flags. This is also the blankExecOpt value.

(=~) :: (RegexMaker Regex CompOption ExecOption source, RegexContext Regex source1 target) => source1 -> source -> target Source #

(=~~) :: (RegexMaker Regex CompOption ExecOption source, RegexContext Regex source1 target, MonadFail m) => source1 -> source -> m target Source #