regex-posix-0.96.0.0: POSIX Backend for "Text.Regex" (regex-base)

Copyright(c) Chris Kuklewicz 200620072008 derived from (c) The University of Glasgow 2002
LicenseBSD-3-Clause
Maintainerhvr@gnu.org
Stabilityexperimental
Portabilitynon-portable (regex-base needs MPTC+FD)
Safe HaskellNone
LanguageHaskell2010

Text.Regex.Posix.Wrap

Contents

Description

WrapPosix.hsc exports a wrapped version of the ffi imports. To increase type safety, the flags are newtype'd. The other important export is a Regex type that is specific to the Posix library backend. The flags are documented in Text.Regex.Posix. The defaultCompOpt is (compExtended .|. compNewline).

The Regex, CompOption, and ExecOption types and their RegexOptions instance is declared. The =~ and =~~ convenience functions are defined.

This module will fail or error only if allocation fails or a nullPtr is passed in.

2009-January : wrapMatchAll and wrapCount now adjust the execution option execNotBOL after the first result to take into account '\n' in the text immediately before the next matches. (version 0.93.3)

2009-January : wrapMatchAll and wrapCount have been changed to return all non-overlapping matches, including empty matches even if they coincide with the end of the previous non-empty match. The change is that the first non-empty match no longer terminates the search. One can filter the results to obtain the old behavior or to obtain the behavior of "sed", where "sed" eliminates the empty matches which coincide with the end of non-empty matches. (version 0.94.0)

Synopsis

High-level API

data Regex Source #

A compiled regular expression.

Instances
RegexLike Regex String Source # 
Instance details

Defined in Text.Regex.Posix.String

RegexLike Regex ByteString Source # 
Instance details

Defined in Text.Regex.Posix.ByteString.Lazy

RegexLike Regex ByteString Source # 
Instance details

Defined in Text.Regex.Posix.ByteString

RegexOptions Regex CompOption ExecOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

RegexContext Regex String String Source # 
Instance details

Defined in Text.Regex.Posix.String

Methods

match :: Regex -> String -> String #

matchM :: MonadFail m => Regex -> String -> m String #

RegexContext Regex ByteString ByteString Source # 
Instance details

Defined in Text.Regex.Posix.ByteString.Lazy

RegexContext Regex ByteString ByteString Source # 
Instance details

Defined in Text.Regex.Posix.ByteString

RegexMaker Regex CompOption ExecOption String Source # 
Instance details

Defined in Text.Regex.Posix.String

RegexMaker Regex CompOption ExecOption ByteString Source # 
Instance details

Defined in Text.Regex.Posix.ByteString.Lazy

RegexMaker Regex CompOption ExecOption ByteString Source # 
Instance details

Defined in Text.Regex.Posix.ByteString

RegexMaker Regex CompOption ExecOption (Seq Char) Source # 
Instance details

Defined in Text.Regex.Posix.Sequence

RegexLike Regex (Seq Char) Source # 
Instance details

Defined in Text.Regex.Posix.Sequence

RegexContext Regex (Seq Char) (Seq Char) Source # 
Instance details

Defined in Text.Regex.Posix.Sequence

Methods

match :: Regex -> Seq Char -> Seq Char #

matchM :: MonadFail m => Regex -> Seq Char -> m (Seq Char) #

type RegOffset = Int64 Source #

RegOffset is "typedef int regoff_t" on Linux and ultimately "typedef long long __int64_t" on Max OS X. So rather than saying 2,147,483,647 is all the length you need, I'll take the larger: 9,223,372,036,854,775,807 should be enough bytes for anyone, no need for Integer. The alternative is to compile to different sizes in a platform dependent manner with "type RegOffset = (#type regoff_t)", which I do not want to do.

There is also a special value unusedRegOffset :: RegOffset which is (-1) and as a starting index means that the subgroup capture was unused. Otherwise the RegOffset indicates a character boundary that is before the character at that index offset, with the first character at index offset 0. So starting at 1 and ending at 2 means to take only the second character.

(=~) :: (RegexMaker Regex CompOption ExecOption source, RegexContext Regex source1 target) => source1 -> source -> target Source #

(=~~) :: (RegexMaker Regex CompOption ExecOption source, RegexContext Regex source1 target, MonadFail m) => source1 -> source -> m target Source #

Low-level API

type WrapError = (ReturnCode, String) Source #

The return code will be retOk when it is the Haskell wrapper and not the underlying library generating the error message.

wrapCompile Source #

Arguments

:: CompOption

Flags (bitmapped)

-> ExecOption

Flags (bitmapped)

-> CString

The regular expression to compile (ASCII only, no null bytes)

-> IO (Either WrapError Regex)

Returns: the compiled regular expression

wrapMatch :: Regex -> CString -> IO (Either WrapError (Maybe [(RegOffset, RegOffset)])) Source #

wrapMatch returns offsets for the begin and end of each capture. Unused captures have offsets of unusedRegOffset which is (-1)

wrapMatchAll :: Regex -> CString -> IO (Either WrapError [MatchArray]) Source #

wrapMatchAll returns the offset and length of each capture. Unused captures have an offset of unusedRegOffset which is (-1) and length of 0.

Miscellaneous

Compilation options

newtype CompOption Source #

A bitmapped CInt containing options for compilation of regular expressions. Option values (and their man 3 regcomp names) are

  • compBlank which is a completely zero value for all the flags. This is also the blankCompOpt value.
  • compExtended (REG_EXTENDED) which can be set to use extended instead of basic regular expressions. This is set in the defaultCompOpt value.
  • compNewline (REG_NEWLINE) turns on newline sensitivity: The dot (.) and inverted set [^ ] never match newline, and ^ and $ anchors do match after and before newlines. This is set in the defaultCompOpt value.
  • compIgnoreCase (REG_ICASE) which can be set to match ignoring upper and lower distinctions.
  • compNoSub (REG_NOSUB) which turns off all information from matching except whether a match exists.

Constructors

CompOption CInt 
Instances
Eq CompOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

Num CompOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

Show CompOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

Bits CompOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

RegexOptions Regex CompOption ExecOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

RegexMaker Regex CompOption ExecOption String Source # 
Instance details

Defined in Text.Regex.Posix.String

RegexMaker Regex CompOption ExecOption ByteString Source # 
Instance details

Defined in Text.Regex.Posix.ByteString.Lazy

RegexMaker Regex CompOption ExecOption ByteString Source # 
Instance details

Defined in Text.Regex.Posix.ByteString

RegexMaker Regex CompOption ExecOption (Seq Char) Source # 
Instance details

Defined in Text.Regex.Posix.Sequence

compBlank :: CompOption Source #

A completely zero value for all the flags. This is also the blankCompOpt value.

Execution options

newtype ExecOption Source #

A bitmapped CInt containing options for execution of compiled regular expressions. Option values (and their man 3 regexec names) are

  • execBlank which is a complete zero value for all the flags. This is the blankExecOpt value.
  • execNotBOL (REG_NOTBOL) can be set to prevent ^ from matching at the start of the input.
  • execNotEOL (REG_NOTEOL) can be set to prevent $ from matching at the end of the input (before the terminating NUL).

Constructors

ExecOption CInt 
Instances
Eq ExecOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

Num ExecOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

Show ExecOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

Bits ExecOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

RegexOptions Regex CompOption ExecOption Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

RegexMaker Regex CompOption ExecOption String Source # 
Instance details

Defined in Text.Regex.Posix.String

RegexMaker Regex CompOption ExecOption ByteString Source # 
Instance details

Defined in Text.Regex.Posix.ByteString.Lazy

RegexMaker Regex CompOption ExecOption ByteString Source # 
Instance details

Defined in Text.Regex.Posix.ByteString

RegexMaker Regex CompOption ExecOption (Seq Char) Source # 
Instance details

Defined in Text.Regex.Posix.Sequence

execBlank :: ExecOption Source #

A completely zero value for all the flags. This is also the blankExecOpt value.

Return codes

newtype ReturnCode Source #

ReturnCode is an enumerated CInt, corresponding to the error codes from man 3 regex:

  • retBadbr (REG_BADBR) invalid repetition count(s) in { }
  • retBadpat (REG_BADPAT) invalid regular expression
  • retBadrpt (REG_BADRPT) ?, *, or + operand invalid
  • retEcollate (REG_ECOLLATE) invalid collating element
  • retEctype (REG_ECTYPE) invalid character class
  • retEescape (REG_EESCAPE) \ applied to unescapable character
  • retEsubreg (REG_ESUBREG) invalid backreference number
  • retEbrack (REG_EBRACK) brackets [ ] not balanced
  • retEparen (REG_EPAREN) parentheses ( ) not balanced
  • retEbrace (REG_EBRACE) braces { } not balanced
  • retErange (REG_ERANGE) invalid character range in [ ]
  • retEspace (REG_ESPACE) ran out of memory
  • retNoMatch (REG_NOMATCH) The regexec() function failed to match

Constructors

ReturnCode CInt 
Instances
Eq ReturnCode Source # 
Instance details

Defined in Text.Regex.Posix.Wrap

Show ReturnCode Source # 
Instance details

Defined in Text.Regex.Posix.Wrap