| |||||||||||||||||||
| |||||||||||||||||||
Description | |||||||||||||||||||
Convenient functions for W3C XML Schema Regular Expression Matcher. For internals see Text.XML.HXT.RelaxNG.XmlSchema.Regex Grammar can be found under http://www.w3.org/TR/xmlschema11-2/#regexs | |||||||||||||||||||
Synopsis | |||||||||||||||||||
| |||||||||||||||||||
Documentation | |||||||||||||||||||
| |||||||||||||||||||
match a string with a regular expression First argument is the regex, second the input string, if the regex is not well formed, Nothing is returned, else Just the match result Examples: matchRE "x*" "xxx" = Just True matchRE "x" "xxx" = Just False matchRE "[" "xxx" = Nothing | |||||||||||||||||||
| |||||||||||||||||||
split a string by taking the longest prefix matching a regular expression Nothing is returned in case of a syntactically wrong regex string or in case there is no matching prefix, else the pair of prefix and rest is returned examples: splitRE "a*b" "abc" = Just ("ab","c") splitRE "a*" "bc" = Just ("", "bc") splitRE "a+" "bc" = Nothing splitRE "[" "abc" = Nothing | |||||||||||||||||||
| |||||||||||||||||||
sed like editing function All matching tokens are edited by the 1. argument, the editing function, all other chars remain as they are examples: sedRE (const "b") "a" "xaxax" = Just "xbxbx" sedRE (\ x -> x ++ x) "a" "xax" = Just "xaax" sedRE undefined "[" undefined = Nothing | |||||||||||||||||||
| |||||||||||||||||||
split a string into tokens (words) by giving a regular expression which all tokens must match This can be used for simple tokenizers. The words in the result list contain at least one char. All none matching chars are discarded. If the given regex contains syntax errors, Nothing is returned examples: tokenizeRE "a*b" "" = Just [] tokenizeRE "a*b" "abc" = Just ["ab"] tokenizeRE "a*b" "abaab ab" = Just ["ab","aab","ab"] tokenizeRE "[a-z]{2,}|[0-9]{2,}|[0-9]+[.][0-9]+" "ab123 456.7abc" = Just ["ab","123","456.7","abc"] tokenizeRE "[a-z]*|[0-9]{2,}|[0-9]+[.][0-9]+" "cab123 456.7abc" = Just ["cab","123","456.7","abc"] tokenizeRE "[^ \t\n\r]*" "abc def\t\n\rxyz" = Just ["abc","def","xyz"] tokenizeRE "[^ \t\n\r]*" = words | |||||||||||||||||||
| |||||||||||||||||||
split a string into tokens and delimierter by giving a regular expression wich all tokens must match This is a generalisation of the above tokenizeRE functions. The none matching char sequences are marked with Left, the matching ones are marked with Right If the regular expression contains syntax errors Nothing is returned The following Law holds: concat . map (either id id) . fromJust . tokenizeRE' re == id | |||||||||||||||||||
| |||||||||||||||||||
convenient function for matchRE syntax errors in R.E. are interpreted as no match found | |||||||||||||||||||
| |||||||||||||||||||
convenient function for tokenizeRE a string syntax errors in R.E. result in an empty list | |||||||||||||||||||
| |||||||||||||||||||
convenient function for tokenizeRE' When the regular expression contains errors [Left input] is returned, that means tokens are found | |||||||||||||||||||
| |||||||||||||||||||
convenient function for sedRE When the regular expression contains errors, sed is the identity, else the funtionality is like sedRE sed undefined "[" == id | |||||||||||||||||||
| |||||||||||||||||||
convenient function for splitRE syntax errors in R.E. are interpreted as no matching prefix found | |||||||||||||||||||
Produced by Haddock version 2.4.2 |