case-insensitive-match-0.1.0.0: A simplified, faster way to do case-insensitive matching.

Safe HaskellSafe
LanguageHaskell2010

Data.CaseInsensitive.Ord

Description

Case folding of Unicode characters is a bit of a mine field, and making ordinal comparisons of characters is sketchy as well. For example, how do we compare alphabetic and non-alphabetic characters, or a control character and a symbol? Unicode characters don't always have a one-to-one mapping between upper- and lower-case, so one-at-a-time character comparisons don't necessarily make string comparisons valid. To make things worse, some applications might have a preference for ordering "Apple" ahead of "apple" while others would prefer them to be EQ so long as "BANANA" compares GT to either of them.

Herein is an attempt to combine both Unicode case folding and non-alphabetic character comparisons in one module. Be( a)?ware of the details and try to think critically of your specific needs.

That slippery disclaimer aside, this should exhibit most people's sense of 'normal' behavior when comparing and ordering strings in a case-insensitive manner. In particular, when comparing strings with all-alphanumeric characters or strings with matching non-alphanumeric characters in identical positions this module will yield satisfying results.

Reasonable comparisons:

  "James" and "Mary"

Two first names, no non-alphanumerics.

  "http://www.haskell.org/" and "HTTP://WWW.EXAMPLE.COM/"

All non-alphanumerics in matching positions until a comparison can be made. This would be much more satisfying if the URI were parsed, even to the point of breaking the hostname into components.

  ("Appleseed","Johnny") and ("Bunyon","Paul")

Comparing last-name-to-last-name and then first-to-first.

Questionable comparisons:

  "Franklin, Benjamin" and "Hitler, Adolph"
  "Smith, Snuffy" and "Smithers, Waylon"
  "me@example.com" and "YOU@EXAMPLE.COM"

Politics and morality aside, the last, first format will inevitably result in comparisons between a comma and an alphabetic character. The result, while intuitively correct in this case, is so because a comma is 'less than' a letter for arbitrary reasons (OK, the ASCII creators probably thought deeply about this and maybe it's not simply arbitrary). Beware when someone decides to sort your User object on last_first because of some local or locale-based assumption and the results look a little backwards to normal humans.

"Https://www.example.com" and "http://www.haskell.org/"

Do all HTTP URIs come before all HTTPS ones, or should the hostname take precedence? Again, think carefully about why you are making case-insensitive comparisons.

Synopsis

Documentation

class CaseInsensitiveOrd a where Source #

A class for ordering strings in a case-insensitive manner.

Minimal complete definition

caseInsensitiveCompare

(^>) :: CaseInsensitiveOrd a => a -> a -> Bool Source #

greater than for case-insensitive strings

(^<) :: CaseInsensitiveOrd a => a -> a -> Bool Source #

less than for case-insensitive strings

(^>=) :: CaseInsensitiveOrd a => a -> a -> Bool Source #

greater than or equal to for case-insensitive strings

(^<=) :: CaseInsensitiveOrd a => a -> a -> Bool Source #

less than or equal to for case-insensitive strings