The full-text-search package
An in-memory full text search engine library. It lets you run full-text queries on a collection of your documents.
Can search over any type of "document". (You explain how to extract search terms from them.)
Supports documents with multiple fields (e.g. title, body)
Supports documents with non-term features (e.g. quality score, page rank)
Uses the state of the art BM25F ranking function
Adjustable ranking parameters (including field weights and non-term feature scores)
In-memory but quite compact. It does not keep a copy of your original documents.
It is independent of the document type, so you have to write the document-specific parts: extracting search terms and any case-normalisation or stemming. This is quite easy using libraries such as tokenize and snowball.
For an example, see the code for the hackage-server where it is used for the package search feature.
|Versions||0.2.0.0, 0.2.1.0, 0.2.1.1, 0.2.1.3, 0.2.1.4|
|Dependencies||array (==0.4.*), base (>=4.5 && <4.7), containers (>=0.4 && <0.6), text (>=0.11 && <1.2), vector (==0.10.*) [details]|
|Copyright||2013-2014 Duncan Coutts, 2014 Well-Typed LLP|
|Maintainer||Duncan Coutts <email@example.com>|
|Category||Data, Text, NLP|
|Source repo||head: darcs get http://code.haskell.org/full-text-search/|
|Uploaded||Wed Feb 12 22:26:23 UTC 2014 by DuncanCoutts|
|Downloads||1877 total (34 in the last 30 days)|
|Rating||(no votes yet) [estimated by rule of succession]|
|Status||Docs available [build log]
Successful builds reported [all 1 reports]
Hackage Matrix CI
For package maintainers and hackage trustees