rake: Rapid Automatic Keyword Extraction (RAKE)

[ keyword-extractor, library, natural-language-processing, nlp, rake ] [ Propose Tags ]

Rapid Automatic Keyword Extraction (RAKE) is an algorithm to automatically extract keywords from documents. Keywords are sequences of one or more words that, together, provide a compact representation of content (see reference below). RAKE is a well-known and widely used NLP technique, but its concrete application depends a lot on factors like the language in which the content is written, the domain of the content and the purpose of the keywords.

The implementation in this library is mainly aimed at English. With additional resources, it is also applicable to other language. The library is inspired by a similar implementation in Python (https://github.com/aneesha/RAKE).

The algorithm is described, for instance, in: Rose, S., Engel, D., Cramer, N., & Cowley, W. (2010): Automatic Keyword Extraction from Individual Documents. In M. W. Berry & J. Kogan (Eds.), Text Mining: Theory and Applications: John Wiley & Sons, available online at: http://www.cbs.dtu.dk/courses/introduction_to_systems_biology/chapter1_textmining.pdf.

More information on this haskell library is available on https://github.com/toschoo/Haskell-Libs.

Versions [RSS] [faq] 0.0.1
Change log changelog.md
Dependencies base (>=4.0 && <=5.0), containers (>=0.5), text (>=1.2) [details]
License LicenseRef-LGPL
Copyright Copyright (c) Tobias Schoofs, 2015
Author Tobias Schoofs
Maintainer Tobias Schoofs <tobias dot schoofs at gmx dot net>
Category Natural Language Processing, NLP, keyword extractor, RAKE
Home page http://github.com/toschoo/Haskell-Libs
Uploaded by TobiasSchoofs at 2015-07-17T11:58:36Z
Distributions NixOS:0.0.1
Downloads 984 total (8 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Hackage Matrix CI
Docs available [build log]
Last success reported on 2015-07-17 [all 1 reports]




Maintainer's Corner

For package maintainers and hackage trustees