eros: A text censorship library.

[ bsd3, library, text ] [ Propose Tags ]

A Haskell library for censoring text, using DansGuardian phraselists. I converted the phraselists into JSON. You can view the converted phraselists here. I recommend looking at the API documentation for Text.Eros if you want an idea of how to use the library. I publish the documentation on GitHub.


[Skip to Readme]

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

  • No Candidates
Versions [RSS] 0.0.0.0, 0.1.0.0, 0.2.0.0, 0.2.0.1, 0.2.1.0, 0.2.1.1, 0.3.0.0, 0.3.0.1, 0.3.1.0, 0.4.0.0, 0.4.0.1, 0.4.1.0, 0.5.0.0, 0.5.1.0, 0.5.2.0, 0.5.3.0, 0.5.3.1, 0.5.3.2, 0.6.0.0
Dependencies aeson, base (>=4.7 && <4.8), bytestring, containers, text [details]
License BSD-3-Clause
Copyright 2014, Peter Harpending.
Author Peter Harpending
Maintainer Peter Harpending <pharpend2@gmail.com>
Category Text
Source repo head: git clone https://github.com/pharpend/eros.git -b master
Uploaded by pharpend at 2014-07-15T05:27:34Z
Distributions
Reverse Dependencies 1 direct, 0 indirect [details]
Downloads 13408 total (35 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs uploaded by user
Build status unknown [no reports yet]

Readme for eros-0.5.3.0

[back to package description]

Eros

A Haskell library for text censorship, using DansGuardian Phraselists.

I converted those Phraselists to JSON. You can see the converted Phraselists here. There are compressed versions for use in your code.

Eros is still in development, and is not ready to be actually used. If you would like to contribute, please do.

You can try the API documentation on Hackage if you want to learn how to use the library. Hackage isn't terribly reliable at successfully building the documentation, so I also publish the documentation on GitHub pages

Usage - v.0.5.2.0

This is a usage guide for version 0.5.2.0. There will be more up-to-date usage guides as more versions come, hopefully.

To install, add eros >=0.5 && <0.6 to the build-depends field in your library's .cabal file

You can get all the functions, simply by importing Text.Eros.

Hackage seems to be unable to build the API documentation for Eros, but it won't hurt to check eros on Hackage. If that doesn't work, I publish the documentation here.

Using Text.Eros

The basic idea is you take a Message type, and check it against a PhraseMap, using messageScore. Message is actually just a type alias for Tl.Text, so just enable the OverloadedStrings extension, and pretend you're using normal strings.

In GHCi,

:set -XOverloadedStrings

import Text.Eros

In a file,

 {-# LANGUAGE OverloadedStrings #-}
 import Text.Eros

Constructing PhraseMaps

A PhraseMap is just a Phraselist marshaled into the more Haskell-friendly Ms.Map type.

Eros provides a large number of Phraselists.

 data ErosList = Chat
               | Conspiracy
               | DrugAdvocacy
               | Forums
               | Gambling
               | Games
               | Gore
               | IdTheft
               | IllegalDrugs
               | Intolerance
               | LegalDrugs
               | Malware
               | Music
               | News
               | Nudism
               | Peer2Peer
               | Personals
               | Pornography
               | Proxies
               | SecretSocieties
               | SelfLabeling
               | Sport
               | Translation
               | UpstreamFilter
               | Violence
               | WarezHacking
               | Weapons
               | Webmail
   deriving (Eq)

The easiest way to marshal a Phraselist into a PhraseMap is to use the readPhraseMap function.

 readPhraseMap :: Phraselist t => t -> IO PhraseMap

Use it like this

pornMap <- readPhraseMap Pornography 30

Internally, readPhraseMap reads JSON data containing the Phraselist, marshals it into a list of PhraseAlmostTrees, converts those into a PhraseForsest, and then into a PhraseMap.

You can obviously use mkMap and readPhraselist to do it yourself, but it's a lot easier to just use readPhraseMap.

You can then use messageScore to see the Score (actually an Int) of each message.

messageScore "Go fuck yourself." pornMap

messageScore is not case sensitive, so go fUck YoUrself returns the same score as go fuck yourself, and so on.

If you want to use multiple eros lists, do something like this

let myLists = [Chat, Pornography, Weapons]

myMaps <- mapM readPhraseMap myLists

map (messageScore "Go fuck yourself") myMaps [0, 30, 0]

Using your own phraselists

I haven't added good support in for this yet, but there still is support nonetheless. Your phraselist needs to be in JSON, in accordance with the Phraselist schema (I'm too lazy to find a link to it).

 data MyList = MyList
 instance Phraselist MyList where
   phraselistPath MyList = "/path/to/phraselist"

You can then do the normal stuff with messageScore and readPhraseMap.

Contributing

I would love if people would contribute. QuickCheck tests are desperately needed.

As far as functionality goes, this library is pretty cut & dry. I already added all of the features I envisioned.

Versions

Eros is pretty heavy development, so the versions change quickly. I follow the Hackage standard of major.minor.even-more-minor.trivial, where major and minor entail API-breaking changes.

In the interest of not confusing myself, I keep Eros and the Eros Client on the same major.minor version. So, a bump in the major.minor number doesn't necessarily mean that there's an API-breaking change.

Contact

The best way to contact me is via IRC. I hang out on #archlinux and #haskell on FreeNode. My handles are l0cust and isomorpheous.