nerf-0.5.0: Nerf, the named entity recognition tool based on linear-chain CRFs

Safe HaskellSafe-Inferred

NLP.Nerf.Tokenize

Contents

Description

The module implements the tokenization used within Nerf and some other tokenization-related stuff.

Synopsis

Tokenization

tokenize :: String -> [String]Source

Tokenize sentence using the default tokenizer.

Synchronization

class Word a whereSource

A class of objects which can be converted to String.

Methods

word :: a -> StringSource

Instances

Word String 
Word Text 
Word Text 

moveNEs :: (Word b, Word c) => NeForest a b -> [c] -> NeForest a cSource

Synchronize named entities with tokenization represented by the second function argument. Of course, both arguments should relate to the same sentence.