| Safe Haskell | None |
|---|---|
| Language | Haskell2010 |
NLP.Chunk
- defaultChunker :: IO (Chunker Chunk Tag)
- conllChunker :: IO (Chunker Chunk Tag)
- train :: (ChunkTag c, Tag t) => Chunker c t -> [ChunkedSentence c t] -> IO (Chunker c t)
- chunk :: (ChunkTag c, Tag t) => Chunker c t -> [TaggedSentence t] -> [ChunkedSentence c t]
- chunkText :: (ChunkTag c, Tag t) => POSTagger t -> Chunker c t -> Text -> Text
- chunkStr :: (ChunkTag c, Tag t) => POSTagger t -> Chunker c t -> String -> String
- chunkerTable :: (ChunkTag c, Tag t) => Map ByteString (ByteString -> Either String (Chunker c t))
- saveChunker :: (ChunkTag c, Tag t) => Chunker c t -> FilePath -> IO ()
- loadChunker :: (ChunkTag c, Tag t) => FilePath -> IO (Chunker c t)
- serialize :: (ChunkTag c, Tag t) => Chunker c t -> ByteString
- deserialize :: (ChunkTag c, Tag t) => Map ByteString (ByteString -> Either String (Chunker c t)) -> ByteString -> Either String (Chunker c t)
Documentation
train :: (ChunkTag c, Tag t) => Chunker c t -> [ChunkedSentence c t] -> IO (Chunker c t) Source
Train a chunker on a set of additional examples.
chunk :: (ChunkTag c, Tag t) => Chunker c t -> [TaggedSentence t] -> [ChunkedSentence c t] Source
Chunk a TaggedSentence that has been produced by a Chatter
tagger, producing a rich representation of the Chunks and the Tags
detected.
If you just want to see chunked output from standard text, you
probably want chunkText or chunkStr.
chunkText :: (ChunkTag c, Tag t) => POSTagger t -> Chunker c t -> Text -> Text Source
Convenience funciton to Tokenize, POS-tag, then Chunk the provided text, and format the result in an easy-to-read format.
> tgr <- defaultTagger > chk <- defaultChunker > chunkText tgr chk "The brown dog jumped over the lazy cat." "[NP The/DT brown/NN dog/NN] [VP jumped/VBD] [NP over/IN the/DT lazy/JJ cat/NN] ./."
chunkStr :: (ChunkTag c, Tag t) => POSTagger t -> Chunker c t -> String -> String Source
A wrapper around chunkText that packs strings.
chunkerTable :: (ChunkTag c, Tag t) => Map ByteString (ByteString -> Either String (Chunker c t)) Source
The default table of tagger IDs to readTagger functions. Each tagger packaged with Chatter should have an entry here. By convention, the IDs use are the fully qualified module name of the tagger package.
saveChunker :: (ChunkTag c, Tag t) => Chunker c t -> FilePath -> IO () Source
Store a Chunker to disk.
loadChunker :: (ChunkTag c, Tag t) => FilePath -> IO (Chunker c t) Source
Load a Chunker from disk, optionally gunzipping if
needed. (based on file extension)
deserialize :: (ChunkTag c, Tag t) => Map ByteString (ByteString -> Either String (Chunker c t)) -> ByteString -> Either String (Chunker c t) Source