hist-pl-lexicon-0.3.1: A binary representation of the historical dictionary of Polish

Safe HaskellNone

NLP.HistPL

Contents

Description

The module provides functions for working with the binary representation of the historical dictionary of Polish.

It is intended to be imported qualified, to avoid name clashes with Prelude functions, e.g.

 import qualified NLP.HistPL as H

Use save and load functions to save/load the entire dictionary in/from a given directory. They are particularly useful when you want to convert the LMF dictionary to a binary format (see NLP.HistPL.LMF module).

To search the dictionary, open the binary directory with an open function. For example, during a GHCi session:

>>> hpl <- H.open "srpsdp.bin"

Set the OverloadedStrings extension for convenience:

>>> :set -XOverloadedStrings

To search the dictionary use the lookup function, e.g.

>>> entries <- H.lookup hpl "dufliwego"

You can use functions defined in the NLP.HistPL.Types module to query the entries for a particular feature, e.g.

>>> map (H.text . H.lemma) entries
[["dufliwy"]]

Synopsis

Entries

data BinEntry Source

Entry in the binary dictionary consists of the lexical entry and corresponding unique identifier.

Constructors

BinEntry 

Fields

lexEntry :: LexEntry

Lexical entry.

uid :: Int

Unique identifier among lexical entries with the same first form (see Key data type).

data Key Source

A dictionary key which uniquely identifies the lexical entry.

Constructors

Key 

Fields

keyForm :: Text

First form (presumably lemma) of the lexical entry.

keyUid :: Int

Unique identifier among lexical entries with the same keyForm.

Instances

proxyForm :: LexEntry -> TextSource

Form representing the lexical entry.

binKey :: BinEntry -> KeySource

Key assigned to the binary entry.

Rules

data Rule Source

A rule for translating a form into a binary dictionary key.

Constructors

Rule 

Fields

cut :: !Int

Number of characters to cut from the end of the form.

suffix :: !Text

A suffix to paste.

ruleUid :: !Int

Unique identifier of the entry.

Instances

between :: Text -> Key -> RuleSource

Make a rule which translates between the string and the key.

apply :: Rule -> Text -> KeySource

Apply the rule.

Dictionary

data HistPL Source

A binary dictionary handle.

Open

tryOpen :: FilePath -> IO (Maybe HistPL)Source

Open the binary dictionary residing in the given directory. Return Nothing if the directory doesn't exist or if it doesn't constitute a dictionary.

open :: FilePath -> IO HistPLSource

Open the binary dictionary residing in the given directory. Raise an error if the directory doesn't exist or if it doesn't constitute a dictionary.

Query

lookup :: HistPL -> Text -> IO [LexEntry]Source

Lookup the form in the dictionary.

lookupBin :: HistPL -> Text -> IO [BinEntry]Source

Lookup the form in the dictionary. Similar to lookup, but returns the BinEntry which can be used to determine place of the entry in the dictionary storage.

getIndex :: HistPL -> IO [Key]Source

List of dictionary keys.

withKey :: HistPL -> Key -> IO (Maybe BinEntry)Source

Extract lexical entry with a given key.

Conversion

Save

save :: FilePath -> [LexEntry] -> IO ()Source

Save the HistPL dictionary in the empty directory.

Load

load :: FilePath -> IO (Maybe [BinEntry])Source

Load dictionary from a disk in a lazy manner. Return Nothing if the path doesn't correspond to a binary representation of the dictionary.

Modules

NLP.HistPL.Types module exports hierarchy of data types stored in the binary dictionary.