Holumbus-Searchengine-1.2.3: A search and indexing engine.

Safe HaskellNone

Holumbus.Index.Inverted.CompressedPrefixMem

Synopsis

Documentation

newtype Inverted occ Source

The index consists of a table which maps documents to ids and a number of index parts.

Constructors

Inverted 

Fields

unInverted :: Parts occ

The parts of the index, each representing one context.

Instances

Eq occ => Eq (Inverted occ) 
Show occ => Show (Inverted occ) 
Binary occ => Binary (Inverted occ) 
NFData occ => NFData (Inverted occ) 
ComprOccurrences occ => XmlPickler (Inverted occ) 
(Binary occ, ComprOccurrences occ) => HolIndex (Inverted occ) 

type Parts occ = Map Context (Part occ)Source

The index parts are identified by a name, which should denote the context of the words.

type Part occ = PrefixTree occSource

The index part is the real inverted index. Words are mapped to their occurrences. The part is implemented as a prefix tree

type Inverted0 = Inverted Occ0Source

The pure inverted index implemented as a prefix tree without any space optimizations. This may be taken as a reference for space and time measurements for the other index structures

type InvertedCompressed = Inverted OccCompressedSource

The inverted index with simple-9 encoding of the occurence sets

type InvertedSerialized = Inverted OccSerializedSource

The inverted index with serialized occurence maps with simple-9 encoded sets

type InvertedCSerialized = Inverted OccCSerializedSource

The inverted index with serialized occurence maps with simple-9 encoded sets and with the serialized bytestrings compressed with bzip2

type InvertedOSerialized = Inverted OccOSerializedSource

The pure inverted index with serialized occurence maps and with the serialized bytestrings compressed with bzip2, no simple-9 encoding. This is the most space efficient index of the 5 variants, even a few percent smaller then InvertedCSerialized, and a few percent faster in lookup

class ComprOccurrences s whereSource

Instances

ComprOccurrences OccOSerialized 
ComprOccurrences OccCSerialized 
ComprOccurrences OccSerialized 
ComprOccurrences OccCompressed 
ComprOccurrences Occ0 

class Sizeof a whereSource

Methods

sizeof :: a -> Int64Source

Instances

Sizeof OccOSerialized 
Sizeof OccCSerialized 
Sizeof OccSerialized 
Sizeof OccCompressed 
Sizeof Occ0 
Sizeof ByteString 

removeDocIdsInverted :: ComprOccurrences i => Occurrences -> Inverted i -> Inverted iSource

Remove DocIds from index