Safe Haskell | Safe-Infered |
---|
Latent Dirichlet Allocation
Imperative implementation of a collapsed Gibbs sampler for LDA. This library uses the topic modeling terminology (documents, words, topics), even though it is generic. For example if used for word class induction, replace documents with word types, words with features and topics with word classes.
- pass :: Int -> LDA s -> Vector Doc -> ST s (Vector Doc)
- passOne :: Int -> LDA s -> Doc -> ST s Doc
- data LDA s
- type Doc = (D, Vector (W, Maybe Z))
- type D = Int
- type W = Int
- type Z = Int
- type Table2D = IntMap Table1D
- type Table1D = IntMap Double
- data Finalized = Finalized {}
- initial :: Vector Word32 -> Int -> Double -> Double -> Maybe Double -> ST s (LDA s)
- finalize :: LDA s -> ST s Finalized
- docTopicWeights_ :: LDA s -> Doc -> ST s (Vector Double)
- priorDocTopicWeights_ :: LDA s -> D -> ST s (Vector Double)
- docTopicWeights :: Finalized -> Doc -> Vector Double
- wordTopicWeights :: Finalized -> D -> W -> Vector Double
- docCounts :: Finalized -> Table1D
Samplers
pass :: Int -> LDA s -> Vector Doc -> ST s (Vector Doc)Source
pass batch
runs one pass of Gibbs sampling on documents in batch
Datatypes
Access model information
Finalized | |
|
Initialization and finalization
initial :: Vector Word32 -> Int -> Double -> Double -> Maybe Double -> ST s (LDA s)Source
initial s k a b
initializes model with k
topics, a/k
alpha
hyperparameter, b
beta hyperparameter and random seed s
finalize :: LDA s -> ST s FinalizedSource
Create transparent immutable object holding model information from opaque internal representation
Querying evolving model
Querying finalized model
docTopicWeights :: Finalized -> Doc -> Vector DoubleSource
docTopicWeights m doc
returns unnormalized topic probabilities
for document doc given LDA model m