edit-distance-vector-1.0.0.4: Calculate edit distances and edit scripts between vectors.

Safe HaskellNone
LanguageHaskell2010

Data.Vector.Distance

Contents

Description

This module implements a variation on the Wagner-Fischer algorithm to find the shortest sequences of operations which transforms one vector of values into another.

Synopsis

Types

data Params v o c Source

Operations invoked by the Wagner-Fischer algorithm.

The parameters to this type are as follows:

  • v is the type of values being compared,
  • o is the type representing operations,
  • c is the type representing costs.

The chief restrictions on these type parameters is that the cost type c must have instances of Monoid and Ord. A good default choice might be the type (Sum Int).

Constructors

Params 

Fields

equivalent :: v -> v -> Bool

Are two values equivalent?

delete :: Int -> v -> o

Delete the element at an index.

insert :: Int -> v -> o

Insert an element at an index.

substitute :: Int -> v -> v -> o

Substitute an element at an index.

cost :: o -> c

Cost of a change.

positionOffset :: o -> Int

Positions to advance after a change. E.g. 0 for a deletion.

type ChangeMatrix o c = Vector (c, [o]) Source

Matrix of optimal edit scripts and costs for all prefixes of two vectors.

This is a representation of the n * m dynamic programming matrix constructed by the algorithm. The matrix is stored in a Vector in row-major format with an additional row and column corresponding to the empty prefix of the source and destination Vectors.

Operations

leastChanges Source

Arguments

:: (Monoid c, Ord c) 
=> Params v o c 
-> Vector v

"Source" vector.

-> Vector v

"Destination" vector.

-> (c, [o]) 

O(nm). Find the cost and optimal edit script to transform one Vector into another.

allChanges Source

Arguments

:: (Monoid c, Ord c) 
=> Params v o c 
-> Vector v

"Source" vector.

-> Vector v

"Destination" vector.

-> ChangeMatrix o c 

O(nm). Calculate the complete matrix of edit scripts and costs between two vectors.

Example

strParams :: Params Char (String, Int, Char) (Sum Int) Source

Example Params to compare (Vector Char) values.

The algorithm will produce edit distances in terms of (Sum Int) and edit scripts containing (String, Int, Char) values.

The first component of each operation is either "delete", "insert", or "replace".