edit-distance-vector-1.0: Calculate edit distances and edit scripts between vectors.

Data.Vector.Distance

Contents

Description

This module implements a variation on the Wagner-Fischer algorithm to find the shortest sequences of operations which transforms one vector of values into another.

Synopsis

# Types

data Params v o c Source

Operations invoked by the Wagner-Fischer algorithm.

The parameters to this type are as follows:

• `v` is the type of values being compared,
• `o` is the type representing operations,
• `c` is the type representing costs.

The chief restrictions on these type parameters is that the cost type `c` must have instances of `Monoid` and `Ord`. A good default choice might be the type `(Sum Int)`.

Constructors

 Params Fieldsequivalent :: v -> v -> BoolAre two values equivalent?delete :: Int -> v -> oDelete the element at an index.insert :: Int -> v -> oInsert an element at an index.substitute :: Int -> v -> v -> oSubstitute an element at an index.cost :: o -> cCost of a change.positionOffset :: o -> IntPositions to advance after a change. E.g. `0` for a deletion.

type ChangeMatrix o c = Vector (c, [o]) Source

Matrix of optimal edit scripts and costs for all prefixes of two vectors.

This is a representation of the `n * m` dynamic programming matrix constructed by the algorithm. The matrix is stored in a `Vector` in row-major format with an additional row and column corresponding to the empty prefix of the source and destination `Vectors`.

# Operations

Arguments

 :: (Monoid c, Ord c) => Params v o c -> Vector v "Source" vector. -> Vector v "Destination" vector. -> (c, [o])

O(nm). Find the cost and optimal edit script to transform one `Vector` into another.

Arguments

 :: (Monoid c, Ord c) => Params v o c -> Vector v "Source" vector. -> Vector v "Destination" vector. -> ChangeMatrix o c

O(nm). Calculate the complete matrix of edit scripts and costs between two vectors.

# Example

Example `Params` to compare `(Vector Char)` values.

The algorithm will produce edit distances in terms of `(Sum Int)` and edit scripts containing `(String, Int, Char)` values.

The first component of each operation is either `"delete"`, `"insert"`, or `"replace"`.