kdt-0.2.4: Fast and flexible k-d trees for various types of point queries.

Safe HaskellSafe
LanguageHaskell2010

Data.KdTree.Static

Contents

Synopsis

Introduction

Let's say you have a large set of 3D points called data points, and you'd like to be able to quickly perform point queries on the data points. One example of a point query is the nearest neighbor query: given a set of data points points and a query point p, which point in points is closest to p?

We can efficiently solve the nearest neighbor query (along with many other types of point queries) if we appropriately organize the data points. One such method of organization is called the k-d tree algorithm, which is implemented in this module.

Usage

Let's say you have a list of 3D data points, and each point is of type Point3d:

data Point3d = Point3d { x :: Double
                       , y :: Double
                       , z :: Double
                       } deriving Show

We call a point's individual values axis values (i.e., x, y, and z in the case of Point3d).

In order to generate a k-d tree of Point3d's, we need to define a PointAsListFn that expresses the point's axis values as a list:

point3dAsList :: Point3d -> [Double]
point3dAsList (Point3d x y z) = [x, y, z]

Now we can build a KdTree structure from a list of data points and perform a nearest neighbor query as follows:

>>> let dataPoints = [(Point3d 0.0 0.0 0.0), (Point3d 1.0 1.0 1.0)]

>>> let kdt = build point3dAsList dataPoints

>>> let queryPoint = Point3d 0.1 0.1 0.1

>>> nearest kdt queryPoint
Point3d {x = 0.0, y = 0.0, z = 0.0}

Variants

Dynamic k-d trees

The KdTree structure is meant for static sets of data points. If you need to insert points into an existing k-d tree, check out Data.KdTree.Dynamic.KdTree.

k-d maps

If you need to associate additional data with each point in the tree (i.e., points are keys associated with values), check out Data.KdMap.Static.KdMap and Data.KdMap.Dynamic.KdMap for static and dynamic variants of this functionality. Please do not try to fake this functionality with a KdTree by augmenting your point type with the extra data; you're gonna have a bad time.

Advanced

Custom distance functions

You may have noticed in the previous use case that we never specified what "nearest" means for our points. By default, build uses a Euclidean distance function that is sufficient in most cases. However, point queries are typically faster on a KdTree built with a user-specified custom distance function. Let's generate a KdTree using a custom distance function.

One idiosyncrasy about KdTree is that custom distance functions are actually specified as squared distance functions (SquaredDistanceFn). This means that your custom distance function must return the square of the actual distance between two points. This is for efficiency: regular distance functions often require expensive square root computations, whereas in our case, the squared distance works fine and doesn't require computing any square roots. Here's an example of a squared distance function for Point3d:

point3dSquaredDistance :: Point3d -> Point3d -> Double
point3dSquaredDistance (Point3d x1 y1 z1) (Point3d x2 y2 z2) =
  let dx = x1 - x2
      dy = y1 - y2
      dz = z1 - z2
  in  dx * dx + dy * dy + dz * dz

We can build a KdTree using our custom distance function as follows:

>>> let kdt = buildWithDist point3dAsList point3dSquaredDistance points

Axis value types

In the above examples, we used a point type with axis values of type Double. We can in fact use axis values of any type that is an instance of the Real typeclass. This means you can use points that are composed of Doubles, Ints, Floats, and so on:

data Point2i = Point2i Int Int

point2iAsList :: Point2i -> [Int]
point2iAsList (Point2i x y) = [x, y]

kdt :: [Point2i] -> KdTree Int Point2i
kdt dataPoints = build point2iAsList dataPoints

Reference

Types

type PointAsListFn a p = p -> [a] Source

Converts a point of type p with axis values of type a into a list of axis values [a].

type SquaredDistanceFn a p = p -> p -> a Source

Returns the squared distance between two points of type p with axis values of type a.

data KdTree a p Source

A k-d tree structure that stores points of type p with axis values of type a.

Instances

Foldable (KdTree a) Source 
(Show a, Show p) => Show (KdTree a p) Source 
Generic (KdTree a p) Source 
(NFData a, NFData p) => NFData (KdTree a p) Source 
type Rep (KdTree a p) Source 

k-d tree construction

empty :: Real a => PointAsListFn a p -> KdTree a p Source

Builds an empty KdTree.

emptyWithDist :: Real a => PointAsListFn a p -> SquaredDistanceFn a p -> KdTree a p Source

Builds an empty KdTree using a user-specified squared distance function.

singleton :: Real a => PointAsListFn a p -> p -> KdTree a p Source

Builds a KdTree with a single point.

singletonWithDist :: Real a => PointAsListFn a p -> SquaredDistanceFn a p -> p -> KdTree a p Source

Builds a KdTree with a single point using a user-specified squared distance function.

build Source

Arguments

:: Real a 
=> PointAsListFn a p 
-> [p]

non-empty list of data points to be stored in the k-d tree

-> KdTree a p 

Builds a KdTree from a list of data points using a default squared distance function defaultSqrDist.

Average complexity: O(n * log(n)) for n data points.

Worst case time complexity: O(n^2) for n data points.

Worst case space complexity: O(n) for n data points.

buildWithDist :: Real a => PointAsListFn a p -> SquaredDistanceFn a p -> [p] -> KdTree a p Source

Builds a KdTree from a list of data points using a user-specified squared distance function.

Average time complexity: O(n * log(n)) for n data points.

Worst case time complexity: O(n^2) for n data points.

Worst case space complexity: O(n) for n data points.

insertUnbalanced :: Real a => KdTree a p -> p -> KdTree a p Source

Inserts a point into a KdTree. This can potentially cause the internal tree structure to become unbalanced. If the tree becomes too unbalanced, point queries will be very inefficient. If you need to perform lots of point insertions on an already existing k-d tree, check out Data.KdTree.Dynamic.KdTree.

Average complexity: O(log(n)) for n data points.

Worse case time complexity: O(n) for n data points.

batchInsertUnbalanced :: Real a => KdTree a p -> [p] -> KdTree a p Source

Inserts a list of points into a KdTree. This can potentially cause the internal tree structure to become unbalanced, which leads to inefficient point queries.

Average complexity: O(n * log(n)) for n data points.

Worst case time complexity: O(n^2) for n data points.

Query

nearest :: Real a => KdTree a p -> p -> p Source

Given a KdTree and a query point, returns the nearest point in the KdTree to the query point.

Average time complexity: O(log(n)) for n data points.

Worst case time complexity: O(n) for n data points.

Throws an error if called on an empty KdTree.

inRadius Source

Arguments

:: Real a 
=> KdTree a p 
-> a

radius

-> p

query point

-> [p]

list of points in tree with given radius of query point

Given a KdTree, a query point, and a radius, returns all points in the KdTree that are within the given radius of the query point.

Points are not returned in any particular order.

Worst case time complexity: O(n) for n data points and a radius that subsumes all points in the structure.

kNearest :: Real a => KdTree a p -> Int -> p -> [p] Source

Given a KdTree, a query point, and a number k, returns the k nearest points in the KdTree to the query point.

Neighbors are returned in order of increasing distance from query point.

Average time complexity: log(k) * log(n) for k nearest neighbors on a structure with n data points.

Worst case time complexity: n * log(k) for k nearest neighbors on a structure with n data points.

inRange Source

Arguments

:: Real a 
=> KdTree a p 
-> p

lower bounds of range

-> p

upper bounds of range

-> [p]

all points within given range

Finds all points in a KdTree with points within a given range, where the range is specified as a set of lower and upper bounds.

Points are not returned in any particular order.

Worst case time complexity: O(n) for n data points and a range that spans all the points.

toList :: KdTree a p -> [p] Source

Returns a list of all the points in the KdTree.

Time complexity: O(n) for n data points.

size :: KdTree a p -> Int Source

Returns the number of elements in the KdTree.

Time complexity: O(1)

Utilities

defaultSqrDist :: Num a => PointAsListFn a p -> SquaredDistanceFn a p Source

A default implementation of squared distance given two points and a PointAsListFn.