Title: Optimized route dispatch code Let's start with our module declaration and imports.
> module Yesod.Routes.Dispatch
>     ( Piece (..)
>     , Route (..)
>     , Dispatch
>     , toDispatch
>     ) where
> import Data.Text (Text)
> import qualified Data.Vector as V
> import Data.Maybe (fromMaybe, mapMaybe)
> import qualified Data.Map as Map
> import Data.List (sortBy)
> import Data.Ord (comparing)
> import Control.Arrow (second)
> import Control.Exception (assert)
This module provides an efficient routing system. The code is pure, requires no fancy extensions, has no Template Haskell involved and is not Yesod specific. It does, however, assume a routing system similar to that of Yesod. Routing works based on splitting up a path into its components. This is handled very well by both the web-routes and http-types packages, and this module does not duplicate that functionality. Instead, it assumes that the requested path will be provided as a list of 'Text's. A route will be specified by a list of pieces (using the 'Piece' datatype).
> data Piece = Static Text | Dynamic
Each piece is either a static piece- which is required to match a component of the path precisely- or a dynamic piece, which will match any component. Additionally, a route can optionally match all remaining components in the path, or fail if extra components exist. Usually, the behavior of dynamic is not what you really want. Often times, you will want to match integers, or slugs, or some other limited format. This brings us nicely to the dispatch function. Each route provides a function of type:
> type Dispatch res = [Text] -> Maybe res
The res argument is application-specific. For example, in a simple WAI application, it could be the Application datatype. The important thing to point out about Dispatch is that is takes a list of 'Text's and returns its response in a Maybe. This gives you a chance to have finer-grained control over how individual components are parsed. If you don't want to deal with it, you return 'Nothing' and routing continues. Note: You do *not* need to perform any checking on your static pieces, this module handles that for you automatically. So each route is specified by:
> data Route res = Route
>     { rhPieces :: [Piece]
>     , rhHasMulti :: Bool
>     , rhDispatch :: Dispatch res
>     }
Your application needs to provide this module with a list of routes, and then this module will give you back a new dispatch function. In other words:
> toDispatch :: [Route res] -> Dispatch res
> toDispatch rhs =
>     bcToDispatch bc
>   where
>     bc = toBC rhs
In addition to the requirements listed above for routing, we add one extra rule: your specified list of routes is treated as ordered, with the earlier ones matching first. If you have an overlap between two routes, the first one will be dispatched. The simplest approach would be to loop through all of your routes and compare against the path components. But this has linear complexity. Many existing frameworks (Rails and Django at least) have such algorithms, usually based on regular expressions. But we can provide two optimizations: * Break up routes based on how many components they can match. We can then select which group of routes to continue testing. This lookup runs in constant time. * Use a Map to reduce string comparisons for each route to logarithmic complexity. Let's start with the first one. Each route has a fixed number of pieces. Let's call this *n*. If that route can also match trailing components (rhHasMulti above), then it will match *n* and up. Otherwise, it will match specifically on *n*. If *max(n)* is the maximum value of *n* for all routes, what we need is (*max(n)* + 2) groups: a zero group (matching a request for the root of the application), 1 - *max(n)* groups, and a final extra group containing all routes that can match more than *max(n)* components. This group will consist of all the routes with rhHasMulti, and only those routes.
> data ByCount res = ByCount
>     { bcVector :: !(V.Vector (PieceMap res))
>     , bcRest :: !(PieceMap res)
>     }
We haven't covered PieceMap yet; it is used for the second optimization. We'll discuss it below. The following function breaks up a list of routes into groups. Again, please ignore the PieceMap references for the moment.
> toBC :: [Route res] -> ByCount res
> toBC rhs =
>     ByCount
>         { bcVector = groups
>         , bcRest = allMultis
>         }
>   where
Determine the value of *max(n)*.
>     maxLen
>       | null rhs = 0
>       | otherwise = maximum $ map (length . rhPieces) rhs
Get the list of all routes which can have multis. This will make up the *rest* group.
>     allMultis = toPieceMap maxLen $ filter rhHasMulti rhs
And now get all the numbered groups. For each group, we need to get all routes with *n* components, __and__ all routes with less than *n* components and that have rhHasMulti set to True.
>     groups = V.map group $ V.enumFromN 0 (maxLen + 1)
>     group i = toPieceMap i $ filter (canHaveLength i) rhs
>     canHaveLength :: Int -> Route res -> Bool
>     canHaveLength i rh =
>         len == i || (len < i && rhHasMulti rh)
>       where
>         len = length $ rhPieces rh
Next we'll set up our routing by maps. What we need is a bunch of nested Maps. For example, if we have the following routings: /foo/bar/1 /foo/baz/2 We would want something that looks vaguely like: /foo /bar /1 /baz /2 But there's an added complication: we need to deal with dynamic compnents and HasMulti as well. So what we'd really have is routes looking like: /foo/bar/1 /foo/baz/2 /*dynamic*/bin/3 /multi/*bunch of multis* We can actually simplify away the multi business. Remember that for each group, we will have a fixed number of components to match. In the list above, it's three. Even though the last route only has one component, we can actually just fill up the missing components with *dynamic*, which will give the same result for routing. In other words, we'll treat it as: /foo /bar /1 /baz /2 /*dynamic* /bin /3 /multi /*dynamic* /*dynamic* What we need is then two extra features on our datatype: * Support both a 'Map Text PieceMap' for static pieces, and a general 'PieceMap' for all dynamic pieces. * An extra constructive after we've gone three levels deep, to provide all matching routes. What we end up with is:
> data PieceMap res = PieceMap
>     { pmDynamic :: PieceMap res
>     , pmStatic :: Map.Map Text (PieceMap res)
>     } | PieceMapEnd [(Int, Dispatch res)]
Note that the PieceMapEnd is a list of pairs, including an Int. Since the map process will confuse the original order of our routes, we need some way to get that back to make sure overlapping is handled correctly. We'll need two pieces of information to make a PieceMap: the depth to drill down to, and the routes in the current group. We'll immediately zip up those routes with an Int to indicate route priority.
> toPieceMap :: Int -> [Route res] -> PieceMap res
> toPieceMap depth = toPieceMap' depth . zip [1..]
> toPieceMap' :: Int
>             -> [(Int, Route res)]
>             -> PieceMap res
The stopping case: we've exhausted the full depth, so let's put together a PieceMapEnd. Technically speaking, the sort here is unnecessary, since we'll sort again later. However, that second sorting occurs during each dispatch occurrence, whereas this sorting only occurs once, in the initial construction of the PieceMap. Therefore, we presort here.
> toPieceMap' 0 rhs =
>     PieceMapEnd $ map (second rhDispatch)
>                 $ sortBy (comparing fst) rhs
Note also that we apply rhDispatch to the route. We are no longer interested in the rest of the route information, so it can be discarded. Now the heart of this algorithm: we construct the pmDynamic and pmStatic records. For both, we recursively call toPieceMap' again, with the depth knocked down by 1.
> toPieceMap' depth rhs = PieceMap
>     { pmDynamic = toPieceMap' depth' dynamics
>     , pmStatic = Map.map (toPieceMap' depth') statics
>     }
>   where
>     depth' = depth - 1
We turn our list of routes into a list of pairs. The first item in the pair gives the next piece, and the second gives the route again, minus that piece.
>     pairs = map toPair rhs
>     toPair (i, Route (p:ps) b c) = (p, (i, Route ps b c))
And as we mentioned above, for multi pieces we fill in the remaining pieces with Dynamic.
>     toPair (i, Route [] b c) = assert b (Dynamic, (i, Route [] b c))
Next, we break up our list of dynamics.
>     getDynamic (Dynamic, rh) = Just rh
>     getDynamic _ = Nothing
>     dynamics = mapMaybe getDynamic pairs
And now we make a Map for statics. Note that Map.fromList would not be appropriate here, since it would only keep one route per Text.
>     getStatic (Static t, rh) = Just $ Map.singleton t [rh]
>     getStatic _ = Nothing
>     statics = Map.unionsWith (++) $ mapMaybe getStatic pairs
The time has come to actually dispatch.
> bcToDispatch :: ByCount res -> Dispatch res
> bcToDispatch (ByCount vec rest) ts0 =
>     bcToDispatch' ts0 pm0
>   where
Get the PieceMap for the appropriate group. If the length of the requested path is greater than *max(n)*, then use the "rest" group.
>     pm0 = fromMaybe rest $ vec V.!? length ts0
Stopping case: we've found our list of routes. Sort them, then starting applying their dispatch functions. If the first one returns Nothing, go to the next, and so on.
>     bcToDispatch' _ (PieceMapEnd r) = firstJust (\f -> f ts0) $ map snd r
For each component, get the static PieceMap and the dynamic one, combine them together, and then continue dispatching.
>     bcToDispatch' (t:ts) (PieceMap dyn sta) = bcToDispatch' ts $
>         case Map.lookup t sta of
>             Nothing -> dyn
>             Just pm -> append dyn pm
Handle an impossible case that should never happen.
>     bcToDispatch' [] _ = assert False Nothing
Helper function: get the first Just response.
> firstJust :: (a -> Maybe b) -> [a] -> Maybe b
> firstJust _ [] = Nothing
> firstJust f (a:as) = maybe (firstJust f as) Just $ f a
Combine two PieceMaps together.
> append :: PieceMap res -> PieceMap res -> PieceMap res
At the end, just combine the list of routes. But we combine them in such a way so as to preserve their order. Since a and b come presorted (as mentioned above), we can just merge the two lists together in linear time.
> append (PieceMapEnd a) (PieceMapEnd b) = PieceMapEnd $ merge a b
Combine the dynamic and static portions of the maps.
> append (PieceMap a x) (PieceMap b y) =
>     PieceMap (append a b) (Map.unionWith append x y)
An impossible case.
> append _ _ = assert False $ PieceMapEnd []
Our O(n) merge.
> merge :: Ord a => [(a, b)] -> [(a, b)] -> [(a, b)]
> merge x [] = x
> merge [] y = y
> merge x@(a@(ai, _):xs) y@(b@(bi, _):ys)
>   | ai < bi   = a : merge xs y
>   | otherwise = b : merge x ys