Download

Here's full source (BSD3-licensed, Cabal sdist) for:

On Hackage: sai-shape-syb

Generic Shape

It is sometimes useful to "project out" the shape or skeleton of a data structure. In abstract algebra, structure-preserving maps are called homomorphisms. Category theory is sometimes described of as the study of homomorphisms, which are there called morphisms.

Although I like theory, I'm not a theoretician so this post may annoy those with high expectations in terms of knowledge of type theory, CT etc. However, I'm pleased with the results, and learned a fair bit, and it may help someone out of a practical bind down the road, so here it is. Motivated by a particular need, to work with some very complicated types (GHC AST) in a quick and dirty way. A couple days' work would have sufficed, but to bring the library up to a shareable quality took a few more. Not only did I get my GHC API hacks; by a slight generalisation of the original idea, it looks like it'll solve the most potentially work-intensive plumbing stage of a big project. Delightful! Whether it will be sufficiently performant remains to be seen...

A couple days ago I started searching for information about how to implement such a generic homomorphism in Haskell, and even asked on the IRC, but didn't turn up anything despite some advice from experienced people. If I'd hit on the right keyword -- shape -- I'd have doubtlessly been directed to one of the excellent libraries in packages shapely-data and fixplate.

I think either of these would probably have sufficed for the task, but I don't presently have an inclination to confirm this. These packages are not needed to build this library or test. At some point soon I'd like to make a careful study of shapely-data and perhaps fixplate. I feel humbled when I look at the beautiful API for shapely-data, even though I don't know how to use it or what its capabilities are yet. By comparison, my implementation is much less grounded in theory, and is probably weaker in every way. If nothing else, this mini-project has revitalised my interest in the theory side.

This implementation:

Code and Examples

Homomorphism can preserve some value information, in addition to the structure, mapping to a k-ary tree of homogeneous type, such as Rose a


  data Rose a = R a [Rose a]

with the "parentheses langauge" representation obtained for the unit type a ~ ().

Achieving this for a homogeneous recursive source type is straightforward. However, we would like a generic solution, to handle arbitrary (heterogeneous) recursive types.

In Haskell, this is possible using a generics library such as SYB or Uniplate. I chose SYB as I had a little past experience with that. If you want to run the code, you'll need to install the SYB package.

This package supports a generic mapping which works over arbitrary recursive heterogeneous types. It differs from existing generic fmap (SYB's gmapT, or Oleg's gmap), in that it allows you to obtain a homogeneous result in the type of your choice, rather than adhering to the original types. [Actually maybe Oleg's will do this.] Hence, this is not an fmap, it doesn't satisfy the laws. (It's a homomorphism.) It uses gfoldl, so fold-as-map. This theme is not new, but it's my take on a useful idea. I'm not trying to contribute theoretically, I just needed these tools and didn't find them. :)

What I came up with is summarised in the following code


  {-# LANGUAGE Rank2Types #-}

  import Data.Data ( Data )
  import Data.Data ( gmapQ )
  import Data.Generics.Aliases ( GenericQ )

  data Rose a = R a [Rose a] deriving Show
  type Homo a = Rose a
  type Shape = Homo ()

  -- (Note: This is now ghomK in the API.)
  ghom :: forall r d. Data d =>
             (r -> r -> r)
          -> GenericQ r
          -> d
          -> Homo r
  ghom k f x = foldl k' b (gmapQ (ghom k f) x)
   where
     b = R (f x) []
     k' (R r chs) nod@(R r' _) = R (r `k` r') (chs++[nod])

(If you use (nod:chs) the structure will be flipped horizontally.)

From this we can easily write the desired generic shapeOf


  shapeOf :: forall d. Data d => d -> Shape
  shapeOf = ghom (\_ _->()) (const ())

Without more ado, it's also possible to write a function which maps arbitrary data to its weighted tree representation

  weightedShapeOf :: forall d. Data d => d -> Homo Int
  weightedShapeOf = ghom (+) (const 1)

We can also use it to preserve choice values.

Let's run some tests.

[Apologies that the examples aren't very good, but they at least demonstrate usage.]


> showAsParens (shapeOf (1,2,3))
  
(•••)        4 nodes

> showAsParens (shapeOf [1,2,3])
  
(•(•(••)))   7 nodes

> showAsParensEnriched $ ghom (mkQ 0 (id::Int->Int)) [1,2,3])
  
(0(1)(0(2)(0(3)(0))))

> showAsParensEnrichedWhen (>0) $ ghom (mkQ 0 (id::Int->Int)) [1,2,3])
  
(.(1)(.(2)(.(3)(.))))

> showAsParensEnrichedM $ ghom (mkQ Nothing ((\x->Just x)::Int->Maybe Int)) [1,2,3]
  
(1(2(3.)))


Possibly, [[Int]], although polytypic, was not a good type for so many examples.
(Some examples with ADTs come after.)

> show test_list
  
[[1,2],[3],[4,5,6]]

> showHetero $ ghomDyn test_list
  
[[1,2],[3],[4,5,6]]
  [1,2]
    1
    [2]
      2
      []
  [[3],[4,5,6]]
    [3]
      3
      []
    [[4,5,6]]
      [4,5,6]
        4
        [5,6]
          5
          [6]
            6
            []
      []

> showHomo $ shapeOf test_list
  
()
| ()
| | ()
| | ()
| | | ()
| | | ()
| ()
| | ()
| | | ()
| | | ()
| | ()
| | | ()
| | | | ()
| | | | ()
| | | | | ()
| | | | | ()
| | | | | | ()
| | | | | | ()
| | | ()

> showAsParens $ shapeOf test_list
  
((•(••))((••)((•(•(••)))•)))

> showAsParensEnriched $ weightedShapeOf test_list
  
(19(5(1)(3(1)(1)))(13(3(1)(1))(9(7(1)(5(1)(3(1)(1))))(1))))

> showAsParensBool $ ghom (mkQ False (odd::Int->Bool)) test_list
  
(.(.(•)(.(.)(.)))(.(.(•)(.))(.(.(.)(.(•)(.(.)(.))))(.))))

The parentheses around the leaves are not strictly needed,
but the result would be harder to read:

(.(.•(...))(.(.•.)(.(..(.•(...))).)))

> showAsParensEnriched $ ghom (mkQ False (odd::Int->Bool)) test_list
  
(False(False(True)(False(False)(False)))(False(False(True)(False))(False(False(False)(False(True)(False(False)(False))))(False))))

> showAsParensEnrichedWhen id $ ghom (mkQ False (odd::Int->Bool)) test_list
  
(.(.(True)(.(.)(.)))(.(.(True)(.))(.(.(.)(.(True)(.(.)(.))))(.))))

> showHomo $ ghom (mkQ False (odd::Int->Bool)) test_list
  
False
| False
| | True
| | False
| | | False
| | | False
| False
| | False
| | | True
| | | False
| | False
| | | False
| | | | False
| | | | False
| | | | | True
| | | | | False
| | | | | | False
| | | | | | False
| | | False

> showHomoWhen id $ ghom (mkQ False (odd::Int->Bool)) test_list
  
.
| .
| | True
| | .
| | | .
| | | .
| .
| | .
| | | True
| | | .
| | .
| | | .
| | | | .
| | | | .
| | | | | True
| | | | | .
| | | | | | .
| | | | | | .
| | | .

> showHomoWhen (>0) $ ghom (mkQ 0 (id::Int->Int)) test_list
  
.
| .
| | 1
| | .
| | | 2
| | | .
| .
| | .
| | | 3
| | | .
| | .
| | | .
| | | | 4
| | | | .
| | | | | 5
| | | | | .
| | | | | | 6
| | | | | | .
| | | .

> showHomo $ filterHomo id $ ghom (mkQ False (odd::Int->Bool)) test_list
  
False
| True
| True
| True

> showHomo $ filterHomoM id $ ghom (mkQ False (odd::Int->Bool)) test_list
  
Nothing
| Just True
| Nothing
| | Just True
| | Just True

> showHomo $ filterHomoM odd $ ghom (mkQ 0 (id::Int->Int)) test_list
  
Nothing
| Just 1
| Nothing
| | Just 3
| | Just 5

> showHomo $ filterHomo (>0) $ ghom (mkQ 0 (id::Int->Int)) test_list
  
0
| 1
| 2
| 3
| 4
| 5
| 6

> showHomo $ filterHomoM (>0) $ ghom (mkQ 0 (id::Int->Int)) test_list
  
Nothing
| Nothing
| | Just 1
| | Just 2
| Nothing
| | Just 3
| | Nothing
| | | Just 4
| | | Nothing
| | | | Just 5
| | | | Just 6

> showHomo $ filterHomoM (>=0) $ ghom (mkQ (-1) (id::Int->Int) `extQ` ((\_->0)::[Int]->Int)) test_list
  
Nothing
| Just 0
| | Just 1
| | Just 0
| | | Just 2
| | | Just 0
| Nothing
| | Just 0
| | | Just 3
| | | Just 0
| | Just 0
| | | Just 4
| | | Just 0
| | | | Just 5
| | | | Just 0
| | | | | Just 6
| | | | | Just 0

> showHetero $ ghomDyn test_list
  
[[1,2],[3],[4,5,6]]
  [1,2]
    1
    [2]
      2
      []
  [[3],[4,5,6]]
    [3]
      3
      []
    [[4,5,6]]
      [4,5,6]
        4
        [5,6]
          5
          [6]
            6
            []
      []

> showHetero $ filterHetero (/=(3::Int)) $ ghomDyn test_list
  
[[1,2],[3],[4,5,6]]
  1
  2
  4
  5
  6

> showBi $ heteroToBi False (odd::Int->Bool) $ ghomDyn test_list
  
(<<[[Int]]>>,False)
| (<<[Int]>>,False)
| | (<<Int>>,True)
| | (<<[Int]>>,False)
| | | (<<Int>>,False)
| | | (<<[Int]>>,False)
| (<<[[Int]]>>,False)
| | (<<[Int]>>,False)
| | | (<<Int>>,True)
| | | (<<[Int]>>,False)
| | (<<[[Int]]>>,False)
| | | (<<[Int]>>,False)
| | | | (<<Int>>,False)
| | | | (<<[Int]>>,False)
| | | | | (<<Int>>,True)
| | | | | (<<[Int]>>,False)
| | | | | | (<<Int>>,False)
| | | | | | (<<[Int]>>,False)
| | | (<<[[Int]]>>,False)

> showBi $ ghomBi (mkQ False (odd::Int->Bool)) test_list
  
(<<[[Int]]>>,False)
| (<<[Int]>>,False)
| | (<<Int>>,True)
| | (<<[Int]>>,False)
| | | (<<Int>>,False)
| | | (<<[Int]>>,False)
| (<<[[Int]]>>,False)
| | (<<[Int]>>,False)
| | | (<<Int>>,True)
| | | (<<[Int]>>,False)
| | (<<[[Int]]>>,False)
| | | (<<[Int]>>,False)
| | | | (<<Int>>,False)
| | | | (<<[Int]>>,False)
| | | | | (<<Int>>,True)
| | | | | (<<[Int]>>,False)
| | | | | | (<<Int>>,False)
| | | | | | (<<[Int]>>,False)
| | | (<<[[Int]]>>,False)

> showBi $ filterBi id $ ghomBi (mkQ False (odd::Int->Bool)) test_list
  
(<<[[Int]]>>,False)
| (<<Int>>,True)
| (<<Int>>,True)
| (<<Int>>,True)

> showHomo $ biToHomo $ filterBi id $ ghomBi (mkQ False (odd::Int->Bool)) test_list
  
False
| True
| True
| True

> let f (x::Int) = if odd x then Just x else Nothing

> showHomo $ ghom (mkQ Nothing f) test_list
  
Nothing
| Nothing
| | Just 1
| | Nothing
| | | Nothing
| | | Nothing
| Nothing
| | Nothing
| | | Just 3
| | | Nothing
| | Nothing
| | | Nothing
| | | | Nothing
| | | | Nothing
| | | | | Just 5
| | | | | Nothing
| | | | | | Nothing
| | | | | | Nothing
| | | Nothing

> showHomo $ filterHomoMM $ ghom (mkQ Nothing f) test_list
  
Nothing
| Just 1
| Nothing
| | Just 3
| | Just 5

> showHomo $ unliftHomoM 0 $ filterHomoMM $ ghom (mkQ Nothing f) test_list
  
0
| 1
| 0
| | 3
| | 5

-------------------------------------------------------

data TA = A1 | A2 TB TA TB
data TB = B TA
exprAB = A2 (B A1) A1 (B A1)

data TC = C1 Float (Int,Int) | C2 TD TC TD | C3 TC
data TD = D TC
exprCD = C2 (D (C1 1.1 (4,5))) (C3 (C1 2.2 (6,7))) (D (C1 3.3 (8,9)))

data TE = E1 String | E2 (Int,Int) TF
data TF = F TE String
exprEF = E2 (2,5) (F (E1 "foo") "bar")

> showAsParens $ shapeOf exprAB
  
((•)•(•))

> showAsParens $ shapeOf exprCD
  
(((•(••)))((•(••)))((•(••))))

> showAsParens $ shapeOf exprEF
  
((••)(((•(•(••))))(•(•(••)))))

> showAsParens $ shapeOf_ exprEF
  
((••)((•)•))

> showAsParensEnriched $ weightedShapeOf_ exprEF
  
(12(3(1)(1))(8(4(3))(3)))

> show $ ( ( unGhomDyn $ ghomDyn exprEF ) :: TE )
  
E2 (2,5) (F (E1 "foo") "bar")

> showHomo $ ( gempty exprEF :: BiM Int)
  
(<<TE>>,Nothing)
| (<<(Int,Int)>>,Nothing)
| | (<<Int>>,Nothing)
| | (<<Int>>,Nothing)
| (<<TF>>,Nothing)
| | (<<TE>>,Nothing)
| | | (<<[Char]>>,Nothing)
| | | | (<<Char>>,Nothing)
| | | | (<<[Char]>>,Nothing)
| | | | | (<<Char>>,Nothing)
| | | | | (<<[Char]>>,Nothing)
| | | | | | (<<Char>>,Nothing)
| | | | | | (<<[Char]>>,Nothing)
| | (<<[Char]>>,Nothing)
| | | (<<Char>>,Nothing)
| | | (<<[Char]>>,Nothing)
| | | | (<<Char>>,Nothing)
| | | | (<<[Char]>>,Nothing)
| | | | | (<<Char>>,Nothing)
| | | | | (<<[Char]>>,Nothing)


Progressive refinement and accumulation:

> (showBi $
     ( grefine
         (\ x -> case x of { E2 (y,z) _ -> Just (z+3)
                           ; _ -> Nothing })
         ( gempty exprEF :: BiM Int)
     )
  )

(<<TE>>,Just 8)
| (<<(Int,Int)>>,Nothing)
| | (<<Int>>,Nothing)
| | (<<Int>>,Nothing)
| (<<TF>>,Nothing)
| | (<<TE>>,Nothing)
| | | (<<[Char]>>,Nothing)
| | | | (<<Char>>,Nothing)
| | | | (<<[Char]>>,Nothing)
| | | | | (<<Char>>,Nothing)
| | | | | (<<[Char]>>,Nothing)
| | | | | | (<<Char>>,Nothing)
| | | | | | (<<[Char]>>,Nothing)
| | (<<[Char]>>,Nothing)
| | | (<<Char>>,Nothing)
| | | (<<[Char]>>,Nothing)
| | | | (<<Char>>,Nothing)
| | | | (<<[Char]>>,Nothing)
| | | | | (<<Char>>,Nothing)
| | | | | (<<[Char]>>,Nothing)

> (showBi $
    ( gaccum
        ((\r1 r2 -> r1+r2) :: Int -> Int -> Int)
        (\ x -> case x of { E1 s -> Just (length s)
                          ; _ -> Nothing })
        ( grefine
            (\ x -> case x of { E2 (y,z) _ -> Just (z+3)
                              ; _ -> Nothing })
            ( gempty exprEF :: BiM Int)
        )
    )
  )

(<<TE>>,Just 8)
| (<<(Int,Int)>>,Nothing)
| | (<<Int>>,Nothing)
| | (<<Int>>,Nothing)
| (<<TF>>,Nothing)
| | (<<TE>>,Just 3)
| | | (<<[Char]>>,Nothing)
| | | | (<<Char>>,Nothing)
| | | | (<<[Char]>>,Nothing)
| | | | | (<<Char>>,Nothing)
| | | | | (<<[Char]>>,Nothing)
| | | | | | (<<Char>>,Nothing)
| | | | | | (<<[Char]>>,Nothing)
| | (<<[Char]>>,Nothing)
| | | (<<Char>>,Nothing)
| | | (<<[Char]>>,Nothing)
| | | | (<<Char>>,Nothing)
| | | | (<<[Char]>>,Nothing)
| | | | | (<<Char>>,Nothing)
| | | | | (<<[Char]>>,Nothing)


Testing filterHomoM and filterBiM:

> show test_list

[[1,2],[3],[4,5,6]]

> showHomo $ filterHomoM odd $ ghom (mkQ 0 (id::Int->Int)) test_list

Nothing
| Just 1
| Nothing
| | Just 3
| | Just 5

> showBi $ filterBiM odd $ ghomBi (mkQ 0 (id::Int->Int)) test_list

(<<[[Int]]>>,Nothing)
| (<<Int>>,Just 1)
| (<<[[Int]]>>,Nothing)
| | (<<Int>>,Just 3)
| | (<<Int>>,Just 5)


Testing abstract datatype:

> show exprN

fromList [("",1.1),("pdsfhp",3.3),("sfv",2.2)]

> show $ Map.toList exprN

[("",1.1),("pdsfhp",3.3),("sfv",2.2)]

> showHomo $ shapeOf_ exprN

()
| ()
| | ()
| | | ()
| | | ()
| | ()
| | | ()
| | | | ()
| | | | ()
| | | ()
| | | | ()
| | | | | ()
| | | | | ()
| | | | ()

> showHomoWhen (>0) $ ghomP (mkQ False (\ (_::String) -> True)) (mkQ 0 (\ (x::Float) -> x) `extQ` (\ (_::String) -> 0)) exprN

.
| .
| | .
| | | .
| | | 1.1
| | .
| | | .
| | | | .
| | | | 3.3
| | | .
| | | | .
| | | | | .
| | | | | 2.2
| | | | .

> showHomoWhen (>0) $ ghom (mkQ 0 (\ (x::Float) -> x)) exprN

.
| .
| | .
| | | .
| | | 1.1
| | .
| | | .
| | | | .
| | | | | .
| | | | | .
| | | | | | .
| | | | | | .
| | | | | | | .
| | | | | | | .
| | | | | | | | .
| | | | | | | | .
| | | | | | | | | .
| | | | | | | | | .
| | | | | | | | | | .
| | | | | | | | | | .
| | | | 3.3
| | | .
| | | | .
| | | | | .
| | | | | | .
| | | | | | .
| | | | | | | .
| | | | | | | .
| | | | | | | | .
| | | | | | | | .
| | | | | 2.2
| | | | .

> showHomo $ filterHomo (>0) $ ghom (mkQ 0 (\ (x::Float) -> x)) exprN

0.0
| 1.1
| 3.3
| 2.2

> showHomo $ filterHomoM (>0) $ ghom (mkQ 0 (\ (x::Float) -> x)) exprN

Nothing
| Nothing
| | Just 1.1
| | Nothing
| | | Just 3.3
| | | Just 2.2


I have also tested this on GHC parse trees (the motivating problem), and it works fine, but there is a catch. These results are reported here.

So, for what it's worth, I was able to implement the generic homomorphisms. Probably this is already supported by some library, but I was unlucky in my search.

Andrew Seniuk
June 12, 2014
rasfar@gmail.com