Free Sections

Haskell Syntax Extension

by Andrew Seniuk

The reader is strongly encouraged to navigate to fremissant.net/freesect as these documents are currently experiencing rapid growth and refinement. The local files represent the best documentation available at packaging of this v0.7 source distribution (March 15, 2012).

Free sections are syntactic sugar to extend the usual notion of "section" in functional programming. Recall that a section is just a function which has been applied to some but not all of its arguments. Unless the function is modified, arguments must be supplied in the lexical order determined by the function binding's parameters. For instance, f 4 True is partially applied, but is still a function type and can absorb two more arguments of types Char and Float, in that order.

  f :: Int -> Bool -> Char -> Float -> String
  f 4 True         -- Char -> Float -> String

By using wildcard symbols (__, a double-underscore), any subset of values occurring in a RHS context can be deferred, in the same spirit as partial function application. When necessary, special brackets (_[…]_) are used to delimit the lexical extent (or context) of the section, although these can usually be omitted (details below). The free section is then a function with arity at least equal to the number of wildcards, and behaves as if a helper function had been defined which shuffled the argument order of the functional expression, turning it into a normal section.

Philosophically, use of this sort of syntax promotes "higher-order programming", since any expression can so easily be made into a function, in numerous ways, simply by replacing parts of it with freesect wildcards. That this is worthwhile is demonstrated by the frequent usefulness of sections.

The complete table of Haskell expression translations probably forms the best set of examples I have to offer. Some other examples, including illustrations of context inferencing:

  map  _[ (+) __ 2 ]_  [1,2,3]  =  [3,4,5]
  map   ( (+) __ 2 )   [1,2,3]  =  [3,4,5]
  zipWith $ f __ $ g __ z       =  zipWith _[ f __ $ g __ z ]_

For a more complex example, consider a situation where some API provides you with the dreaded dreadme function, which you have need of using. The example is designed to show that you can really put the __'s anywhere on a RHS that an expression can go.

  {-# LANGUAGE FreeSections #-}  -- use "{- #" if compiling with ghc -F

  tableA = [ [1,2,3], [4,5], [6,7,8,9] ]
  tableB = [ [9,8,7], [6,5], [4,3,2,1] ]

  data D a = D Int [a]

  v = map3  ( dreadme tableA __ (repeat True,__) (D __ [(+),(*)]) )
            [tableA,tableB]
            [0..]
            [-1,1]

FreeSect rewrites the RHS of v (where fs0…fs2 must be fresh identifiers) to:

  v = map3  (\ fs0 fs1 fs2 -> dreadme tableA fs0 (repeat True,fs1) (D fs2 [(+),(*)]) )
            [tableA,tableB]
            [0..]
            [-1,1]

(Here is the rest of the code to make a closed module.)

  dreadme :: [[Int]] -> [[Int]] -> ([Bool],Int) -> D (Int->Int->Int) -> String
  dreadme tabA tabB (ps,q) (D wt flst)
    = show $ map (!!q) $ map2 f tabA tabB
    where f rowA rowB =   map2 (<) (repeat rowA)
                        $ map2' (filter' ps flst) rowB (map ((q+).(wt*)) rowB)
(base cases)  filter' :: [Bool] -> [a] -> [a]
  filter' (p:ps) (x:xs) = if p then x:filter' ps xs
                               else   filter' ps xs
  map2 :: (a->b->c) -> [a] -> [b] -> [c]
  map2 f (x:xs) (y:ys) = (f x y) : map2 f xs ys
  map2' :: [a->b->c] -> [a] -> [b] -> [[c]]
  map2' (f:fs) xs ys = (map2 f xs ys) : map2' fs xs ys
  map3 :: (a->b->c->d) -> [a] -> [b] -> [c] -> [d]
  map3 f (x:xs) (y:ys) (z:zs) = (f x y z) : map3 f xs ys zs
  filter' :: [Bool] -> [a] -> [a]
  filter' (p:ps) (x:xs) = if p then x:filter' ps xs
                               else   filter' ps xs
  map2 :: (a->b->c) -> [a] -> [b] -> [c]
  map2 f _ [] = []
  map2 f [] _ = []
  map2 f (x:xs) (y:ys) = (f x y) : map2 f xs ys
  map2' :: [a->a->a] -> [a] -> [a] -> [[a]]
  map2' [] _ _ = []
  map2' _ [] _ = []
  map2' _ _ [] = []
  map2' (f:fs) xs ys = (map2 f xs ys) : map2' fs xs ys
  map3 :: (a->b->c->d) -> [a] -> [b] -> [c] -> [d]
  map3 f _ _ [] = []
  map3 f _ [] _ = []
  map3 f [] _ _ = []
  map3 f (x:xs) (y:ys) (z:zs) = (f x y z) : map3 f xs ys zs
  main = print v

Some virtues of the new syntax

Although the rewrite provided by the extension is simple, there are advantages of free sections relative to explicitly written lambdas:

the lambda forces the programmer to invent names for the wildcards
the lambda forces the programmer to repeat those names, and place them correctly
freesect wildcards stand out vividly, indicating where the awaited expressions will go
reading the lambda requires visual pattern-matching between left and right sides of the lambda
the lambda is longer overall, and prefaces the expression of interest with something like micro-boilerplate

On the other hand, the lambda is more powerful:

it can achieve arbitrary permutations without further ado
it is more expressive when nesting is involved, because the variables are not anonymous

Implementation

The implementation seems to be working. The freesect package can also be installed from Hackage, for instance using cabal-install. I offer fair warning that I'm not a good Haskell coder, but this does demonstrate how a person of modest abilities can use SYB with HSE to create a robust Haskell syntax extension.

Default context inferencing

Note: The following policy is flawed as pointed out by <ski> on #haskell, since

  <ski> this clearly means that `(f __) a' (meaning `(\x -> f x) a') is
        different from `f __ a' (meaning `\x -> f x a')

A default context is automatically applied when the _[…]_ grouping syntax is omitted. Defining this default can be tricky, and a bit arbitrary, so for robust code it's best to use explicit bracketing. As of v0.6, when the _[…]_ are omitted, the defaulting rules are as follows: The semilattice join of all unbacketed wildcards in a RHS are given context of the innermost enclosing parentheses or infix $ operation. If no such parentheses or infix $ are present, the whole RHS becomes the default context. Some of the examples demonstrate these rules. One hopes that the default will result in a type error, should the inferred context differ from the intended. I haven't thought of any situation where more than one bracketing of freesect wildcards yield typeable expressions, but neither have I ruled it out. A context inference policy based on typeability tests among various possible scopes would also be possible, although probably ill-advised.

Background

There was discussion about this on the #haskell IRC channel. Strangely, <eyebloom> suggested the same thing an hour after I had started making inquiries about how to go about writing an extension -- and this is the one I had in mind. It's an extraordinary coincidence, as I first sketched the idea in 2003, and had never seen it mentioned per se, beyond discussions of point-free and the like. (Despite <eyebloom> asking to be contacted if I wanted to work on this, my email was never answered; I didn't run off with their idea!) The principal example from that chat is described here.

  <dolio> map (foo (g _) x) (h _)
  <dolio> Matter of fact, what does "map (foo (g _) x) (h _)" mean?

Given the present context inferencing policy,

  map (foo (g __) x) (h __)  =  \ a b -> map (foo (g a) x) (h b)

Outer parentheses might also be necessary, depending on the larger context of the expression. (The same is true for some of the others below, if the full expression extends further to the right.) This interpretation may not be what the programmer intended, in which case at least a $ is needed

  map $ (foo (g __) x) (h __)  =  map $ \ a b -> (foo (g a) x) (h b)

Finally, if two separate free sections were intended, the programmer would need to use _[…]_ bracketing for at least one of them

     map  _[foo (g __) x]_   _[h __]_
  =  map   (foo (g __) x)    _[h __]_
  =  map  _[foo (g __) x]_    (h __)
  =  map  _[foo (g __) x]_  $  h __

all of which are rewritten by FreeSect to

  map (\a->foo (g a) x) (\b->h b)

This interpretation cannot by typed, since the second argument of map must be have kind *, while \b->h b has kind *->*.

Other details

Nested free sections work correctly, thanks to bottom-up traversal. Note that, as wildcards are syntactically indistinguishable, it is impossible for an enclosing freesect to have a wildcard inside any enclosed freesect.

Fresh identifiers generated by FreeSect are guaranteed to be unique among names referenced within a module. There is no danger of exporting these names inadvertently, as they never have top-level scope. Imported names which are in scope but never used will not incur conflicts.

A few things don't work quite right yet, such as list comprehensions. Work continues...

How to run it

The extension runs as a preprocessor. This can be done with or without the knowledge of GHC. If you use ghc -F (see scripts in the download), then unfortunately GHC checks language pragmas before preprocessing, which means you cannot declare {-# LANGUAGE FreeSections #-} at the top. Although the patched HSE is capable of rejecting source which uses free sections without the pragma, this feature is disabled for interoperability with GHC. All compilers are supported by running FreeSect as an independent preprocessor, but there are some inconveniences involving temporary files.

Feedback

Please provide feedback on the FreeSect haskell-cafe mailing list thread.

Kind Reg'ds,

Andrew Seniuk

<rasfar> on #haskell

Feb. 29, 2012