# Ticket #7542 (new bug)

Opened 6 months ago

## GHC doesn't optimize (strict) composition with id

### Description

Newtype constructors and selectors have no runtime overhead, but some uses of them do. For example, given newtype Identity a = Identity { runIdentity :: a }, Identity turns into id, but Identity . f turns into id . f, which is distinct from f, because it gets eta-expanded to \x -> f x.

It would be nice to be able to compose a newtype constructor with a function without any overhead. The obvious thing to try is strict composition:

(#) :: (b -> c) -> (a -> b) -> a -> c
(#) f g = f seq g seq \x -> f (g x)


In theory this should get rid of the eta-expansion. In practice, the generated Core looks like this:

foo :: (a -> b) -> [a] -> [b]
foo f = map (id # f)
-- becomes
foo = \f e -> map (case f of g { __DEFAULT -> \x -> g x }) e


Different variations of (#), and turning -fpedantic-bottoms on, don't seem to affect this. A simpler version, foo f = map (f seq \x -> f x), generates the same sort of Core.

In one library we resorted to defining a bunch of functions of the form identityDot :: (a -> b) -> a -> Identity b; identityDot = unsafeCoerce. It would be better to be able to rely on GHC to do the optimization directly, if we use strict composition anyway.

Can you give a concrete example? With this module

module T7542 where

newtype Id a = MkId a

f1 = map reverse

f2 = map (MkId . reverse)


compiled with ghc-7.6 -O -ddump-stg I get

==================== STG syntax: ====================

T7542.f1 :: forall a_afy. [[a_afy]] -> [[a_afy]]
[GblId, Arity=1, Str=DmdType, Unf=OtherCon []] =
\r [eta_B1] GHC.Base.map GHC.List.reverse eta_B1;
SRT(T7542.f1): []

T7542.f2 :: forall a_afr. [[a_afr]] -> [T7542.Id [a_afr]]
[GblId, Arity=1, Str=DmdType, Unf=OtherCon []] =
\r [eta_B1] GHC.Base.map GHC.List.reverse eta_B1;
SRT(T7542.f2): []


which looks fine to me.

Here's an example of the sort of context this comes up in:

module T7542 where

import Unsafe.Coerce

newtype Id a = MkId { unId :: a }

-- Think of mapped as mapM, but restricted to Id (we could make it work
-- with any Functor, rather than just []). over takes the Id wrappers back
-- off. The goal is to make it easy to compose mapped with other functions of
-- the same form. The wrapper should be "free" because it's just newtype noise.

mapped1 :: (a -> Id b) -> [a] -> Id [b]
mapped1 f = MkId . map (unId . f)

over1 :: ((a -> Id b) -> s -> Id t) -> (a -> b) -> s -> t
over1 l f = unId . l (MkId . f)

map1 :: (a -> b) -> [a] -> [b]
map1 f xs = over1 mapped1 f xs
-- Core: map1 = \f xs -> map (\x -> f x) xs

-- over1 mapped1 = unId . MkId . map (unId . MkId . f)
--               ~ map
-- However, if f = ⊥, unId . MkId . f /= f!
-- Therefore over1 mapped1 must turn into \f -> map (\x -> f x)
-- We can't expect GHC to compile it to map because it has different strictness.

-- Let's define strict versions of (MkId .) and (unId .):
mkIdDot2 :: (a -> b) -> a -> Id b
mkIdDot2 f = f seq \x -> MkId (f x)

unIdDot2 :: (a -> Id b) -> a -> b
unIdDot2 f = f seq \x -> unId (f x)

mapped2 :: (a -> Id b) -> [a] -> Id [b]
mapped2 f = mkIdDot2 (map (unIdDot2 f))

over2 :: ((a -> Id b) -> s -> Id t) -> (a -> b) -> s -> t
over2 l f = unIdDot2 (l (mkIdDot2 f))

map2 :: (a -> b) -> [a] -> [b]
map2 f xs = over2 mapped2 f xs
-- map2 should have the same semantics as map. But the Core isn't the same:
-- Without -fpedantic-bottoms: map2 = \f xs -> map (\e -> f e) xs
-- With -fpedantic-bottoms:
-- map2 = \f xs -> map (case f of g { __DEFAULT -> \x -> g x }) xs
-- Ideally, (case f of g { __DEFAULT -> \x -> g x }) would simply be f.

-- Let's try manually telling GHC that our newtype compositions are coercions:
-- (Ideally, this is what mkIdDot2 and unIdDot2 would compile into.)
mkIdDot3 :: (a -> b) -> a -> Id b
mkIdDot3 = unsafeCoerce

unIdDot3 :: (a -> Id b) -> a -> b
unIdDot3 = unsafeCoerce
-- (Note: Due to #7398, we couldn't define a strict composition operator and
-- rely on RULES to turn (MkId dot) into unsafeCoerce -- the MkId itself
-- gets turned into a coercion before any RULES have a chance to fire.)

mapped3 :: (a -> Id b) -> [a] -> Id [b]
mapped3 f = mkIdDot3 (map (unIdDot3 f))

over3 :: ((a -> Id b) -> s -> Id t) -> (a -> b) -> s -> t
over3 l f = unIdDot3 (l (mkIdDot3 f))

map3 :: (a -> b) -> [a] -> [b]
map3 f xs = over3 mapped3 f xs
-- Core: map3 = map


commit 35f1fc957d152c520c90c6bd2330266e57578eb2

Author: Simon Peyton Jones <simonpj@microsoft.com>
Date:   Tue Jan 22 22:46:33 2013 +0000

Allow CaseElim if the case binder is the next thing to be eval'd

This makes CaseElim happen a bit more often.
See Note [Case binder next] in Simplify.
This came up when investigating Trac #7542.

compiler/simplCore/Simplify.lhs |   32 ++++++++++++++++++++++++++------
1 files changed, 26 insertions(+), 6 deletions(-)


Also this:

commit 7a1480c7c590d4d2fa7a105a4eebf299e921e056
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date:   Tue Jan 22 22:43:22 2013 +0000

Allow eta-reduction of eval'd functions if of arity 1

See Note [Eta reduction of an eval'd function] in CoreUtils.
This doesn't fix Trac #7542, but that was the ticket that
pointed out this infelicity.

>---------------------------------------------------------------

compiler/coreSyn/CoreUtils.lhs |   24 ++++++++++++++++++++++--
1 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/compiler/coreSyn/CoreUtils.lhs b/compiler/coreSyn/CoreUtils.lhs index 7017f70..9b527e7 100644
--- a/compiler/coreSyn/CoreUtils.lhs
+++ b/compiler/coreSyn/CoreUtils.lhs
@@ -1712,8 +1712,14 @@ tryEtaReduce bndrs body

---------------
fun_arity fun             -- See Note [Arity care]
-       | isLocalId fun && isStrongLoopBreaker (idOccInfo fun) = 0
-       | otherwise = idArity fun
+       | isLocalId fun
+       , isStrongLoopBreaker (idOccInfo fun) = 0
+       | arity > 0                           = arity
+       | isEvaldUnfolding (idUnfolding fun)  = 1
+            -- See Note [Eta reduction of an eval'd function]
+       | otherwise                           = 0
+       where
+         arity = idArity fun

---------------
ok_lam v = isTyVar v || isEvVar v
@@ -1737,6 +1743,20 @@ tryEtaReduce bndrs body
ok_arg _ _ _ = Nothing
\end{code}

+Note [Eta reduction of an eval'd function]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+In Haskell is is not true that    f = \x. f x
+because f might be bottom, and 'seq' can distinguish them.
+
+But it *is* true that   f = f seq \x. f x
+and we'd like to simplify the latter to the former.  This amounts to
+the rule that
+  * when there is just *one* value argument,
+  * f is not bottom
+we can eta-reduce    \x. f x  ===>  f
+
+This turned up in Trac #7542.


The previous two commits improve the situation a bit; both were triggered by looking at the code from this example, thanks.

I'm far from sure that we get perfect code now. But I'm also convinced that this is the wrong way to solve this problem: check out NewtypeWrappers.

Meanwhile do have a go with HEAD and see if you get better code now.

Simon

Simon, should this ticket be closed now?

