Ticket #7542 (new bug)

Opened 6 months ago

Last modified 2 months ago

GHC doesn't optimize (strict) composition with id

Reported by: shachaf Owned by: simonpj
Priority: normal Milestone: 7.8.1
Component: Compiler Version: 7.6.1
Keywords: Cc: ekmett@…, jwlato@…, hackage.haskell.org@…, pho@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

Newtype constructors and selectors have no runtime overhead, but some uses of them do. For example, given newtype Identity a = Identity { runIdentity :: a }, Identity turns into id, but Identity . f turns into id . f, which is distinct from f, because it gets eta-expanded to \x -> f x.

It would be nice to be able to compose a newtype constructor with a function without any overhead. The obvious thing to try is strict composition:

(#) :: (b -> c) -> (a -> b) -> a -> c
(#) f g = f `seq` g `seq` \x -> f (g x)

In theory this should get rid of the eta-expansion. In practice, the generated Core looks like this:

foo :: (a -> b) -> [a] -> [b]
foo f = map (id # f)
-- becomes
foo = \f e -> map (case f of g { __DEFAULT -> \x -> g x }) e

Different variations of (#), and turning -fpedantic-bottoms on, don't seem to affect this. A simpler version, foo f = map (f seq \x -> f x), generates the same sort of Core.

In one library we resorted to defining a bunch of functions of the form identityDot :: (a -> b) -> a -> Identity b; identityDot = unsafeCoerce. It would be better to be able to rely on GHC to do the optimization directly, if we use strict composition anyway.

Change History

Changed 6 months ago by simonpj

  • status changed from new to infoneeded
  • difficulty set to Unknown

Can you give a concrete example? With this module

module T7542 where

newtype Id a = MkId a

f1 = map reverse

f2 = map (MkId . reverse)

compiled with ghc-7.6 -O -ddump-stg I get

==================== STG syntax: ====================

T7542.f1 :: forall a_afy. [[a_afy]] -> [[a_afy]]
[GblId, Arity=1, Str=DmdType, Unf=OtherCon []] =
    \r [eta_B1] GHC.Base.map GHC.List.reverse eta_B1;
SRT(T7542.f1): []

T7542.f2 :: forall a_afr. [[a_afr]] -> [T7542.Id [a_afr]]
[GblId, Arity=1, Str=DmdType, Unf=OtherCon []] =
    \r [eta_B1] GHC.Base.map GHC.List.reverse eta_B1;
SRT(T7542.f2): []

which looks fine to me.

Changed 6 months ago by shachaf

  • status changed from infoneeded to new

Here's an example of the sort of context this comes up in:

module T7542 where

import Unsafe.Coerce

newtype Id a = MkId { unId :: a }

-- Think of `mapped` as `mapM`, but restricted to Id (we could make it work
-- with any Functor, rather than just []). `over` takes the Id wrappers back
-- off. The goal is to make it easy to compose mapped with other functions of
-- the same form. The wrapper should be "free" because it's just newtype noise.

mapped1 :: (a -> Id b) -> [a] -> Id [b]
mapped1 f = MkId . map (unId . f)

over1 :: ((a -> Id b) -> s -> Id t) -> (a -> b) -> s -> t
over1 l f = unId . l (MkId . f)

map1 :: (a -> b) -> [a] -> [b]
map1 f xs = over1 mapped1 f xs
-- Core: map1 = \f xs -> map (\x -> f x) xs

-- over1 mapped1 = unId . MkId . map (unId . MkId . f)
--               ~ map
-- However, if f = ⊥, unId . MkId . f /= f!
-- Therefore `over1 mapped1` must turn into \f -> map (\x -> f x)
-- We can't expect GHC to compile it to `map` because it has different strictness.

-- Let's define strict versions of (MkId .) and (unId .):
mkIdDot2 :: (a -> b) -> a -> Id b
mkIdDot2 f = f `seq` \x -> MkId (f x)

unIdDot2 :: (a -> Id b) -> a -> b
unIdDot2 f = f `seq` \x -> unId (f x)

mapped2 :: (a -> Id b) -> [a] -> Id [b]
mapped2 f = mkIdDot2 (map (unIdDot2 f))

over2 :: ((a -> Id b) -> s -> Id t) -> (a -> b) -> s -> t
over2 l f = unIdDot2 (l (mkIdDot2 f))

map2 :: (a -> b) -> [a] -> [b]
map2 f xs = over2 mapped2 f xs
-- map2 should have the same semantics as map. But the Core isn't the same:
-- Without -fpedantic-bottoms: map2 = \f xs -> map (\e -> f e) xs
-- With -fpedantic-bottoms:
-- map2 = \f xs -> map (case f of g { __DEFAULT -> \x -> g x }) xs
-- Ideally, (case f of g { __DEFAULT -> \x -> g x }) would simply be f.

-- Let's try manually telling GHC that our newtype compositions are coercions:
-- (Ideally, this is what mkIdDot2 and unIdDot2 would compile into.)
mkIdDot3 :: (a -> b) -> a -> Id b
mkIdDot3 = unsafeCoerce

unIdDot3 :: (a -> Id b) -> a -> b
unIdDot3 = unsafeCoerce
-- (Note: Due to #7398, we couldn't define a strict composition operator and
-- rely on RULES to turn (MkId `dot`) into unsafeCoerce -- the `MkId` itself
-- gets turned into a coercion before any RULES have a chance to fire.)

mapped3 :: (a -> Id b) -> [a] -> Id [b]
mapped3 f = mkIdDot3 (map (unIdDot3 f))

over3 :: ((a -> Id b) -> s -> Id t) -> (a -> b) -> s -> t
over3 l f = unIdDot3 (l (mkIdDot3 f))

map3 :: (a -> b) -> [a] -> [b]
map3 f xs = over3 mapped3 f xs
-- Core: map3 = map

Changed 5 months ago by ekmett

  • cc ekmett@… added

Changed 5 months ago by jwlato

  • cc jwlato@… added

Changed 5 months ago by liyang

  • cc hackage.haskell.org@… added

Changed 5 months ago by PHO

  • cc pho@… added

Changed 5 months ago by simonpj@…

commit 35f1fc957d152c520c90c6bd2330266e57578eb2

Author: Simon Peyton Jones <simonpj@microsoft.com>
Date:   Tue Jan 22 22:46:33 2013 +0000

    Allow CaseElim if the case binder is the next thing to be eval'd
    
    This makes CaseElim happen a bit more often.
    See Note [Case binder next] in Simplify.
    This came up when investigating Trac #7542.

 compiler/simplCore/Simplify.lhs |   32 ++++++++++++++++++++++++++------
 1 files changed, 26 insertions(+), 6 deletions(-)

Changed 5 months ago by simonpj

Also this:

commit 7a1480c7c590d4d2fa7a105a4eebf299e921e056
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date:   Tue Jan 22 22:43:22 2013 +0000

    Allow eta-reduction of eval'd functions if of arity 1
    
    See Note [Eta reduction of an eval'd function] in CoreUtils.
    This doesn't fix Trac #7542, but that was the ticket that
    pointed out this infelicity.

>---------------------------------------------------------------

 compiler/coreSyn/CoreUtils.lhs |   24 ++++++++++++++++++++++--
 1 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/compiler/coreSyn/CoreUtils.lhs b/compiler/coreSyn/CoreUtils.lhs index 7017f70..9b527e7 100644
--- a/compiler/coreSyn/CoreUtils.lhs
+++ b/compiler/coreSyn/CoreUtils.lhs
@@ -1712,8 +1712,14 @@ tryEtaReduce bndrs body
 
     ---------------
     fun_arity fun             -- See Note [Arity care]
-       | isLocalId fun && isStrongLoopBreaker (idOccInfo fun) = 0
-       | otherwise = idArity fun
+       | isLocalId fun
+       , isStrongLoopBreaker (idOccInfo fun) = 0
+       | arity > 0                           = arity
+       | isEvaldUnfolding (idUnfolding fun)  = 1  
+            -- See Note [Eta reduction of an eval'd function]
+       | otherwise                           = 0
+       where
+         arity = idArity fun
 
     ---------------
     ok_lam v = isTyVar v || isEvVar v
@@ -1737,6 +1743,20 @@ tryEtaReduce bndrs body
     ok_arg _ _ _ = Nothing
 \end{code}
 
+Note [Eta reduction of an eval'd function] 
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+In Haskell is is not true that    f = \x. f x
+because f might be bottom, and 'seq' can distinguish them.
+
+But it *is* true that   f = f `seq` \x. f x 
+and we'd like to simplify the latter to the former.  This amounts to 
+the rule that
+  * when there is just *one* value argument,
+  * f is not bottom
+we can eta-reduce    \x. f x  ===>  f
+
+This turned up in Trac #7542.  

Changed 5 months ago by simonpj

The previous two commits improve the situation a bit; both were triggered by looking at the code from this example, thanks.

I'm far from sure that we get perfect code now. But I'm also convinced that this is the wrong way to solve this problem: check out NewtypeWrappers.

Meanwhile do have a go with HEAD and see if you get better code now.

Simon

Changed 2 months ago by igloo

  • owner set to simonpj
  • milestone set to 7.8.1

Simon, should this ticket be closed now?

Note: See TracTickets for help on using tickets.