ghc-9.8.0.20230929: The GHC API
Safe HaskellNone
LanguageHaskell2010

GHC.Stg.CSE

Description

Note [CSE for Stg] ~~~~~~~~~~~~~~~~~~

This module implements a simple common subexpression elimination pass for STG. This is useful because there are expressions that we want to common up (because they are operationally equivalent), but that we cannot common up in Core, because their types differ. This was originally reported as #9291.

There are two types of common code occurrences that we aim for, see Note [Case 1: CSEing allocated closures] and Note [Case 2: CSEing case binders] below.

Note [Case 1: CSEing allocated closures] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The first kind of CSE opportunity we aim for is generated by this Haskell code:

bar :: a -> (Either Int a, Either Bool a) bar x = (Right x, Right x)

which produces this Core:

bar :: forall a. a -> (Either Int a, Either Bool a) bar a x = (Right Int a x, Right Bool @a x)

where the two components of the tuple are different terms, and cannot be commoned up (easily). On the STG level we have

bar [x] = let c1 = Right [x] c2 = Right [x] in (c1,c2)

and now it is obvious that we can write

bar [x] = let c1 = Right [x] in (c1,c1)

instead.

Note [Case 2: CSEing case binders] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The second kind of CSE opportunity we aim for is more interesting, and came up in #9291 and #5344: The Haskell code

foo :: Either Int a -> Either Bool a foo (Right x) = Right x foo _ = Left False

produces this Core

foo :: forall a. Either Int a -> Either Bool a foo a e = case e of b { Left n -> … , Right x -> Right Bool @a x }

where we cannot CSE `Right Bool a x` with the case binder b as they have different types. But in STG we have

foo [e] = case e of b { Left [n] -> … , Right [x] -> Right [x] }

and nothing stops us from transforming that to

foo [e] = case e of b { Left [n] -> … , Right [x] -> b}

Note [StgCse after unarisation] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Consider two unboxed sum terms:

(# 1 | #) :: (# Int | Int# #) (# 1 | #) :: (# Int | Int #)

These two terms are not equal as they unarise to different unboxed tuples. However if we run StgCse before Unarise, it'll think the two terms (# 1 | #) are equal, and replace one of these with a binder to the other. That's bad -- #15300.

Solution: do unarise first.

Documentation