Ticket #2273 (new bug)

Opened 5 years ago

Last modified 4 months ago

inlining defeats seq

Reported by: igloo Owned by:
Priority: lowest Milestone: 7.6.2
Component: Compiler Version: 6.9
Keywords: Cc: ezyang@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

Consider this module:

module Q (tcExtendIdEnv2) where

-- Interesting code:

tcExtendIdEnv2 :: M a
tcExtendIdEnv2 = do env <- getEnv
                    let level :: Int
                        level = thLevel (tcl_th_ctxt env)
                    level `seq` tc_extend_local_id_env level

{-# NOINLINE tc_extend_local_id_env #-}
tc_extend_local_id_env :: Int -> M a
tc_extend_local_id_env th_lvl = if read "foo"
                                then th_lvl `seq` return undefined
                                else return undefined

thLevel :: ThStage -> Int
thLevel Comp       = 0
thLevel (Splice l) = l
thLevel (Brack l)  = l

-- Dull code:

type M a = IOEnv TcLclEnv a

data TcLclEnv = TcLclEnv { tcl_th_ctxt :: !ThStage }

data ThStage = Comp | Splice Int | Brack  Int

getEnv :: IOEnv env env
getEnv = IOEnv (\ env -> return env)

newtype IOEnv env a = IOEnv { unIOEnv :: env -> IO a }

instance Monad (IOEnv m) where
    IOEnv m >>= f = IOEnv (\ env -> do r <- m env
                                       unIOEnv (f r) env )
    return a = IOEnv (\ _ -> return a)
    fail = error

Compiling with

ghc -O -ddump-simpl -ddump-stg -c Q.hs

we get, in the STG,

Q.$wa =
    \r srt:(0,*bitmap*) [ww_sDx w_sDO]
        case
            case ww_sDx of wild_sEs {
              Q.Comp -> Q.lvl;
              Q.Splice l_sDA -> l_sDA;
              Q.Brack l_sDC -> l_sDC;
            }
        of
        tpl_sEt
        { GHC.Base.I# ipv_sEu ->
              let { sat_sDN = NO_CCS Q.TcLclEnv! [ww_sDx]; } in
              let {
                sat_sDL =
                    \u []
                        case ww_sDx of wild_sEv {
                          Q.Comp -> Q.lvl;
                          Q.Splice l_sDH -> l_sDH;
                          Q.Brack l_sDJ -> l_sDJ;
                        };
              } in  Q.tc_extend_local_id_env sat_sDL sat_sDN w_sDO;
        };

GHC seems to have inlined level, forced it (due to the seq), but passed along a second, inlined, unevaluated copy to tc_extend_local_id_env. So the whole environment is retained, defeating the purpose of the seq!

If I mark level as NOINLINE then the STG looks like this:

Q.a5 =
    \r srt:(0,*bitmap*) [env_sD1 eta_sDh]
        case env_sD1 of tpl_sDg {
          Q.TcLclEnv ipv_sD5 ->
              case
                  case ipv_sD5 of wild_sDN {
                    Q.Comp -> Q.lvl;
                    Q.Splice l_sD8 -> l_sD8;
                    Q.Brack l_sDa -> l_sDa;
                  }
              of
              level_sDc
              { __DEFAULT ->
                    case level_sDc of tpl1_sDf {
                      GHC.Base.I# ipv1_sDO -> Q.tc_extend_local_id_env tpl1_sDf tpl_sDg eta_sDh;
                    };
              };
        };

which fixes the env-retained problem, although I don't understand why two cases are done.

It would be nice not to have to resort to this level of trickery, though!

Attachments

T2273.hs Download (1.0 KB) - added by morabbin 4 months ago.

Change History

Changed 5 years ago by simonpj

Thanks Ian -- nice report. What is happening is this. After inlining seq we get something like this:

  let level = case x of { A -> 1; B -> 2 }
  in case level of { _ -> ...level... }

Now GHC thinks that the case-expression for level is cheap (which it is), and therefore ok to duplicate, so it inlines it

  let level = case x of { A -> 1; B -> 2 }
  in case (case x of { A -> 1; B -> 2 }) of { _ -> ...level... }

Now there's only one remaining occurrence of level so that gets inlined too. Disaster.

This example has made me realise that what is really wrong here is that seq should really have a type more like

  fseq :: a -> (a -> b) -> b

Then we'd have

  level `fseq` (\level -> ...level...)

Now the inner level is nothing to do with the outer level and all will be well. So here we are saying what we mean: "evaluate level and use the evaluated version inside here".

This version of seq is just reversed strict function application, of course. Which is very like strict let. So with bang patterns we could also write

  let !level2 = level in ...level2...

This is satisfactorily explicit, but we must use a new name (level2). Perhaps that's not unreasonable.

Both fseq and a strict let could desugar to

   case level of level2 { _ -> ...level2... }

Meanwhile I think a good fix will be to change the desugarer to desugar saturated applications of seq to this same form. Not very robust to abstraction, but better than what we have now.

seq is subtler than it looks.

Changed 5 years ago by simonpj

  • owner set to igloo
  • type changed from bug to merge

Fixed by

Fri May 16 09:51:49 GMT Daylight Time 2008  simonpj@microsoft.com
  * Improve the treatment of 'seq' (Trac #2273)
  
  Trac #2273 showed a case in which 'seq' didn't cure the space leak
  it was supposed to.  This patch does two things to help
  
  a) It removes a now-redundant special case in Simplify, which
     switched off the case-binder-swap in the early stages.  This
     isn't necessary any more because FloatOut has improved since
     the Simplify code was written.  And switching off the binder-swap
     is harmful for seq.
  
  However fix (a) is a bit fragile, so I did (b) too:
  
  b) Desugar 'seq' specially.  See Note [Desugaring seq (2)] in DsUtils
     This isn't very robust either, since it's defeated by abstraction, 
     but that's not something GHC can fix; the programmer should use
     a let! instead.
  

    M ./compiler/basicTypes/MkId.lhs -6 +14
    M ./compiler/deSugar/DsUtils.lhs -9 +44
    M ./compiler/simplCore/Simplify.lhs -12 +15

Most of the new lines are comments!

I believe this could usefully be moved to the branch too.

I wonder if it'd be worth a test in eyeball?

Simon

Changed 5 years ago by Isaac Dupree

are the following equivalent?

a `fseq` \a' -> b `fseq` \b' -> (a',b')
b `fseq` \b' -> a `fseq` \a' -> (a',b')

(I guess this is the usual question where the order of strictness doesn't matter because "imprecise" exceptions? but the order might have non-obvious performance implications? It's simpler than let because it's not mutually/recursive by default...)

Changed 5 years ago by simonpj

They are semantically equivalent, but could perhaps have different space behaviour.

Simon

Changed 5 years ago by igloo

  • status changed from new to closed
  • resolution set to fixed

Merged

Changed 5 years ago by simonmar

  • architecture changed from Unknown to Unknown/Multiple

Changed 5 years ago by simonmar

  • os changed from Unknown to Unknown/Multiple

Changed 4 years ago by simonpj

  • status changed from closed to reopened
  • resolution fixed deleted

See also this thread,  http://www.haskell.org/pipermail/cvs-ghc/2009-September/050164.html, which describes another instance of the very same thing. Furthermore, this instance is NOT FIXED by the above patch. So I'm re-opening.

Simon

Changed 4 years ago by igloo

  • owner igloo deleted
  • status changed from reopened to new

Changed 4 years ago by simonpj

Note to self. I think the Right Solution is simply not to inline things that are "cheap", but only things that are "values". This is easy to achieve:

hunk ./compiler/coreSyn/CoreUnfold.lhs 637
- 	yes_or_no = active_inline && is_cheap && consider_safe
+ 	yes_or_no = active_inline && is_value && consider_safe

But it's not quite so easy: this change makes allocation go up in a couple of programs, and runtime goes up quite a bit. (We'd need to double-check that the runtime figures are right.)

--------------------------------------------------------------------------------
        Program           Size    Allocs   Runtime   Elapsed
--------------------------------------------------------------------------------
         fulsom          -2.8%    +17.2%    +39.1%    +39.8%
         puzzle          +1.0%     +8.1%    +15.2%    +18.6%
           atom          +0.7%     +7.3%    +55.7%    +64.7%
        circsim          +0.5%     -0.0%    +35.2%    +37.1%
      compress2          +0.7%     -0.0%    +25.8%    +29.9%
           lcss          +0.7%     -0.0%    +42.8%    +48.1%
--------------------------------------------------------------------------------
            Min          -2.8%     -4.3%     -4.8%    -10.0%
            Max          +1.0%    +17.2%    +55.7%    +64.7%
 Geometric Mean          +0.5%     +0.5%    +18.2%    +21.4%

So, annoyingly, more investigation required. I can't do it today, so I'm recording the breadcrumbs here for when I get back to it.

Simon

Changed 4 years ago by igloo

  • type changed from merge to bug

Changed 4 years ago by igloo

  • milestone set to 6.12 branch

Changed 3 years ago by igloo

  • milestone changed from 6.12 branch to 6.12.3

Changed 3 years ago by igloo

  • priority changed from normal to low
  • milestone changed from 6.12.3 to 6.14.1

Changed 2 years ago by igloo

  • milestone changed from 7.0.1 to 7.0.2

Changed 2 years ago by igloo

  • milestone changed from 7.0.2 to 7.2.1

Changed 2 years ago by ezyang

  • cc ezyang@… added
  • failure set to None/Unknown

Changed 20 months ago by igloo

  • milestone changed from 7.2.1 to 7.4.1

Changed 16 months ago by igloo

  • priority changed from low to lowest
  • milestone changed from 7.4.1 to 7.6.1

Changed 8 months ago by igloo

  • milestone changed from 7.6.1 to 7.6.2

Changed 4 months ago by morabbin

Changed 4 months ago by morabbin

Seems to work as desired with 7.6.1:

ghc -O -ddump-simpl -ddump-stg -c T2273.hs

yields (in the STG code):

[GblId, Arity=2, Str=DmdType S(S)L, Unf=OtherCon []] =
    \r srt:(0,*bitmap*) [env_sxC s_sxP]
        case env_sxC of wild_sxO {
          Q.TcLclEnv ds_sxF ->
              case
                  case ds_sxF of _ {
                    Q.Comp -> Q.tcExtendIdEnv3;
                    Q.Splice l_sxI -> l_sxI;
                    Q.Brack l_sxK -> l_sxK;
                  }
              of
              level_sxN
              { GHC.Types.I# ipv_syf ->
                    Q.tc_extend_local_id_env level_sxN wild_sxO s_sxP;
              };
        };

Changed 4 months ago by igloo

The ticket was reopened for a different example:  http://hackage.haskell.org/trac/ghc/ticket/2273#comment:8

Note: See TracTickets for help on using tickets.