Ticket #4138 (closed bug: fixed)

Opened 3 years ago

Last modified 11 months ago

Performance regression in overloading

Reported by: simonmar Owned by: pcapriotti
Priority: high Milestone: 7.6.1
Component: Compiler Version: 6.13
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Difficulty: Unknown
Test Case: simplCore/shouldCompile/T4138 Blocked By:
Blocking: Related Tickets:

Description

The following program goes 25% slower with HEAD compared to 6.12.3:

module Main (main) where

import DeepSeq

main :: IO ()
main = do
  rnf [ mk x | x <- [ 1 .. 1024 ] ] `seq` return ()
  where
    mk :: Float -> [(Float,Float)]
    mk x = [ (x+i,x+i+1) | i <- [ 1 .. 2048] ]

using the attached DeepSeq module, or indeed the standard Control.DeepSeq.

Simon and I diagnosed the problem to be the following dictionary for NFData (Float,Float) (this is HEAD):

Main.main6 :: DeepSeq.NFData (GHC.Types.Float, GHC.Types.Float)
Main.main6 =
  DeepSeq.$fNFData(,)
    @ GHC.Types.Float
    @ GHC.Types.Float
    DeepSeq.$fNFDataFloat
    DeepSeq.$fNFDataFloat

GHC has not inlined the dictionary function or the arguments here, even though this class is in fact just a single-method dictionary. With 6.12 we got:

Main.main6 =
  \ (ds_dBc :: (GHC.Types.Float, GHC.Types.Float)) ->
    case ds_dBc of _ { (x_awr, y_aws) ->
    case x_awr of _ { GHC.Types.F# _ ->
    case y_aws of _ { GHC.Types.F# _ -> GHC.Unit.() }
    }
    }

i.e. everything fully inlined and a nice efficient definition.

This is currently affecting parallel programs where we typically use rnf quite a lot.

Attachments

DeepSeq.hs Download (1.3 KB) - added by simonmar 3 years ago.

Change History

Changed 3 years ago by simonmar

Changed 3 years ago by simonpj

  • owner changed from simonpj to igloo

WIth HEAD (and hence STABLE I guess), the performance (looking at allocation) is the same as 6.12, so I declare this fixed.

Ian: might you turn this into a performance test? Thanks

Simon

Changed 3 years ago by simonmar

IIRC the difference here doesn't show up in allocations, just in runtime.

Changed 3 years ago by simonpj

Ahem, ok. Well, Ian can you see if it's still a problem and add a test anyway?

Changed 3 years ago by simonpj

  • owner changed from igloo to simonpj

Changed 3 years ago by simonpj

  • milestone changed from 7.0.1 to 7.0.2

Changed 2 years ago by simonmar

  • milestone changed from 7.0.2 to 7.2.1

We think the performance is almost back to where it was with 6.12, but not quite, and we'll take another look for 7.2.1.

Changed 18 months ago by simonpj

  • owner changed from simonpj to igloo

Ian: could you see if this is ok now, and if not how not? Thanks

Simon

Changed 16 months ago by igloo

  • difficulty set to Unknown
  • milestone changed from 7.4.1 to 7.4.2

Changed 14 months ago by igloo

  • milestone changed from 7.4.2 to 7.6.1

Changed 11 months ago by igloo

  • owner changed from igloo to simonpj
  • testcase set to T4138

We're currently getting a strange middle result:

  \ (ds_dvm :: (GHC.Types.Float, GHC.Types.Float)) ->
    case ds_dvm of _ { (x_aux, y_auy) ->
    case x_aux of _ { GHC.Types.F# ipv_svw ->
    DeepSeq.$fNFDataFloat_$crnf y_auy
    }
    }

I'd have expected $fNFDataFloat_$crnf to have been inlined.

I've added a test (T4138).

Simon, I wonder if it would make sense to look at this at the same time as #6104?

Changed 11 months ago by simonpj

  • testcase changed from T4138 to simplCore/shouldCompile/T4138

Changed 11 months ago by simonpj

Why do you think $fNFDataFloat_$crnf should be inlined? It is not called in an interesting context; it is not applied to an argument we know something about. Inlining would duplicate a case expression to elimiate a single jump. Is this really the source of the regression, if any?

Simon

Changed 11 months ago by igloo

I was only looking at the core generated, not looking at run-time, so if that is the expected core then we can just close this ticket.

Changed 11 months ago by simonpj

Well 17 months ago Simon M wrote "We think the performance is almost back to where it was with 6.12, but not quite, and we'll take another look for 7.2.1". Could you possibly check the performance as of today? It would be good to know whether we have a regression or not; and if so, whether it's due to this extra jump (which I rather doubt). Thanks

Changed 11 months ago by igloo

  • owner changed from simonpj to pcapriotti

Paolo, can you take a look at whether there is a performance issue here please?

Changed 11 months ago by pcapriotti

Here are some timings:

ghc-6.12.3: 287 ms
ghc-7.0.4:  316 ms
ghc-7.2.1:  269 ms
ghc-7.4.2:  265 ms
ghc-HEAD:   267 ms

It appears that the regression has been fixed sometime before 7.2.

Before we close the ticket, should we remove the test case which checks whether rnf is inlined?

Changed 11 months ago by igloo

Just change it to check for 1 F# rather than 2

Changed 11 months ago by pcapriotti

  • status changed from new to closed
  • resolution set to fixed

Pushed:

commit 8f3d6a513ebba2c0a1fb15f835af78ad9e0023ff
Author: Paolo Capriotti <p.capriotti@gmail.com>
Date:   Thu Jul 5 20:04:37 2012 +0100

    Fix test case for #4138.
    
    As per the comments on that ticket, *not* inlining rnf is correct, so
    make this test pass.

Closing the ticket now.

Note: See TracTickets for help on using tickets.