Ticket #5631 (closed bug: fixed)

Opened 7 months ago

Last modified 5 months ago

Compilation slowdown from 7.0.x to 7.2.x

Reported by: bjpop Owned by: igloo
Priority: highest Milestone: 7.4.1
Component: Compiler Version: 7.2.1
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Compile-time performance bug Difficulty: Unknown
Test Case: perf/compiler/T5631 Blocked By:
Blocking: Related Tickets:

Description

Compile time performance for GHC appears to have slowed down significantly for the package:

 http://hackage.haskell.org/package/language-python

Time to compile the package:

  • GHC 7.0.4: 3m57s
  • GHC 7.2.1: 49m22s
  • GHC 7.2.2: 49m56s

The culprits are the Happy generated modules:

Language.Python.Version3.Parser.Parser Language.Python.Version2.Parser.Parser

GHC uses up to 700MB of memory compiling these files (my machine has 2GB of RAM and it does not appear to be thrashing during the compilation).

All versions of GHC in this ticket were downloaded as binary packages from the ghc website.

System:

OS X 10.7.2 1.4 GHz Intel Core 2 Duo 2GB RAM GCC 4.2.1

Attachments

test_rules2.hs Download (30.6 KB) - added by simonmar 6 months ago.
compilation-time regression test
test_rules2.y Download (2.0 KB) - added by simonmar 6 months ago.
happy input file
ghc-stage2.prof Download (159.8 KB) - added by simonmar 6 months ago.

Change History

Changed 7 months ago by simonmar

  • priority changed from normal to high
  • milestone set to 7.4.1

Thanks. Looks pretty serious, we'll investigate.

Changed 6 months ago by simonpj@…

commit 207a2a63e039af2bf6c331ef5a005512dd25fd77

Author: Simon Peyton Jones <simonpj@microsoft.com>
Date:   Thu Nov 17 13:13:15 2011 +0000

    Remove a quadratic complexity blow-up in coercionKind
    thereby fixing Trac #5631.
    
    See Note [Nested InstCos] in Coercion

 compiler/types/Coercion.lhs |   51 ++++++++++++++++++++++++++++++------------
 1 files changed, 36 insertions(+), 15 deletions(-)

Changed 6 months ago by simonmar

While trying to make a test case for this bug, I appear to have found another compilation-time regression (or possibly the same one in a different guise). Compiling the attached module (test_rules2.hs), I get the following times:

GHC 7.0.3:
     642,777,576 bytes allocated in the heap
  Total time    1.44s  (  1.95s elapsed)

GHC 7.2.1:
     615,395,312 bytes allocated in the heap
  Total   time    1.38s  (  2.45s elapsed)

GHC 7.3.20111119:
   6,063,152,720 bytes allocated in the heap
  Total   time    6.81s  (  6.89s elapsed)

GHC is allocating 10x as much as 7.2.1 while compiling this module!

Changed 6 months ago by simonmar

compilation-time regression test

Changed 6 months ago by simonmar

happy input file

Changed 6 months ago by simonmar

  • priority changed from high to highest

Changed 6 months ago by simonmar

  • owner set to simonpj

We have a partial fix that I am validating right now, this brings it down to

GHC 7.3.20111119:
   1,647,677,024 bytes allocated in the heap
  Total   time    4.21s  (  4.27s elapsed)

I will also attach a profile. Simon is going to look into the remaining problems.

Changed 6 months ago by simonmar

Changed 6 months ago by marlowsd@…

commit ed043776f3363b2b9e56f1e9836a250f43c28d26

Author: Simon Marlow <marlowsd@gmail.com>
Date:   Tue Nov 22 14:09:05 2011 +0000

    uType_defer: only call mkErrInfo if -ddump-tc-trace is on (#5631)
    
    Profiling shows we're wasting a lot of time calling mkErrInfo and
    tidying the context, work that is wasted if we're not tracing.
    
    There is more to this compilation performance regression than this,
    but this is one bug.

 compiler/typecheck/TcUnify.lhs |   13 +++++++++----
 1 files changed, 9 insertions(+), 4 deletions(-)

Changed 6 months ago by simonmar

To remind us: we need to look into why uType_defer was being called here rather than unifying directly. Simon PJ was suspicious.

Changed 6 months ago by dimitris

I've done a lot of modifications to remove various bottlenecks in our new design and just brought the time for this down to 1.8 secs again. I've got a big patch pending along with several more performance benchmarks for the testuite which I will push on Monday.

Changed 6 months ago by simonpj@…

commit e99f323405a8398f0521686d87124453e85b599a

Author: Simon Peyton Jones <simonpj@microsoft.com>
Date:   Sat Nov 26 21:47:39 2011 +0000

    Add missing cases in TcUnify.uUnfilledVars
    
    These missing cases dealt with unifying a meta type variable with a
    skolem when the kinds match -- a pretty common case.  The missing
    cases meant that instead of directly solving on the fly (which is easy
    in this situation) we were generating an equality constraint viat
    `utype_defer`.  This isn't *wrong*, but it's a lot less efficient than
    it could be!
    
    All this arose from investigating #5631.  This one change does this
    to the compiler allocation
    
    Before:
         821,257,552 bytes allocated in the heap
                  94 MB total memory in use (0 MB lost due to fragmentation)
    
      MUT     time    1.54s  (  1.67s elapsed)
      GC      time    1.36s  (  1.60s elapsed)
      Total   time    2.93s  (  3.27s elapsed)
    
    After:
         424,244,124 bytes allocated in the heap
                  49 MB total memory in use (0 MB lost due to fragmentation)
    
      MUT     time    0.64s  (  0.89s elapsed)
      GC      time    0.83s  (  0.77s elapsed)
      Total   time    1.47s  (  1.66s elapsed)
    
    Not bad for a 3-line change!

 compiler/typecheck/TcUnify.lhs |   12 +++++++++---
 1 files changed, 9 insertions(+), 3 deletions(-)

Changed 6 months ago by simonpj

  • owner changed from simonpj to igloo
  • testcase set to perf/cmpiler/T5631

OK I fixed utype_defer, and added a regression test (the one Simon constructed). Ian, would you like to compare the original report error: compiling language-python with 7.0 vs HEAD?

Simon

Changed 6 months ago by dimitris

  • testcase changed from perf/cmpiler/T5631 to perf/compiler/T5631

Changed 5 months ago by igloo

  • status changed from new to closed
  • difficulty set to Unknown
  • resolution set to fixed

language-python (with a handful of build fixes) just compiled for me with HEAD in 4:01.49 total.

Note: See TracTickets for help on using tickets.