Ticket #7450 (new bug)

Opened 6 months ago

Last modified 6 weeks ago

Regression in optimisation time of functions with many patterns (6.12 to 7.4)?

Reported by: iustin Owned by:
Priority: normal Milestone: 7.8.1
Component: Compiler Version: 7.6.1
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Compile-time performance bug Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

In our project, we build (via TH) some not-trivial data structures, including custom serialisation. After growing the number of constructors significantly, compiling same code with ghc 7.4/7.6 is about 10 times slower than with ghc 6.12.

Narrowing this down on a simpler test case, it turns out that TH is not the cause, just the optimisation of the generated code.

The attached Generator.hs file generates a simple data structure (our project is using a much smaller number of constructors, but with more complex data types, and the result is the same):

data F
  = F1 {fldF1 :: Int} |
    F2 {fldF2 :: Int} |
    F3 {fldF3 :: Int} |
    F4 {fldF4 :: Int} |
    F5 {fldF5 :: Int} |
  … deriving (Eq, Show, Read)

prettyF :: F -> String
prettyF (F1 v) = show v
prettyF (F2 v) = show v
…

The main.hs file just calls this splice and then does a trivial use of prettyF.

Compiling this with a variable number of constructors, with -O:

Constructors6.127.6
502.45s4.10s
1004.40s10.60s
2008.45s33.30s
40016.90s121.00s
80035.95s514.50s

As you can see, 6.12 looks liniar in the number of constructors, whereas 7.6 is not anymore (at least above 100 constructors).

Compiling without -O makes things sane again:

Constructors6.127.6
501.40s1.97s
1002.45s2.70s
2004.50s4.95s
4008.95s9.55s
80018.25s19.10s

After some more investigation, it turns out that just our function function (prettyF) with no automatic instances is cheap, compiling/deriving Eq/Show is also cheap, but Read is very expensive (all numbers with ghc 7.6):

ConstructorsNo instancesEqShowRead
500.75s0.90s1.20s2.95s
1000.85s1.00s1.70s6.80s
2001.20s1.45s2.85s19.15s
4002.05s2.50s5.40s64.45s
8004.30s5.40s11.65s259.40s

So I believe this is about optimisation of complex functions with many patterns/case statements. 6.12 seems nicely liniar, whereas 7.4/7.6 exhibit a much worse behaviour.

This might be a known issue, or even a non-issue ("Don't use Read for complex types"), but I thought it's better to report it. I don't have the skills to look at the core result, and see if it's different or not between 6.12/7.6, sorry.

Attachments

Generator.hs Download (1.2 KB) - added by iustin 6 months ago.
TH generator file
main.hs Download (117 bytes) - added by iustin 6 months ago.
main file (trivial)

Change History

Changed 6 months ago by iustin

TH generator file

Changed 6 months ago by iustin

main file (trivial)

  Changed 6 months ago by igloo

  • status changed from new to closed
  • difficulty set to Unknown
  • resolution set to duplicate

Thanks for the report. This looks like a duplicate of #7258.

follow-up: ↓ 4   Changed 6 months ago by simonpj

  • status changed from closed to new
  • resolution duplicate deleted

It may be a dup of #7258, but if it's true that the issue is solely concerned with deriving(Read) that would localise it much more than saying it's to do with DynFlags.lhs.

So if I understand it, simply having your data type with lots of constructors and deriving(Read) is enough to trigger this non-linear behaviour?

Simon

  Changed 6 months ago by igloo

Lots of named fields; see W2.hs in #7258.

in reply to: ↑ 2   Changed 6 months ago by iustin

Replying to simonpj:

It may be a dup of #7258, but if it's true that the issue is solely concerned with deriving(Read) that would localise it much more than saying it's to do with DynFlags.lhs. So if I understand it, simply having your data type with lots of constructors and deriving(Read) is enough to trigger this non-linear behaviour?

Indeed. The following conditions seem to be required:

  • many constructors (> ~100)
  • constructors should be record constructors, not normal ones
  • optimisations must be turned on

For normal constructors, I can't trigger it with even ~1500 constructors (it's a bit less than linear, but still very acceptable, e.g. 30s for 1600 normal constructors versus 259s for 800 record constructors).

  Changed 5 months ago by simonpj

One reason that the derived code was big was this. The derived Read code has lots of this

   do { ...
      ; Ident "foo" <- lexP
      ; Punc "=" <- lexP
      ;  ...

Each of these failable pattern matches generates a case expression with a call to error and an error string. This is very wasteful. Better instead to define in GHC.Read:

  expectP :: L.Lexeme -> ReadPrec ()

and use it thus

   do { ...
      ; expectP (Ident "foo")
      ; expectP (Punc "=")
      ;  ...

This makes the code significantly shorter. Without -O, and 200 constructors, the compiler itself allocates half as much as before.

This may or may not address the non-linearity, but it certainly improves Read instances.

commit 52e43004f63276c1342933e40a673ad25cf2113a
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date:   Fri Dec 21 17:39:33 2012 +0000

    Use expectP in deriving( Read )
    
    Note [Use expectP]   in TcGenDeriv
    ~~~~~~~~~~~~~~~~~~
    Note that we use
       expectP (Ident "T1")
    rather than
       Ident "T1" <- lexP
    The latter desugares to inline code for matching the Ident and the
    string, and this can be very voluminous. The former is much more
    compact.  Cf Trac #7258, although that also concerned non-linearity in
    the occurrence analyser, a separate issue.

There is an accompanying patch to base:

commit d9b6b25a30bfdaefb69c29dedb30eed06ae71e61
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date:   Fri Dec 21 17:40:08 2012 +0000

    Define GHC.Read.expectP and Text.Read.Lex.expect
    
    They are now used by TcGenDeriv

>---------------------------------------------------------------

 GHC/Read.lhs     |   29 ++++++++++++++++-------------
 Text/Read/Lex.hs |    7 ++++++-
 2 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/GHC/Read.lhs b/GHC/Read.lhs index c5024fc..c542274 100644
--- a/GHC/Read.lhs
+++ b/GHC/Read.lhs
@@ -32,7 +32,7 @@ module GHC.Read
   , lexDigits
 
   -- defining readers
-  , lexP
+  , lexP, expectP
   , paren
   , parens
   , list
@@ -270,12 +270,15 @@ lexP :: ReadPrec L.Lexeme
 -- ^ Parse a single lexeme
 lexP = lift L.lex
 
+expectP :: L.Lexeme -> ReadPrec ()
+expectP lexeme = lift (L.expect lexeme)
+
 paren :: ReadPrec a -> ReadPrec a
 -- ^ @(paren p)@ parses \"(P0)\"
 --      where @p@ parses \"P0\" in precedence context zero
-paren p = do L.Punc "(" <- lexP
-             x          <- reset p
-             L.Punc ")" <- lexP
+paren p = do expectP (L.Punc "(")
+             x <- reset p
+             expectP (L.Punc ")")
              return x
 
 parens :: ReadPrec a -> ReadPrec a
@@ -292,7 +295,7 @@ list :: ReadPrec a -> ReadPrec [a]
 -- using the usual square-bracket syntax.
 list readx =
   parens
-  ( do L.Punc "[" <- lexP
+  ( do expectP (L.Punc "[")
        (listRest False +++ listNext)
   )
  where
@@ -408,12 +411,12 @@ parenthesis-like objects such as (...) and [...] can be an argument to  instance Read a => Read (Maybe a) where
   readPrec =
     parens
-    (do L.Ident "Nothing" <- lexP
+    (do expectP (L.Ident "Nothing")
         return Nothing
      +++
      prec appPrec (
-        do L.Ident "Just" <- lexP
-           x              <- step readPrec
+        do expectP (L.Ident "Just")
+           x <- step readPrec
            return (Just x))
     )
 
@@ -427,7 +430,7 @@ instance Read a => Read [a] where
 
 instance  (Ix a, Read a, Read b) => Read (Array a b)  where
     readPrec = parens $ prec appPrec $
-               do L.Ident "array" <- lexP
+               do expectP (L.Ident "array")
                   theBounds <- step readPrec
                   vals   <- step readPrec
                   return (array theBounds vals) @@ -504,9 +507,9 @@ instance (Integral a, Read a) => Read (Ratio a) where
   readPrec =
     parens
     ( prec ratioPrec
-      ( do x            <- step readPrec
-           L.Symbol "%" <- lexP
-           y            <- step readPrec
+      ( do x <- step readPrec
+           expectP (L.Symbol "%")
+           y <- step readPrec
            return (x % y)
       )
     )
@@ -543,7 +546,7 @@ wrap_tup :: ReadPrec a -> ReadPrec a  wrap_tup p = parens (paren p)
 
 read_comma :: ReadPrec ()
-read_comma = do { L.Punc "," <- lexP; return () }
+read_comma = expectP (L.Punc ",")
 
 read_tup2 :: (Read a, Read b) => ReadPrec (a,b)
 -- Reads "a , b"  no parens!
diff --git a/Text/Read/Lex.hs b/Text/Read/Lex.hs index f5a07f1..8a64e21 100644
--- a/Text/Read/Lex.hs
+++ b/Text/Read/Lex.hs
@@ -22,7 +22,7 @@ module Text.Read.Lex
   , numberToInteger, numberToRational, numberToRangedRational
 
   -- lexer
-  , lex
+  , lex, expect
   , hsLex
   , lexChar
 
@@ -144,6 +144,11 @@ numberToRational (MkDecimal iPart mFPart mExp)  lex :: ReadP Lexeme  lex = skipSpaces >> lexToken
 
+expect :: Lexeme -> ReadP ()
+expect lexeme = do { skipSpaces 
+                   ; thing <- lexToken
+                   ; if thing == lexeme then return () else pfail }
+
 hsLex :: ReadP String
 -- ^ Haskell lexer: returns the lexed string, rather than the lexeme  hsLex = do skipSpaces

  Changed 5 months ago by simonpj

OK with these changes I now get this:

6.12.3 6.12.3 HEAD HEAD
#constructors Alloc (Mbytes) Time (s) Alloc (Mbytes) Time (s)
40 1075 1.7
80 1646 4 2184 5
160 3217 8 4862 10
320 6385 16 12242 23
640 12766 34 35009 60

So it still looks quite a bit less well-behaved than 6.12.3, for reasons I don't yet understand. But better than before.

  Changed 6 weeks ago by igloo

  • milestone set to 7.8.1
Note: See TracTickets for help on using tickets.