Regression in optimisation time of functions with many patterns (6.12 to 7.4)?

mentioned in issue #7258

changed weight to 5

added Tbug Trac import labels

Attached file Generator.hs ($801).

TH generator file

Attached file main.hs ($802).

main file (trivial)

closed

Thanks for the report. This looks like a duplicate of #7258.

Trac metadata

Trac field	Value
Resolution	Unresolved → ResolvedDuplicate

reopened

It may be a dup of #7258, but if it's true that the issue is solely concerned with deriving(Read) that would localise it much more than saying it's to do with DynFlags.lhs.

So if I understand it, simply having your data type with lots of constructors and deriving(Read) is enough to trigger this non-linear behaviour?

Simon

Trac metadata

Trac field	Value
Resolution	ResolvedDuplicate → Unresolved

Lots of named fields; see W2.hs in #7258.

Replying to [ticket:7450#comment:66753 simonpj]:

It may be a dup of #7258, but if it's true that the issue is solely concerned with deriving(Read) that would localise it much more than saying it's to do with DynFlags.lhs.

So if I understand it, simply having your data type with lots of constructors and deriving(Read) is enough to trigger this non-linear behaviour?

Indeed. The following conditions seem to be required:

many constructors (> ~100)
constructors should be record constructors, not normal ones
optimisations must be turned on

For normal constructors, I can't trigger it with even ~1500 constructors (it's a bit less than linear, but still very acceptable, e.g. 30s for 1600 normal constructors versus 259s for 800 record constructors).

One reason that the derived code was big was this. The derived Read code has lots of this

   do { ...
      ; Ident "foo" <- lexP
      ; Punc "=" <- lexP
      ;  ...

Each of these failable pattern matches generates a case expression with a call to error and an error string. This is very wasteful. Better instead to define in GHC.Read:

  expectP :: L.Lexeme -> ReadPrec ()

and use it thus

   do { ...
      ; expectP (Ident "foo")
      ; expectP (Punc "=")
      ;  ...

This makes the code significantly shorter. Without -O, and 200 constructors, the compiler itself allocates half as much as before.

This may or may not address the non-linearity, but it certainly improves Read instances.

commit 52e43004f63276c1342933e40a673ad25cf2113a
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date:   Fri Dec 21 17:39:33 2012 +0000

    Use expectP in deriving( Read )
    
    Note [Use expectP]   in TcGenDeriv
    ~~~~~~~~~~~~~~~~~~
    Note that we use
       expectP (Ident "T1")
    rather than
       Ident "T1" <- lexP
    The latter desugares to inline code for matching the Ident and the
    string, and this can be very voluminous. The former is much more
    compact.  Cf Trac #7258, although that also concerned non-linearity in
    the occurrence analyser, a separate issue.

There is an accompanying patch to base:

commit d9b6b25a30bfdaefb69c29dedb30eed06ae71e61
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date:   Fri Dec 21 17:40:08 2012 +0000

    Define GHC.Read.expectP and Text.Read.Lex.expect
    
    They are now used by TcGenDeriv

>---------------------------------------------------------------

 GHC/Read.lhs     |   29 ++++++++++++++++-------------
 Text/Read/Lex.hs |    7 ++++++-
 2 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/GHC/Read.lhs b/GHC/Read.lhs index c5024fc..c542274 100644
--- a/GHC/Read.lhs
+++ b/GHC/Read.lhs
@@ -32,7 +32,7 @@ module GHC.Read
   , lexDigits
 
   -- defining readers
-  , lexP
+  , lexP, expectP
   , paren
   , parens
   , list
@@ -270,12 +270,15 @@ lexP :: ReadPrec L.Lexeme
 -- ^ Parse a single lexeme
 lexP = lift L.lex
 
+expectP :: L.Lexeme -> ReadPrec ()
+expectP lexeme = lift (L.expect lexeme)
+
 paren :: ReadPrec a -> ReadPrec a
 -- ^ @(paren p)@ parses \"(P0)\"
 --      where @p@ parses \"P0\" in precedence context zero
-paren p = do L.Punc "(" <- lexP
-             x          <- reset p
-             L.Punc ")" <- lexP
+paren p = do expectP (L.Punc "(")
+             x <- reset p
+             expectP (L.Punc ")")
              return x
 
 parens :: ReadPrec a -> ReadPrec a
@@ -292,7 +295,7 @@ list :: ReadPrec a -> ReadPrec [a]
 -- using the usual square-bracket syntax.
 list readx =
   parens
-  ( do L.Punc "[" <- lexP
+  ( do expectP (L.Punc "[")
        (listRest False +++ listNext)
   )
  where
@@ -408,12 +411,12 @@ parenthesis-like objects such as (...) and [...] can be an argument to  instance Read a => Read (Maybe a) where
   readPrec =
     parens
-    (do L.Ident "Nothing" <- lexP
+    (do expectP (L.Ident "Nothing")
         return Nothing
      +++
      prec appPrec (
-        do L.Ident "Just" <- lexP
-           x              <- step readPrec
+        do expectP (L.Ident "Just")
+           x <- step readPrec
            return (Just x))
     )
 
@@ -427,7 +430,7 @@ instance Read a => Read [a] where
 
 instance  (Ix a, Read a, Read b) => Read (Array a b)  where
     readPrec = parens $ prec appPrec $
-               do L.Ident "array" <- lexP
+               do expectP (L.Ident "array")
                   theBounds <- step readPrec
                   vals   <- step readPrec
                   return (array theBounds vals) @@ -504,9 +507,9 @@ instance (Integral a, Read a) => Read (Ratio a) where
   readPrec =
     parens
     ( prec ratioPrec
-      ( do x            <- step readPrec
-           L.Symbol "%" <- lexP
-           y            <- step readPrec
+      ( do x <- step readPrec
+           expectP (L.Symbol "%")
+           y <- step readPrec
            return (x % y)
       )
     )
@@ -543,7 +546,7 @@ wrap_tup :: ReadPrec a -> ReadPrec a  wrap_tup p = parens (paren p)
 
 read_comma :: ReadPrec ()
-read_comma = do { L.Punc "," <- lexP; return () }
+read_comma = expectP (L.Punc ",")
 
 read_tup2 :: (Read a, Read b) => ReadPrec (a,b)
 -- Reads "a , b"  no parens!
diff --git a/Text/Read/Lex.hs b/Text/Read/Lex.hs index f5a07f1..8a64e21 100644
--- a/Text/Read/Lex.hs
+++ b/Text/Read/Lex.hs
@@ -22,7 +22,7 @@ module Text.Read.Lex
   , numberToInteger, numberToRational, numberToRangedRational
 
   -- lexer
-  , lex
+  , lex, expect
   , hsLex
   , lexChar
 
@@ -144,6 +144,11 @@ numberToRational (MkDecimal iPart mFPart mExp)  lex :: ReadP Lexeme  lex = skipSpaces >> lexToken
 
+expect :: Lexeme -> ReadP ()
+expect lexeme = do { skipSpaces 
+                   ; thing <- lexToken
+                   ; if thing == lexeme then return () else pfail }
+
 hsLex :: ReadP String
 -- ^ Haskell lexer: returns the lexed string, rather than the lexeme  hsLex = do skipSpaces

OK with these changes I now get this:

	6.12.3	6.12.3	HEAD	HEAD
\#constructors	Alloc (Mbytes)	Time (s)	Alloc (Mbytes)	Time (s)
40			1075	1.7
80	1646	4	2184	5
160	3217	8	4862	10
320	6385	16	12242	23
640	12766	34	35009	60

So it still looks quite a bit less well-behaved than 6.12.3, for reasons I don't yet understand. But better than before.

changed milestone to %7.8.1

changed milestone to %7.10.1

Moving to 7.10.1

mentioned in issue #9669

removed milestone

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

Constructors	6.12	7.6
50	2.45s	4.10s
100	4.40s	10.60s
200	8.45s	33.30s
400	16.90s	121.00s
800	35.95s	514.50s

Constructors	6.12	7.6
50	1.40s	1.97s
100	2.45s	2.70s
200	4.50s	4.95s
400	8.95s	9.55s
800	18.25s	19.10s

Constructors	No instances	Eq	Show	Read
50	0.75s	0.90s	1.20s	2.95s
100	0.85s	1.00s	1.70s	6.80s
200	1.20s	1.45s	2.85s	19.15s
400	2.05s	2.50s	5.40s	64.45s
800	4.30s	5.40s	11.65s	259.40s

Trac field	Value
Version	7.6.1
Type	Bug
TypeOfFailure	OtherFailure
Priority	normal
Resolution	Unresolved
Component	Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture

Regression in optimisation time of functions with many patterns (6.12 to 7.4)?

Child items 0

Activity