Ticket #6110 (closed bug: fixed)
Data.Vector.Unboxed performance regression of 7.4.1 relative to 7.0.4
| Reported by: | mdgabriel | Owned by: | pcapriotti |
|---|---|---|---|
| Priority: | high | Milestone: | 7.4.3 |
| Component: | Compiler | Version: | 7.4.1 |
| Keywords: | Vector Performance Regression | Cc: | |
| Operating System: | Linux | Architecture: | x86 |
| Type of failure: | Runtime performance bug | Difficulty: | Unknown |
| Test Case: | Blocked By: | ||
| Blocking: | Related Tickets: | #6111 |
Description
Problem
Severe Data.Vector.Unboxed performance regression in 7.4.1 relative to 7.0.4:
(Sum GHC 7.4.1)/(Sum GHC 7.0.4) ~ 2.4
System
GNU/Linux 3.2.0-24-generic 38-Ubuntu i386
Compilers
GHC 7.0.4
GHC 7.4.1
GCC 4.6.3 for a baseline
Main.hs
module Main where
import System.Environment (getArgs)
import qualified Data.Vector.Unboxed as U (generate, sum)
main :: IO ()
main = do args <- getArgs
if length args == 1
then putSum (read (head args) :: Int)
else error "need a count operand"
putSum :: Int -> IO ()
putSum cnt = let v = U.generate cnt (\i -> fromIntegral i :: Double)
s = U.sum v
in putStrLn ("Sum="++show s)
GHC compilation
ghc --version
7.4.1
ghc -O2 -Wall --make -o sum Main.hs
ghc --version
7.0.4
ghc -O2 -Wall --make -o sum Main.hs
Baseline csum.c
#include <libgen.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
unsigned long i, size;
double tot=0;
if (argc != 2)
{
(void)fprintf(stderr, "usage: %s size\n", basename(argv[0]));
return(1);
}
size = atol(argv[1]);
for(i = 0; i < size; i++) tot += (double)i;
(void)printf("Sum=%.15e\n", tot);
return(0);
}
GCC baseline compilation
gcc --version
4.6.3
gcc -O2 -Wall csum.c -o csum
Data: time sum-7.0.4 n
n seconds
100000000 0.74
200000000 1.46
300000000 2.24
400000000 2.94
500000000 3.70
600000000 4.40
700000000 5.14
800000000 5.89
900000000 6.62
1000000000 7.34
Data: time sum-7.4.1 n
n seconds
100000000 1.74
200000000 3.49
300000000 5.24
400000000 6.98
500000000 8.73
600000000 10.51
700000000 12.22
800000000 13.96
900000000 15.75
1000000000 17.51
Data: time csum-4.6.3 n
n seconds
100000000 1.04
200000000 2.10
300000000 3.12
400000000 4.19
500000000 5.23
600000000 6.26
700000000 7.32
800000000 8.37
900000000 9.41
1000000000 10.45
Linear in n
y is in seconds
GHC 7.0.4: y = (0.73/108) * n + 0.03
GCC 4.6.3: y = (1.04/108) * n + 0.03
GHC 7.4.1: y = (1.75/108) * n - 0.01
Severe performance regression:
GHC 7.4.1/GHC 7.0.4 ~ 1.75/0.73 ~ 2.4
Notes
1/ I discovered the problem in a slightly more complicated case when I recompiled a package that used some simple statisics. The sum of [0..(n-1)] was the simplest case that I imagined to demonstrate the problem.
2/ I tried a similar experiment with Data.List, Data.Array.Unboxed, Data.Vector.Storable.MMap, and Foreign.Marshal.Alloc. In all cases, the GHC 7.4.1 version was faster than the GHC 7.0.4 version.
3/ It is the same Data.Vector.Unboxed code in both cases compilied and installed separately for each version of the GHC compiler. Thus, the problem appears to be the interaction between Data.Vector.Unboxed and the 7.4.1 compiler that causes the performance regression.
4/ I am impressed that the GHC 7.0.4 sum is faster than the GCC 4.6.3 sum. I expected it to be close, but not faster. Given this impressive result, I certainly would hope that the same result can be recovered once again.
