Ticket #778 (closed bug: fixed)

Opened 6 years ago

Last modified 6 years ago

building with gcc-4.1.x causes ghc to enter infinite allocation loop

Reported by: guest Owned by:
Priority: normal Milestone: 6.4.3
Component: Runtime System Version: 6.4.2
Keywords: memory gcc-4 Cc:
Operating System: Linux Architecture: x86
Type of failure: Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description (last modified by simonmar) (diff)

I have the same program using both ghci and ghc. ghci gives:

   ___         ___ _
  / _ \ /\  /\/ __(_)
 / /_\// /_/ / /  | |      GHC Interactive, version 6.4.2, for Haskell 98.
/ /_\\/ __  / /___| |      http://www.haskell.org/ghc/
\____/\/ /_/\____/|_|      Type :? for help.

ghc-6.4.2: out of memory (requested 1048576 bytes)

and ghc gives:

ghc -ignore-package Cabal --make -Wall -fno-warn-unused-matches -cpp -i. -odir dist/tmp -hidir dist/tmp Setup.lhs -o setup
ghc-6.4.2: out of memory (requested 1048576 bytes)

I have 2 gigs of ram, and 3 of swap. When I run ghci, my free memory quickly drops from ~1.9 Gb to less than 10 Mb, at which point, I get the out of memory error. I'm running gentoo on my laptop, with gcc 4.1.1 and kernel 2.6.17-rc1-mm2

Change History

Changed 6 years ago by guest

*problem. not program. note to self: don't try to type after midnight.

Changed 6 years ago by guest

What Setup.lhs are you compiling exactly? Is it from a well known cabal package?

Changed 6 years ago by simonmar

  • milestone changed from 6.4.2 to 6.4.3

Does this happen with any ghc compilation, or just this particular one? Can you start up GHCi on its own?

If it is just this particular source file, can you send/attach the file?

Also, can you run 'strace -f' of the failing command and attach the output?

Changed 6 years ago by simonmar

  • cc gcc 4.1.1 & ghc 6.4.2 removed
  • description modified (diff)

Changed 6 years ago by guest

I couldn't start up ghci on it's own. I just reinstalled ghc-bin, so after I get ghc reinstalled in a few hours, I'll give run strace -f ghci. As for the cabal package, I'm using cabal-1.1.4, and the setup.lhs is:

#!/usr/bin/runhaskell
> module Main where
> import Distribution.Simple
> main :: IO ()
> main = defaultMain

Changed 6 years ago by guest

the results of strace are :  http://www.contrib.andrew.cmu.edu/~mjrosenb/ghci.txt.gz. to anyone who's looking to read through that, I would suggest stripping the SIGALRM's first. there's nearly a meg of them. I think I know what's going wrong. It may be a problem with ghc, and it may be a problem with gcc. When I compile with CFLAGS=-O3, I get these errors, and when I compile with CFLAGS=-O, it starts up normally (it still gives a meg of SIGALRM's though. Does anyone know why?). I think I'll report this on the gcc side as well. Also, is there any way to just attach a file, without having to link it in from my personal site?

Changed 6 years ago by simonmar

The SIGALRMs are normal, that's just GHC's interval timer (we shouldn't really be using SIGALRM, but that's another issue).

It seems that GHC is just allocating all the memory, and I don't know why. It's odd that changing CFLAGS makes a difference: we don't actually use CFLAGS in our build system, so I have no idea what effect setting it has.

The next step for debugging is to compile GHC with debugging support, and generate some debugging logs. In your build tree:

$ cd ghc/compiler
$ rm stage2/ghc-*
$ make stage=2 EXTRA_HC_OPTS=-debug
$ ./stage2/ghc-inplace --interactive +RTS -Ds -Sstderr 2>&1 | tee log

and send/attach the log file. (you should be able to attach files using the "Attach File" button above).

Changed 6 years ago by dcoutts@…

The Gentoo GHC ebuild does partially use the users CFLAGS settings.

It uses a filtered set of CLAGS for building the .c parts of GHC, ie the rts and an even more filtered set for building the .hs parts. -O2 and -O3 are currently allowed for the .c files. -O3 is only allowed if the user is running the testing ~x86 / ~amd64 profiles. For 'stable' x86 and amd64 profiles only -O2 is allowed for the .c files.

For building the .hs files we only pass through abi flags to gcc, eg things like -mcpu=ultrasparc.

So my suspicion in this case is that the bug is triggered by using -O3 in CFLAGS when emerging dev-lang/ghc. If so it will probably be due to the rts being compiled with gcc -O3. So either the rts is breaking some strict rule that gcc is relying on for correct optimisation (eg strict aliasing), or there is a mis-compilation bug in gcc-4.1.1.

So my reccomendation to the original bug reporter is to try re-emerging with -O2 rather than -O3 in CFLAGS and see if the problem persists. It would be helpful to know if that does fix the problem as if it does we'll change the ghc ebuild to downgrade -O3 to -O2 to prevent this problem. BTW, in future it would be better to report this kind of bug in the Gentoo bugzilla first (so we can investigate if it's due to using an unstable build environment and/or settings).

It might be interesting to try building the rts with gcc-4.1.1 and look to see if it warns about any strict aliasing issues.

Changed 6 years ago by guest

So i reompiled with -O2, and I still have the same problem. however, I do have some other flags in CFLAGS, such as

-fomit-frame-pointer -fgcse -fgcse-lm -fgcse-las

I don't know if you pass those into gcc for anything, but they're there

Changed 6 years ago by dcoutts@…

Could you please open a bug at bugs.gentoo.org and specify exactly what CFLAGS you used to build and any other details you think might be relevant, including gcc version etc.

The Gentoo ebuild does use a filtered set of CFLAGS when compiling the ghc rts.

Changed 6 years ago by dcoutts@…

  • keywords gcc-4 added
  • component changed from Compiler to Runtime System
  • architecture changed from Unknown to x86
  • summary changed from memory leak to building with gcc-4.1.1 causes ghc to enter infinite allocation loop

 http://bugs.gentoo.org/show_bug.cgi?id=135651

I can repoduce this bug using a stable gentoo x86 profile with gcc-4.1.1 and plain CFLAGS="-O2 -pipe".

The symptoms are the same, after compiling ghc-6.4.2 using ghc-6.4.2 with gcc-4.1.1 then upon running the freshly built ghc-6.4.2 it gets part way through initialisation and then goes into an infinite loop allocating more and more memory (in the usual 1Mb MBlocks) until the kernel refuses to allocate any more and the rts terminates.

I'm just testing with gcc-4.1.0 to see if it's present there too.

So it looks like this isn't just a bad combination of agressive CFLAGS but is going to be a real problem once gcc-4.x gets more popular.

So the 6.4.3 milestone seems appropriate, it'd be good to investigate it more thouroughly in that timeframe.

My suspicion is that gcc-4.1.1 is mis-compiling part of the rts, that or the rts is breaking some strict C rule that gcc is now relying upon (eg aliasing). So I'm changing the summary and component to match.

Changed 6 years ago by dcoutts@…

  • summary changed from building with gcc-4.1.1 causes ghc to enter infinite allocation loop to building with gcc-4.1.x causes ghc to enter infinite allocation loop

Also reproducable with gcc-4.1.0

Changed 6 years ago by simonmar

  • status changed from new to closed
  • resolution set to fixed

Now fixed. Annoyingly I'd already come across this bug in the HEAD and forgotten about it. The workaround is to compile GC.c in the RTS with -fno-strict-aliasing.

Note: See TracTickets for help on using tickets.