Ticket #2169 (closed bug: fixed)

Opened 5 years ago

Last modified 5 years ago

Compilation fails first time without giving an error, later succeeds without changing code

Reported by: nccb Owned by:
Priority: normal Milestone: 6.10.1
Component: Compiler Version: 6.8.2
Keywords: Cc:
Operating System: Linux Architecture: x86_64 (amd64)
Type of failure: Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

There is one particular module in our Haskell project that causes GHC to act oddly. We compile using GHC --make Main, and compilation on this particular module often fails on the first attempt. There are no errors given, just a lot of warnings of the form "Defined but not used: ..." (we are *not* using -Werror anywhere). All these warnings are about unused bindings (e.g. foo x y = x, gives unused warning about y) -- even though we don't use the -fwarn-unused-matches option anywhere, nor -Wall.

If you then try to execute the same command several times, it usually succeeds on around the fourth attempt, without altering anything whatsoever. This bug is persistent -- it happens often, usually from a clean start (of compiling the whole project).

The command used for GHC is:

ghc $(GHC_OPTS) -o tock$(EXEEXT) --make Main -odir obj -hidir obj

where:

GHC_OPTS = \
	-fglasgow-exts \
	-fwarn-deprecations \
	-fwarn-duplicate-exports \
	-fwarn-incomplete-record-updates \
	-fwarn-missing-fields \
	-fwarn-missing-methods \
	-fwarn-missing-signatures \
	-fwarn-overlapping-patterns \
	-fwarn-simple-patterns \
	-fwarn-type-defaults \
	-fwarn-unused-binds \
	-fwarn-unused-imports \
	-ibackends -ichecks -icommon -idata -iflow -ifrontends -ipass -itransformations \
	-v -dcore-lint -XUndecidableInstances -fwarn-tabs -fwarn-monomorphism-restriction

(The -v and -dcore-lint flags were added to try and track down the bug). Output from the failed compilation is as follows:

compile: input file flow/FlowGraph.hs
*** Checking old interface for main:FlowGraph:
[42 of 51] Compiling FlowGraph        ( flow/FlowGraph.hs, obj/FlowGraph.o )
*** Parser:
*** Renamer/typechecker:

flow/FlowGraph.hs:274:10: Warning: Defined but not used: `m'

flow/FlowGraph.hs:322:26: Warning: Defined but not used: `m'

flow/FlowGraph.hs:334:40: Warning: Defined but not used: `m'

flow/FlowGraph.hs:339:36: Warning: Defined but not used: `m'

flow/FlowGraph.hs:339:38: Warning: Defined but not used: `rep'

flow/FlowGraph.hs:357:31:
    Warning: Defined but not used: `outStartA'

flow/FlowGraph.hs:375:27: Warning: Defined but not used: `m'

flow/FlowGraph.hs:383:26: Warning: Defined but not used: `str'

flow/FlowGraph.hs:383:31: Warning: Defined but not used: `route'

flow/FlowGraph.hs:391:27: Warning: Defined but not used: `m'

flow/FlowGraph.hs:398:26: Warning: Defined but not used: `m'

flow/FlowGraph.hs:415:49: Warning: Defined but not used: `m'

flow/FlowGraph.hs:419:19: Warning: Defined but not used: `pId'

flow/FlowGraph.hs:419:24: Warning: Defined but not used: `nStart'

flow/FlowGraph.hs:419:32: Warning: Defined but not used: `nEnd'

flow/FlowGraph.hs:419:46: Warning: Defined but not used: `m'

flow/FlowGraph.hs:430:19: Warning: Defined but not used: `pId'

flow/FlowGraph.hs:430:24: Warning: Defined but not used: `nStart'

flow/FlowGraph.hs:430:32: Warning: Defined but not used: `nEnd'

flow/FlowGraph.hs:443:19: Warning: Defined but not used: `pId'

flow/FlowGraph.hs:443:24: Warning: Defined but not used: `nStart'

flow/FlowGraph.hs:443:32: Warning: Defined but not used: `nEnd'

flow/FlowGraph.hs:509:46: Warning: Defined but not used: `m'

flow/FlowGraph.hs:557:20: Warning: Defined but not used: `m'
*** Deleting temp files:
Deleting: /tmp/ghc2747_0/ghc2747_0.s
Warning: deleting non-existent /tmp/ghc2747_0/ghc2747_0.s
Upsweep partially successful.
*** Deleting temp files:
Deleting: 
link(batch): upsweep (partially) failed OR
   Main.main not exported; not linking.
*** Deleting temp files:
Deleting: /tmp/ghc2747_0/ghc2747_1.hscpp /tmp/ghc2747_0/ghc2747_0.hscpp
*** Deleting temp dirs:
Deleting: /tmp/ghc2747_0
make[1]: *** [tock] Error 1
make[1]: Leaving directory `/home/neil/work/tock/branches/ghc-bug'
make: *** [all] Error 2

When the compilation later succeeds the output is as follows:

compile: input file flow/FlowGraph.hs
*** Checking old interface for main:FlowGraph:
[42 of 51] Compiling FlowGraph        ( flow/FlowGraph.hs, obj/FlowGraph.o )
*** Parser:
*** Renamer/typechecker:
*** Desugar:
    Result size = 11538
*** Core Linted result of Desugar:
*** Simplify:
    Result size = 8648
*** Core Linted result of Simplifier phase 0, iteration 1 out of 4:
    Result size = 7809
*** Core Linted result of Simplifier phase 0, iteration 2 out of 4:
    Result size = 7797
*** Core Linted result of Simplifier phase 0, iteration 3 out of 4:
    Result size = 7797
*** Core Linted result of Simplify phase 0 done:
*** Tidy Core:
    Result size = 7959
*** Core Linted result of Tidy Core:
writeBinIface: 37 Names
writeBinIface: 167 dict entries
*** CorePrep:
    Result size = 9571
*** Core Linted result of CorePrep:
*** Stg2Stg:
*** CodeGen:
*** CodeOutput:
*** Assembler:
gcc -march=athlon64 -Wa,--noexecstack -Iflow -c /tmp/ghc3185_0/ghc3185_0.s -o obj/FlowGraph.o
*** Deleting temp files:
Deleting: /tmp/ghc3185_0/ghc3185_0.s

The code in question is from  http://offog.org/darcs/tock/ (version as of 22/3/08 4pm) -- I don't think it would easy to isolate the problem. The module in question (flow/FlowGraph.hs) can be found there.

Attachments

ghc-bug-tock.tar.bz2 Download (223.4 KB) - added by nccb 5 years ago.
Tarball of the sources to reproduce the problem

Change History

Changed 5 years ago by ajd

I darcs get'd the mentioned source tree and couldn't reproduce the bug (albeit on 32-bit Linux). I did get a type error on the "frontends/ParseOccam" module, though...

Changed 5 years ago by igloo

  • difficulty set to Unknown
  • milestone set to 6.8.3

Thanks for the report!

You are actually using -Werror in the generated data/OrdAST.hs. This shouldn't cause compilation of FlowGraph? to fail, but I suspect it is the culprit nevertheless. You should be able to confirm that by removing the -Werror and seeing if it fixes it.

Can you please make a tarball of just Haskell source files, so that we only need to run a ghc --make ... command to reproduce the problem?

It would also make our lives a lot easier if you could remove dependencies on non-boot-libraries, and make the example as small as possible. The easiest way to do both of these is to replace function definitions with undefined and then remove unused functions, checking that the problem hasn't fixed itself as you go.

Thanks

Ian

Changed 5 years ago by nccb

Sorry about that - I had forgotten about the auto-generated modules having custom errors. I think you have spotted it; FlowGraph? is compiled directly after OrdAST, and it seems that sometimes the per-module settings from OrdAST have "bled through" into FlowGraph? (we are using -fwarn-unused-matches in OrdAST, sorry for not noticing that in the original report). What seems especially odd is that FlowGraph? can fail a second time, when OrdAST has already been compiled (some problem with temporary files?). I'll try and trim it down into a neater testcase.

Changed 5 years ago by nccb

Tarball of the sources to reproduce the problem

Changed 5 years ago by nccb

I'm having real trouble slimming down the testcase. If I remove modules from the build process, the error goes away. If I remove some functions, it also disappears. Here is what I am fairly certain of:

* Options in an OPTIONS_GHC pragma in data/OrdAST.hs are being applied to flow/FlowGraph.hs, as long as it is the next module in the build process for --make.

* The error occurs even though flow/FlowGraph.hs doesn't even import data/OrdAST.hs

* The error always occurs if you add a function (and thus change the automatically-determined export list) in a module they both depend on, such as common/Utils.hs and recompile

* The error can occur even when data/OrdAST.hs is not recompiled (presumably it is still scanned for the module dependencies)

* The error seems to disappear when data/OrdAST.hs exports a function; the error only occurs when data/OrdAST.hs exports nothing at all, or exports only type-class instances (for types not in its view).

* Removing module imports from Main that are normally compiled before data/OrdAST.hs and flow/FlowGraph.hs, and that are directly impored by neither, can make the error go away.

* The error seems to only occur if both modules import *qualified* AST, but as different things (e.g. import qualified AST, versus import qualified AST as A). I think every other module in the code imports AST as A, so that may be a bit of a confounding factor too.

* If I move all the modules from their subdirectories into the main directory, the problem disappears

I had suspected the problem was to do with the orphan instance declarations in data/OrdAST.hs but the error seems to occur even with the instance declarations removed. My best guess now is that the problem is somehow to do with the qualified imports, not exporting anything, the OPTIONS_GHC pragma and seems to be very specific to the build order.

I've attached the tarball. Extract the files and run "sh compile.sh". If that doesn't produce the error immediately, let the compilation finish and then in common/Utils.hs, uncomment/comment (it's the toggling that does it) the two "foo" lines near the top of the file. Then re-run compile.sh. On my machine this will then produce the problem every time. If you re-run compile.sh until compilation succeeds, you can then toggle the lines in common/Utils.hs again to produce the problem again.

Changed 5 years ago by igloo

Thanks for the tarball, and your investigations!

Unfortunately, using a Debian-compiled ghc:

$ ghc --version                   
The Glorious Glasgow Haskell Compilation System, version 6.8.2

the first time, and after the first and second toggles, I just get this:

[41 of 51] Compiling OrdAST           ( data/OrdAST.hs, obj/OrdAST.o )
[42 of 51] Compiling FlowGraph        ( flow/FlowGraph.hs, obj/FlowGraph.o )
[43 of 51] Compiling UsageCheckUtils  ( checks/UsageCheckUtils.hs, obj/UsageCheckUtils.o )

This is a really irritating bug. I've seen it before, but never been able to get a reproducible testcase.

If you unpack the tarball in a different directory, with different length, can you still reproduce it?

Do you remember where your GHC binary came from?

Changed 5 years ago by nccb

If I extract the tarball to a fresh new directory, I get the bug first-time. I'm not quite sure what you mean by "different length". My GHC is installed via Gentoo's package manager. I seem to remember it works by downloading a binary bootstrap of the latest GHC, then compiling the latest GHC with it for my machine in specific. It should therefore be a native x86_64 build. I don't have any particularly crazy flags set for it (not sure if the CFLAGS would be relevant anyway, but they are -O2 -march=athlon64).

$ ghc -v       
Glasgow Haskell Compiler, Version 6.8.2, for Haskell 98, stage 2 booted by GHC version 6.8.2
Using package config file: /usr/lib64/ghc-6.8.2/package.conf
wired-in package base mapped to base-3.0.1.0
wired-in package rts mapped to rts-1.0
wired-in package haskell98 mapped to haskell98-1.0.1.0
wired-in package template-haskell mapped to template-haskell-2.2.0.0
wired-in package ndp not found.
Hsc static flags: -static

Changed 5 years ago by simonmar

I can't reproduce it either (on x86).

I looked at the code in GHC and can't see any obvious bugs in this area. The way that the bug seems to be hard to reproduce indicates that it is based on something fragile - perhaps some runtime mutation with unsafePerformIO, or some strange RTS bug. But it looks like it might only be reproducible on x86_64 - Ian, do you have a 64-bit box to test it on? I don't have easy access to one right now.

Changed 5 years ago by igloo

My tests higher up were on amd64/Linux (Debian). Unfortunately, Duncan's amd64/Linux (Gentoo) box is broken at the moment, so he can't try it there.

Changed 5 years ago by igloo

  • milestone changed from 6.8.3 to _|_

We really can't do anything about this without a way to reproduce it, so I'm putting it in the _|_ milestone.

Changed 5 years ago by igloo

  • milestone changed from _|_ to 6.10.1

Here's main/HeaderInfo.hs:getOptionsFromFile with some extra debugging prints:

getOptionsFromFile :: DynFlags
                   -> FilePath            -- input file
                   -> IO [Located String] -- options, if any
getOptionsFromFile dflags filename
    = Control.Exception.bracket
          (openBinaryFile filename ReadMode)
              (hClose)
              (\handle ->
                   do buf <- hGetStringBufferBlock handle blockSize
                      opts <- loop handle buf
                      if showSDoc (ppr opts) == "[[-, X, C, P, P], [-, f, v, e, c, t, o, r, i, s, e]]"
                          then do print ("B4", filename)
                                  print ("B5", lexemeToString buf (len buf))
                                  putStrLn "B6"
                          else return ()
                      return opts)
    where blockSize = 1024
          loop handle buf
              | len buf == 0 = return []
              | otherwise
              = do
                 -- print ("B1", lexemeToString buf (len buf))
                 case getOptions' dflags buf filename of
                  (Nothing, opts) ->
                      do print ("B2", showSDoc (ppr opts))
                         return opts
                  (Just buf', opts) -> do print ("B3", showSDoc (ppr opts))
                                          nextBlock <- hGetStringBufferBlock handle blockSize
                                          newBuf <- appendStringBuffers buf' nextBlock
                                          if len newBuf == len buf
                                             then return opts
                                             else do opts' <- loop handle newBuf
                                                     return (opts++opts')

and here's some output while compiling dph:

V2
("B3","[[-, X, C, P, P]]")
("B2","[[-, f, v, e, c, t, o, r, i, s, e]]")
("B4","./Data/Array/Parallel/Lifted/PArray.hs")
("B5","{-# LANGUAGE CPP #-}{-# OPTIONS -fvectorise #-}\nmodule Data.Array.Parallel.Prelude.Int (\n  P.Int, (+), (-), (*), div, mod, intSquareRoot, enumFromToP, intSumP, \n  (==), (/=), (<=), (<), (>=), (>)\n) where\n\nimport Data.Array.Parallel.Prelude.Base\nimport Data.Array.Parallel.Prelude.Base.Int\n\nimport qualified Prelude as P\nimport Prelude (Int)\n\ninfixl 7 *\ninfixl 6 +, -\ninfix 4 ==, /=, <, <=, >, >=\n\n(==), (/=), (<), (<=), (>), (>=) :: Int -> Int -> P.Bool\n(==) = eq\n(/=) = neq\n(<) = lt\n(<=) = le\n(>) = gt\n(>=) = ge\n\n(*) :: Int -> Int -> Int\n(*) = mult\n\n(+) :: Int -> Int -> Int\n(+) = plus\n\n(-) :: Int -> Int -> Int\n(-) = minus\n\ndiv:: Int -> Int -> Int\ndiv = intDiv\n\nmod:: Int -> Int -> Int\nmod = intMod\n?1\NUL\STX+\NUL\NUL\NULw\NUL\SOH\NUL\NUL\NUL\NUL\NUL\NUL!\NUL\STX+\NUL\NUL\t\NUL0\NUL\STX+\NUL\NULIz\NUL\SOH\NUL\NUL\NUL\NUL\n\NUL!\NUL\STX+\NUL\NUL\NUL\NUL]\NUL\NUL\NUL\NUL\NUL!\NUL!\NUL\STX+\NUL\NUL!\NUL\NUL\NUL\NUL\NUL\NUL\NUL@\NUL]\NUL\NUL\NUL\NUL\NULQ\NUL!\NUL\STX+\NUL\NUL9\NUL!\NUL\STX+\NUL\NUL\NUL\674\NUL\NUL\NUL\NUL\NULx\NUL!\NUL\STX+\NUL\NULi\NUL!\NUL\STX+\NUL\NUL\NUL\738\NUL\NUL\NUL\NUL\NULk#\NUL\STX+\NUL\NUL\EOT\NUL\NUL\NUL\NUL\NUL\NUL\NUL{\STX\NUL\NUL\NUL\NUL\NUL\NUL|\STX\NUL\NUL\NUL\NUL\NUL\NUL|1\NUL\STX+\NUL\NULI\NUL+\NUL\STX+\NUL\NUL\NUL\NUL!\NUL\STX+\NUL\NUL\NUL\802\NUL\NUL\NUL\NUL\NULk#\NUL\STX+\NUL\NUL\EOT\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\STX\NUL\NUL\NUL\NUL\NUL\NUL\NUL\738\NUL\NUL\NUL\NUL\NULk#\NUL\STX+\NUL\NUL\EOT\NUL\NUL\NUL\NUL\NUL\NUL\NUL|\STX\NUL\NUL\NUL\NUL\NUL\NUL}\STX\NUL\NUL\NUL\NUL\NUL\NUL\NUL1\NUL\STX+\NUL\NUL\NULy\NUL\SOH\NUL\NUL\NUL\NUL\NUL\NUL!\NUL\STX+\NUL\NUL\NULh1\NUL\STX+\NUL\NUL9z\NUL\SOH\NUL\NUL\NUL\NUL\NUL\NUL!\NUL\STX+\NUL\NUL)J1\NUL\STX+\NUL\NUL\NULw\NUL\SOH\NUL\NUL\NUL\NUL\NUL\NUL!\NUL")
B6

So we've tried to read Data/Array/Parallel/Lifted/PArray.hs but instead we have got {-# LANGUAGE CPP #-} followed by the contents of Data/Array/Parallel/Prelude/Int.hs.

We don't know quite what's going on yet, but we're closer now, so I'm optimistically putting this bug back into the 6.10.1 milestone.

I think it's likely that #2240 is caused by the same thing.

Changed 5 years ago by simonmar

  • status changed from new to closed
  • resolution set to fixed

Thanks to Ian's investigations, I finally found the culprit.

Mon Jul  7 10:58:36 BST 2008  Simon Marlow <marlowsd@gmail.com>
  * FIX #1736, and probably #2169, #2240
  appendStringBuffer was completely bogus - the arguments to copyArray
  were the wrong way around, which meant that corruption was very likely
  to occur by overwriting the end of the buffer in the first argument.
  
  This definitely fixes #1736.  The other two bugs, #2169 and #2240 are
  harder to reproduce, but we can see how they could occur: in the case
  of #2169, the options parser is seeing the contents of an old buffer,
  and in the case of #2240, appendStringBuffer is corrupting an
  interface file in memory, since strng buffers and interface files are
  both allocated in the pinned region of memory.
Note: See TracTickets for help on using tickets.