Ticket #1843 (closed bug: fixed)

Opened 10 months ago

Last modified 5 months ago

ghc 6.8.1 broken on Mac OS X Leopard PPC

Reported by: guest Assigned to:
Priority: high Milestone: 6.8.2
Component: Compiler Version: 6.8.1
Severity: critical Keywords:
Cc: alfonso.acosta@gmail.com Difficulty: Unknown
Test Case: Architecture: powerpc
Operating System: MacOS X

Description

File fail.hs:

import Data.Word
import Data.Binary.Put
import qualified Data.ByteString.Lazy as B

main = B.putStrLn (runPut (putWord8 0x32))
nick@frost ~/P/growlnet> ghc -o fail fail.hs --make -O ; ./fail
Linking fail ...
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
fish: Job 1, “./fail” terminated by signal SIGSEGV (Address boundary error)

(note, 'fish' is the shell I use... http://fishshell.org/)

Also fails without -O, with -fasm, and with -fvia-c

The GHC I'm using was built by scsibug, available at http://scsibug.com/

on #haskell, everyone else reports that this code works correctly for them (different architectures)

-Nick Burlett nickburlett@mac.com

Attachments

ST__15.o (1.0 kB) - added by thorkilnaur on 11/25/07 04:14:03.
Smallest .o file producing the scattered message from the 10.5 Leopard ld
GHC_trac_1843.tar.gz (7.5 kB) - added by thorkilnaur on 11/26/07 23:02:18.
I3.hs and derived files that illustrate the problem

Change History

11/06/07 22:55:48 changed by scsibug

I've tried the code on 10.4/PPC and 10.5/x86 with the same build, no issues, but don't yet have 10.5/PPC to test with.

11/06/07 22:57:33 changed by guest

Additional info:

nick@frost ~/P/binary-0.4.1> ghc-pkg list
/usr/local/lib/ghc-6.8.1/package.conf:
    ALUT-2.1.0.0, Cabal-1.2.2.0, Crypto-4.0.3, GLUT-2.1.1.1,
    HUnit-1.2.0.0, OpenAL-1.3.1.1, OpenGL-2.2.1.1, QuickCheck-1.1.0.0,
    X11-1.2.3.1, array-0.1.0.0, base-3.0.0.0, binary-0.4.1,
    bytestring-0.9.0.1, cgi-3001.1.5.1, containers-0.1.0.0,
    directory-1.0.0.0, fgl-5.4.1.1, filepath-1.1.0.0, (ghc-6.8.1),
    haskell-src-1.0.1.1, haskell98-1.0.1.0, hpc-0.5.0.0, html-1.0.1.1,
    mtl-1.1.0.0, network-2.1.0.0, old-locale-1.0.0.0, old-time-1.0.0.0,
    packedstring-0.1.0.0, parallel-1.0.0.0, parsec-2.1.0.0,
    pretty-1.0.0.0, process-1.0.0.0, random-1.0.0.0, readline-1.0.1.0,
    regex-base-0.72.0.1, regex-compat-0.71.0.1, regex-posix-0.72.0.1,
    rts-1.0, stm-2.1.1.0, template-haskell-2.2.0.0, time-1.1.2.0,
    unix-2.2.0.0, xhtml-3000.0.2.1

11/07/07 10:52:26 changed by dons

  • difficulty set to Unknown.
  • summary changed from ghc 6.8.1 broken on Mac OS X Leopard to ghc 6.8.1 broken on Mac OS X Leopard PPC.

(follow-up: ↓ 9 ) 11/08/07 01:43:55 changed by ChrisKuklewicz

I attempted to make 6.8.1 on a G4 with OS X 10.5 and XCode 3.0

The stage1 compiler seems to run, but the I got _many_ of those "unknown scattered relocation type 4" errors. The stage2 compiler that was installed simply segfaults when run.

If I tried to compile the extra src tarball at the same time then it would die in the parsec package during compilation. But that might be an unrelated error.

11/08/07 07:20:19 changed by igloo

  • milestone set to 6.8.2.

Thanks for the report!

11/08/07 07:27:52 changed by simonmar

  • priority changed from normal to high.

11/09/07 05:37:37 changed by ChrisKuklewicz

This *might* be related to bug #1845 on OS X 10.4 (Tiger) on ppc.

11/09/07 06:01:35 changed by guest

I don't have the above problem with OS X 10.4 (Tiger) on ppc with #1845, Christian

./fail
2

(but may other jumps are wrong on Leopard.)

Does anyone have a ghc-6.8.1 on ppc without #1845?

(in reply to: ↑ 4 ; follow-up: ↓ 10 ) 11/09/07 19:25:14 changed by mokus

Replying to ChrisKuklewicz:

I attempted to make 6.8.1 on a G4 with OS X 10.5 and XCode 3.0 The stage1 compiler seems to run, but the I got _many_ of those "unknown scattered relocation type 4" errors. The stage2 compiler that was installed simply segfaults when run. If I tried to compile the extra src tarball at the same time then it would die in the parsec package during compilation. But that might be an unrelated error.

I have had the same experience on my PowerBook? G4 (building on Leopard using XCode 3 and ghc-6.6.1). Regarding the parsec build issue: based on the output when '-v' is added to the command-line that fails, the error seems to be related to the '-split-objs' option somehow making ld blow up. Removing '-O' from the command line apparently changes the output enough that it doesn't trigger the problem.

I have no hypothesis regarding the segfault issue, but I can confirm that when building either 6.8.1 or head, my stage1 builds pretend to work, but my stage2 compilers are deader than doornails. Anything I can do to provide more useful info?

(in reply to: ↑ 9 ) 11/09/07 20:06:41 changed by mokus

Replying to mokus:

I have no hypothesis regarding the segfault issue, but I can confirm that when building either 6.8.1 or head, my stage1 builds pretend to work, but my stage2 compilers are deader than doornails. Anything I can do to provide more useful info?

As I typed this, I realized I hadn't actually tested the stage1 ghc-inplace to confirm that it regularly produces vegetables. I have now done so. Simple 'hello world'-type stuff works, but just about nothing else does. Here's a pretty simple program that segfaults when run after being compiled by stage1/ghc-inplace:

{-
 -	"primes.hs"
 -}

module Main where

import System

primes = 2 : 3: 5 : (filter isPrime [7,9..])

-- x %= y: "x divides y"
x %= y = (y `rem` x) == 0

isPrime x
	| x <= 1		= False
	| isComposite x		= False
	| otherwise		= True
isComposite x = any (%= x) (takeWhile (\p -> p^2 <= x) primes)

main = do
	args <- getArgs
	let n = read (args !! 0)
	print (primes !! (n - 1))

Some interesting facts: "otool -r" knows of no relocations in the compiled file. There are 14 listed in the working version compiled with ghc-6.6.1. When I compiled with ghc-inplace, there were 8 of the infamous "unknown scattered relocation type 4" messages.

11/13/07 08:59:59 changed by ChrisKuklewicz

I installed the binary distribution prepared by Greg Heartsfield on OS X 10.5 on a G4 powerbook.

ghci starts (hooray). The compiler does not segfault.

I am compiling the simplest test.hs which contains "main = return ()" or "main = undefined". I am compiling with "ghc -v5 -fasm test.hs" and "ghc -v5 -fvia-C test.hs" . The same error occurs with both options.

The ghc compiler needs the linker (ld64) from gcc (the version from XCode 3.0) in the guise of the "/usr/libexec/gcc/powerpc-apple-darwin9/4.0.1/collect2" binary. This program is the one that prints the "unknown scattered relocation type 4" error.

A search of google has this "unknown scattered relocation" in the google cache for the code for /ld64/src/Readers/ObjectFileMachO.cpp

So I would say the error is definitely due to the upgrading of gcc and the linker. If there is any diagnostic that I could do that would help, let me know at < haskell @at@ list .dot. mightyreason .dot. com >

11/21/07 09:56:46 changed by ChrisKuklewicz

The source code of ld64-77 used by OS X 10.5 shows that it produces the "unknown scatter relocation type 4" where the 4 is PPC_RELOC_HI16. Note that ld64-77 seems capable of writing scattered PPC_RELOC_HI16 but not reading them.

I am currently reading the binary ABI at http://developer.apple.com/documentation/DeveloperTools/Conceptual/MachORuntime/Reference/reference.html to see if it is going to enlighten me on this bug.

11/25/07 04:11:23 changed by thorkilnaur

On a PPC Mac OS X 10.5 Leopard, I can reproduce the basic problem:

thorkil-naurs-mac-mini:1843 thorkilnaur$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 6.8.1
thorkil-naurs-mac-mini:1843 thorkilnaur$ ghc --make Test1.hs
[1 of 1] Compiling Main             ( Test1.hs, Test1.o )
Linking Test1 ...
unknown scattered relocation type 4
unknown scattered relocation type 4
unknown scattered relocation type 4
thorkil-naurs-mac-mini:1843 thorkilnaur$ cat Test1.hs
main = putStr "Here I am for #1843 2007-Nov-18 10.32\n"
thorkil-naurs-mac-mini:1843 thorkilnaur$ 

This uses the binary distribution http://haskell.org/ghc/dist/6.8.1/maeder/ghc-6.8.1-powerpc-apple-darwin.tar.bz2 by Christian Maeder.

To figure out where these scattered messages come from, we do:

thorkil-naurs-mac-mini:1843 thorkilnaur$ rm Test1.o
thorkil-naurs-mac-mini:1843 thorkilnaur$ ghc --make Test1.hs -optl-t 2>&1 | awk '/scatter/{print l}{l=$0}'
/Users/thorkilnaur/tn/install/ghc-6.8.1/lib/ghc-6.8.1/lib/base-3.0.0.0/libHSbase-3.0.0.0.a(Conc__128.o)
/Users/thorkilnaur/tn/install/ghc-6.8.1/lib/ghc-6.8.1/lib/base-3.0.0.0/libHSbase-3.0.0.0.a(Conc__32.o)
/Users/thorkilnaur/tn/install/ghc-6.8.1/lib/ghc-6.8.1/lib/base-3.0.0.0/libHSbase-3.0.0.0.a(TopHandler__19.o)
thorkil-naurs-mac-mini:1843 thorkilnaur$ 

printing the line before each scattered message of the detailed (-optl-t) linker output. So it appears that the base library of this binary ghc-6.8.1 distribution contains compiled code that cannot be interpreted properly by the Leopard ld linker.

To get an easier grip on the situation, I have split libHSbase-3.0.0.0.a into separate .o files and checked each, using /usr/bin/ld -r, to see whether the scattered message is produced. 94 of the about 9800 files in this library have this problem. For example:

thorkil-naurs-mac-mini:ar thorkilnaur$ /usr/bin/ld -r Conc__128.o 
unknown scattered relocation type 4
thorkil-naurs-mac-mini:ar thorkilnaur$ 

If I try this on a PPC Mac OS X 10.4 Tiger, the problem disappears:

Thorkil-Naurs-Computer:~/tn/test/GHC/MacOSX/10.5Leopard/GHC6.8.1OnPPCOSX10.5Leopard/work/ar thorkilnaur$ /usr/bin/ld -r Conc__128.o 
Thorkil-Naurs-Computer:~/tn/test/GHC/MacOSX/10.5Leopard/GHC6.8.1OnPPCOSX10.5Leopard/work/ar thorkilnaur$

This indicates, as others have pointed out earlier, that some change between 10.4 and 10.5 has caused the interpretation of .o files to be changed, perhaps erroneously.

To get more information about what goes on, I have selected the smallest of the .o files with this problem (it is ST__15.o) and used the otool (suggested earlier by mokus, thanks) to dissect it:

thorkil-naurs-mac-mini:ar thorkilnaur$ /usr/bin/ld -r ST__15.o 
unknown scattered relocation type 4
thorkil-naurs-mac-mini:ar thorkilnaur$ otool -r ST__15.o 
ST__15.o:
Relocation information (__TEXT,__text) 9 entries
address  pcrel length extern type    scattered symbolnum/value
00000020 0     2      n/a    8       1         0x00000010
00000000 0     2      n/a    1       1         0x00000000
0000001c 1     2      1      3       0         7
00000018 0     2      n/a    5       1         0x0000002c
00000000 0     2      0      1       0         16777215
00000014 0     2      n/a    4       1         0x0000002c
0000002d 0     2      0      1       0         16777215
00000000 0     2      n/a    8       1         0x00000024
00000000 0     2      n/a    1       1         0x00000010
Relocation information (__DATA,__const) 2 entries
address  pcrel length extern type    scattered symbolnum/value
00000004 0     2      0      0       0         3
00000000 0     2      1      0       0         6
Relocation information (__DATA,__data) 6 entries
address  pcrel length extern type    scattered symbolnum/value
00000018 0     2      0      0       0         1
00000010 0     2      n/a    0       1         0x00000044
0000000c 0     2      1      0       0         8
00000008 0     2      1      0       0         9
00000004 0     2      1      0       0         10
00000000 0     2      1      0       0         5
thorkil-naurs-mac-mini:ar thorkilnaur$ otool -rv ST__15.o 
ST__15.o:
Relocation information (__TEXT,__text) 9 entries
address  pcrel length extern type    scattered symbolnum/value
00000020 False long   n/a    SECTDIF True      0x00000010
         False long   n/a    PAIR    True      0x00000000
0000001c True  long   True   BR24    False     _base_GHCziBase_zddmfail_info
00000018 False long   n/a    LO16    True      0x0000002c
         False long   False  PAIR    False     half = 0x0000
00000014 False long   n/a    HI16    True      0x0000002c
         False long   False  PAIR    False     half = 0x002d
00000000 False long   n/a    SECTDIF True      0x00000024
         False long   n/a    PAIR    True      0x00000010
Relocation information (__DATA,__const) 2 entries
address  pcrel length extern type    scattered symbolnum/value
00000004 False long   False  VANILLA False     3 (__DATA,__data)
00000000 False long   True   VANILLA False     _base_GHCziBase_zddmfail_closure
Relocation information (__DATA,__data) 6 entries
address  pcrel length extern type    scattered symbolnum/value
00000018 False long   False  VANILLA False     1 (__TEXT,__text)
00000010 False long   n/a    VANILLA True      0x00000044
0000000c False long   True   VANILLA False     _base_GHCziST_return_closure
00000008 False long   True   VANILLA False     _base_GHCziST_zgzg_closure
00000004 False long   True   VANILLA False     _base_GHCziST_zgzgze_closure
00000000 False long   True   VANILLA False     _base_GHCziBase_ZCDMonad_static_info
thorkil-naurs-mac-mini:ar thorkilnaur$ otool -tv ST__15.o 
ST__15.o:
(__TEXT,__text) section
_base_GHCziST_fail_info_dsp:
00000000	.long 0x00000014
00000004	.long 0x00050001
00000008	.long 0x00000000
0000000c	.long 0x000f0003
_base_GHCziST_fail_info:
00000010	or	r16,r15,r15
00000014	lis	r15,0x0
00000018	ori	r15,r15,0x2d
0000001c	b	0x0
00000020	.long 0x00000010
thorkil-naurs-mac-mini:ar thorkilnaur$ otool -tV ST__15.o 
ST__15.o:
(__TEXT,__text) section
_base_GHCziST_fail_info_dsp:
00000000	.long 0x00000014
00000004	.long 0x00050001
00000008	.long 0x00000000
0000000c	.long 0x000f0003
_base_GHCziST_fail_info:
00000010	or	r16,r15,r15
00000014	lis	r15,hi16(_base_GHCziST_zdf2_closure+0x1)
00000018	ori	r15,r15,lo16(_base_GHCziST_zdf2_closure+0x1)
0000001c	b	_base_GHCziBase_zddmfail_info
00000020	.long 0x00000010
thorkil-naurs-mac-mini:ar thorkilnaur$ 

The scattered relocation that ld complains about is presumably this one:

00000014 False long   n/a    HI16    True      0x0000002c

addressing the code line:

00000014	lis	r15,hi16(_base_GHCziST_zdf2_closure+0x1)

(Wild guess: Some pointer tagging connection here?) I have attached the ST__15.o file.

My best suggestion from this information would be to approach Apple, reporting an error, but I may be wrong here. So please, experts, state whether you agree with this.

If there is agreement that reporting the problem to Apple is a good idea, I would also need some help to do that.

But there is more: One of the things that I tried was building a recent HEAD (that would be 6.9.something) with the binary http://haskell.org/ghc/dist/6.6.1/ghc-6.6.1-powerpc-apple-darwin.tar.bz2 by Christian Maeder. In this process, an apparently working stage1/ghc-inplace was produced, but at some point, the scattered messages started to appear:

/usr/bin/ld -x -r -o dist/build/HSbase-3.0.o  dist/build/Data/Generics.o dist/build/Data/Generics/Aliases.o dist/build/Data/Generics/Basics.o dist/build/Data/Generics/Instances.o dist/build/Data/Generics/Schemes.o dist/build/Data/Generics/Text.o dist/build/Data/Generics/Twins.o dist/build/Foreign/Concurrent.o dist/build/GHC/Arr.o dist/build/GHC/Base.o dist/build/GHC/Conc.o dist/build/GHC/ConsoleHandler.o dist/build/GHC/Desugar.o dist/build/GHC/Dotnet.o dist/build/GHC/Enum.o dist/build/GHC/Environment.o dist/build/GHC/Err.o dist/build/GHC/Exception.o dist/build/GHC/Exts.o dist/build/GHC/Float.o dist/build/GHC/ForeignPtr.o dist/build/GHC/Handle.o dist/build/GHC/IO.o dist/build/GHC/IOBase.o dist/build/GHC/Int.o dist/build/GHC/List.o dist/build/GHC/Num.o dist/build/GHC/PArr.o dist/build/GHC/Pack.o dist/build/GHC/PrimopWrappers.o dist/build/GHC/Ptr.o dist/build/GHC/Read.o dist/build/GHC/Real.o dist/build/GHC/ST.o dist/build/GHC/STRef.o dist/build/GHC/Show.o dist/build/GHC/Stable.o dist/build/GHC/Storable.o dist/build/GHC/TopHandler.o dist/build/GHC/Unicode.o dist/build/GHC/Weak.o dist/build/GHC/Word.o dist/build/System/Timeout.o dist/build/Control/Applicative.o dist/build/Control/Arrow.o dist/build/Control/Category.o dist/build/Control/Concurrent.o dist/build/Control/Concurrent/Chan.o dist/build/Control/Concurrent/MVar.o dist/build/Control/Concurrent/QSem.o dist/build/Control/Concurrent/QSemN.o dist/build/Control/Concurrent/SampleVar.o dist/build/Control/Exception.o dist/build/Control/Monad.o dist/build/Control/Monad/Fix.o dist/build/Control/Monad/Instances.o dist/build/Control/Monad/ST.o dist/build/Control/Monad/ST/Lazy.o dist/build/Control/Monad/ST/Strict.o dist/build/Data/Bits.o dist/build/Data/Bool.o dist/build/Data/Char.o dist/build/Data/Complex.o dist/build/Data/Dynamic.o dist/build/Data/Either.o dist/build/Data/Eq.o dist/build/Data/Fixed.o dist/build/Data/Foldable.o dist/build/Data/Function.o dist/build/Data/HashTable.o dist/build/Data/IORef.o dist/build/Data/Int.o dist/build/Data/Ix.o dist/build/Data/List.o dist/build/Data/Maybe.o dist/build/Data/Monoid.o dist/build/Data/Ord.o dist/build/Data/Ratio.o dist/build/Data/STRef.o dist/build/Data/STRef/Lazy.o dist/build/Data/STRef/Strict.o dist/build/Data/String.o dist/build/Data/Traversable.o dist/build/Data/Tuple.o dist/build/Data/Typeable.o dist/build/Data/Unique.o dist/build/Data/Version.o dist/build/Data/Word.o dist/build/Debug/Trace.o dist/build/Foreign.o dist/build/Foreign/C.o dist/build/Foreign/C/Error.o dist/build/Foreign/C/String.o dist/build/Foreign/C/Types.o dist/build/Foreign/ForeignPtr.o dist/build/Foreign/Marshal.o dist/build/Foreign/Marshal/Alloc.o dist/build/Foreign/Marshal/Array.o dist/build/Foreign/Marshal/Error.o dist/build/Foreign/Marshal/Pool.o dist/build/Foreign/Marshal/Utils.o dist/build/Foreign/Ptr.o dist/build/Foreign/StablePtr.o dist/build/Foreign/Storable.o dist/build/Numeric.o dist/build/Prelude.o dist/build/System/Console/GetOpt.o dist/build/System/CPUTime.o dist/build/System/Environment.o dist/build/System/Exit.o dist/build/System/IO.o dist/build/System/IO/Error.o dist/build/System/IO/Unsafe.o dist/build/System/Info.o dist/build/System/Mem.o dist/build/System/Mem/StableName.o dist/build/System/Mem/Weak.o dist/build/System/Posix/Internals.o dist/build/System/Posix/Types.o dist/build/Text/ParserCombinators/ReadP.o dist/build/Text/ParserCombinators/ReadPrec.o dist/build/Text/Printf.o dist/build/Text/Read.o dist/build/Text/Read/Lex.o dist/build/Text/Show.o dist/build/Text/Show/Functions.o dist/build/Unsafe/Coerce.o `find dist/build -name "*_stub.o" -print` dist/build/cbits/PrelIOUtils.o dist/build/cbits/WCsubst.o dist/build/cbits/Win32Utils.o dist/build/cbits/consUtils.o dist/build/cbits/dirUtils.o dist/build/cbits/inputReady.o dist/build/cbits/lockFile.o dist/build/cbits/longlong.o dist/build/cbits/selectUtils.o
ar: creating archive dist/build/libHSbase-3.0.a
unknown scattered relocation type 4

It seems that this is the first time that the linker is being asked to link some code that has actually been produced by the stage1/ghc-inplace compiler.

Further, I found:

thorkil-naurs-mac-mini:base thorkilnaur$ /usr/bin/ld -x -r -o dist/build/HSbase-3.0.o dist/build/GHC/ForeignPtr.o
unknown scattered relocation type 4
thorkil-naurs-mac-mini:base thorkilnaur$ rm dist/build/GHC/ForeignPtr.o
thorkil-naurs-mac-mini:base thorkilnaur$ ../../compiler/stage1/ghc-inplace -package-name base-3.0 -hide-all-packages -i -idist/build/autogen -idist/build -i. -Idist/build -Iinclude -#include "HsBase.h" -odir dist/build -hidir dist/build -stubdir dist/build -package rts-1.0 -O -fglasgow-exts -package-name base -XCPP -idist/build  -Werror -H64m -Onot -fvia-C -O -c GHC/ForeignPtr.hs -o dist/build/GHC/ForeignPtr.o  -ohi dist/build/GHC/ForeignPtr.hi
thorkil-naurs-mac-mini:base thorkilnaur$ /usr/bin/ld -x -r -o dist/build/HSbase-3.0.o dist/build/GHC/ForeignPtr.o
thorkil-naurs-mac-mini:base thorkilnaur$

So it appears that -fvia-C helps. But notice that I am not saying that the native code generator is at fault, only that code provoking the problem is apparently not being generated by gcc in this case.

And so, using -fvia-C all the way through, I have been able to produce a working HEAD on and for PPC Mac OS X 10.5. mk/build.mk is:

HADDOCK_DOCS    = YES
SRC_CC_OPTS     = -Werror
SRC_HC_OPTS     = -Werror -H64m -Onot -fvia-C
GhcStage1HcOpts = -Onot -fvia-C
GhcStage2HcOpts = -Onot -fvia-C
GhcLibHcOpts    = -Onot -fvia-C
GhcLibWays      =
SplitObjs       = NO
NoFibWays       =
STRIP           = :
GhcBootLibs     = YES

Fast testing produces (where the Pp* tests should be ignored, they are related to work on #1337, sorry about that):

make fast stage=2 EXTRA_HC_OPTS=-fvia-C
...
OVERALL SUMMARY for test run started at Sun Nov 25 11:10:52 CET 2007
    2001 total tests, which gave rise to
    7595 test cases, of which
       1 caused framework failures
    5924 were skipped

    1586 expected passes
      75 expected failures
       0 unexpected passes
      10 unexpected failures

Unexpected failures:
   Pp004(normal)
   Pp005(normal)
   Pp006(normal)
   Pp007(normal)
   Pp_coverage(normal)
   currentDirectory001(normal)
   directory001(normal)
   ghci024(ghci)
   print026(ghci)
   recomp004(normal)

I haven't checked the unexpected results at all, but it doesn't look too bad, really.

It is important to notice that this compiler is limping in the sense that it can only reliably produce code that is accepted by the 10.5 linker with -fvia-C.

Best regards Thorkil

11/25/07 04:14:03 changed by thorkilnaur

  • attachment ST__15.o added.

Smallest .o file producing the scattered message from the 10.5 Leopard ld

11/26/07 02:00:59 changed by simonmar

Since gcc can apparently generate code that doesn't upset the linker, it would be worth investigating the difference between what gcc generates and what GHC's native code generator is generating for one of the problematic references.

11/26/07 09:29:01 changed by ChrisKuklewicz

Using mk/build.mk of

HADDOCK_DOCS = NO SRC_CC_OPTS = -Werror SRC_HC_OPTS = -Werror -H128m -O2 -fvia-C GhcStage?1HcOpts = -O2 -fvia-C GhcStage?2HcOpts = -O2 -fvia-C GhcLibHcOpts? = -O2 -fvia-C GhcLibWays? = p SplitObjs? = NO NoFibWays? = STRIP = : GhcBootLibs? = YES

I fail when on my G4 laptop w/10.5 during the stage2 build:

ranlib libHSghc.a /usr/bin/ld -r -x -o HSghc.o stage2/basicTypes/BasicTypes.o <<snip>> ld: scattered reloc r_address too large for inferred architecture ppc make[2]: *** [HSghc.o] Error 1 make[1]: *** [stage2] Error 2 make: *** [bootstrap2] Error 2

So the linker ends with another scattered relocation problem.

11/26/07 22:59:05 changed by thorkilnaur

The following Haskell program I3.hs is a reduced version of Control/Monad/Instances.hs that demonstrates the problem:

{-# OPTIONS_NHC98 --prelude #-}

module I3 (Monad(..)) where

import Prelude

instance Monad ((->) r) where
        return = const
        f >>= k = \ r -> k (f r) r

The attached tar.gz archive includes I3.hs as well as:

I3.s
assembler code generated by the GHC native code generator
I3.hc
C code generated by GHC with -fvia-C
I3.raw_s
unmangled assembler code generated by gcc from I3.hc
I3.mangled.s
mangled assembler code
I3.asm.otool.out
dissection of the .o file corresponding to I3.s
I3.C.otool.out
dissection of the .o file corresponding to I3.mangled.s
The otool dissection of the .o file corresponding to I3.s generated by the GHC NCG indicates that the critical relocation happens in the first of these lines from I3.s:

        lis     r31, hi16(_r6F_closure+3)
        ori     r31, r31, lo16(_r6F_closure+3)
        stw     r31, -12(r25)

The corresponding lines from I3.mangled.s seem to be these:

        lis r2,ha16(_r6F_closure)
        la r2,lo16(_r6F_closure)(r2)
        addi r2,r2,3
        stw r2,-12(r25)

Best regards Thorkil

11/26/07 23:02:18 changed by thorkilnaur

  • attachment GHC_trac_1843.tar.gz added.

I3.hs and derived files that illustrate the problem

11/27/07 00:50:44 changed by thorkilnaur

Adding to the I3.hs story, I tried to change I3.s as follows:

thorkil-naurs-mac-mini:work thorkilnaur$ diff ../I3.s .
85c85
<       lis     r31, hi16(_r6F_closure+3)
---
>       lis     r31, hi16(_r6F_closure)
thorkil-naurs-mac-mini:work thorkilnaur$

Assembling this changed I3.s gave a I3.o which was acceptable to the linker

Best regards Thorkil

11/27/07 01:46:58 changed by simonmar

Indeed hi16(_r6F_closure+3) is strange: the +3 can't make any difference to the result.

11/27/07 09:47:56 changed by ChrisKuklewicz

Using -fvia-C and setting -O to -Onot did build ghc-6.8.1 as was reported above.

The exact stanza of mk/build.mk was

HADDOCK_DOCS    = NO
SRC_CC_OPTS     = -Werror
SRC_HC_OPTS     = -Werror -H64m -Onot -fvia-C
GhcStage1HcOpts = -Onot -fvia-C
GhcStage2HcOpts = -Onot -fvia-C
GhcLibHcOpts    = -Onot -fvia-C
GhcLibWays      =
SplitObjs       = NO
NoFibWays       =
STRIP           = :
GhcBootLibs     = YES

Simply recompiling ghc-6.8.1 with itself then failed because the configure script uses ghc without the -fvia-C option (it has the unknown...4 error when linking the utils/pwd/pwd program).

So I changed the "ghc" program shell script to add the -fvia-C options to every invocation. This has allowed ghc-6.8.1 to start compiling itself...

11/27/07 15:46:49 changed by ChrisKuklewicz

The "lis hi16/ori lo16" instructions seem to only be from compiler/nativeGen/MachCodeGen.hs line 1754

getRegister (CmmLit lit)
  = let rep = cmmLitRep lit
        imm = litToImm lit
        code dst = toOL [
              LIS dst (HI imm),
              OR dst dst (RIImm (LO imm))
          ]
    in return (Any rep code)

So that is probably the place to fix the code generation. The above is then rendered to text by PpcMach?.hs in the obvious way. Replacing HI/hi16 by HA/ha16 and changing OR to the right kind of "la" or "addi" might fix it.

(follow-up: ↓ 24 ) 11/27/07 16:52:59 changed by ChrisKuklewicz

Here is a patch to compiler/nativeGen/MachineCodeGen.hs to use the desired instructions:

--- MachCodeGen-orig.hs	2007-11-28 00:09:25.000000000 +0000
+++ MachCodeGen-cek.hs	2007-11-28 00:08:48.000000000 +0000
@@ -1746,7 +1746,7 @@
 				 CmmStaticLit (CmmFloat f frep)]
             `consOL` (addr_code `snocOL` LD frep dst addr)
     return (Any frep code)
-
+{-
 getRegister (CmmLit lit)
   = let rep = cmmLitRep lit
         imm = litToImm lit
@@ -1755,6 +1755,17 @@
               OR dst dst (RIImm (LO imm))
           ]
     in return (Any rep code)
+-}
+
+getRegister (CmmLit lit)
+  = let rep = cmmLitRep lit
+        imm = litToImm lit
+        code dst = toOL [
+              LIS dst (HA imm),
+              ADD dst dst (RIImm (LO imm))
+          ]
+    in return (Any rep code)
+-}
 
 getRegister other = pprPanic "getRegister(ppc)" (pprExpr other)
     

More testing is needed, but I think I have a working stage2 compiler from the above without having to use -fvia-C.

11/29/07 13:45:36 changed by thorkilnaur

I have investigated further and now have definite evidence that the sequence

  lis r,hi16(s+k)
  ori r,r,lo16(s+k)

(where s is a symbol and k is a literate constant) used to bring the value s+k into register r is sometimes handled errorneously by (the Xcode 3.0 coming with) Mac OS X 10.5 Leopard. In contrast to (the Xcode 2 that comes with) 10.4 Tiger that doesn't have this problem.

The evidence indicates that the object code is produced correctly by Leopard (or at least in consistence with the object code produced by Tiger), but that the linker fails to process the relocation related to hi16(s+k) correctly, not only producing the confusing "unknown scattered relocation type 4" message, but also, in fact, generating incorrect code. Thus explaining the failure to run any amount of code that has been prepared in this manner.

I intend to report this problem to Apple. In the meantime, there are fortunately some work-arounds:

  1. Use the sequence
      lis r,ha16(s+k)
      la r,lo16(s+k)(r)
    
    suggested by ChrisKuklewicz above
  2. To retain the lis hi16 + ori lo16 combination, use
    t = s+k
      lis r,hi16(t)
      ori r,r,lo16(t)
    
    defining a new symbol t to hold the desired value for hi16.

Both of these methods work, as far as I can tell.

Best regards Thorkil

12/01/07 05:06:14 changed by fons

  • cc set to alfonso.acosta@gmail.com.

(in reply to: ↑ 21 ) 12/01/07 17:18:15 changed by igloo

Thanks for all the investigative work, everyone!

Replying to ChrisKuklewicz:

Here is a patch to compiler/nativeGen/MachineCodeGen.hs to use the desired instructions:

...

More testing is needed, but I think I have a working stage2 compiler from the above without having to use -fvia-C.

6.8.2 is just around the corner, so do we think this patch is right? Should I apply it?

Thanks

Ian

(follow-up: ↓ 26 ) 12/02/07 11:20:15 changed by guest

I tried applying ChrisKuklewicz?'s patch to a fresh ghc 6.8.1 source tree (from http://haskell.org/ghc/dist/6.8.1/ghc-6.8.1-src.tar.bz2) along with the extra libs, and the linker segfaults every time it tries to link one of the certain extra libs. So far, parsec, X11, and OpenGL cause it to die:

...
== make way=p -f GNUmakefile all;
== Finished recursively making `all' for ways: p  ...
Registering HGL-3.2.0.0...
Reading package info from "dist/inplace-pkg-config" ... done.
Saving old package config file... done.
Writing new package config file... done.
if ifBuildable/ifBuildable OpenGL; then \
	  cd OpenGL && \
	  make -r && \
	  setup/Setup register --inplace; \
	fi
../../compiler/stage1/ghc-inplace -package-name OpenGL-2.2.1.1 -hide-all-packages -split-objs -i -idist/build/autogen -idist/build -i. -Idist/build -Iinclude -optc-DCALLCONV=ccall -#include "HsOpenGL.h" -odir dist/build -hidir dist/build -stubdir dist/build -package base-3.0.0.0 -O -DCALLCONV=ccall -XCPP -XForeignFunctionInterface -idist/build  -H16m -O -O -Rghc-timing -fgenerics -c Graphics/Rendering/OpenGL/GL/Feedback.hs -o dist/build/Graphics/Rendering/OpenGL/GL/Feedback.o  -ohi dist/build/Graphics/Rendering/OpenGL/GL/Feedback.hi
collect2: ld terminated with signal 10 [Bus error]
<<ghc: 309647992 bytes, 58 GCs, 5755631/9450608 avg/max bytes residency (4 samples), 23M in use, 0.00 INIT (0.00 elapsed), 3.54 MUT (20.03 elapsed), 1.05 GC (1.40 elapsed) :ghc>>
make[2]: *** [dist/build/Graphics/Rendering/OpenGL/GL/Feedback.o] Error 1
make[1]: *** [make.library.OpenGL] Error 2
make: *** [stage1] Error 2

(in reply to: ↑ 25 ) 12/02/07 13:07:55 changed by igloo

Replying to guest:

I tried applying ChrisKuklewicz?'s patch to a fresh ghc 6.8.1 source tree (from http://haskell.org/ghc/dist/6.8.1/ghc-6.8.1-src.tar.bz2) along with the extra libs, and the linker segfaults every time it tries to link one of the certain extra libs. So far, parsec, X11, and OpenGL cause it to die:

Thanks for the info! From comments 4 and 9 it sounds to me like that is a different problem, and not something caused by the patch. Does anyone disagree?

12/03/07 15:11:12 changed by thorkilnaur

I agree that the linker problem seems to be a separate issue. Using the ChrisKuklewicz patch, a validate on a PPC Mac OS X 10.5. Leopard (without the extra libraries) concluded:

OVERALL SUMMARY for test run started at Mon Dec  3 11:39:48 CET 2007
    2001 total tests, which gave rise to
    7595 test cases, of which
       1 caused framework failures
    5924 were skipped

    1570 expected passes
      75 expected failures
       0 unexpected passes
      26 unexpected failures

Unexpected failures:
   Records(normal)
   arith011(normal)
   arr016(normal)
   cg034(normal)
   cg044(normal)
   cg059(normal)
   currentDirectory001(normal)
   directory001(normal)
   drv012(normal)
   drv013(normal)
   drv020(normal)
   ds059(normal)
   exceptions002(normal)
   freeNames(normal)
   ghci024(ghci)
   karl1(normal)
   nbe(normal)
   rebindable5(normal)
   red-black(normal)
   simpl007(normal)
   simplrun004(optc)
   tc(normal)
   tc088(normal)
   text001(normal)
   tup001(normal)
   unicode002(normal)

Proceeding with a more realistic build (mk/build.mk is mk/build.mk.sample with BuildFlavour = perf) and extra libraries included, matters are brought to a halt at:

../../compiler/stage1/ghc-inplace -package-name parsec-2.0 -hide-all-packages -split-objs -i -idist/build/autogen -idist/build -i. -Idist/build -odir dist/build -hidir dist/build -stubdir dist/build -package base-3.0 -O -XExistentialQuantification -XPolymorphicComponents -idist/build  -H32m -O2  -c Text/ParserCombinators/Parsec/Token.hs -o dist/build/Text/ParserCombinators/Parsec/Token.o  -ohi dist/build/Text/ParserCombinators/Parsec/Token.hi
collect2: ld terminated with signal 10 [Bus error]
make[2]: *** [dist/build/Text/ParserCombinators/Parsec/Token.o] Error 1
make[1]: *** [make.library.parsec] Error 2
make: *** [stage1] Error 2

which has been reported earlier on the glasgow-haskell-users mailing list, but as far as I can tell, not reported as a trac ticket. I will prepare a suitable trac ticket shortly, unless someone is kind enough to tell me that the issue has been reported already.

Best regards Thorkil

12/05/07 01:39:05 changed by thorkilnaur

The collect2: ld terminated with signal 10 [Bus error] matter has been reported as #1958.

12/07/07 15:58:01 changed by igloo

  • status changed from new to closed.
  • resolution set to fixed.

I've applied the patch to HEAD and 6.8 branch, so this bug is now fixed I believe.

12/19/07 03:56:00 changed by thorkilnaur

For information (since we have applied a work-around): On 2007-Dec-09, I reported the linker problem to Apple:

Problem ID: 5637618
Title:	PPC Leopard (Xcode 3.0) linker ld reports "unknown scattered relocation type 4"
State:	  Open
Originated Date:	09-Dec-2007 04:34 AM

Best regards Thorkil

04/24/08 07:30:16 changed by thorkilnaur

Just for the curious, since we have applied a work-around: Apple has responded to the "Bug ID #: 5637618 (PPC Leopard (Xcode 3.0) linker ld reports "unknown scattered relocation type 4")", saying that this issue has been addressed in "the latest seed release of Xcode 3.1, build 9M2165".

Best regards Thorkil