Ticket #5004 (closed bug: fixed)

Opened 2 years ago

Last modified 2 years ago

loading stripped libHsghc-7.0.2.a fails

Reported by: pbrisbin Owned by: duncan
Priority: highest Milestone: 7.0.4
Component: Compiler Version: 7.0.2
Keywords: Cc: slyfox@…, juhp@…
Operating System: Linux Architecture: Unknown/Multiple
Type of failure: GHCi crash Difficulty:
Test Case: Blocked By:
Blocking: Related Tickets:

Description

Arch linux's ghc package was upgraded to 7.0.2 this weekend. I unregistered all cabal packages, removed ~/.cabal/packages/*/* and ~/.ghc in an effort to start fresh.

I successfully installed quite a few packages via cabal install.

You can find my current ghc-pkg list  here.

At cabal install yesod, I get the following:

...
Loading package MonadCatchIO-mtl-0.3.0.2 ... linking ... done.
Loading package Cabal-1.10.1.0 ... linking ... done.
Loading package ghc-binary-0.5.0.2 ... linking ... done.
Loading package bin-package-db-0.0.0.0 ... linking ... done.
Loading package hpc-0.5.0.6 ... linking ... done.
Loading package ghc-7.0.2 ... ghc: This ELF file contains no symtab
ghc: panic! (the 'impossible' happened)
  (GHC version 7.0.2 for x86_64-unknown-linux):
        loadArchive "/usr/lib/ghc-7.0.2/ghc-7.0.2/libHSghc-7.0.2.a": failed

Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug

cabal: Error: some packages failed to install:
yesod-0.7.1 failed during the building phase. The exception was:
ExitFailure 1

I'm reporting here because I was specifically asked to by ghc (I almost always assume these things are my fault), please let me know what additional info you would like in this report.

Thanks.

Attachments

build.mk Download (3.4 KB) - added by pbrisbin 2 years ago.
The build.mk used by Arch Linux
linker-partially-striped-objects-fix.dpatch Download (71.9 KB) - added by duncan 2 years ago.
patch for the ghc 7.0.x branch

Change History

  Changed 2 years ago by pbrisbin

  • summary changed from yesod-0.7.1 fails to build on ghc 7.0.2 to yesod-0.7.1 causes panic on Arch's ghc 7.0.2

  Changed 2 years ago by slyfox

  • cc slyfox@… added
  • failure changed from Compile-time crash to GHCi crash
  • architecture changed from x86_64 (amd64) to Unknown/Multiple

I confirm the same problem with gentoo's package. We strip out debug info from archives when install package.

Simpler testcase:

prefix@sf ~ $ ghci -package ghc
GHCi, version 7.0.2: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package array-0.3.0.2 ... linking ... done.
Loading package containers-0.4.0.0 ... linking ... done.
Loading package filepath-1.2.0.0 ... linking ... done.
Loading package old-locale-1.0.0.2 ... linking ... done.
Loading package old-time-1.0.0.6 ... linking ... done.
Loading package unix-2.4.2.0 ... linking ... done.
Loading package directory-1.1.0.0 ... linking ... done.
Loading package pretty-1.0.1.2 ... linking ... done.
Loading package process-1.0.1.5 ... linking ... done.
Loading package Cabal-1.10.1.0 ... linking ... done.
Loading package bytestring-0.9.1.10 ... linking ... done.
Loading package ghc-binary-0.5.0.2 ... linking ... done.
Loading package bin-package-db-0.0.0.0 ... linking ... done.
Loading package hpc-0.5.0.6 ... linking ... done.
Loading package template-haskell ... linking ... done.
Loading package ghc-7.0.2 ... ghc: /home/prefix/gentoo/usr/lib/ghc-7.0.2/ghc-7.0.2/libHSghc-7.0.2.a: no string tables, or too many
ghc: panic! (the 'impossible' happened)
  (GHC version 7.0.2 for i386-unknown-linux):
        loadArchive "/home/prefix/gentoo/usr/lib/ghc-7.0.2/ghc-7.0.2/libHSghc-7.0.2.a": failed

Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug

Changed 2 years ago by pbrisbin

The build.mk used by Arch Linux

  Changed 2 years ago by pbrisbin

I've attached the build.mk used by Arch Linux, maybe something in there can be changed?

  Changed 2 years ago by igloo

  • priority changed from normal to highest
  • milestone set to 7.2.1

Thanks for the report, we'll take a look.

  Changed 2 years ago by guest

This is probably restating the obvious, but in case it's useful: I've also confirmed this issue exists on Arch Linux 64 and have also found that providing an unstripped version of libHsghc-7.0.2.a allows for successful linking.

  Changed 2 years ago by deteego

I can confirm that I also have this issue on i686 Archlinux

  Changed 2 years ago by mightybyte

I get the same error message when compiling snap 0.4.1 with ghc 7.0.2 on Arch Linux.

  Changed 2 years ago by juhpetersen

  • cc juhp@… added
  • summary changed from yesod-0.7.1 causes panic on Arch's ghc 7.0.2 to loading stripped libHsghc-7.0.2.a fails

Also happens with Fedora 15 ghc-7.0.2 - we have also been stripping all libraries and binaries.

  Changed 2 years ago by juhpetersen

Also happens with ghc-7.0.3.

follow-up: ↓ 11   Changed 2 years ago by juhpetersen

Seems there is no Hsghc-7.0.2.o file? Intentional?

in reply to: ↑ 10   Changed 2 years ago by igloo

Replying to juhpetersen:

Seems there is no Hsghc-7.0.2.o file? Intentional?

Yes.

  Changed 2 years ago by igloo

I looked into this, incidentally: As far as I can tell, we only use the debugging info in order to tell how many jump islands etc we're going to need later on, so that we can allocate all the space we'll need in a block before we start doing the relocations. So it looks like it ought to be fixable one way or another for 7.2.1.

In the meantime, the workaround is to not strip it.

  Changed 2 years ago by simonmar

See also #5098

  Changed 2 years ago by jaffachief

Happens when I try to compile yi 0.6.3.0 on Arch 64

  Changed 2 years ago by icarus127

I'm just curious how to compile 7.0.2 from the git repository so I can get my hands on an un-stripped library. I cloned the git repo but there are not tags for ghc versions and I can't find a commit that's obviously the 7.0.2 release. Building head gives me a lib for 7.1.2, will this work with 7.0.2? Thanks.

  Changed 2 years ago by igloo

Compiling just the ghc package separately is unlikely to work; you'll need to build and install the whole of ghc.

If you want a released version, it's easiest to use a source tarball from http://www.haskell.org/ghc/download

If you really want to get the code from version control, 7.0.* are in the darcs repos:  http://darcs.haskell.org/ghc-7.0/ghc/

follow-up: ↓ 19   Changed 2 years ago by twhitehead

Just finished running into this on Debian. Looks like the problem is that the keepCAFsForGHCi.o object file in the archive has no symbols but debug ones.

That is, taking the post-stripped archive from the packaging directory, we have

$ nm libHSghc-7.0.3-poststrip.a > /dev/null
nm: keepCAFsForGHCi.o: no symbols
$ ar -x libHSghc-7.0.3-poststrip.a keepCAFsForGHCi.o
$ nm keepCAFsForGHCi.o
nm: keepCAFsForGHCi.o: no symbols
$ nm -a keepCAFsForGHCi.o 
nm: keepCAFsForGHCi.o: no symbols

compared to the pre-stripped archive from the install directory, which gives

$ nm libHSghc-7.0.3.a > /dev/null
$ ar -x libHSghc-7.0.3.a keepCAFsForGHCi.o
$ nm keepCAFsForGHCi.o
$ nm -a 
0000000000000000 b .bss
0000000000000000 n .comment
0000000000000000 d .data
0000000000000000 n .note.GNU-stack
0000000000000000 t .text
0000000000000000 a keepCAFsForGHCi.c
$ 

(the -a flag causes nm to also list debug symbols).

Replacing just keepCAFsForGHCi.o in the post-stripped archive with that from the pre-stripped archive is enough to make GHC work again.

I see the associated source file seems to only exist to set the keepCAFs variable via a constructor flagged function. As this variable seems to be local (there is no corresponding undefined symbol), I'm a bit confused as to what it is suppose to do.

In any event, even if it is still used, this can likely be easily worked around by simply adding a bogus non-static symbol as well so stripping won't produce no symbols.

Cheers! -Tyson

  Changed 2 years ago by igloo

  • owner set to duncan

Duncan said he might take a look at this.

in reply to: ↑ 17   Changed 2 years ago by duncan

Replying to twhitehead:

Just finished running into this on Debian. Looks like the problem is that the keepCAFsForGHCi.o object file in the archive has no symbols but debug ones.

Thanks, this turns out to be the core issue. It's fortunately nothing to do with debug symbols or jump islands.

So when linking we definitely need a symbol table and its corresponding string table. Using strip on a .o or .a file will remove both tables. In this situation neither the system linker nor the GHCi linker cannot do anything with it.

Linux distros of course do not use just strip, they typically use strip --strip-unneeded. The strip --strip-unneeded command does not remove symbol tables, it just filters out the symbols not needed for relocation processing.

There is one important exception to the above however: strip --strip-unneeded does remove the symbol table when the symbol table is empty (after the usual filtering).

That is exactly what is happening with keepCAFsForGHCi.o. The keepCAFsForGHCi.c has just one function, and it is conditionally compiled so that it only appears only in the dynamic way. So keepCAFsForGHCi.dyn_o has the function in it, while the ordinary keepCAFsForGHCi.o is basically an empty object file. So when we strip --strip-unneeded on the libHSghc.a then the archive member keepCAFsForGHCi.o ends up with no symbol table.

So, the solution is probably to recognise when an object file exports no symbols and has no relocations requiring symbols. Thus for these object files we do not need their symbol table at all. So it should be just a matter of adjusting the order in which various things are validated and checked.

It is interesting to note that the system linker does not complain about object files with no symbol table, it just fails when it cannot find the target symbols that it needs. So it apparently just treats lack of a symbol table as no symbols. Perhaps we can do the same.

Changed 2 years ago by duncan

patch for the ghc 7.0.x branch

  Changed 2 years ago by duncan

So I have a fix! I've pushed the patch for ghc-head one for the 7.0.x branch is attached.

Thu May 12 17:43:59 BST 2011  Duncan Coutts <duncan@well-typed.com>
  * Make the GHCi linker handle partially stripped object files (#5004)
  When you use 'strip --strip-unneeded' on a ELF format .o or .a file, if
  the object file has no global/exported symbols then 'strip' ends up
  removing the symbol table entirely. Previously the GHCi linker assumed
  there would always be exactly one symbol table and exactly one string
  table. In fact, in ELF object files there is no such limitation, instead
  each section points to the other sections it needs, in particular
  relocation sections have a link to the symbol table section they use and
  symbol table sections have a link to the corresponding string table.
  So instead of assuming there will always be a global symbol and string
  table, all we have to do is validate and follow these links. Then, when
  we encounter an empty object file that has no symbols then we handle it
  correctly, because since it's empty we never process any relocations and
  so never have to follow any links to non-existant symbol tables.
  
  Also, in the case where an object is fully stripped, we can now detect
  this more reliably and emit a more helpful error message, e.g:
  
  libHSghc-7.1.20110509.a(DsMeta.o): relocation section #2 has no symbol table
  This object file has probably been fully striped. Such files cannot be linked.

  Changed 2 years ago by twhitehead

Fabulous.

Nothing like a good clean fix that makes things even more robust.

So much better than my suggested awful hack.

Thanks very much Duncan!

  Changed 2 years ago by icarus127

I second that! Thanks Duncan :)

  Changed 2 years ago by simonmar

Patch looks fine as far as I can tell. Nice job!

  Changed 2 years ago by duncan

  • status changed from new to closed
  • resolution set to fixed

Now applied to the 7.0.x branch too.

  Changed 2 years ago by simonmar

  • milestone changed from 7.2.1 to 7.0.4
Note: See TracTickets for help on using tickets.