Ticket #6063 (patch bug)

Opened 13 months ago

Last modified 2 weeks ago

GHC's build-time ld-flag checks are problematic

Reported by: parcs Owned by: thoughtpolice
Priority: high Milestone: 7.8.1
Component: Compiler Version: 7.4.1
Keywords: Cc: bos@…, mail@…
Operating System: Linux Architecture: Unknown/Multiple
Type of failure: GHC doesn't work at all Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets: #4862

Description

If the linker that GHC was built against happened to recognize the --hash-size and --reduce-memory-overhead flags, but the current system linker does not, GHC will indiscriminately pass those flags to the current linker when attempting to link anything, and the linking will fail due to an 'unrecognized option' error.

I have experienced this behavior when I upgraded to GHC 7.4.1 on debian. The GHC package in question was built against bfd ld, which recognized those linker flags, but my system linker is gold, which does not recognized those flags. Therefore, I could not build anything until i reverted to bfd ld.

(Relevant commits: 3275b7bd2a803a3adc0b952b6fbfeb738fc15a74 and 9ccb59ed6e5edf73c876e87429e69e8848162497)

Attachments

Change History

  Changed 10 months ago by edsko

  • related set to 4862

  Changed 9 months ago by bos

  • cc bos@… added
  • failure changed from None/Unknown to GHC doesn't work at all
  • os changed from Unknown/Multiple to Linux

Still an issue with 7.6.

  Changed 8 months ago by igloo

  • difficulty set to Unknown
  • related changed from 4862 to #4862

follow-up: ↓ 5   Changed 7 months ago by igloo

  • milestone set to 7.8.1

The linker flags are detected when a bindist is installed, not when it is built, so they should match the machine that GHC is run on.

Should we check what flags the linker supports every time ghc is run?

in reply to: ↑ 4   Changed 7 months ago by parcs

Replying to igloo:

The linker flags are detected when a bindist is installed, not when it is built, so they should match the machine that GHC is run on. Should we check what flags the linker supports every time ghc is run?

Yeah, that sounds like a reasonable solution. The system linker may change after GHC is installed, and doing so shouldn't result in a non-functioning GHC installation.

  Changed 7 months ago by igloo

  • priority changed from normal to high

  Changed 2 months ago by ezyang

On the other hand, having to shell out to ld every time we run GHC is a bit of a bummer, as far as performance goes. A stop-gap may be to just re-run ld without the lfags if it fails the first time; in the good case, we still manage to only do one invocation.

  Changed 2 months ago by thoughtpolice

  • owner set to thoughtpolice

I'll take this over since I've had some complaints from friends about it (and have hit it myself in the past.)

I'll measure any compile time differences, but I imagine checking if the argument is supported through an extra invocation is negligible compared to the actual linking time. On windows, process creation can be expensive, but we'll only be using ld here anyway so we can short-cut it.

  Changed 4 weeks ago by nh2

  • cc mail@… added

Was this not fixed in #4862 9 months ago?

Doesn't seem to be in 7.6.2 though ...

  Changed 2 weeks ago by jlebar

Is there a way to work around this in the meantime, short of uninstalling gold? I tried running

# LD=ld.bfd cabal install ...

and that didn't work.

  Changed 2 weeks ago by nh2

@jlebar That's not so easy, but you could actually make it *use* gold. See here:  http://stackoverflow.com/questions/6952396/why-does-ghc-take-so-long-to-link/16105691#16105691. Might also contain some other relevant info for you (e.g. that -pgml is actually gcc, so you'd have to tell gcc which linker it should call, and I guess that's why your env variable doesn't work).

  Changed 2 weeks ago by thoughtpolice

I have a patch for this now. I decided to generalize it a little and added functionality to generally detect linker information at runtime, and distinguish between several cases. I have verified it works and am validating on my Mac OS X machine and my other Linux machines. I will set up a Windows machine as well.

I'd like some code review before I push this patch to the tree directly. If Ian or anyone else would like to give feedback, please do so. If there aren't any complaints I'll push this myself.

@nh2 - no, this was not fixed in that work. The issue is a little more complicated. Please see my commit message and patch which details the issue fairly clearly.

commit e1825358b318f847aca19a230fb0a4377830d767
Author: Austin Seipp <aseipp@pobox.com>
Date:   Sat May 4 16:07:22 2013 -0500

    Detect linker information at runtime. Fixes Trac #6063
    
    Previously, we did ./configure time checks to see if 'GNU ld' supported
    certain options. If it does, we bake those options into the link step.
    See Trac #5240. Ergo, this is a build-time configuration.
    
    Unfortunately, the linker we use at runtime can change for several
    reasons. One is that the user specifies -pgml 'foo'. The other is if
    /usr/bin/ld or whatnot literally *changes* from when GHC was built.
    Obviously, not every linker supports these options, like GNU gold,
    and that would lead to linking failure. This is Trac #6063.
    
    What this ultimately means is that we need to check at runtime what
    linker we're using. Always. This is actually a little bit complicated
    because we normally use the C compiler as our linker. Also, OS X does
    not support gold OR ld, Windows only has GNU ld, etc.
    
    Finally, this patch also unconditionally gives '--hash-size=31' and
    '--reduce-memory-overheads' to the system linker if it's GNU ld. These
    options have been supported for 8+ years from what I can see, and there
    are probably a lot of *other* reasons why GHC would not work with such
    an ancient binutils, all things considered.
    
    See Note [Run-time linker info] in SysTools for more details. There
    are plenty of comments as well in the surrounding code.
    
    Signed-off-by: Austin Seipp <aseipp@pobox.com>

  Changed 2 weeks ago by thoughtpolice

  • status changed from new to patch

  Changed 2 weeks ago by parcs

I notice two things:

1. From what I can tell, you don't need to spawn the linker process when os is OSDarwin or OSMinGW32.

2. Use hang to print the error message:

hang (text "Warning:") 9 $
    text "Couldn't ..." $$
    text "Make ..."

Changed 2 weeks ago by thoughtpolice

  Changed 2 weeks ago by thoughtpolice

Updated per review comments (thanks to int-e on freenode as well.)

Note: See TracTickets for help on using tickets.