Ticket #5824 (closed bug: fixed)

Opened 4 months ago

Last modified 3 months ago

ARM StgRun register clobber list is broken

Reported by: bgamari Owned by: simonmar
Priority: high Milestone: 7.4.2
Component: Runtime System Version: 7.4.1-rc2
Keywords: Cc: bgamari@…, karel.gardas@…, nomeata
Operating System: Unknown/Multiple Architecture: arm
Type of failure: None/Unknown Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

The ARM implementation of StgRun? does not claim that it clobbers r7-r12. As a result, the compiler will sometimes put the returned RegTable? in one of these registers, resulting in an invalid RegTable? to be returned. Hilarity ensues.

Attachments

0001-Fix-register-clobber-list-in-StgRun-for-ARM.patch Download (1.0 KB) - added by bgamari 4 months ago.
Patch
0001-ARM-StgRun-Ensure-r11-state-is-preserved.patch Download (1.2 KB) - added by bgamari 4 months ago.
Ensure r11 is preserved
0001-fix-ARM-s-StgCRun-clobbered-register-list-for-both-A.patch Download (1.6 KB) - added by kgardas 3 months ago.
The patch fixing StgCRun clobbered regs list for both ARM and Thumb modes
0002-fix-ARM-StgCRun-to-not-save-and-restore-r11-fp-regis.patch Download (1.1 KB) - added by kgardas 3 months ago.
the patch fixing ARM StgCRun to not save and restore r11/fp register twice

Change History

Changed 4 months ago by bgamari

Patch

  Changed 4 months ago by bgamari

  • status changed from new to patch

  Changed 4 months ago by simonmar

  • component changed from Compiler to Runtime System
  • priority changed from normal to high
  • difficulty set to Unknown
  • architecture changed from Unknown/Multiple to arm
  • milestone set to 7.4.2
  • owner set to simonmar

  Changed 4 months ago by dterei

So Simon seems to be handling this. I'm not that familiar with the code in question but here are some questions / comments.

1. I assume r7 isn't being saved as it stores R1 which is the return register? 2. Why isn't r11 (storing SpLim?) being saved?

What is up with ARM's register names? If I look in the code you wrote in MachRegs?.h it seems as if the registers have two names? e.g r11 also known as v8? r12 also as ip?

I think the patch looks fine.

  Changed 4 months ago by bgamari

Sorry for the confusion; I somehow convinced myself that you wrote that code.

Yes, ARM registers have two names: a numeric name (e.g. r12) and an APCS (e.g. ip). Honestly, excluded r7 from the clobber list because the compiler complained when I added it,

rts/StgCRun.c: In function ‘StgRun’:
rts/StgCRun.c:678:1:  error: r7 cannot be used in asm here

It seems that the reason for this is that the compiler uses r7 as the frame pointer register when in Thumb mode. Coq experienced the same issue ( https://bugs.launchpad.net/ubuntu/+source/coq/+bug/636229) but their solution seemed to be a bit excessive ( http://bazaar.launchpad.net/~ubuntu-branches/ubuntu/maverick/coq/maverick/revision/20).

Concerning r11, I think you are correct. I'm not sure why it wasn't saved previously but it seems this omission has the potential to cause some very frustrating bugs. Additional patch forthcoming.

  Changed 4 months ago by bgamari

  • cc bgamari@… added

Changed 4 months ago by bgamari

Ensure r11 is preserved

  Changed 4 months ago by igloo

Both patches applied in 7.4 branch.

  Changed 4 months ago by igloo

  • cc karel.gardas@… added

With these patches, the Debian guys found (#5849) that the build failed  https://buildd.debian.org/status/fetch.php?pkg=ghc&arch=armel&ver=7.4.1-1&stamp=1328355092 :

"inplace/bin/ghc-stage1" -optc-Wall -optc-Wextra -optc-Wstrict-prototypes -optc-Wmissing-prototypes -optc-Wmissing-declarations -optc-Winline -optc-Waggregate-return -optc-Wpointer-arith -optc-Wmissing-noreturn -optc-Wnested-externs -optc-Wredundant-decls -optc-Iincludes -optc-Irts -optc-Irts/dist/build -optc-DCOMPILING_RTS -optc-DUSE_LIBFFI_FOR_ADJUSTORS -optc-fno-strict-aliasing -optc-fno-common -optc-g -optc-O0 -optc-DRtsWay=\"rts_debug\" -optc-w -optc-DDEBUG  -H32m -O -lffi -optl-pthread -optc-mlong-calls -Iincludes -Irts -Irts/dist/build -DCOMPILING_RTS -package-name rts  -dcmm-lint      -i -irts -irts/dist/build -irts/dist/build/autogen -Irts/dist/build -Irts/dist/build/autogen               -c rts/StgCRun.c -o rts/dist/build/StgCRun.debug_o
rts/StgCRun.c: In function 'StgRun':

rts/StgCRun.c:678:1:  error: fp cannot be used in asm here
make[2]: *** [rts/dist/build/StgCRun.debug_o] Error 1
make[1]: *** [all] Error 2
make[1]: Leaving directory `/build/buildd-ghc_7.4.1-1-armel-V2DeDU/ghc-7.4.1'
make: *** [build-stamp] Error 2

Perhaps the difference is whether the LLVM compiler is being used or not?

Regardless, we really need a solution that works for everyone.

  Changed 4 months ago by kgardas

The patch: commit 5a984f4388ef85d5c3af973b21a12c12b36c1ed4 Author: Ben Gamari <bgamari.foss@…> Date: Mon Jan 30 16:52:40 2012 -0500

ARM StgRun?: Ensure r11 state is preserved

looks wrong to me. I'm sorry to not review it more earlier, in fact ARM's r11 is the same reg as fp. So IMHO whole this patch might be reverted. Unfortunately I don't have debian here to check if it helps or not.

  Changed 4 months ago by bgamari

Bah, yes, you are right. Very obvious oversight on my part. Indeed this ought to be reverted. I'm still not certain why it compiled for me (or even why it failed to compile for the Debian folks), but the patch is certainly redundant.

follow-up: ↓ 11   Changed 4 months ago by igloo

  • cc nomeata added

Given the error, perhaps only r11 should be used? nomeata, are you able to test please?

in reply to: ↑ 10   Changed 4 months ago by nomeata

Replying to igloo:

Given the error, perhaps only r11 should be used? nomeata, are you able to test please?

Yes, but just to avoid round-trips after 10 hours of building due to misunderstandings: Can you provide a proper patch? I’ll start the build right away.

follow-up: ↓ 15   Changed 4 months ago by igloo

Rather than doing a full GHC build, I'd recommend trying to reproduce the error by compiling a standalone C file containing the inline asm, and then experiment to find out what is and isn't accepted in it.

follow-up: ↓ 14   Changed 4 months ago by kgardas

By the way, I've tested stock 7.4.1 release on Ubuntu 11.04 and there is no problem compiling even this bogus r11/fp statement. GCC is:

gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4)

in reply to: ↑ 13   Changed 4 months ago by nomeata

Replying to kgardas:

By the way, I've tested stock 7.4.1 release on Ubuntu 11.04 and there is no problem compiling even this bogus r11/fp statement. GCC is: {{{ gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4) }}}

Hmm, so there might be a toolchain and/or processor dependence there. The full failing build log can be found at  https://buildd.debian.org/status/fetch.php?pkg=ghc&arch=armel&ver=7.4.1-1&stamp=1328355092

Interesting snippets are basically: we use gcc-4.6_4.6.2-12, llvm-3.0_3.0-5. The machine is a Marvell Feroceon CPU @ 1.2GHz on a Marvell DB-78x00-BP Development Board (ARM v5).

in reply to: ↑ 12   Changed 4 months ago by nomeata

Replying to igloo:

Rather than doing a full GHC build, I'd recommend trying to reproduce the error by compiling a standalone C file containing the inline asm, and then experiment to find out what is and isn't accepted in it.

I tried with this code:

#ifdef LEADING_UNDERSCORE
#define STG_RUN "_StgRun"
#define STG_RETURN "_StgReturn"
#else
#define STG_RUN "StgRun"
#define STG_RETURN "StgReturn"
#endif

#if defined(__thumb__)
#define THUMB_FUNC ".thumb\n\t.thumb_func\n\t"
#else
#define THUMB_FUNC
#endif

#define arm_HOST_ARCH_PRE_ARMv6

void *f(void) {
    void *basereg;
    void * r;
    __asm__ volatile (
        /*
         * save callee-saves registers on behalf of the STG code.
         */
        "stmfd sp!, {r4-r10, fp, ip, lr}\n\t"
#if !defined(arm_HOST_ARCH_PRE_ARMv6)
        "vstmdb sp!, {d8-d11}\n\t"
#endif
        /*
         * allocate some space for Stg machine's temporary storage.
         * Note: RESERVER_C_STACK_BYTES has to be a round number here or
         * the assembler can't assemble it.
         */
        "sub sp, sp, %3\n\t"
        /*
         * Set BaseReg
         */
        "mov r4, %2\n\t"
        /*
         * Jump to function argument.
         */
        "bx %1\n\t"

        ".global " STG_RETURN "\n\t"
        THUMB_FUNC
        ".type " STG_RETURN ", %%function\n"
        STG_RETURN ":\n\t"
        /*
         * Free the space we allocated
         */
        "add sp, sp, %3\n\t"
        /*
         * Return the new register table, taking it from Stg's R1 (ARM's R7).
         */
        "mov %0, r7\n\t"
        /*
         * restore callee-saves registers.
         */
#if !defined(arm_HOST_ARCH_PRE_ARMv6)
        "vldmia sp!, {d8-d11}\n\t"
#endif
        "ldmfd sp!, {r4-r10, fp, ip, lr}\n\t"
      : "=r" (r)
      : "r" (f), "r" (basereg), "i" (123)
      : 
    );
    return r;
}

and running gcc -c test.c -o test.o on an armel machine works fine. The test machine that I have access to is not the autobuilder box, but a also a “Marvell Feroceon CPU @ 1GHz on a Marvell DB-78x00-BP Development Board (ARM v5)”, so I guess setting arm_HOST_ARCH_PRE_ARMv6 is correct.

Not defining this yields:

$ gcc -c test.c -o test.o
/tmp/ccL5k8Fg.s: Assembler messages:
/tmp/ccL5k8Fg.s:29: Error: selected processor does not support ARM mode `vstmdb sp!,{d8-d11}'
/tmp/ccL5k8Fg.s:38: Error: selected processor does not support ARM mode `vldmia sp!,{d8-d11}'

So this does not give any insight into the problem.

  Changed 4 months ago by igloo

Perhaps the easiest thing is to set a build going on the machine you have access to, and see if that fails, then.

  Changed 4 months ago by nomeata

I hope you don’t mind using the ticket also as a experiment record. Maybe it helps some.

So this is the gcc call that ghc is using to compile the c file and which fails:

/usr/bin/gcc -fno-stack-protector -Wl,--hash-size=31 -Wl,--reduce-memory-overheads -x c rts/StgCRun.c -o rts/StgCRun.s -S -Wimplicit -O -D__GLASGOW_HASKELL__=704 -DTABLES_NEXT_TO_CODE -Wall -Wextra -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Winline -Waggregate-return -Wpointer-arith -Wmissing-noreturn -Wnested-externs -Wredundant-decls -Iincludes -Irts -Irts/dist/build -DCOMPILING_RTS -DUSE_LIBFFI_FOR_ADJUSTORS -fno-strict-aliasing -fno-common -g -O0 -DRtsWay="rts_debug" -w -DDEBUG -mlong-calls -I includes -I rts -I rts/dist/build -I rts/dist/build -I rts/dist/build/autogen -I /home/nomeata/ghc-7.4.1/libraries/base/include -I /home/nomeata/ghc-7.4.1/rts/dist/build -I /home/nomeata/ghc-7.4.1/includes -I /home/nomeata/ghc-7.4.1/includes/dist-ghcconstants/header -I /home/nomeata/ghc-7.4.1/includes/dist-derivedconstants/header

Just reverting [5a984f4388ef85d5c3af973b21a12c12b36c1ed4] does not help. What does help is to remove both r11 and fp:

--- rts/StgCRun.c.orig  2012-02-12 22:02:56.000000000 +0000
+++ rts/StgCRun.c       2012-02-12 22:02:59.000000000 +0000
@@ -672,7 +672,7 @@
         "ldmfd sp!, {r4-r11, fp, ip, lr}\n\t"
       : "=r" (r)
       : "r" (f), "r" (basereg), "i" (RESERVED_C_STACK_BYTES)
-      : "%r4", "%r5", "%r6", "%r8", "%r9", "%r10", "%r11", "%fp", "%ip", "%lr"
+      : "%r4", "%r5", "%r6", "%r8", "%r9", "%r10",  "%ip", "%lr"
     );
     return r;
 }

But someone who actually knows what that means should tell me if it is a sensible thing to do.

  Changed 4 months ago by kgardas

Hi, Joachim (I hope nomeata is Joachim, if not please complain), could you be so kind and verify for us, that Debian is using pure ARM code? I.e. what `gcc -v' claims? Is it compiling by default in ARM mode or in Thumb mode? If the former, then I see the reason for this complain. In fact Ben already experience it himself on Ubuntu which is using Thumb mode by default and claims that r7 is used for frame pointer by compiler in this case. If Debian is using ARM mode, then fp is r11 and the compiler should complain about it in the same way. So if all this is true, then this should solve my question why Debian and Ubuntu compilers behaves in different way here. Now, we do have another issue. In ARM mode (on Debian presumably) we should mark R7 as clobbered IMHO. I'm not sure if we should mark R11 as clobbered in Thumb mode yet. Anyway, marking R11 as clobbered on Thumb is less important than marking R7 as clobbered on ARM as Thumb is not going to use R11 anyway due to its limited register space. I'm going to give this idea a try since on my Panda I do have gcc 4.4.1 (from older ubuntu), which is compiling in ARM mode by default so this should duplicate Debian's issue -- if my idea is right here. I will keep you posted.

  Changed 4 months ago by nomeata

Yes, I’m Joachim :-)

I see no mention of Thumb here, but I guess you can make more out of this;

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabi/4.6/lto-wrapper
Target: arm-linux-gnueabi
Configured with: ../src/configure -v --with-pkgversion='Debian 4.6.2-14' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.6 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin --enable-objc-gc --disable-sjlj-exceptions --with-arch=armv4t --with-float=soft --enable-checking=release --build=arm-linux-gnueabi --host=arm-linux-gnueabi --target=arm-linux-gnueabi
Thread model: posix
gcc version 4.6.2 (Debian 4.6.2-14) 

  Changed 3 months ago by igloo

nomeata, do you have a failing build that you can access now, then? If so, are you able to cut down StgCRun.c and the gcc commandline to make a minimal, standalone testcase please?

  Changed 3 months ago by kgardas

The testcase is IMHO not needed. Just try to compile with -optc=-marm/-optc=-mthumb and you will see it too. On Ubuntu you do have -mthumb by default, so you need to give a try to -marm to see Debian's result. Anyway, I already do have a patch for this issue and I'm going to attach it here for your testing in a few minutes.

Changed 3 months ago by kgardas

The patch fixing StgCRun clobbered regs list for both ARM and Thumb modes

Changed 3 months ago by kgardas

the patch fixing ARM StgCRun to not save and restore r11/fp register twice

  Changed 3 months ago by kgardas

Joachim, Ben, could you be so kind and test my last two patches on your platforms? Here my Panda is just running testsuite to see if those are ok on Ubuntu 11.04. Thanks! Karel

follow-up: ↓ 24   Changed 3 months ago by kgardas

Hello, my last two patches passes testing here. I get usually expected set of failures. Joachim, how is it looking with testing on Debian? Thanks!

in reply to: ↑ 23   Changed 3 months ago by nomeata

Replying to kgardas:

Hello, my last two patches passes testing here. I get usually expected set of failures. Joachim, how is it looking with testing on Debian? Thanks!

I patched the file and compiling individually works fine. I continued the partial build to see if the resulting compiler does actually work, then I’ll report bug. But things look good.

  Changed 3 months ago by nomeata

Ok, built worked fine, so from my side you can go ahead and apply the patch. Thanks!

follow-up: ↓ 27   Changed 3 months ago by kgardas

Simon, could you be so kind and apply two attached patches provided by me (kgardas)? It looks like, they are working for both Ubuntu and Debian folks. Thanks!

in reply to: ↑ 26   Changed 3 months ago by nomeata

Replying to kgardas:

Simon, could you be so kind and apply two attached patches provided by me (kgardas)?

Preferably also to the branch that becomes 7.4.2, to avoid avoidable patching in the distribution packages, if that is compatible with your release policy.

  Changed 3 months ago by simonmar

  • status changed from patch to merge
  • milestone changed from 7.4.2 to 7.6.1

  Changed 3 months ago by simonmar

  • milestone changed from 7.6.1 to 7.4.2

oops, not sure what happened there

  Changed 3 months ago by pcapriotti

  • status changed from merge to closed
  • resolution set to fixed
Note: See TracTickets for help on using tickets.