Ticket #7192 (closed bug: fixed)
Bug in -fregs-graph with -fnew-codegen
| Reported by: | simonmar | Owned by: | benl |
|---|---|---|---|
| Priority: | highest | Milestone: | 7.8.1 |
| Component: | Compiler | Version: | 7.7 |
| Keywords: | Cc: | benl, pho@… | |
| Operating System: | Unknown/Multiple | Architecture: | Unknown/Multiple |
| Type of failure: | None/Unknown | Difficulty: | Unknown |
| Test Case: | Blocked By: | ||
| Blocking: | #4258 | Related Tickets: |
Description
This is triggered by running the test dph-diophantine-opt with -fnew-codegen. It segfaults for me on x86_64/Linux. I verified that adding -fno-regs-graph fixes it, as does -fllvm.
This failure is blocking the new code generator switchover. I don't want to break DPH, although perhaps we could disable -fregs-graph temporarily until this bug is fixed.
Here's the bit of code that is miscompiled. I think there are probably several instances of a similar sequence in the module, because I saw the failure happening in several different places.
Cmm code:
sT8r_info:
{offset
c1cnV:
_sT8r::I64 = R1;
_sT85::I64 = R2;
_sT82::I64 = R3;
_sT8m::I64 = R4;
_sT87::I64 = R5;
_sT8a::I64 = R6;
if (Sp - 96 < SpLim) goto c1co3; else goto c1cnZ;
c1cnZ:
Hp = Hp + 16;
if (Hp > HpLim) goto c1co2; else goto c1coR;
c1co2:
HpAlloc = 16;
goto c1co3;
c1co3:
R1 = _sT8r::I64;
I64[Sp - 8] = _sT8a::I64;
I64[Sp - 16] = _sT87::I64;
I64[Sp - 24] = _sT8m::I64;
I64[Sp - 32] = _sT82::I64;
I64[Sp - 40] = _sT85::I64;
Sp = Sp - 40;
call (stg_gc_fun)() args: 48, res: 0, upd: 8;
asm code:
0x0000000000438f38 <+0>: mov %rbx,%r10 # R1 saved in %r10 0x0000000000438f3b <+3>: mov %rdi,0x48(%rsp) 0x0000000000438f40 <+8>: mov %r8,0x40(%rsp) 0x0000000000438f45 <+13>: mov %r9,%rbx # %rbx clobbered 0x0000000000438f48 <+16>: lea -0x60(%rbp),%rax 0x0000000000438f4c <+20>: cmp %r15,%rax 0x0000000000438f4f <+23>: jb 0x438ff5 <sT8r_info+189> ... 0x0000000000438ff5 <+189>: mov %rbx,-0x8(%rbp) 0x0000000000438ff9 <+193>: mov %r8,%rax 0x0000000000438ffc <+196>: mov %rax,-0x10(%rbp) 0x0000000000439000 <+200>: mov %rdi,%rax 0x0000000000439003 <+203>: mov %rax,-0x18(%rbp) 0x0000000000439007 <+207>: mov %rsi,-0x20(%rbp) 0x000000000043900b <+211>: mov %r14,-0x28(%rbp) 0x000000000043900f <+215>: add $0xffffffffffffffd8,%rbp 0x0000000000439013 <+219>: jmpq *-0x8(%r13)
The problem is that the contents of %rbx (aka R1) has been lost at the jump. It is supposed to contain the same value it had on entry to the proc.
BTW I made a change recently which might be relevant: the registers R1-R8 can now be used by the register allocator. See f857f0741515b9ebf186beb38fe64448de355817
The linear register allocator is generating much better code here:
leaq -96(%rbp),%rax
cmpq %r15,%rax
jb .Lc12k1
...
movq %r9,-8(%rbp)
movq %r8,-16(%rbp)
movq %rdi,-24(%rbp)
movq %rsi,-32(%rbp)
movq %r14,-40(%rbp)
addq $-40,%rbp
jmp *-8(%r13)
