Ticket #3279 (closed merge: fixed)

Opened 4 years ago

Last modified 4 years ago

Segmentation fault in reactive program

Reported by: Baughn Owned by: igloo
Priority: high Milestone: 6.10.4
Component: Runtime System Version: 6.11
Keywords: Cc: int-e@…
Operating System: Unknown/Multiple Architecture: x86_64 (amd64)
Type of failure: Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

Trying to debug reactive, I triggered what appears to be a GHC bug. To clarify:

- Replacing "foo unamb bar unamb baz" with foldl1 unamb [foo,bar,baz] in one particular function causes a segmentation fault at runtime, in all possible permutations of -threaded/nonthreaded and GHC 6.10.3/6.11. Further, a number of different changes do the same, quite unpredictably.

The hackage version of reactive does not exhibit this bug. I've attached a quite minimal patch to it that causes this, as well as a test program to trigger it.

I've been trying to debug this myself, with very little success. No level of core-lint or debug checks causes this code to trigger assertions instead of outright segfaults.

Incidentally, lazysmallcheck-0.3, a dependency of reactive, does not compile against 6.11 due to a name change in the Data.Generics interface. I've uploaded a fixed version to  http://brage.info/~svein/lsc.tar.gz.

Attachments

reactive.patch Download (1.6 KB) - added by Baughn 4 years ago.
crash.hs Download (1.1 KB) - added by Baughn 4 years ago.
crash.2.hs Download (0.6 KB) - added by Baughn 4 years ago.
assert.patch Download (0.8 KB) - added by Baughn 4 years ago.

Change History

Changed 4 years ago by Baughn

Changed 4 years ago by Baughn

Changed 4 years ago by Baughn

  Changed 4 years ago by Baughn

Editing crash.hs to cut down on the number of imports, funnily enough, caused the program to progress further before crashing. It still crashes, though.

  Changed 4 years ago by int-e

Apparently stg_sel_ret_0_upd_info is called with an stg_dummy_ret_closure argument. Where does it come from? RaiseAsync.c?

> ghc --version
The Glorious Glasgow Haskell Compilation System, version 6.10.3
> gdb ./crash -ex 'break stg_sel_ret_0_upd_info' -ex 'run' -ex 'c 477' -ex 'i r'
GNU gdb 6.8
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
Breakpoint 1 at 0x8142e8c
Starting program: /var/home/bf3/t/reactive/crash 
[Thread debugging using libthread_db enabled]
[New Thread 0xb7ecb6c0 (LWP 10828)]
[Switching to Thread 0xb7ecb6c0 (LWP 10828)]

Breakpoint 1, 0x08142e8c in stg_sel_ret_0_upd_info ()
Current language:  auto; currently asm
Will ignore next 476 crossings of breakpoint 1.  Continuing.
<Imp NoBound 2.0,1.0>
<Imp NoBound 2.5,2.0>
<Imp NoBound 3.2,3.0>
<Imp NoBound 3.7,4.0>
"never-never"
<Imp NoBound 2.0,()>
<Imp NoBound 2.5,()>

Breakpoint 1, 0x08142e8c in stg_sel_ret_0_upd_info ()
eax            0xb7dcb4c8       -1210272568
ecx            0xb7d00000       -1211105280
edx            0xb7d01960       -1211098784
ebx            0x81819c8        135797192
esp            0xbffbe8ac       0xbffbe8ac
ebp            0xb7cf2fa4       0xb7cf2fa4
esi            0x817edc8        135785928
edi            0xb7dcc314       -1210268908
eip            0x8142e8c        0x8142e8c <stg_sel_ret_0_upd_info>
eflags         0x246    [ PF ZF IF ]
cs             0x73     115
ss             0x7b     123
ds             0x7b     123
es             0x7b     123
fs             0x0      0
gs             0x33     51
(gdb) x $esi & -4
0x817edc8 <stg_dummy_ret_closure>:      0x08142d4c
(gdb) ni 8
Program received signal SIGSEGV, Segmentation fault.

  Changed 4 years ago by Baughn

Probably something to do with that, yes.

I've traced the problem to the exception-handling code in Unamb.hs; if I comment out the killThreads in race everything works fine (if inefficiently), as no exceptions are ever thrown. Meanwhile, if I replace the call to restartingUnsafePerformIO with just unsafePerformIO, the program fails to crash, but also fails to work.

restartingUnsafePerformIO is a very dodgy piece of code. I'd focus any efforts on that.

Meanwhile, I still haven't managed to produce a simple test-case, even though I now know it's unamb. (Well, I suspected that all along, of course..)

  Changed 4 years ago by Baughn

Going by int-e's research, this bug may be 64-bit only; it fails to trigger on his 32-bit system using 6.11 (and possibly 6.10? I didn't ask.)

The attached patch turns it into an assertion failure, when compiled with -debug.

Changed 4 years ago by Baughn

  Changed 4 years ago by int-e

  • cc int-e@… added

I was testing ghc 6.11 with the wrong version of 'unamb' version (the hackage one instead of the darcs head). Now I can reproduce the crash with ghc 6.11 as well.

  Changed 4 years ago by int-e

FTR: I've confirmed that the stg_dummy_ret_closure comes from RaiseAsync.c by adding an additional stg_dummy_ret_ra_closure and using that in RaiseAsync.c.

  Changed 4 years ago by simonmar

  • owner set to simonmar
  • difficulty set to Unknown
  • priority changed from normal to high
  • component changed from Compiler to Runtime System
  • milestone set to 6.12.1

  Changed 4 years ago by Baughn

There was a bug inside Unamb.hs. I think I'd best paste the relevant code..

race a b = block $ do

v <- newEmptyMVar let f x = forkIO $ putCatch x v ta <- f a tb <- f b

- We rely on killing the threads forked here in order to limit excessive work, but as you can see I'd forgotten to unblock exceptions first. Switching to a corrected implementation

race a b = block $ do

v <- newEmptyMVar let f x = forkIO $ putCatch (unblock x) v ta <- f a tb <- f b

removed that problem. It also removed the crash. However, I have a feeling it may still exist in potentia, as a race condition. At any rate, that's a place to start; exceptions thrown to blocked threads that go on to evaluate bottoms.

  Changed 4 years ago by Baughn

Formatting messed up. Corrected, here:

Before:

race a b = block $ do

v <- newEmptyMVar
let f x = forkIO $ putCatch (unblock x) v
ta <- f a
tb <- f b

After:

race a b = block $ do

v <- newEmptyMVar
let f x = forkIO $ putCatch (unblock x) v
ta <- f a
tb <- f b

  Changed 4 years ago by simonmar

I don't have any luck reproducing this one either. (see #3288)

~/scratch/3288 > ./crash3279
<Imp NoBound 2.0,1.0>
<Imp NoBound 2.5,2.0>
<Imp NoBound 3.2,3.0>
<Imp NoBound 3.7,4.0>
"never-never"
<Imp NoBound 2.0,()>
<Imp NoBound 2.5,()>
<Imp NoBound 3.2,()>
<Imp NoBound 4.0,()>
crash3279: BothBottom
~/scratch/3288 > ./crash3279-2
<Imp NoBound 2.0,1.0>
<Imp NoBound 2.5,2.0>
<Imp NoBound 3.2,3.0>
<Imp NoBound 3.7,4.0>
"never-never"
<Imp NoBound 2.0,()>
<Imp NoBound 2.5,()>
<Imp NoBound 3.2,()>
<Imp NoBound 4.0,()>
crash3279-2: BothBottom

in reply to: ↑ description   Changed 4 years ago by Baughn

Yes, sorry. See my last message.

The last version of unamb that still reliably exhibits the bug is 0.2

  Changed 4 years ago by int-e

Here's a testcase for you. It does not crash, but it prints wrong results (112459785 instead of 2 here; Baughn tested it on a 64 bit maching and got 1114609665). I've also verified that your patch to unblock fixes this behaviour.

-- test for #3279

import System.IO.Unsafe
import GHC.Conc
import Control.Exception
import Prelude hiding (catch)

f :: Int
f = (1 +) . unsafePerformIO $ do
        error "foo" `catch` \(SomeException e) -> do
            myThreadId >>= flip throwTo e
            -- point X
            unblock $ return 1

main :: IO ()
main = do
    evaluate f `catch` \(SomeException e) -> return 0
    -- the evaluation of 'x' is now suspended at point X
    tid <- block $ forkIO (evaluate f >> return ())
    killThread tid
    -- now execute the 'unblock' above with a pending exception
    yield
    -- should print 1 + 1 = 2
    print f

  Changed 4 years ago by simonmar

  • owner changed from simonmar to igloo
  • type changed from bug to merge
  • milestone changed from 6.12.1 to 6.10.4

Thanks for the testcase!

Fixed

Tue Jun 16 08:24:55 PDT 2009  Simon Marlow <marlowsd@gmail.com>
  * Fix #3279, #3288: fix crash encountered when calling unblock inside unsafePerformIO

  Changed 4 years ago by igloo

  • status changed from new to closed
  • resolution set to fixed

Merged

Note: See TracTickets for help on using tickets.