id	summary	reporter	owner	description	type	status	priority	milestone	component	version	resolution	keywords	cc	os	architecture	failure	difficulty	testcase	blockedby	blocking	related
1984	weird performance drop with -O2 on x86	guest	igloo	"Here's my program:
{{{
import Control.Concurrent
import Data.IORef

maker :: IORef Int -> IO ()
maker v = loop
    where
    loop = do
        x <- readIORef v
        writeIORef v $! x + 1
        forkIO (return ())
        loop

main :: IO ()
main = do
    v <- newIORef 0
    t <- forkIO (maker v)
    threadDelay 1000000
    killThread t
    x <- readIORef v
    print x
}}}

It's supposed to print the number of threads created in one second. With ghc -O2, I get around 61104; similarly for -O1. However, with no optimization I get results around 612274, i.e. approximately ten times more threads in the same time.
What's going on here?

More data points:

6.6.1 behaves similarly but the numbers are a bit higher (~10% more iterations).

<dons> be sure to mention that results appear normal on amd64.
"	merge	closed	normal	6.8.3	Runtime System	6.8.2	fixed			Linux	x86		Unknown				
