Ticket #5859 (new bug)

Opened 16 months ago

Last modified 2 months ago

unsafeInterleaveIO duplicates computation when evaluated by multiple threads

Reported by: joeyadams Owned by: simonpj
Priority: high Milestone: 7.6.2
Component: libraries/base Version: 7.2.2
Keywords: Cc: pho@…, hackage.haskell.org@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Incorrect result at runtime Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

When the following code is compiled with -O1 or -O2, the interleaved computation (putStrLn "eval") is performed 1000 times, rather than once:

import Control.Concurrent
import Control.Exception (evaluate)
import Control.Monad
import System.IO.Unsafe

main :: IO ()
main = do
    x <- unsafeInterleaveIO $ putStrLn "eval"
    replicateM_ 1000 $ forkIO $ evaluate x >> return ()
    threadDelay 1000000

Taking a look at the source to unsafeInterleaveIO:

{-# INLINE unsafeInterleaveIO #-}
unsafeInterleaveIO :: IO a -> IO a
unsafeInterleaveIO m = unsafeDupableInterleaveIO (noDuplicate >> m)

-- We believe that INLINE on unsafeInterleaveIO is safe, because the
-- state from this IO thread is passed explicitly to the interleaved
-- IO, so it cannot be floated out and shared.

It seems the comment about INLINE is not true. If I define the following function:

interleave :: IO a -> IO a
interleave = unsafeInterleaveIO
{-# NOINLINE interleave #-}

and replace unsafeInterleaveIO with interleave, "eval" is printed only once. If I change NOINLINE to INLINE, or if I remove the pragma altogether, "eval" is printed 1000 times.

I believe unsafeInterleaveIO should guarantee that computations are not repeated. Otherwise, we end up with strangeness like this:

import Control.Applicative
import Control.Concurrent
import Control.Monad

main :: IO ()
main = do
    chan <- newChan :: IO (Chan Int)
    mapM_ (writeChan chan) [0..999]
    items <- take 10 <$> getChanContents chan
    replicateM_ 5 $ forkIO $ putStrLn $ "items = " ++ show items
    threadDelay 1000000

which prints:

items = [0,1,2,3,4,5,6,7,8,9]
items = [10,11,12,13,14,15,16,17,18,19]
items = [20,21,22,23,24,25,26,27,28,29]
items = [30,31,32,33,34,35,36,37,38,39]
items = [40,41,42,43,44,45,46,47,48,49]

For the time being, programs can work around this by using a NOINLINE wrapper:

getChanContents' :: Chan a -> IO [a]
getChanContents' = getChanContents
{-# NOINLINE getChanContents' #-}

I tested this on Linux 64-bit with GHC 7.2.2 and ghc-7.4.0.20120111, and on Windows 32-bit with GHC 7.0.3 and 7.2.2. All of these platforms and versions exhibit the same behavior. The bug goes away when the program is compiled with -O0, or when functions returning interleaved computations are marked NOINLINE (e.g. getChanContents').

Attachments

good.simpl Download (35.4 KB) - added by simonpj 15 months ago.
bad.simpl Download (43.1 KB) - added by simonpj 15 months ago.

Change History

Changed 16 months ago by PHO

  • cc pho@… added

Changed 15 months ago by simonmar

  • priority changed from normal to high
  • difficulty set to Unknown
  • milestone set to 7.4.2

Thanks for the report.

Changed 15 months ago by simonpj

This does indeed look bogus. I've done a little investigation, which I'm going to record here so I don't forget it.

Fundamentally the problem is the "state hack". (Compile with -fno-state-hack and the problem goes away.) Consider

   let x = factorial 1000 in replicate 10 (print x)

After a bit of inlining we get something like

   let x = factorial 1000 in replicate 10 (\s. wprint x s)

where s :: State# RealWorld is a state token. Now the float-in pass pushes the let-binding inwards, to give

   replicate 10 (\s.  wprint (factorial 1000) s)

and we are dead: we evaluate (factorial 1000) each time round the loop.

Why does float-in do this? Because the state hack says that the type of s means it is only applied once, which is patently false.

I think the solution is the narrow (once more) the scope of the state hack, so that it's only used in arity expansion. I know how to do this but have to set up nofib runs to compare before and after.

Thanks for the report. Hacks usually bite you in the end.

Simon

Changed 15 months ago by simonmar

  • owner set to simonpj

Changed 15 months ago by simonpj

I tried disabling the use of the state hack (isStateHackType, via isOneShotBndr) in

    simplCore/Simplify.lhs 
    simplCore/OccurAnal.lhs 
    simplCore/FloatIn.lhs

(the only other state hack uses are in coreSyn/CoreArity.lhs).

This resulted in perf/compiler/parsing001 failing slightly, and perf/should_run/T5536 failing a lot. I looked into the latter. The attached program has

     484268840 bytes allocated with the full state hack
    2085048512 bytes allocated with the state hack modified as above

Replacing either of the "if" expressions in the last few lines with just its "else" branch fixes the problem.

I've also attached the -ddump-simpl output.

Changed 15 months ago by simonpj

Changed 15 months ago by simonpj

Changed 13 months ago by simonpj

Cf #5943. Since unsafeDupableInterleaveIO is now NONINLINE, the symptoms of this ticket might have gone away.

Changed 12 months ago by igloo

  • milestone changed from 7.4.2 to 7.4.3

Changed 8 months ago by igloo

  • milestone changed from 7.4.3 to 7.6.2

Changed 2 months ago by liyang

  • cc hackage.haskell.org@… added
Note: See TracTickets for help on using tickets.