Ticket #5897 (closed bug: fixed)
GHC runtime task workers are not released with C FFI
|Reported by:||sanketr||Owned by:|
|Operating System:||Unknown/Multiple||Architecture:||x86_64 (amd64)|
|Type of failure:||None/Unknown||Difficulty:||Unknown|
|Test Case:||Blocked By:|
I have a test code which calls C FFI to collect data every n microseconds. The timer event in Haskell code spawns one thread for each C FFI thread. Those C FFI threads call back, and coordinates with calling GHC thread through mvar. What I am consistently seeing is the increase in number of runtime task workers with each iteration of timer event. The attached test code reproduces the issue (please see attached README on how to run it).
I tested these on both Mac and Redhat Linux x86_64 with GHC 7.4.1, and was able to reliably reproduce the issue.
The end result is that if number of C FFI threads is beyond a certain threshold (6 on my quad-core iMac), the number of runtime tasks seem to increase without bounds. For example, here is a sample RTS output from attached test code, with 7 C FFI threads, and 2 GHC threads, after two iterations of the call to C FFI - instead of 4 task workers, there are 10 - tested with "-N3 +RTS -s" on GHC 7.4.1 and Mac 10.7.2 (quad-core iMac):
Parallel GC work balance: nan (0 / 0, ideal 1) MUT time (elapsed) GC time (elapsed) Task 0 (worker) : 0.00s ( 0.42s) 0.00s ( 0.00s) Task 1 (worker) : 0.00s ( 0.00s) 0.00s ( 0.00s) Task 2 (worker) : 0.00s ( 0.93s) 0.00s ( 0.00s) Task 3 (worker) : 0.00s ( 0.93s) 0.00s ( 0.00s) Task 4 (worker) : 0.00s ( 0.93s) 0.00s ( 0.00s) Task 5 (worker) : 0.00s ( 0.93s) 0.00s ( 0.00s) Task 6 (worker) : 0.00s ( 0.93s) 0.00s ( 0.00s) Task 7 (worker) : 0.06s ( 1.00s) 0.00s ( 0.00s) Task 8 (worker) : 0.00s ( 0.00s) 0.00s ( 0.00s) Task 9 (worker) : 0.06s ( 1.43s) 0.00s ( 0.00s) Task 10 (bound) : 0.00s ( 0.00s) 0.00s ( 0.00s)
The culprit seems to be mvar callback by C FFI. If I remove mvar callback, the number of task workers stay constant at 4.
If this is a bug in GHC runtime and not my code, it seems to be a big bug because mvar callback is important for coordination with C FFI threads. This bug might have been in previous versions of GHC as well, but probably not discovered because it seems to require a certain C FFI thread count threshold to kick in.