Ticket #3989 (closed bug: duplicate)

Opened 3 years ago

Last modified 3 years ago

Parallel GC with more -N than physical processors gives dramatic slowdown (75 times)

Reported by: NeilMitchell Owned by:
Priority: normal Milestone:
Component: Runtime System Version: 6.12.1
Keywords: Cc:
Operating System: Windows Architecture: Unknown/Multiple
Type of failure: None/Unknown Difficulty:
Test Case: Blocked By:
Blocking: Related Tickets:

Description

Running hlint 1.6.21 ( http://hackage.haskell.org/package/hlint) with GHC 6.12.1 on a Windows laptop with two cores, I get:

timer hlint src +RTS -N1 -qg = 1.344 seconds
timer hlint src +RTS -N2 -qg = 1.000 seconds
timer hlint src +RTS -N3 -qg = 0.984 seconds
timer hlint src +RTS -N4 -qg = 1.016 seconds
timer hlint src +RTS -N1 = 1.344 seconds
timer hlint src +RTS -N2 = 0.969 seconds
timer hlint src +RTS -N3 = 76.563 seconds

At -N1, -qg has no effect (as expected)

At -N2, -qg has a small positive effect (I repeated the benchmarks many times, so the effect is there)

At -N3, -qg is essential or it takes forever

The result seems to be that if you overschedule your garbage collector it goes totally crazy. People often use -N with a higher number than their processors, since it nicely allows IO and computation to be interleaved. GC should probably drop that -N down if it gets lots of contention.

This is a performance regression from GHC 6.10.4, where HLint worked fine with +RTS -N3. I only caught this regression as my test suite started to take forever to run.

Change History

Changed 3 years ago by simonmar

  • status changed from new to closed
  • resolution set to duplicate

Thanks for the report; closing as duplicate of

#3553
parallel gc suffers badly if one thread is descheduled

You should find that 6.12.2 is better, although you'll still get a slowdown if you use a -N value greater than the number of physical cores. There should never be a reason to do that: GHC will already overlap I/O with computation. You might well get a speedup by using more threads than cores, but using more capabilities than cores is not a good idea.

You're probably right that we ought to reduce the number of capabilities or turn off parallel GC if we find that things are going badly.

#3729
Allow modification of capabilities at runtime

to which I'll add a link to this ticket.

Note: See TracTickets for help on using tickets.