Ticket #5991 (closed bug: fixed)

Opened 14 months ago

Last modified 7 months ago

regression: huge number of wakeups in xmonad

Reported by: dmwit Owned by:
Priority: high Milestone: 7.6.2
Component: Runtime System Version: 7.6.1
Keywords: Cc: gammelholte, dennis@…
Operating System: Linux Architecture: x86_64 (amd64)
Type of failure: Runtime performance bug Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

People using GHC 7.4 have been complaining that xmonad causes a huge number of wakeups (about 100 wakeups/s, according to one complaint, and 30% of all wakeups on the entire machine according to another). A workaround of making xmonad compile with -rtsopts and adding +RTS -V0 seems to hide this problem, but this is very inconvenient because it means the user can't use xmonad's own recompilation and restart mechanisms.

Change History

Changed 13 months ago by simonmar

  • status changed from new to infoneeded
  • difficulty set to Unknown
  • milestone set to 7.6.1

Is xmonad normally compiled with -threaded? Does it make a difference?

Changed 13 months ago by dmwit

xmonad is usually not compiled with -threaded; compiling with -threaded does seem to be another valid workaround.

16:42 < dylan> dmwit: -threaded also fixes the wakeups problem
16:42 < dmwit> dylan: Awesome, I'll say so on the bug.
16:42 < dmwit> Perhaps we should just be passing -threaded when we build in xmonad.
16:43 < dylan> ultimately the RTS should not be doing what it is doing though
16:43 < dmwit> Yes, and also -threaded is a bit scary with X code.
16:43 < dylan> I'm going back to -V0 for the time being
16:44 < dylan> the amusing bit
16:44 < dylan> either either solution
16:45 < dylan> there are still several hundred SIGVTALRM/rt_sigreturn pairs before it settles down into a poll()

Changed 13 months ago by simonmar

  • priority changed from normal to high
  • status changed from infoneeded to new
  • component changed from Compiler to Runtime System
  • milestone changed from 7.6.1 to 7.4.2

fix coming...

Changed 13 months ago by simonmar

  • owner set to simonmar

Changed 13 months ago by marlowsd@…

commit dd24d6bc37879c6b32a3d5ac4ee765e59e13501c

Author: Simon Marlow <marlowsd@gmail.com>
Date:   Wed Apr 11 10:12:14 2012 +0100

    Disable the timer signal while blocked in select() (#5991)
    
    The threaded RTS had a fix for this a long time ago (#1623) but this
    patch applies a similar fix to the non-threaded RTS.

 rts/posix/Select.c |   17 +++++++++++++----
 1 files changed, 13 insertions(+), 4 deletions(-)

Changed 13 months ago by simonmar

  • status changed from new to merge

Please also merge 9d26519c20194abc3832578a55fbe31327519eeb, as it fixes a regression in threadDelay in the non-threaded RTS that I noticed while investigating this bug.

Changed 13 months ago by pcapriotti

  • status changed from merge to closed
  • resolution set to fixed

Merged as 27c2f292d23983fac0e1f333a43424264c2e290b and 6aee474574e0aff10602a6fd17d37826b04a996d.

The first commit (threadDelay resolution change) broke threadDelay on 7.4 because of a broken unit conversion macro, which was fixed in the HEAD, so I merged that as well (a73d0a21c875ff539c2240e431eaf87173a92bd3).

Changed 11 months ago by nomeata

I have applied all three patches to the GHC-7.4.1 package in Debian, but from my testing it seems that the bug is really fixed, I still see much of this when stracing xmonad:

--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a)                      = -1 EINTR (Interrupted system call)
poll([{fd=3, events=POLLIN}], 1, -1)    = ? ERESTART_RESTARTBLOCK (To be restarted)
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a)                      = -1 EINTR (Interrupted system call)
poll([{fd=3, events=POLLIN}], 1, -1)    = ? ERESTART_RESTARTBLOCK (To be restarted)

Did I miss something? Did someone confirm with GHC-7.4.2 that this bug is really fixed?

Changed 11 months ago by simonmar

I just tried with 7.4.2 and a simple threadDelay example and the bug looks fixed to me.

Is your xmonad compiled with -threaded or not? If it is, you can expect to see the SIGVTALRM signals for 0.3 seconds until the idle GC occurs, then they should stop.

Changed 9 months ago by gammelholte

  • status changed from closed to new
  • cc gammelholte added
  • version changed from 7.4.1 to 7.4.2
  • architecture changed from Unknown/Multiple to x86_64 (amd64)
  • owner simonmar deleted
  • resolution fixed deleted

Hello,

the bug is still there in 7.4.2---as noted by several users at  http://www.reddit.com/r/xmonad/comments/yp86g/help_with_xmonad_wakeups_bug/

That is, running vanilla xmonad on 7.4.2 leads to ~100 wakeups/s. Recompiling manually with -threaded fixes the problem, though.

Changed 9 months ago by simonmar

Does xmonad have a background thread doing a threadDelay? I think that might prevent the RTS from turning off its timer signal. If that's the case I know how to fix it.

Changed 9 months ago by gammelholte

How do I check if there's a background thread doing a threadDelay?

Thanks

Changed 9 months ago by dmwit

The xmonad code base itself does not call threadDelay. However, the xmonad-contrib packages do include three calls in various places:

* The XMonad.Prompt hierarchy use threadDelay after a prompt autocompletion to wait for windows to finish drawing themselves. It is not called in a background loop as far as I can tell; I expect it to be called about once per keypress. * XMonad.Util.Timer provides a timer. Possibly some configs use this module and create a timer which renews itself every time it fires; however, I think this is pretty uncommon. Many other modules import this one, and may have loops, I didn't check. * XMonad.Util.Run provides a function to spawn a process, wait a bit, and then terminate it. I very much don't expect this to be called in a background loop.

So, perhaps the next question is: does this problem occur with a stock xmonad config (i.e. nothing more than "import XMonad; main = defaultConfig")? If so, then we can confidently say there are no background loops calling threadDelay.

Changed 9 months ago by def

  • cc dennis@… added

This problem also occurs with a stock xmonad config. I tracked it down even further and have a minimal example now:

import Graphics.X11.Xlib

main = do
  dpy <- openDisplay ""
  allocaXEvent $ \xevent ->
    nextEvent dpy xevent

100 wakeups per second using GHC 7.4.2. The same program in C does not have this problem (compile with gcc -lX11 -o min-x11 min-x11.c):

#include <X11/Xlib.h>

int main(int argc, char **argv) {
  Display* dpy = XOpenDisplay(NULL);
  XEvent xevent;
  XNextEvent(dpy, &xevent);
}

The problem is related to blocking in a foreign call (in this case waiting on the socket /tmp/.X11-unix/X0). When the nextEvent' from  http://www.haskell.org/haskellwiki/X_window_programming_in_Haskell is used instead of nextEvent the problem disappears. It waits using Haskell's threadWaitRead instead of doing it inside the foreign call.

I have also found another instance of this problem when using GTK (it's also /tmp/.X11-unix/X0):

import Graphics.UI.Gtk

main = do
  initGUI
  mainGUI

Again no problems with the C version (compile with gcc $(pkg-config --cflags --libs gtk+-2.0) -o min-gtk min-gtk.c):

#include <gtk/gtk.h>

int main(int argc, char **argv) {
  gtk_init(&argc, &argv);
  gtk_main();
}

I could upload the strace outputs of the C and Haskell versions which show what they do differently. Tell me if there's anything I can do.

Changed 8 months ago by def

  • version changed from 7.4.2 to 7.6.1

I'm still seeing the same behaviour in GHC 7.6.1 as tested with the min-x11 examples from above.

Changed 8 months ago by marlowsd@…

commit 0b79d5cd4687dacf7efd430df7fba9d9a5a5ce32

Author: Simon Marlow <marlowsd@gmail.com>
Date:   Fri Sep 21 15:49:22 2012 +0100

    Another overhaul of the recent_activity / idle GC handling (#5991)
    
    Improvements:
    
     - we now turn off the timer signal in the non-threaded RTS after
       idleGCDelay.  This should make the xmonad users on #5991 happy.
    
     - we now turn off the timer signal after idleGCDelay even if the
       idle GC is disabled with +RTS -I0.
    
     - we now do *not* turn off the timer when profiling.
    
     - more comments to explain the meaning of the various ACTIVITY_*
       values

 includes/rts/Flags.h |    1 +
 rts/RtsFlags.c       |   14 ++++++++++++--
 rts/Schedule.c       |   16 ++++++++++++----
 rts/Schedule.h       |   28 +++++++++++++++++++++++-----
 rts/Timer.c          |   29 +++++++++++++++--------------
 5 files changed, 63 insertions(+), 25 deletions(-)

Changed 8 months ago by marlowsd@…

commit 673b6f50eca6f53cfb13b00e587c403c716baba1

Author: Simon Marlow <marlowsd@gmail.com>
Date:   Fri Sep 21 15:26:46 2012 +0100

    Revert "Disable the timer signal while blocked in select() (#5991)"
    
    This reverts commit dd24d6bc37879c6b32a3d5ac4ee765e59e13501c.
    
    This attempt to fix the problem was misguided: the program might be
    stuck in a foreign call rather than awaitEvent(), and then the timer
    signal will never get disabled.  The only way to turn off the timer
    signal in this case is in the timer interrupt handler itself.

 rts/posix/Select.c |   17 ++++-------------
 1 files changed, 4 insertions(+), 13 deletions(-)

Changed 8 months ago by simonmar

  • status changed from new to merge
  • milestone changed from 7.4.2 to 7.6.2

Changed 7 months ago by pcapriotti

  • status changed from merge to closed
  • resolution set to fixed

Merged as f9077d98b9e58b0eca518c8ae370b0f5c4df8800, 809e693303fc2e6bf949c13fbcd66c3f058d3539.

Note: See TracTickets for help on using tickets.