Ticket #716 (closed bug: fixed)

Opened 7 years ago

Last modified 5 years ago

Unloading a dll generated by GHC doesn't free all resources

Reported by: lennart@… Owned by: igloo
Priority: normal Milestone: 6.8.1
Component: Runtime System Version: 6.4.1
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

There seems to be resource leaks in load&unload of a DLL. If you put it in a loop strange thing can happen.

So one thing I notiticed is that several places (Ticker.c, IOManager.c) create threads, but I could not find the place where these threads die. How are they supposed to die shutdownHaskell() is called?

Attachments

notimer.patch Download (34.5 KB) - added by lennart@… 7 years ago.
Patch to disable time slicing by specifying a negative slice.
dlls.tar.gz Download (1.4 KB) - added by igloo 6 years ago.

Change History

Changed 7 years ago by lennart@…

After some more examination I've seen that the threads are supposed to exit. BUT, it doesn't really work because of a bad race condition.

Take stopTicker(). It sends an event to the tick thread that tells it to exit. But then it never waits for the thread to actually exit! This means that when the tick thread get scheduled to run the Haskell DLL migh have been unloaded from memory. And it seems that that this really happens, albeit rarely.

For each occurence of _beginthreadex() in the code there really needs to be a corresponding call that waits for the thread to shut down when exiting. You can't rely on thread scheduling to run things in the right order.

Changed 7 years ago by lennart@…

Patch to disable time slicing by specifying a negative slice.

Changed 7 years ago by lennart@…

Sometimes (very often?) it's not really necessary to have timer interrupts running in the Haskell rts. I've added a patch that allows you to turn them off by specifying a negative value to the -C flag.

Using this feature I can get around the access violation that sometimes occurs when unloading a DLL. A proper fix for the race condition is still needed, though.

Changed 7 years ago by simonmar

  • owner set to simonmar
  • component changed from Compiler to Runtime System
  • milestone set to 6.4.2

I have a better patch for timeslicing floating around (make the timer interval configurable) but it needs more testing.

In the meantime, we'll do something about this before 6.4.2.

Changed 7 years ago by simonmar

  • status changed from new to closed
  • resolution set to fixed

Now fixed, at least partially. Please test. I've addressed the timer thread issue.

I've run a test program that loads & unloads a DLL several hundred times without crashing (it used to crash before the patch).

Changed 7 years ago by simonmar

  • status changed from closed to reopened
  • resolution fixed deleted
  • milestone changed from 6.4.2 to 6.6

On second thoughts I'll leave the ticket open until it's fully fixed. I don't expect to fix it properly in 6.4.2, though - the best chance for a complete fix is in the HEAD with the threaded RTS. As far as I'm aware, in the HEAD's threaded RTS, all threads should shut down properly at exit except those currently blocked in foreign calls.

Changed 7 years ago by simonmar

  • milestone changed from 6.6 to 6.6.1

Punt to 6.6.1 - it may be fixed, but we need to investigate fully. Related to #804.

Changed 7 years ago by simonmar

  • owner changed from simonmar to igloo
  • priority changed from normal to high
  • status changed from reopened to new

Changed 6 years ago by igloo

Changed 6 years ago by igloo

  • priority changed from high to normal
  • milestone changed from 6.6.1 to 6.6.2

I've added a tarball (dlls.tar.gz) which contains code for a C program that loads a Haskell DLL, calls something from it and unloads it, in a loop. It now seems to run happily no matter how many times it iterates, but there seems to be a performance issue: the system time is quadratic in the number of iterations. I haven't looked into what's causing this (not even tested to see if the same thing happens with a C DLL).

Changed 6 years ago by simonmar

  • status changed from new to closed
  • resolution set to fixed

The only remaining issue related to this is #1663, as far as I know.

Changed 6 years ago by simonmar

  • milestone changed from 6.6.2 to 6.8.1

Changed 5 years ago by simonmar

  • architecture changed from Unknown to Unknown/Multiple

Changed 5 years ago by simonmar

  • os changed from Unknown to Unknown/Multiple
Note: See TracTickets for help on using tickets.