Ticket #1619 (new proposed-project)

Opened 14 months ago

Last modified 14 months ago

Tweak memory-reuse analysis tools for GHC compatibility

Reported by: rrnewton Owned by:
Priority: not yet rated Keywords:
Cc: Topic: misc
Difficulty: unknown Mentor: not-accepted

Description (last modified by rrnewton) (diff)

Some program instrumentation and analysis tools are language agnostic. Pin and Valgrind use binary rewriting to instrument an x86 binary on the fly and thus in theory could be used just as well for a Haskell binary as for one compiled by C. Indeed, if you download Pin from pintool.org, you can use the included open source tools to immediately begin analyzing properties of Haskell workloads -- for example the total instruction mix during execution.

The problem is that aggregate data for an entire execution is rather coarse. It's not correlated temporally with phases of program execution, nor are specific measured phenomena related to anything in the Haskell source.

This could be improved. A simple example would be to measure memory-reuse distance (an architecture-independent characterization of locality) but to distinguish garbage collection from normal memory access. It would be quite nice to see a histogram of reuse-distances in which GC accesses appear as a separate layer (different color) from normal accesses.

How to go about this? Fortunately, the existing MICA pintool can build today (v0.4) and measure memory reuse distances.

 http://boegel.kejo.be/ELIS/mica/

In fact, it already produces per-phase measurements where phases are delimited by dynamic instruction counts (i.e. every 100M instructions). All that remains is to tweak that definition of phase to transition when GC switches on or off.

How to do that? Well, Pin has existing methods for targeted instrumentation of specific C functions:

 http://www.cs.virginia.edu/kim/publicity/pin/docs/45467/Pin/html/group__RTN__BASIC__API.html#g8622a6ba858eb8d55df4e006eb165e57

By targeting appropriate functions in the GHC RTS, this analysis tool could probably work without requiring any GHC modification at all.

A further out goal would be to correlate events observed by the binary rewriting tool and those recorded by GHC's traceEvent.

Finally, as it turns out this would NOT be the first crossing of paths between GHC and binary rewriting. Julian Seward worked on GHC before developing valgrind:

 http://www.techrepublic.com/article/open-source-awards-2004-julian-seward-for-valgrind/5136747

Interested Mentors

Ryan Newton

Others??

Interested Students (Include enough identifying info to find/reach you!)

Change History

Changed 14 months ago by rrnewton

  • description modified (diff)

Changed 14 months ago by rrnewton

  • description modified (diff)
Note: See TracTickets for help on using tickets.