Ticket #1582 (new proposed-project)

Opened 3 years ago

Last modified 6 weeks ago

LLVM optimisation passes / tables next to code

Reported by: batterseapower Owned by:
Priority: good Keywords:
Cc: Topic: GHC
Difficulty: 1 person Summer Mentor: not-accepted

Description

Project 1: LLVM optimisation passes ~

1) Clearly identify issues with LLVM produced assembly code in the context of GHC

  • This can be done by examining how it compares to the native code generator on nofib benchmarks
  • You might be able to get some mileage from simply eyeballing the assembly and looking for "obvious" stupidity, like the ESP issue David spotted
  • The result of this part should be a simple set of Haskell test cases, the assembly they produced, what the assembly *should* be (roughly) and perhaps some notes on what might fix it

2) The second part would be to identify the lowest hanging fruit from those things identified in 1) and make changes to the LLVM output / write LLVM optimisations (apparently this is a joy to do, the LLVM framework is very well designed) to fix the issues

Separating the project into two parts like this means that we could get something out of the project even if the student is unable to make significant progress with LLVM itself in the GSoC timeframe. Having a clear description of the problems involved + simple benchmark programs would be a huge help to someone attempting part 2) in the future, or they could serve as the basis for feature requests to the LLVM developers.

Project 2: Tables next to code

My feeling is that this is the more challenging of the two projects, as it is likely to touch more of LLVM / GHC. However, it is likely to yield a solid 6% speedup. It seems there are two implementation options:

1) Postprocessor ala the Evil Mangler (a nice self contained project, but not the best long term solution)

2) Modify LLVM to support this feature

Other

Either project could include looking at the impact of using llvm for link time optimisation, and a particularly able student might be able to attempt both halves.

See also the thread at  http://www.haskell.org/pipermail/cvs-ghc/2010-February/052606.html and David's thesis  http://docs.google.com/viewer?url=http%3A%2F%2Fwww.cse.unsw.edu.au%2F~pls%2Fthesis%2Fdavidt-thesis.pdf

Change History

Changed 3 years ago by batterseapower

I should probably add that I (Max Bolingbroke,  http://www.cl.cam.ac.uk/~mb566/) am an interested mentor.

Changed 3 years ago by dons

  • priority changed from not yet rated to good

Changed 2 years ago by guest

Was this accepted as  http://socghop.appspot.com/gsoc/student_project/show/google/gsoc2010/haskell/t127230760615 ? How should this ticket be dealt with - closed as fixed?

Changed 2 years ago by dterei

Probably close. Tables next to close has already been fixed as has some of the stack problems. Work could still be done but not sure how much of a fun gsoc project it would be and a little undefined as there is not any obvious things to fix up now.

Changed 2 years ago by batterseapower

It might still be sensible convert the backend to use the llvm package. I now understand the issues with the LLVM backend a bit better and it is not clear to me how to improve the amount of optimisation we get - there is a *lot* of missed potential due to poor aliasing analysis, but I'm not sure the best way to fix it up.

Changed 2 years ago by dterei

Yeah I agree but I don't think this would make a good student project since none of us really have a clear idea of how to improve it. Potentially a good student (e.g more Phd student) who knew what they were getting into could do some good.

I'm also a little doubtful if the problem is poor aliasing analysis. Roman and I spent some time trying to improve the code generated by concentrating on giving LLVM better alias information. We got a slight improvement but nothing like we wanted. I did some testing after that where I used a custom alias analysis pass in LLVM that just always said nothing aliased and we still got the bad code. Our conclusion at the time was that there was some bad interplay between the instruction selection and register allocation. There is a bug in llvm for example that basically gives bad register allocation for large functions (there is no live range splitting yet):

 http://llvm.org/bugs/show_bug.cgi?id=1512

This is all based more on hunches then lots of evidence so good chance I could be wrong. I think adding some of our own optimisation passes to llvm might be the best approach and potentially a cool student project but all pretty vague.

Changed 2 years ago by batterseapower

I agree with you that improving the optimisation is not a good student project. It is too open ended. However, converting the backend to use the llvm package would be a reasonable project or part of a project (I'm still not quite sure why Alp found it so hard!).

As for the optimisation issue, I really thought it was an aliasing issue. The bad code I'm seeing is present even at the LLVM-IR level so I doubt that the backend register allocators at fault. The kind of issue I'm seeing is MutableByteArray?# objects that are passed on the stack are being assumed to alias with everything else on the stack, so if inner loops store to such an array LLVM cannot do even simple store-to-load forwarding, which is totally disabling loop optimisations. We probably have different codegen issues in mind, though.

Changed 2 years ago by dterei

Could you send me a small example of this please so I can put it on my todo list.

Changed 6 weeks ago by mathilde.andre

Hello,

I would like to know if this project is for Gogle Summer of Code 2013 ?

Thank you for your answer, Regards, Mathilde André

Note: See TracTickets for help on using tickets.