| | 519 | ===== An example where it worsens allocation ===== |
| | 520 | |
| | 521 | {{{ |
| | 522 | ------------------------------------------------------------------------------- |
| | 523 | Program ll-baseline ll-protect-ignore ll-lam10pin ll-it10pin ll-lam10pin-rec10 ll-it10pin-rec10 ll-lam10pin-rec10-stab ll-it10pin-rec10-stab |
| | 524 | ------------------------------------------------------------------------------- |
| | 525 | kahan 405812520 +0.0% +0.0% +0.0% +1.5% +1.5% +1.5% +1.5% |
| | 526 | }}} |
| | 527 | |
| | 528 | (Note: kahan is one of the programs Johan sees as a regression: he once had it not allocating at all, IIRC. It allocates plenty with 7.6 and HEAD, however.) |
| | 529 | |
| | 530 | Before the late lambda lift in imaginary/kahan (some compact array munging code Johan wrote), we have |
| | 531 | |
| | 532 | {{{ |
| | 533 | f ... = ... letrec inner ... = ... inner ... |
| | 534 | in |
| | 535 | letrec-no-escape outer ... = |
| | 536 | case ... of |
| | 537 | False -> terminate |
| | 538 | True -> case inner ... of ... -> ... outer ... |
| | 539 | in ... outer ... |
| | 540 | }}} |
| | 541 | |
| | 542 | Both letrecs are floated to the top in a SAT'd shape (explained below). |
| | 543 | |
| | 544 | {{{ |
| | 545 | poly_inner ... = letrec-no-escape inner ... = ... in inner ... |
| | 546 | |
| | 547 | poly_outer ... = letrec-no-escape outer ... = ... poly_inner ... in outer ... |
| | 548 | |
| | 549 | f ... = ... poly_outer ... |
| | 550 | }}} |
| | 551 | |
| | 552 | Consequently, the simplifier immediately (and silently) inlines them both. However, it happens in an unfortunate order, so we end up with |
| | 553 | |
| | 554 | {{{ |
| | 555 | f ... = ... let-no-escape outer ... = |
| | 556 | case ... of |
| | 557 | False -> terminate |
| | 558 | True -> letrec inner ... = ... inner ... |
| | 559 | in case inner ... of ... -> ... outer ... |
| | 560 | in ... outer ... |
| | 561 | }}} |
| | 562 | |
| | 563 | outer loops 2500000 times, and now allocates inner (size = 3 words) once per iteration. |
| | 564 | |
| | 565 | The SAT'd shape of each letrec (which triggers pre-inline-unconditionally, I'm suspecting) is due to some existing code in SetLevels. This is from a match for lvlBind (Rec pairs): |
| | 566 | |
| | 567 | {{{ |
| | 568 | -- ToDo: when enabling the floatLambda stuff, |
| | 569 | -- I think we want to stop doing this |
| | 570 | | isSingleton pairs && count isId abs_vars > 1 |
| | 571 | = do -- Special case for self recursion where there are |
| | 572 | -- several variables carried around: build a local loop: |
| | 573 | -- poly_f = \abs_vars. \lam_vars . letrec f = \lam_vars. rhs in f lam_vars |
| | 574 | -- This just makes the closures a bit smaller. If we don't do |
| | 575 | -- this, allocation rises significantly on some programs |
| | 576 | -- |
| | 577 | -- We could elaborate it for the case where there are several |
| | 578 | -- mutually functions, but it's quite a bit more complicated |
| | 579 | -- |
| | 580 | -- This all seems a bit ad hoc -- sigh |
| | 581 | }}} |
| | 582 | |
| | 583 | If floating a singly recursive binding with more than one abstracted value variable, this immediately does a SAT while marking it FloatMe. |
| | 584 | |
| | 585 | The comment is doubly confusing: |
| | 586 | |
| | 587 | * "when enabling the floatLambda stuff" seems contradictory to the guard count isId abs_vars > 1 |
| | 588 | * I'm not sure how this improves allocation, unless ... perhaps it is an alternative way to prevent the float from precluding specialization? |
| | 589 | * ie It's an alternative approach to the problem that we mitigated by instead stabilizing the binding for the sake of the .hi file? |
| | 590 | * this SAT didn't prevent our 20% allocation explosion in cryptharithm2 because the letrec there had exactly one abs_var, and so didn't use this lvlBind alternative |
| | 591 | |
| | 592 | Questions: |
| | 593 | |
| | 594 | 1. Is it worrisome that the chosen inlining ends up nesting a loop inside another, simultaneously spoiling the LNE? |
| | 595 | 1. If it can be refined to avoid this sort of problem, this SAT-based transform could potentially get the benefits of both the late lambda float and specialization |
| | 596 | * actually, might another (normal) FloatOut pass float inner (back) out of outer? |
| | 597 | |