| | 116 | |
| | 117 | Using -flate-float-in-thunk-limit=10, -fprotect-last-arg, and -O1, I tested the libraries+NoFib for four variants. |
| | 118 | |
| | 119 | * nn - do not float a binding that applies one of its free variables. |
| | 120 | * yn - do not float a binding that applies one of its free variables saturated or oversaturated. |
| | 121 | * ny - do not float a binding that applies one of its free variables undersaturated. |
| | 122 | * yy - do not restrict application of bindings free variables |
| | 123 | |
| | 124 | Roughly, we expect that more floating means (barely) less allocation but worse runtime (by how much?) because some known calls become unknown calls. |
| | 125 | |
| | 126 | ==== Anchoring let-no-escapes ==== |
| | 127 | |
| | 128 | I discovered this while experimenting with the fast preservation variants above. |
| | 129 | |
| | 130 | TODO: Mitigate it. |
| | 131 | |
| | 132 | In fish (1.6%), hpg (~4.5%), and sphere (10.4%), allocation gets worse for ny and yy compared to nn and yn. The nn and ny do not change the allocation compared to the baseline library (ie no LLF). |
| | 133 | |
| | 134 | The nn -> ny comparison is counter to our rough idea: floating more bindings (those that saturate/oversaturate some free variables) worsens allocation. Thus, I investigate. |
| | 135 | |
| | 136 | The sphere program hammers `hPutStr`. Its extra allocation is mostly due to a regression in `GHC.IO.Encoding.UTF8`. Here's the situation. |
| | 137 | |
| | 138 | With the nn variant: |
| | 139 | |
| | 140 | {{{ |
| | 141 | outer a b c ... = |
| | 142 | let-no-escape f x = CTX[let-no-escape $j y = ... (f ...) ... in CTX2[$j]] |
| | 143 | in ... |
| | 144 | }}} |
| | 145 | |
| | 146 | In this case, `$j` is not floated because it applies `f`. With the ny variant, `$j` gets floated. |
| | 147 | |
| | 148 | {{{ |
| | 149 | poly_$j a b c ... f y = ... |
| | 150 | |
| | 151 | outer a b c ... = |
| | 152 | let f x = CTX[CTX2[poly_$j a b c ... f]] |
| | 153 | in ... |
| | 154 | }}} |
| | 155 | |
| | 156 | Thus `f` cannot be let-no-escape because it now occurs as an argument to `poly_$j`. |
| | 157 | |
| | 158 | This contributes to sphere's 1 megabyte of extra allocation for two reasons: |
| | 159 | |
| | 160 | * `outer` is entered about 60,000 times. |
| | 161 | * The RHS of `f` has 13 free variables, so it's closure is rather large. |
| | 162 | |
| | 163 | 13*60,000 ~ 750,000. I suspect the rest of sphere's increase is due to a similar issue in `GHC.IO.Handle`. |
| | 164 | |
| | 165 | In hpg, it's principally due to GHC`.IO.Encoding.UTF8` again, with a second place contributor of `GHC.IO.FD`, where the function `$wa17` is again like the `outer` example above, but with fewer free variables and thus less effect. |