| 74 | | === Idiom: non-recursive make === |
| 75 | | |
| 76 | | Build systems for large projects often use the technique commonly |
| 77 | | known as "recursive make", where there is a separate `Makefile` in |
| 78 | | each directory that is capable of building that part of the system. |
| 79 | | The `Makefile`s may share some common infrastructure and configuration |
| 80 | | by using GNU '''make''''s `include` directive; this is exactly what the |
| 81 | | previous GHC build system did. However, this design has a number of |
| 82 | | flaws, as described in Peter Miller's |
| 83 | | [http://miller.emu.id.au/pmiller/books/rmch/ Recursive Make Considered Harmful]. |
| 84 | | |
| 85 | | The GHC build system adopts the non-recursive '''make''' idiom. That is, we |
| 86 | | never invoke '''make''' from inside a `Makefile`, and the whole build system |
| 87 | | is effectively a single giant `Makefile`. |
| 88 | | |
| 89 | | This gives us the following advantages: |
| 90 | | |
| 91 | | * Specifying dependencies between different parts of the tree is |
| 92 | | easy. In this way, we can accurately specify many dependencies |
| 93 | | that we could not in the old recursive-make system. This makes it much more likely that when you say "make" |
| 94 | | after modifying parts of the tree or pulling new patches, |
| 95 | | the build system will bring everything up-to-date in the correct order, and leave you with a working |
| 96 | | system. |
| 97 | | |
| 98 | | * More parallelism: dependencies are more fine-grained, and there |
| 99 | | is no need to build separate parts of the system in sequence, so |
| 100 | | the overall effect is that we have more parallelism in the build. |
| 101 | | |
| 102 | | Doesn't this sacrifice modularity? No - we can still split the build |
| 103 | | system into separate files, using GNU '''make''''s `include`. |
| 104 | | |
| 105 | | Specific notes related to this idiom: |
| 106 | | |
| 107 | | * Individual directories usually have a `ghc.mk` file which |
| 108 | | contains the build instructions for that directory. |
| 109 | | |
| 110 | | * Other parts of the build system are in `mk/*.mk` and `rules/*.mk`. |
| 111 | | |
| 112 | | * The top-level `ghc.mk` file includes all the other `*.mk` files in |
| 113 | | the tree. The top-level `Makefile` invokes '''make''' on `ghc.mk` |
| 114 | | (this is the only recursive invocation of '''make'''; see the "phase |
| 115 | | ordering" idiom below). |
| 116 | | |
| 117 | | === Idiom: stub makefiles === |
| 118 | | |
| 119 | | It's all very well having a single giant `Makefile` that knows how to |
| 120 | | build everything in the right order, but sometimes you want to build |
| 121 | | just part of the system. When working on GHC itself, we might want to |
| 122 | | build just the compiler, for example. In the recursive '''make''' system we |
| 123 | | would do `cd ghc` and then `make`. In the non-recursive system we can |
| 124 | | still achieve this by specifying the target with something like `make |
| 125 | | ghc/stage1/build/ghc`, but that's not so convenient. |
| 126 | | |
| 127 | | Our second idiom therefore supports the `cd ghc; make` idiom, just as |
| 128 | | with recursive make. To achieve this we put tiny stub `Makefile` in each |
| 129 | | directory whose job it is to invoke the main `Makefile` specifying the |
| 130 | | appropriate target(s) for that directory. These stub `Makefiles` |
| 131 | | follow a simple pattern: |
| 132 | | |
| 133 | | {{{ |
| 134 | | dir = libraries/base |
| 135 | | TOP = ../.. |
| 136 | | include $(TOP)/mk/sub-makefile.mk |
| 137 | | }}} |
| 138 | | |
| 139 | | where `mk/sub-makefile.mk` knows how to recursively invoke the giant top-level '''make'''. |
| 140 | | |
| 141 | | === Idiom: standard targets (all, clean, etc.) === |
| 142 | | |
| 143 | | We want an `all` target that builds everything, but we also want a way to build individual components (say, everything in `rts/`). This is achieved by having a separate "all" target for each directory, named `all_`''directory''. For example in `rts/ghc.mk` we might have this: |
| 144 | | |
| 145 | | {{{ |
| 146 | | all : all_rts |
| 147 | | .PHONY all_rts |
| 148 | | all_rts : ...dependencies... |
| 149 | | }}} |
| 150 | | When the top level '''make''' includes all these `ghc.mk` files, it will see that target `all` depends on `all_rts, all_ghc, ...etc...`; so `make all` will make all of these. But the individual targets are still available. In particular, you can say |
| 151 | | * `make all_rts` (anywhere) to build everything in the RTS directory |
| 152 | | * `make all` (anywhere) to build everything |
| 153 | | * `make`, with no explicit target, makes the default target in the current directory's stub `Makefile`, which in turn makes the target `all_`''dir'', where ''dir'' is the current directory. |
| 154 | | |
| 155 | | Other standard targets such as `clean`, `install`, and so on use the same technique. There are pre-canned macros to define your "all" and "clean" targets, take a look in `rules/all-target.mk` and `rules/clean-target.mk`. |
| 156 | | |
| 157 | | === Idiom: stages === |
| 158 | | |
| 159 | | What do we use to compile GHC? GHC itself, of course. In a complete build we actually build GHC twice: once using the GHC version that is installed, and then again using the GHC we just built. To be clear about which GHC we are talking about, we number them: |
| 160 | | |
| 161 | | * '''Stage 0''' is the GHC you have installed. The "GHC you have installed" is also called "the bootstrap compiler". |
| 162 | | * '''Stage 1''' is the first GHC we build, using stage 0. Stage 1 is then used to build the packages. |
| 163 | | * '''Stage 2''' is the second GHC we build, using stage 1. This is the one we normally install when you say `make install`. |
| 164 | | * '''Stage 3''' is optional, but is sometimes built to test stage 2. |
| 165 | | |
| 166 | | Stage 1 does not support interactive execution (GHCi) and Template Haskell. The reason being that when running byte code we must dynamically link the packages, and only in stage 2 and later can we guarantee that the packages we dynamically link are compatible with those that GHC was built against (because they are the very same packages). |
| 167 | | |
| 168 | | |
| 169 | | === Idiom: distdir === |
| 170 | | |
| 171 | | Often we want to build a component multiple times in different ways. For example: |
| 172 | | |
| 173 | | * certain libraries (e.g. Cabal) are required by GHC, so we build them once with the |
| 174 | | bootstrapping compiler, and again with stage 1 once that is built. |
| 175 | | |
| 176 | | * GHC itself is built multiple times (stage 1, stage 2, maybe stage 3) |
| 177 | | |
| 178 | | * some tools (e.g. ghc-pkg) are also built once with the bootstrapping compiler, |
| 179 | | and then again using stage 1 later. |
| 180 | | |
| 181 | | In order to support multiple builds in a directory, we place all generated files in a subdirectory, called the "distdir". The distdir can be anything at all; for example in `compiler/` we name our distdirs after the stage (`stage1`, `stage2` etc.). When there is only a single build in a directory, by convention we usually call the distdir simply "dist". |
| 182 | | |
| 183 | | There is a related concept called ''ways'', which includes profiling and dynamic-linking. Multiple ways are currently part of the same "build" and use the same distdir, but in the future we might unify these concepts and give each way its own distdir. |
| 184 | | |
| 185 | | === Idiom: interaction with Cabal === |
| 186 | | |
| 187 | | Many of the components of the GHC build system are also Cabal |
| 188 | | packages, with package metadata defined in a `foo.cabal` file. For the |
| 189 | | GHC build system we need to extract that metadata and use it to build |
| 190 | | the package. This is done by the program `ghc-cabal` (in `utils/ghc-cabal` |
| 191 | | in the GHC source tree). This program reads `foo.cabal` and produces |
| 192 | | `package-data.mk` containing the package metadata in the form of |
| 193 | | makefile bindings that we can use directly. |
| 194 | | |
| 195 | | We adhere to the following rule: '''`ghc-cabal` generates only |
| 196 | | makefile variable bindings''', such as |
| 197 | | {{{ |
| 198 | | HS_SRCS = Foo.hs Bar.hs |
| 199 | | }}} |
| 200 | | `ghc-cabal` never generates makefile rules, macro, macro invocations etc. |
| 201 | | All the makefile code is therefore contained in fixed, editable |
| 202 | | `.mk` files. |
| 203 | | |
| 204 | | === Idiom: variable names === |
| 205 | | |
| 206 | | Now that our build system is one giant `Makefile`, all our variables |
| 207 | | share the same namespace. Where previously we might have had a |
| 208 | | variable that contained a list of the Haskell source files called |
| 209 | | `HS_SRCS`, now we have one of these for each directory (and indeed each build, or distdir) in the source tree, |
| 210 | | so we have to give them all different names. |
| 211 | | |
| 212 | | The idiom that we use for distinguishing variable names is to prepend |
| 213 | | the directory name and the distdir to the variable. So for example the list of |
| 214 | | Haskell sources in the directory `utils/hsc2hs` would be in the |
| 215 | | variable `utils/hsc2hs_dist_HS_SRCS` ('''make''' doesn't mind slashes in variable |
| 216 | | names). The pattern is: ''directory''_''distdir''_''variable''. |
| 217 | | |
| 218 | | === Idiom: macros === |
| 219 | | The build system makes extensive use of Gnu '''make''' '''macros'''. A macro is defined in |
| 220 | | GNU '''make''' using `define`, e.g. |
| 221 | | |
| 222 | | {{{ |
| 223 | | define build-package |
| 224 | | # args: $1 = directory, $2 = distdir |
| 225 | | ... makefile code to build a package ... |
| 226 | | endef |
| 227 | | }}} |
| 228 | | |
| 229 | | (for example, see `rules/build-package`), and is invoked like this: |
| 230 | | |
| 231 | | |
| 232 | | {{{ |
| 233 | | $(eval $(call build-package,libraries/base,dist)) |
| 234 | | }}} |
| 235 | | |
| 236 | | (this invocation would be in `libraries/base/ghc.mk`). |
| 237 | | |
| 238 | | Note that `eval` works like this: its argument is expended as normal, |
| 239 | | and then the result is interpreted by '''make''' as makefile code. This |
| 240 | | means the body of the `define` gets expanded ''twice''. Typically |
| 241 | | this means we need to use `$$` instead of `$` everywhere in the body of |
| 242 | | `define`. |
| 243 | | |
| 244 | | Now, the `build-package` macro may need to define '''local variables'''. |
| 245 | | There is no support for local variables in macros, but we can define |
| 246 | | variables which are guaranteed to not clash with other variables by |
| 247 | | preceding their names with a string that is unique to this macro call. |
| 248 | | A convenient unique string to use is ''directory''_''distdir''_; this is unique as long as we only call each macro with a given directory/build pair once. Most macros in |
| 249 | | the GHC build system take the directory and build as the first two |
| 250 | | arguments for exactly this reason. For example, here's an excerpt |
| 251 | | from the `build-prog` macro: |
| 252 | | |
| 253 | | {{{ |
| 254 | | define build-prog |
| 255 | | # $1 = dir |
| 256 | | # $2 = distdir |
| 257 | | # $3 = GHC stage to use (0 == bootstrapping compiler) |
| 258 | | |
| 259 | | $1_$2_INPLACE = $$(INPLACE_BIN)/$$($1_$2_PROG) |
| 260 | | ... |
| 261 | | }}} |
| 262 | | |
| 263 | | So if `build-prog` is called with `utils/hsc2hs` and `dist` for the |
| 264 | | first two arguments, after expansion '''make''' would see this: |
| 265 | | |
| 266 | | {{{ |
| 267 | | utils/hsc2hs_dist_INPLACE = $(INPLACE_BIN)/$(utils/hsc2hs_dist_PROG) |
| 268 | | }}} |
| 269 | | |
| 270 | | The idiom of `$$($1_$2_VAR)` is very common throughout the build |
| 271 | | system - get used to reading it! Note that the only time we use a |
| 272 | | single `$` in the body of `define` is to refer to the parameters `$1`, |
| 273 | | `$2`, and so on. |
| 274 | | |
| 275 | | === Idiom: phase ordering === |
| 276 | | |
| 277 | | NB. you need to understand this section if either (a) you are modifying parts of the build system that include automatically-generated `Makefile` code, or (b) you need to understand why we have a top-level `Makefile` that recursively invokes '''make'''. |
| 278 | | |
| 279 | | The main hitch with non-recursive '''make''' arises when parts of the build |
| 280 | | system are automatically-generated. The automatically-generated parts |
| 281 | | of our build system fall into two main categories: |
| 282 | | |
| 283 | | * Dependencies: we use `ghc -M` to generate make-dependencies for |
| 284 | | Haskell source files, and similarly `gcc -M` to do the same for |
| 285 | | C files. The dependencies are normally generated into a file |
| 286 | | `.depend`, which is included as normal. |
| 287 | | |
| 288 | | * Makefile binding generated from `.cabal` package descriptions. See |
| 289 | | "Idiom: interaction with Cabal". |
| 290 | | |
| 291 | | Now, we also want to be able to use `make` to build these files, since |
| 292 | | they have complex dependencies themselves. For example, in order to build |
| 293 | | `package-data.mk` we need to first build `ghc-cabal` etc.; similarly, |
| 294 | | a `.depend` file needs to be re-generated if any of the source files have changed. |
| 295 | | |
| 296 | | GNU '''make''' has a clever strategy for handling this kind of scenario. It |
| 297 | | first reads all the included Makefiles, and then tries to build each |
| 298 | | one if it is out-of-date, using the rules in the Makefiles themselves. |
| 299 | | When it has brought all the included Makefiles up-to-date, it restarts itself |
| 300 | | to read the newly-generated Makefiles. |
| 301 | | |
| 302 | | This works fine, unless there are dependencies ''between'' the |
| 303 | | Makefiles. For example in the GHC build, the `.depend` file for a |
| 304 | | package cannot be generated until `package-data.mk` has been generated |
| 305 | | and '''make''' has been restarted to read in its contents, because it is the |
| 306 | | `package-data.mk` file that tells us which modules are in the package. |
| 307 | | But '''make''' always makes '''all''' the included `Makefiles` before restarting - it |
| 308 | | doesn't know how to restart itself earlier when there is a dependency |
| 309 | | between included `Makefiles`. |
| 310 | | |
| 311 | | Consider the following Makefile: |
| 312 | | |
| 313 | | {{{ |
| 314 | | all : |
| 315 | | |
| 316 | | include inc1.mk |
| 317 | | |
| 318 | | inc1.mk : Makefile |
| 319 | | echo "X = C" >$@ |
| 320 | | |
| 321 | | include inc2.mk |
| 322 | | |
| 323 | | inc2.mk : inc1.mk |
| 324 | | echo "Y = $(X)" >$@ |
| 325 | | }}} |
| 326 | | |
| 327 | | Now try it: |
| 328 | | |
| 329 | | {{{ |
| 330 | | $ make -f fail.mk |
| 331 | | fail.mk:3: inc1.mk: No such file or directory |
| 332 | | fail.mk:8: inc2.mk: No such file or directory |
| 333 | | echo "X = C" >inc1.mk |
| 334 | | echo "Y = " >inc2.mk |
| 335 | | make: Nothing to be done for `all'. |
| 336 | | }}} |
| 337 | | |
| 338 | | '''make''' built both `inc1.mk` and `inc2.mk` without restarting itself |
| 339 | | between the two (even though we added a dependency on `inc1.mk` from |
| 340 | | `inc2.mk`). |
| 341 | | |
| 342 | | The solution we adopt in the GHC build system is as follows. We have |
| 343 | | two Makefiles, the first a wrapper around the second. |
| 344 | | |
| 345 | | {{{ |
| 346 | | # top-level Makefile |
| 347 | | % : |
| 348 | | $(MAKE) -f inc.mk PHASE=0 just-makefiles |
| 349 | | $(MAKE) -f inc.mk $< |
| 350 | | }}} |
| 351 | | |
| 352 | | {{{ |
| 353 | | # inc.mk |
| 354 | | |
| 355 | | include inc1.mk |
| 356 | | |
| 357 | | ifeq "$(PHASE)" "0" |
| 358 | | |
| 359 | | inc1.mk : inc.mk |
| 360 | | echo "X = C" >$@ |
| 361 | | |
| 362 | | else |
| 363 | | |
| 364 | | include inc2.mk |
| 365 | | |
| 366 | | inc2.mk : inc1.mk |
| 367 | | echo "Y = $(X)" >$@ |
| 368 | | |
| 369 | | endif |
| 370 | | |
| 371 | | just-makefiles: |
| 372 | | @: # do nothing |
| 373 | | |
| 374 | | clean : |
| 375 | | rm -f inc1.mk inc2.mk |
| 376 | | }}} |
| 377 | | Each time '''make''' is invoked, we recursively invoke '''make''' in several |
| 378 | | ''phases'': |
| 379 | | * '''Phase 0''': invoke `inc.mk` with `PHASE=0`. This brings `inc1.mk` |
| 380 | | up-to-date (and ''only'' `inc1.mk`). |
| 381 | | |
| 382 | | * '''Final phase''': invoke `inc.mk` again (with `PHASE` unset). Now we can be sure |
| 383 | | that `inc1.mk` is up-to-date and proceed to generate `inc2.mk`. |
| 384 | | If this changes `inc2.mk`, then '''make''' automatically re-invokes itself, |
| 385 | | repeating the final phase. |
| 386 | | We could instead have abandoned '''make''''s automatic re-invocation mechanism altogether, |
| 387 | | and used three explicit phases (0, 1, and final), but in practice it's very convenient to use the automatic |
| 388 | | re-invocation when there are no problematic dependencies. |
| 389 | | |
| 390 | | Note that the `inc1.mk` rule is ''only'' enabled in phase 0, so that if we accidentally call `inc.mk` without first performing phase 0, we will either get a failure (if `inc1.mk` doesn't exist), or otherwise '''make''' will not update `inc1.mk` if it is out-of-date. |
| 391 | | |
| 392 | | In the case of the GHC build system we need 4 such phases, see the |
| 393 | | comments in the top-level `ghc.mk` for details. |
| 394 | | |
| 395 | | This approach is not at all pretty, and |
| 396 | | re-invoking '''make''' every time is slow, but we don't know of a better |
| 397 | | workaround for this problem. |
| 398 | | |
| 399 | | |
| 400 | | |
| 401 | | |
| 402 | | === Idiom: no double-colon rules === |
| 403 | | |
| 404 | | '''Make''' has a special type of rule of the form `target :: prerequisites`, |
| 405 | | with the behaviour that all double-colon rules for a given target are |
| 406 | | executed if the target needs to be rebuilt. This style was popular |
| 407 | | for things like "all" and "clean" targets in the past, but it's not |
| 408 | | really necessary - see the "all" idiom above - and this means there's one fewer makeism you need to know about. |
| 409 | | |
| 410 | | === Idiom: the vanilla way === |
| 411 | | |
| 412 | | Libraries can be built in several different "ways", for example |
| 413 | | "profiling" and "dynamic" are two ways. Each way has a short tag |
| 414 | | associated with it; "p" and "dyn" are the tags for profiling and |
| 415 | | dynamic respectively. In previous GHC build systems, the "normal" way |
| 416 | | didn't have a name, it was just always built. Now we explicitly call |
| 417 | | it the "vanilla" way and use the tag "v" to refer to it. |
| 418 | | |
| 419 | | This means that the `GhcLibWays` variable, which lists the ways in |
| 420 | | which the libraries are built, must include "v" if you want the |
| 421 | | vanilla way to be built (this is included in the default setup, of |
| 422 | | course). |
| 423 | | |
| 424 | | === Idiom: whitespace === |
| 425 | | |
| 426 | | make has a rather ad-hoc approach to whitespace. Most of the time it ignores it, e.g. |
| 427 | | {{{ |
| 428 | | FOO = bar |
| 429 | | }}} |
| 430 | | sets `FOO` to `"bar"`, not `" bar"`. However, sometimes whitespace is significant, |
| 431 | | and calling macros is one example. For example, we used to have a call |
| 432 | | {{{ |
| 433 | | $(call all-target, $$($1_$2_INPLACE)) |
| 434 | | }}} |
| 435 | | and this passed `" $$($1_$2_INPLACE)"` as the argument to `all-target`. This in turn generated |
| 436 | | {{{ |
| 437 | | .PHONY: all_ inplace/bin/ghc-asm |
| 438 | | }}} |
| 439 | | which caused an infinite loop, as make continually thought that `ghc-asm` was out-of-date, rebuilt it, |
| 440 | | reinvoked make, and then thought it was out of date again. |
| 441 | | |
| 442 | | The moral of the story is, avoid white space unless you're sure it'll be OK! |
| 443 | | |
| 444 | | === Idiom: platform names === |
| 445 | | |
| 446 | | There are three platforms of interest when building GHC: |
| 447 | | |
| 448 | | * `$(BUILDPLATFORM)`: The ''build'' platform.[[br]] |
| 449 | | The platform on which we are doing this build. |
| 450 | | |
| 451 | | * `$(HOSTPLATFORM)`: The ''host'' platform.[[br]] |
| 452 | | The platform on which these binaries will run. |
| 453 | | |
| 454 | | * `$(TARGETPLATFORM)`: The ''target'' platform.[[br]] |
| 455 | | The platform for which this compiler will generate code. |
| 456 | | |
| 457 | | These platforms are set when running the |
| 458 | | {{{configure}}} script, using the |
| 459 | | {{{--build}}}, {{{--host}}}, and |
| 460 | | {{{--target}}} options. The {{{mk/project.mk}}} |
| 461 | | file, which is generated by `configure` from [http://darcs.haskell.org/mk/project.mk.in project.mk.in], defines several symbols related to the platform settings. |
| 462 | | |
| 463 | | We don't currently support build and host being different, because |
| 464 | | the build process creates binaries that are both run during the build, |
| 465 | | and also installed. |
| 466 | | |
| 467 | | If host and target are different, then we are building a |
| 468 | | cross-compiler. For GHC, this means a compiler |
| 469 | | which will generate intermediate .hc files to port to the target |
| 470 | | architecture for bootstrapping. The libraries and stage 2 compiler |
| 471 | | will be built as HC files for the target system (see [wiki:Building/Porting Porting GHC] for details). |
| 472 | | |
| 473 | | More details on when to use BUILD, HOST or TARGET can be found in |
| 474 | | the comments in [http://darcs.haskell.org/mk/project.mk.in project.mk.in]. |
| | 74 | * [wiki:Building/Architecture/Idiom/NonRecursiveMake Non-recursive make] |
| | 75 | * [wiki:Building/Architecture/Idiom/StubMakefiles Stub makefiles] |
| | 76 | * [wiki:Building/Architecture/Idiom/StandardTargets Standard targets (all, clean etc.)] |
| | 77 | * [wiki:Building/Architecture/Idiom/Stages Stages] |
| | 78 | * [wiki:Building/Architecture/Idiom/Distdir Distdir] |
| | 79 | * [wiki:Building/Architecture/Idiom/Cabal Interaction with Cabal] |
| | 80 | * [wiki:Building/Architecture/Idiom/VariableNames Variable names] |
| | 81 | * [wiki:Building/Architecture/Idiom/Macros Macros] |
| | 82 | * [wiki:Building/Architecture/Idiom/PhaseOrdering Phase ordering] |
| | 83 | * [wiki:Building/Architecture/Idiom/DoubleColon No double-colon rules] |
| | 84 | * [wiki:Building/Architecture/Idiom/VanillaWay The vanilla way] |
| | 85 | * [wiki:Building/Architecture/Idiom/Whitespace Whitespace] |
| | 86 | * [wiki:Building/Architecture/Idiom/PlatformNames Platform names (build, host, target)] |