Version 4 (modified by duncan, 5 years ago)

add descriptions to more of the Distribtion.* modules

Guide to the Cabal source code

On first look the Cabal code seems large and intimidating. This page is intended to give you a head start in understanding it.

Structure

All the Cabal modules live under Distribution.*

The modules can be roughly divided into two groups:

The declarative modules
They are mostly concerned with data structures like package descriptions. These modules live under Distribution.*. Much of the code in these modules are utility functions for handling the data types and also functions for parsing and showing them.
The active modules
They are concerned with actually doing things like configuring, building and installing packages. These modules live under Distribution.Simple.*.

Declarative modules

A couple really dull modules

Distribution/GetOpt.hs
This should live under Compat/ it's just a bundled version of the standard GetOpt. Not very interesting.
Distribution/Setup.hs
This is a deprecated module here just so that old Setup.hs scripts do not break. Ignore it.
Distribution/Extension.hs
This is also a deprecated module here just so that old code does not break. The module got renamed to Language.Haskell.Extension.

Some simple data types:

Distribution/Version.hs
exports the Version type along with a parser and pretty printer. A version is something like "1.3.3". It also defines VersionRanges and Dependency data types. Version ranges are like ">= 1.2 && < 2". A dependency is a package name and a version range, like "foo >= 1.2 && < 2".
Distribution/Package.hs
defines a package identifier along with a parser and pretty printer for it. PackageIdentifiers consist of a name and an exact version (exact version as opposed to a dependency like above that uses a version range).
Distribution/Verbosity.hs
a simple Verbosity type with associated utilities. There are 4 standard verbosity levels from Silent, Normal, Verbose up to Deafening. This is used for deciding what logging messages to print in the active parts.
Distribution/Compiler.hs
This has an enumeration of the various compilers that Cabal knows about. It also specifies the default compiler. Sadly you'll often see code that does case analysis on this compiler flavour enumeration like:
  case compilerFlavor comp of
    GHC -> GHC.getInstalledPackages verbosity packageDb progconf
    JHC -> JHC.getInstalledPackages verbosity packageDb progconf

Obviously it would be better to use the proper Compiler abstraction because that would keep all the compiler-specific code together. Unfortunately we cannot make this change yet without breaking the UserHooks api, which would break all custom Setup.hs files, so for the moment we just have to live with this deffeciency. If you're interested, see ticket #50.
Distribution/System.hs
Cabal often needs to do slightly different things on specific platforms. You probably know about the System.Info.os :: String however using that is very inconvenient because it is a string and different Haskell implementations do not agree on using the same strings for the same platforms! (In particular see the controversy over "windows" vs "ming32"). So to make it more consistent and easy to use we have an OS enumeration.
Distribution/License.hs
The .cabal file allows you to specify a license file. Of course you can use any license you like but people often pick common open source licenses and it's useful if we can automatically recognise that (eg so we can display it on the hackage web pages). So you can also specify the license itself in the .cabal file from a short enumeration defined in this module. It includes GPL, LGPL and BSD3 licenses.
Distribution/ParseUtils.hs
The .cabal file format is not trivial, especially with the introduction of configurations and the section syntax that goes with that. This module has a bunch of parsing functions that is used by the .cabal parser and a couple others. It has the parsing framework code and also little parsers for many of the formats we get in various .cabal file fields, like module names, comma separated lists etc.
Distribution/PackageDescription.hs
This is a big one. It defines the data structure for the .cabal file format. There are several parts to this structure. It has top level info and then Library and Executable sections each of which have associated BuildInfo data that's used to build the library or exe. To further complicate things there is both a PackageDescription and a GenericPackageDescription. This distinction relates to cabal configurations. When we initially read a .cabal file we get a GenericPackageDescription which has all the conditional sections. Before actually building a package we have to decide on each conditional. Once we've done that we get a PackageDescription. It was done this way initially to avoid breaking too much stuff when the feature was introduced. It could probably do with being rationalised at some point to make it simpler. This module also has code to do some sanity checking on the package description. Some of the complexity in this module is to do with the fact that we have to be backwards compatible with old .cabal files, so there's code to translate into the newer structure.
Distribution/Configuration.hs
This is about the cabal configurations feature. It has code for working with the tree of conditions and resolving or flattening conditions. This is used by finalizePackageDescription and flattenPackageDescription.
Distribution/InstalledPackageInfo.hs
The .cabal file format is for describing a package that is not yet installed. It has a lot of flexibility like conditionals and dependency ranges. As such that format is not at all suitable for describing a package that has already been built and installed. By the time we get to that stage we have resolved all conditionals and resolved dependency version constraints to exact versions of dependent packages. So this module defines the InstalledPackageInfo data strcuture the contains all the info we keep about an installed package. There is a parser and pretty printer. The textual format is rather simpler than the .cabal format, there are no sections for example. This is the format that ghc-pkg understands.

Active modules

Useful internal abstractions

Distribution/Simple/Program.hs
Distribution/Simple/Command.hs
Distribution/Simple/InstallDirs.hs
Distribution/Simple/Compiler.hs
Distribution/Simple/PreProcess.hs
Distribution/Simple/Utils.hs
Distribution/Simple/LocalBuildInfo.hs

Particular phases or actions within the build process

Distribution/Simple/Configure.hs
Distribution/Simple/Build.hs
Distribution/Simple/Install.hs
Distribution/Simple/Haddock.hs
Distribution/Simple/Register.hs
Distribution/Simple/SrcDist.hs

Compiler-specific modules

Distribution/Simple/GHC.hs
Distribution/Simple/Hugs.hs
Distribution/Simple/JHC.hs
Distribution/Simple/NHC.hs

Stuff related to the front end

Distribution/Simple/UserHooks.hs
Distribution/Simple/Setup.hs
Distribution/Simple/SetupWrapper.hs

Command line front ends

Distribution/Simple.hs
Distribution/Make.hs