| Version 10 (modified by duncan, 13 months ago) |
|---|
This page is for trying to clarify some ideas and design for cabal handling sets of packages people want to build.
Goal
Recently people have written various cabal extension/wrapper tools to address use cases that the main Cabal/cabal-install tool does not handle very well. These tools are each designed to address a few use cases, and the different tools overlap somewhat in what they cover.
The goal here is:
- to see if we can identify some more unified idea and explain what the various existing tools are trying to do in terms of that idea;
- then to come up with a design (mechanisms) that implements that idea; and
- a user interface that covers the main use cases in a reasonably simple way, but hopefully sufficiently flexible to expose what the design can do
Example use cases
Developer working on a single package
The developer is actively developing a package. The package is in some intermediate state i.e. it doesn't make sense for other, not-in-development projects to depend on it (e.g. it might be buggy, have an unstable API at the moment.) The developer might want, in addition to building the package, to run tests and/or benchmarks.
Current workflow (assuming that cabal repl exists):
cabal configure --enable-tests --enable-benchmarks cabal install --only-dependencies # does this pull in test/benchmark deps? cabal build -or cabal build && cabal test -or- cabal build && cabal benchmark -or- cabal repl
Problems with current workflow:
- Dependencies might be added/removed during development, forcing the developer to re-run configure --enable-tests --enable-benchmarks, cabal install --only-dependencies.
- It's not possible to build the package without configuring it and install its dependencies, thus those should be implied by build.
- It's not possible to run tests/benchmarks without first building the package, so building should be implied by running cabal test/bench.
Ideal workflow:
cabal build -or- cabal test -or- cabal bench -or- cabal repl
Developer working on multiple local packages
Like "Developer working on a single package", except the developer also has to change one or more dependencies in order to make the current package work e.g. perhaps he/she needs to add a new function to one of the dependencies and use it in the package under development. The dependencies might not be owner by the developer so he/she can't simply change them and make a new release. He/she needs to use the modified version until the changes have been accepted upstream.
Current workflow:
cd dep cabal configure && cabal build cd pkg cabal configure --package-db=../dep/dist/package.conf.inplace # Like previous use case when it comes to running tests, install other dependencies, etc.
Problems with current workflow:
- Every time the package depended upon needs to be changed, the four steps above have to be repeated.
- Otherwise the same as in "Developer working on a single package"
Ideal workflow (UI up for discussion):
cabal build --package-root=$HOME/src # Looks for additional packages here
Similar for cabal test, cabal bench, and cabal repl
In-house developer team working on multiple packages with local hackage server
This is similar to the above "Developer working on multiple local packages" case but there is a team of programmers producing and consuming packages. Development versions of packages are shared using source control, and developers can publish versions of packages to the local hackage server. Developers work using a combination of packages from hackage.haskell.org, the local hackage server (which has some bug fix versions of 3rd party packages as well as packages published by the team), and packages checked out from source control. There are also testing servers which build from the local hackage server rather than from source repos.
Existing tools
cabal-dev
- http://hackage.haskell.org/package/cabal-dev
- https://github.com/creswick/cabal-dev/blob/master/README.md
cabal-dev provides what it calls a sandbox for source packages and installed packages. It has a command line interface that is very similar to that of cabal, so that it can be used as a drop-in replacement. It is implemented as a wrapper around the cabal command.
For source packages, the sandboxing mean providing a local source package set that overrides the global package index. Tarballs can be added to this index. It provides a command cabal-dev add-source /path/to/source/code which generate an sdist snapshot of the given package and adds that tarball to the local source index.
For installed packages, the sandboxing means that packages are not registered into the user or global ghc package database. The global package db is used, so it is recommended that the global package db is only used for the ghc core libraries. This approach conflicts with using distribution packages for non-core libraries, because they are installed into the global db.
The user interface provides two ways to install a package into a sandbox, either to add the source package into the sandbox, or to install a package into the sandbox. In the latter case the source is not available if something needed to be rebuilt (e.g. needed profiling version later).
Note that when source packages are added to the sandbox, it is a snapshot of the package, not a live link to another build tree. This is probably not by design, but a limitation of cabal that cabal-dev cannot easily fix.
The default install location for cabal-dev is the sandbox. This means it only works with packages that are prefix independent. Libraries or programs that use the Paths_pkgname module, e.g. to find data files would expect to find those files also in the sandbox. This is ok for running inplace but not if applications will be installed to some system location in the end. This would be better if we had a reliable way to build prefix-independent packages (or fail for ones that are not prefix-independent).
When you do cabal-dev add-source to add a source package to the sandbox, we think that just makes that version available, and it does not mask all other versions of that package. One has to rely on the added source package having a higher version, and rely on the solver to pick the highest version (which is will if possible, but will fall back to older versions if necessary).
cabal-src
- http://hackage.haskell.org/package/cabal-src
- https://github.com/yesodweb/cabal-src/blob/master/README.md
cabal-src is intended to solve the problem that cabal does not know about the source versions of local packages, so it cannot use those source packages in its dependency planning. It has a command line interface that is very similar to that of cabal, so that it can be used as a drop-in replacement. It is implemented as a wrapper around the cabal command.
Ordinarily, if you cabal install in a local directory, cabal knows only about the packages that are already installed, and the source packages available from hackage. It does not know about other local build trees. If you make a change to a package and install it, then go to build another local package then cabal will usually use the instance of the package that you just installed. However this is not always possible: to use consistent versions of dependencies it is sometimes necessary to rebuild a package. This is where the problem occurs, if cabal cannot see that source package then it cannot rebuild from that source. This is the problem that cabal-src tries to address.
cabal-src address the problem by taking a snapshot of a source package and inserting it into cabal's local source index. It modifies the ~/.cabal/config file to tell cabal to look in this local index.
This is in a way similar to what cabal-dev add-source does, but it does it for the user's default environment rather than for a local sandbox.
cabal-meta
cabal-meta is intended to solve the same problem as cabal-src but it solves it in a different way. It has a command line interface that is very similar to that of cabal, so that it can be used as a drop-in replacement. It is implemented as a wrapper around the cabal-src or cabal-dev commands (which are themselves wrappers around the cabal command).
cabal-meta lets the user list the locations of local source build trees in a file. When the user runs cabal-meta install it compiles each of the packages and uses cabal-src to add the source package into the local source package index. In a sense, cabal-meta is a declarative approach compared to the imperative approach of cabal-src. Where cabal-src actively inserts source packages into the local source package index, with cabal-meta you declare what source packages should be used.
Additionally it allows per-package configuration flags to be specified in the local file. Another feature is that git repositories can be used as locations of source packages. These git repo is synced each time the user runs cabal-meta install.
virthualenv / hsenv
- http://hackage.haskell.org/package/virthualenv
- https://github.com/Paczesiowa/virthualenv/blob/master/README.md
virthualenv (recently renamed hsenv but not released at the time of writing) is a tool to create what it calls isolated Haskell environments.
It lets the user start shell sessions where the usual Haskell tools (ghc, cabal etc) only use and install packages from the isolated environment. It is implemented by setting environment variables, the PATH and GHC_PACKAGE_PATH such that the ghc and cabal commands will install packages only within the local environment.
This solves part of the same problem that cabal-dev solves, providing a sandbox for installed packages (but not for source packages as cabal-dev does).
cab
Package environments ideas
Roles: The package author and package builder are distinct roles (though in many use cases the same individual may fill both roles):
- The package author specifies information in the package .cabal file.
- The package environment is controlled by the person/agent doing the build.
A package environment consists of:
- source package set
- installed package store
- constraints for package versions and flags
- other build configuration (profiling, optimisation, C lib locations etc)
The source package set is a finite mapping of source package id to a location where we can find a source package. This includes references to source tarballs (remotely or locally) and references to local build trees.
The installed package store is a location where packages are installed, plus a package database where installed packages are registered.
The constraints include:
- package version constraints, like "foo == 1.0"
- configuration flags for particular packages (that is the "flags" stuff in .cabal files)
- enabling/disabling of test suites or benchmarks
These constraints influence which source packages can be used and the configuration influence what the dependencies of those packagea are.
Other build configuration includes most of the other command line flags that package builders can currently specify when they do cabal configure:
- enable/disable profiling
- dynamic or static linking
- where to look for C libraries
- optimisation settings
- etc
Mechanisms for an implementation
The existing hackage index format gives us a source package set but has the limitation that it cannot refer to local build trees. It is relatively straightfoward to generalise the existing hackage index format to make it possible to refer to local tarballs or directories.
TODO: what about references to repositories (darcs, git etc) ?
