Ticket #184 (reopened enhancement)

Opened 2 years ago

Last modified 11 months ago

cabal-install should report build results to hackage server

Reported by: duncan Assigned to:
Priority: high Milestone: HackageDB
Component: hackageDB website Version: 1.2.2.0
Severity: normal Keywords:
Cc: Difficulty: normal
GHC Version: 6.4.2 Platform: Linux

Description

One way to get a lot more testing data on hackage packages is if cabal-install could report back to the hackage server about build successes and failures. This information should be kept by hackage and used to distill information about which platforms configurations a package builds successfully and which it does not. This should provide useful information to developers to enable them to discover problems more quickly and useful information to users.

An important consideration is privacy. Users should always have the option to not report anything and any information they report that is kept should not contain identifying information. This is particularly important when it comes to build logs which may contain paths etc. It should be clear to users what information they are reporting to the can decide for themselves if it meets their privacy needs. Since cabal can be used to build private code it is vital that it reports only on packages that were obtained from the hackage server. It is also vital that the information is sent to the correct hackage server. It's possible to set up private hackage server instances and it'd may be useful to collect build information "in house" too.

So what information would be helpful?

  • build success or failure (qualified by what failed, build, docs, tests)
  • package name and version
  • hash of .cabal file just to make sure it's the same one we're all talking about and to detect local modifications.
  • precise versions of dependent packages that the package was built with.
  • os and arch strings
  • compiler flavour and version
  • versions of important build tools
  • In the case of a build failure, some part of the build log would be helpful. This is the most problematic part from a privacy point of view.

This is quite a bit of raw data and we can expect to have many hundreds of such reports for popular packages. How can we distill useful information from this kind of data?

I expect we would want to do some statistical analysis where we look for common traits in the results. The data forms a multi-dimensional space of boolean values. By looking down the rows/columns of this space we should be able to identify trends. For example if it always fails with ghc-6.8.x, or always fails on Windows. Excluding those obvious failures we then may be able to say that it does work on mac osx (except ghc-6.8.1) or that it does work with regex-posix-0.92 (except on windows), etc. This should allow us to give a summary saying yes/no to various properties of the environment.

For failure cases developers should be able to get access to the more detailed data including build logs.

Change History

(follow-up: ↓ 4 ) 01/20/08 09:13:57 changed by duncan

  • priority changed from normal to high.
  • difficulty changed from normal to project(> week).

There are various sub-tasks:

  • Make cabal install not fail overall just because one package fails. There should be a mode to carry on and build all remaining packages that did not depend on the failed package. This requires keeping a full dep graph while installing, not just a linear list of packages.
  • Define a build report data type
  • Generate a build report for each
  • Collect build reports in a local file
  • Upload build reports to hackage at appropriate points. Perhaps new reports should be sent at the same that the package index or tarballs are downloaded.
  • Implement server side script to allow reports to be uploaded. Have to decide where and how reports are stored. Obvious choice is a log file per package-version. Or some other kind of data base and index. The considerations here are in keeping the reports anonymous and what queries we will want to run on the reports.
  • Implement the statistical analysis necessary to distill the reports into useful information per-package. Present this info on each package page.

04/12/08 13:47:25 changed by duncan

  • status changed from new to closed.
  • resolution set to fixed.

Fixed:

Sat Apr 12 22:49:04 CEST 2008  Andres Loeh <mail@andres-loeh.de>
  * fix for #187 -- directory of Paths_packagename is included when looking for source files

04/12/08 13:53:14 changed by duncan

  • status changed from closed to reopened.
  • resolution deleted.

Oops, closed the wrong bug....

(in reply to: ↑ 1 ) 06/10/08 04:42:00 changed by duncan

Replying to duncan:

There are various sub-tasks: * Make cabal install not fail overall just because one package fails. There should be a mode to carry on and build all remaining packages that did not depend on the failed package. This requires keeping a full dep graph while installing, not just a linear list of packages. * Define a build report data type * Generate a build report for each * Collect build reports in a local file

These are all now done. Build reports are logged into ~/.cabal/packages/$server/build-reports.log.

* Upload build reports to hackage at appropriate points. Perhaps new reports should be sent at the same that the package index or tarballs are downloaded. * Implement server side script to allow reports to be uploaded. Have to decide where and how reports are stored. Obvious choice is a log file per package-version. Or some other kind of data base and index. The considerations here are in keeping the reports anonymous and what queries we will want to run on the reports. * Implement the statistical analysis necessary to distill the reports into useful information per-package. Present this info on each package page.

These are the remaining tasks.

08/12/08 11:22:54 changed by duncan

  • difficulty changed from project(> week) to normal.
  • component changed from cabal-install tool to hackageDB website.

Current cabal-install can now generate anonymous or detailed build reports and upload them to a compatible server. The remaining work is mostly server side.

08/15/08 08:18:46 changed by duncan

  • milestone set to HackageDB.