Ticket #609 (closed defect: fixed)

Opened 4 years ago

Last modified 3 years ago

ghc-pkg dump encoding

Reported by: igloo Owned by:
Priority: normal Milestone: Cabal-1.8
Component: Cabal library Version: HEAD
Severity: normal Keywords:
Cc: Difficulty: unknown
GHC Version: Platform:

Description

With ghc-6.12.0.20091120, FunctorSalad on IRC found:

when I try to run some Setup.hs (of any package, apparently), 
it compiles, but ./Setup configure crashes with:
Setup: fd:5: hGetContents: invalid argument (Invalid or 
incomplete multibyte or wide character)
*** glibc detected *** ./Setup: double free or corruption 
(!prev): 0x098953a0 ***

As far as I can tell, the problem is that ./Setup finds /usr/bin/ghc which is 6.10.4, and the output of ghc-pkg dump | hexdump -C includes:

00038060  63 6f 70 79 72 69 67 68  74 3a 20 a9 20 32 30 30  |copyright: . 200|

i.e. there's a latin-1 copyright symbol 0xA9.

Change History

  Changed 4 years ago by igloo

  • version changed from 1.6.0.1 to HEAD

  Changed 4 years ago by duncan

Gah!

We should agree between Cabal and ghc/ghc-pkg whether installed package info files are in Latin1 or UTF-8. I vote for them always being UTF-8 (not locale) and for all of ghc/ghc-pkg/Cabal validating.

follow-up: ↓ 4   Changed 4 years ago by igloo

Whatever we decide should happen in the future, existing installations may have non-UTF-8 package databases.

in reply to: ↑ 3   Changed 4 years ago by duncan

Replying to igloo:

Whatever we decide should happen in the future, existing installations may have non-UTF-8 package databases.

ghc and ghc-pkg do not have to deal with existing non-utf8 package dbs because they only read/write them for package dbs for ghc 6.12. Cabal may have to if cabal-install is built with 6.12 but used with ghc-pkg-6.10 dump.

  Changed 4 years ago by guest

GHC now always uses UTF-8 for the output of ghc-pkg dump, and the input to ghc-pkg register and ghc-pkg update.

Wed Nov 25 06:17:30 PST 2009  Simon Marlow <marlowsd@gmail.com>
  * Use UTF-8 explicitly for InstalledPackageInfo
  So ghc-pkg register/update takes input in UTF-8, and ghc-pkg dump
  outputs in UTF-8.  Textual package config files in the package DB are
  assumed to be in UTF-8.

    M ./utils/ghc-pkg/Main.hs -7 +26

This patch will be merged into 6.12.1.

Also the double-free error reported above is fixed.

  Changed 3 years ago by duncan

  • status changed from new to closed
  • resolution set to fixed
  • milestone set to Cabal-1.8
Sun Nov 29 15:33:41 GMT 2009  Duncan Coutts <duncan@haskell.org>
  * Package registration files are always UTF8
  As is the output from ghc-pkg dump.
Note: See TracTickets for help on using tickets.