Ticket #4080 (closed bug: fixed)

Opened 3 years ago

Last modified 3 years ago

Use libcharset instead of nl_langinfo(CODESET) if possible.

Reported by: PHO Owned by: igloo
Priority: high Milestone: 7.0.1
Component: libraries/base Version: 6.13
Keywords: iconv locale Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime crash Difficulty:
Test Case: Blocked By:
Blocking: Related Tickets:

Description

nl_langinfo(CODESET) doesn't always return standardized variations of encoding names which GNU libiconv understands.

This problem actually affects (at least) NetBSD and OpenBSD: GHC.IO.Encoding.Iconv.localeEncoding suffers from this and then even ghc --version fails. Here is an example:

/* test1.c */
#include <stdio.h>
#include <locale.h>
#include <langinfo.h>

int main() {
    setlocale(LC_ALL, "");
    printf("nl_langinfo(CODESET) = \"%s\"\n", nl_langinfo(CODESET));
    return 0;
}
% gcc -o test1 test1.c
% LC_ALL=ja_JP.UTF-8 ./test1
nl_langinfo(CODESET) = "UTF-8"   // Good.
% iconv -f UTF-8 -t UTF-8 /dev/null && echo ok
ok
% LC_ALL=C ./test1
nl_langinfo(CODESET) = "646"     // Wtf? You mean ISO 646?
% iconv -f 646 -t UTF-8 /dev/null && echo ok
iconv: conversion from 646 unsupported
iconv: try 'iconv -l' to get the list of supported encodings
% uname -a
NetBSD netbsd 5.99.20 NetBSD 5.99.20 (ADJUSTED) #0: Mon Oct  5 15:05:08 JST 2009
  root@netbsd:/usr/obj/sys/arch/i386/compile/ADJUSTED i386
%

So we should use libcharset if possible, which is shipped together with GNU libiconv. See:  http://www.haible.de/bruno/packages-libcharset.html

/* test2.c */
#include <stdio.h>
#include <locale.h>
#include <libcharset.h>

int main() {
    setlocale(LC_ALL, "");
    printf("locale_charset() = \"%s\"\n", locale_charset());
    return 0;
}
% gcc -o test2 test2.c -I/usr/pkg/include -L/usr/pkg/lib -lcharset
% LC_ALL=ja_JP.UTF-8 ./test2
locale_charset() = "UTF-8"    // Good.
% LC_ALL=C ./test2
locale_charset() = "ASCII"    // Good!
% iconv -f ASCII -t UTF-8 /dev/null && echo ok
ok
%

Attachments

libcharset.patch Download (33.8 KB) - added by PHO 3 years ago.

Change History

Changed 3 years ago by PHO

Changed 3 years ago by igloo

  • status changed from new to patch

Changed 3 years ago by igloo

  • priority changed from normal to high
  • milestone set to 6.14.1

Changed 3 years ago by simonmar

  • owner set to igloo

Patch looks fine to me - Ian could you go ahead and validate/push please?

Changed 3 years ago by igloo

  • status changed from patch to closed
  • resolution set to fixed

Applied.

Note: See TracTickets for help on using tickets.