Ticket #5239 (new feature request)

Opened 2 years ago

Last modified 5 months ago

Em-dash for "--" with UnicodeSyntax.

Reported by: Eelis- Owned by:
Priority: normal Milestone: 7.6.2
Component: Compiler (Parser) Version: 7.0.3
Keywords: unicode syntax extension Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

It would be neat if the UnicodeSyntax? extension supported the Unicode "—" EM DASH (U+2014) character as an alternative for the "--" single-line comment introduction character sequence.

One possible objection I can anticipate is that its use could be confusing when using a monospace font, but it seems unjust to let that hold back those of us who have liberated ourselves from monospace. :-)

Attachments

mdash.patch Download (5.1 KB) - added by porges 20 months ago.
patch

Change History

Changed 23 months ago by igloo

  • component changed from Compiler to Compiler (Parser)
  • milestone set to 7.4.1

Changed 20 months ago by porges

patch

Changed 20 months ago by porges

  • status changed from new to patch

I have added a tentative patch. It works fine, but if UnicodeSyntax? is disabled then the lexer gives an error upon encountering the mdashes... unexpected character '\n' or something. I thought that this was because of the 'Unicode fix' which transforms mdash into \x7, but Alex always complains about \n, not whatever it gets transformed into.

I'm not sure how to fix that, so I'm attaching it here in the hopes that someone else knows.

I also added some extra checking to the 'not in scope' error (in RnEnv?.lhs) that suggests that users might want to enable UnicodeSyntax? if compilation fails because an mdash isn't in scope. A further extension would be for this to happen when any UnicodeSyntax? character turns up here. (This can't be seen at the moment because of the aforementioned issue but works fine if only this part is enabled.)

Changed 20 months ago by porges

Figured out what was wrong with my patch. The '$mdash' declaration needs to be in the $symbol character class. After that change, all works as expected.

Changed 16 months ago by igloo

  • milestone changed from 7.4.1 to 7.6.1

Changed 10 months ago by simonpj

  • status changed from patch to new
  • difficulty set to Unknown

Dear porges

Sorry that we've been playing dead on this.

We don't have an opinion either way, but it's not entirely clear to us that everyone would welcome such a change; eg they might want to use em-dash in an operator.

Could you initiate a thread on glasgow-haskell-users to see if other Unicode-aware folk actively want the change? If so, we'll apply it. A final patch would be useful; and it should include documentation in 7.3.1 of the user manual.

Thanks

Simon

Changed 8 months ago by igloo

  • milestone changed from 7.6.1 to 7.6.2

Changed 5 months ago by guest

I just want to speak out in support of this feature. I would prefer en-dashes over em-dashes though. (That would be consistent with the TeX convention of -- for en-dashes and --- for em-dashes.)

Note: See TracTickets for help on using tickets.