Ticket #3079 (closed merge: fixed)

Opened 4 years ago

Last modified 4 years ago

LANGUAGE pragma fails if it falls on a 1024-byte boundary

Reported by: Deewiant Owned by: igloo
Priority: normal Milestone: 6.10.2
Component: Compiler (Parser) Version: 6.10.1
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Difficulty: Unknown
Test Case: read068 Blocked By:
Blocking: Related Tickets:

Description

Putting too much whitespace and/or comment text before a LANGUAGE pragma seems to cause parsing it to fail. In the following, the closing } of the comment is byte 1025. Perhaps the whole thing is expected to fit into the first 1024 bytes or be wholly inside an aligned 1024-byte range?

-- xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
-- xxxxxxxxxxxxxxxxxxxxxxxx
-- xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
-- xxxxxxxxxxxxxxxxxxxxxxx
-- xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
--
--xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
--xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
--
--xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
--xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
--xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
--xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
--xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
--xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
--xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
--xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
--
--xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

{-# LANGUAGE ScopedTypeVariables #-}
main = return ()

Removing absolutely anything prior to the start of the LANGUAGE pragma will make it work: one of the exes, an entire line, any whitespace, whatever.

Change History

Changed 4 years ago by Deewiant

  • summary changed from LANGUAGE pragma fails if preceded by too many comments to LANGUAGE pragma fails if it falls on a 1024-byte boundary

I tried adding more comments before it and that also made it work. Adding still more breaks it again as soon as the 2048th byte is among the contents of the {-# #-}, so this is indeed an alignment issue.

The error message which I managed to forget putting into the report:

arst.hs:21:13:
    cannot parse LANGUAGE pragma: comma-separated list expected

Changed 4 years ago by Deewiant

The culprit is getOptions' in compiler/main/HeaderInfo.hs. The function has a mechanism of indicating that it wants more input but it isn't doing it properly in this case. I'm not familiar enough with the system to be able to fix it, but it seems to me that modifying the definition of parseLanguage (in the where clause) somehow should suffice.

Note that the same function parses OPTION pragmas, which do work correctly even if they fall on a 1024-byte boundary.

Changed 4 years ago by simonmar

  • owner set to simonmar
  • difficulty set to Unknown
  • milestone set to 6.10.2

Changed 4 years ago by simonmar

  • owner changed from simonmar to igloo
  • type changed from bug to merge

Fixed:

Thu Mar 12 14:11:03 GMT 2009  Simon Marlow <marlowsd@gmail.com>
  * FIX #3079, dodgy parsing of LANGUAGE pragmas
  I ended up rewriting this horrible bit of code, using (yikes) lazy I/O
  to slurp in the source file a chunk at a time.  The old code tried to
  read the file a chunk at a time, but failed with LANGUAGE pragmas
  because the parser for LANGUAGE has state and the state wasn't being
  saved between chunks.  We're still closing the Handle eagerly, so
  there shouldn't be any problems here.

Changed 4 years ago by igloo

  • status changed from new to closed
  • testcase set to read068
  • resolution set to fixed

Merged, and test read068 added.

Note: See TracTickets for help on using tickets.