Markdown literate programming proposal

Markdown has become quite popular for its unobstructive readability. Also, its lightweight rendering has made previewers possible that update as-you-type or as-you-save. There have been comments by GHC developers about not liking the LaTeX markup for it being too much in the way. Many notes in the GHC source are now in a markdown-like style, which doesn't help in converting the code to readable documentation (which was the point of lhs to begin with).

The proposal is to extend unlitting with markdown-style code fencing. Pure markdown does not specify code blocks, but the popular GitHub extensions do (and so do many others). They even allow for syntax highlighting if the language's (short) name is specified on the opening line of a code block.

Unfortunately, markdown is incompatible with Bird-style literate programming (lines starting with '>' are considered block quotes in markdown) or LaTeX-style literate programming. It seems, therefore, that a file should either be interpreted as markdown-lhs, or as classic-lhs. One way of doing this is introducing a new extension "mhs" (or "mdhs"; and then, of course, "mhs-boot" or "mdhs-boot"). This seems a rather heavy way of going about it.

Two other ways of doing it: The first is to not analyse the file line by line, but as a whole, to see whether it contains e.g. "hs" or "haskell" strings. This option seems the least demanding of the programmer, but it may sometimes produce unexpected results. The alternative is to demand a markdown file starts with markdown meta-data. As an aside, for Cabal-developers, this looks a *lot* like cabal metadata; maybe there's room for future design here ;)

A good way of implementing this, could be to use (a somewhat modified version of) cpphs. The ability to do markdown unlitting with cpphs is being discussed separately with the maintainers of that package. An additional benefit could be that GHC no longer requires multiple states of temporary files to do unlit and cpp if we decide to use cpphs internally. This will simplify these stages for the GHC API also. It also seems to help solve #4836; CPP-things would only be preserved inside fenced code blocks.

I am more than willing to implement this and my preference is for the last solution, but before I start, I though I would gage public opinion.

Questions:

Is there a special reason why unlit.c and cpp are still used as external calls from DriverPipeline?

Is it an acceptable side-effect of this proposal to "internalise" the unlit and cpp stages of the pipeline?

Any further ideas / suggestions / restrictions / recommendations?

Trac metadata

Trac field	Value
Version	7.5
Type	FeatureRequest
TypeOfFailure	OtherFailure
Priority	normal
Resolution	Unresolved
Component	Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information