Ticket #7120 (closed feature request: invalid)

Opened 10 months ago

Last modified 8 months ago

Markdown literate programming proposal

Reported by: holzensp Owned by:
Priority: normal Milestone:
Component: Compiler Version: 7.5
Keywords: Cc: ikke+ghc@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Difficulty:
Test Case: Blocked By:
Blocking: Related Tickets: #4836

Description

Markdown has become quite popular for its unobstructive readability. Also, its lightweight rendering has made previewers possible that update as-you-type or as-you-save. There have been comments by GHC developers about not liking the LaTeX markup for it being too much in the way. Many notes in the GHC source are now in a markdown-like style, which doesn't help in converting the code to readable documentation (which was the point of lhs to begin with).

The proposal is to extend unlitting with markdown-style code fencing. Pure markdown does not specify code blocks, but the popular  GitHub extensions do (and so do many others). They even allow for syntax highlighting if the language's  (short) name is specified on the opening line of a code block.

Unfortunately, markdown is incompatible with Bird-style literate programming (lines starting with '>' are considered block quotes in markdown) or LaTeX-style literate programming. It seems, therefore, that a file should either be interpreted as markdown-lhs, or as classic-lhs. One way of doing this is introducing a new extension "mhs" (or "mdhs"; and then, of course, "mhs-boot" or "mdhs-boot"). This seems a rather heavy way of going about it.

Two other ways of doing it: The first is to not analyse the file line by line, but as a whole, to see whether it contains e.g. "hs" or "haskell" strings. This option seems the least demanding of the programmer, but it may sometimes produce unexpected results. The alternative is to demand a markdown file starts with  markdown meta-data. As an aside, for Cabal-developers, this looks a *lot* like cabal metadata; maybe there's room for future design here ;)

A good way of implementing this, could be to use (a somewhat modified version of) cpphs. The ability to do markdown unlitting with cpphs is being discussed separately with the maintainers of that package. An additional benefit could be that GHC no longer requires multiple states of temporary files to do unlit and cpp if we decide to use cpphs internally. This will simplify these stages for the GHC API also. It also seems to help solve #4836; CPP-things would only be preserved inside fenced code blocks.

I am more than willing to implement this and my preference is for the last solution, but before I start, I though I would gage public opinion.

Questions:

Is there a special reason why unlit.c and cpp are still used as external calls from DriverPipeline??

Is it an acceptable side-effect of this proposal to "internalise" the unlit and cpp stages of the pipeline?

Any further ideas / suggestions / restrictions / recommendations?

Change History

Changed 10 months ago by holzensp

For some reason the trac/wiki markup pruned some markup, I meant to say that the strings to look for in file analysis would be ```hs or ```haskell

Changed 10 months ago by nicolast

  • cc ikke+ghc@… added

This [1] might be somewhat related... It's using Sphinx and ReStructuredText? (which I prefer), encoding Haskell code in the document as literal code blocks using the standard ReST syntax, and a custom preprocessor script. I guess a similar approach could be used for Markdown documents?

[1]  http://blog.incubaid.com/2011/10/17/literate-programming-using-sphinx-and-haskell/

Changed 10 months ago by holzensp

I would be more than happy to take reStructuredText on-board as an alternative as well. If we're going to be looking at multiple different markdown options, though, we should specifically see how we should determine the content type. Does reStructuredText also have meta-data definitions at the start of a file? I haven't yet played around with the python script on the page you linked, but does GHCi know to also search for .rst-files when doing an import somewhere?

Changed 9 months ago by holzensp

  • status changed from new to closed
  • resolution set to invalid

Changed 8 months ago by Syzygies

I've been thinking about this for a while; I've been using my own literate preprocessor for years, which supports here documents and HTML markup, and I'm getting ready to upload it to Hackage. And I have experience with multiple HTML markup processors.

Markup processors do local work translating to HTML, but generally still require an external script which adds headers and footers; this is in any case the more flexible approach. And any HTML markup processor worth considering will preserve existing HTML. So what one wants here is a first pass that formats the Haskell code, and exposes bare literate comments to a subsequent markup processor. There is absolutely no reason to impose a choice of markup processors. Modular design says don't!

I don't like bird tracks (they look depressingly like 1980s email messages, and I'm old enough to have a backlog of unanswered messages from that era), so I use identation: Flush is comments, indented is code. However, GHC supports external literate preprocessors, so one can do anything one likes here. Whatever one does, the literate preprocessor should be able to output markup-ready HTML source.

Note: See TracTickets for help on using tickets.