Ticket #4230 (new feature request)

Opened 3 years ago

Last modified 11 days ago

Template Haskell: less type checking in quotations?

Reported by: simonpj Owned by:
Priority: low Milestone: 7.6.2
Component: Compiler Version: 7.6.3
Keywords: Cc: illissius@…, jwlato@…, gmainlan@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Difficulty: Unknown
Test Case: Blocked By:
Blocking: #4124, #4125, #4128, #4135, #4170 Related Tickets:

Description (last modified by simonpj) (diff)

This ticket introduces two related Template Haskell design questions. It's inspired by several other tickets, for which these issues are the underlying cause


ISSUE 1

Consider this module

  module M where
   import ...
   f x = x
   $(th1 4)
   h y = k y y
   $(th2 10)

In our present setup we typecheck down to the first splice, run the splice, then typecheck to the second splice, run that, and so on.

But GHC's main structure is:

  • parse
  • rename (resolve lexical scopes)
  • typecheck

The above setup makes this impossible. We can't even rename the definition of h until we've run the splice $(th1 4) because that might be what brings k into scope. All this is a bit hard to explain to users.

Now suppose that instead we did the following.

  • Typecheck and run top-level splices in the renamer

That is, when the renamer finds a top-level splice:

  • rename it
  • typecheck it
  • run it
  • replace the splice by the result of running it
  • and then rename *that* as if that's what the user had written in the first place

But what about nested quotes? eg

     $(th1 [| f x |])

Well we can't typecheck that quote, because we don't have a type for f and x, because we are in the renamer. But we don't need to typecheck the quote to be able to typecheck and run the splice! Remember, we already have a staging restriction that only imported functions can be run in a top-level splice.

So the proposal would mean that we don't attempt to typecheck those nested quotes. Instead they'll be checked after the top-level splice is run, the result spliced in, and the whole shebang typechecked in the context of the program as a whole. This defers the error message, but not too muce, since typechecking the output of the splice is what will happen next.

This approach would have a number of rather compelling advantages:

  • It would allow us to show the program after renaming and splice expansion, but before typechecking. That's quite helpful. It makes it easier to say that we
    • rename the program, and then
    • typecheck the program
  • Remember, GHC is also a Haskell library, and the current story (a bit of renaming, a bit of typechecking, then a bit more renaming and a bit more typechecking) complicates the API.
  • Geoff Mainland's quasi-quotation mechanism is run in the renamer (because it can expand to patterns) so it would make all the TH stuff more consistent.
  • Even more exciting, we could support pattern splices, thus
         f $(g 4) = x+y
    
    because the splice $(g 4) would be typechecked and run, perhaps expanding to (x,y), say, by the renamer *before* it goes on to do scope-analysis on x+y. This is exactly why quasiquoting *can* do pattern splicing at the moment, and TH cannot.

    This would fix a long-standing infelicity in TH, namely the absence of pattern splices.
  • In the same way we could support local declaration splices (which are currently ruled out)
           f x = let $(g [| x |]) in ....
    
  • At the top level we could get rid of the top-to-bottom bias. We could allow, say
         f x = x+x
         $(th [| g |] [| f |])
         g y = y+1
    

One disadvantage is that it'd "bake in" the staging restriction that a splice can only (typecheck and) run functions imported from another module. Currently this is only an implementation restriction, which in principle might be lifted. On the other hand, I have no plans to lift it, and in practice people don't complain about it.


ISSUE 2

Consider this:

  f :: Q Type -> Q [Dec]
  f t = [d| data T = MkT $t; 
            g (MkT x) = g+1 |]

This function f returns a declaration splice, declaring T and g. You'll see that the constructor MkT takes an argument whose type is passed (as a quotation) to f -- a type splice.

The difficulty is that we can't typecheck the declaration of 'g' until we know what $t is; and we won't know that until f is called. So

  • we can't really typecheck the declaration quote at all

In the case of term splices in a quotation we can simply give them type (forall a. a), which never gets in the way. But there is no equivalent for *type* splices. An analogous problem occurs in other declaration splices, for example

  f t = [d| instance C $t where ... |]

Here GHC's check that an instance decl is always of form

   instance C (T a b c)

falls over, again because we don't know what $t will be.

It's hard to see what to do about this.

  • We could get rid of type splices (but people like them)
  • We could refrain from typechecking quotations *at all* I have users asking me for exactly this: #4125, #4135
  • We could refrain from typechecking *declaration* quotes. But the problem still happens for terms
          [| let { f :: $t; f = e } in .. |]
    
  • We could refrain from typechecking any quotation that included a type splice. This is probably the most benign: it switches off the type checker just when it's problematic, and it's very predictable when that is. Awkward to implement though.

Do you have any other ideas?


DISCUSSION

The two issues are related of course, because both involve doing less typechecking of quotations.

That seems a pity, because it'd lose one of TH's advantages -- that of typechecking meta programs rather than just typechecking the output of the meta programs. And in the case of ISSUE 2, error messages might get delayed, perhpas to another module altogether.

However, we deliberately designed TH so that it is possible to get a type error when typechecking the result of a type-correct top-level splice. Reason: we use type Exp rather than (Exp t) as in MetaML. That gave us (a lot) more flexibility. Apart from anything else, declaration splices would be problematic in the MetaML approach.

Seen in that light, the proposals here just move a bit further in the direction of flexibility, at the cost of somewhat-increased danger of errors being reported later than is ideal.

Change History

Changed 3 years ago by Ashley Yakeley

What bad things would happen if quotations were not type-checked?

I found the early type-checking a bit surprising: my first expectation was that a quotation is just a convenient alternative to writing out all those Language.TH constructors. It's certainly possible to construct badly-typed programs using the latter.

Changed 3 years ago by igloo

  • blocking 4125 added

Changed 3 years ago by igloo

  • blocking 4170 added

Changed 3 years ago by igloo

  • blocking 4135 added

Changed 3 years ago by igloo

  • milestone set to 6.16.1

Changed 3 years ago by simonpj

Answering Ashley: type checking quotes leads to error messages when you write the quote, rather than later, when you splice that quoted code in. The author of the quote might be a library author, and the splicer a library client, so spotting errors earlier is a Good Thing. But it's not complete: by design TH does allow a library author to create badly-typed splices, unlike !MetaML which does not. Reason: it enables much more expressiveness. So it's a judgement call, not a black and white thing.

Changed 3 years ago by simonpj

  • blockedby 4128 added

(In #4128) This too related to the global discussion in #4230. If we did less kind-checking of the type quote, we wouldn't reject the quote.

Changed 3 years ago by simonpj

  • description modified (diff)

Changed 3 years ago by simonpj

  • blocking 4124 added

(In #4124) This too is related to #4230.

Changed 3 years ago by simonpj

  • blocking 4128 added
  • blockedby 4128 removed

Changed 3 years ago by Ashley Yakeley

Here's my opinion:

  • Type-checking quotations is surprising: the obvious mental model is that quotations are simply identical to explicit constructors, and type-checking only happens once all the splices and quotes are assembled.
  • Type-checking quotations is fiddly to implement, leading to a number of bugs and infelicities.
  • Type-checking quotations is incomplete: it doesn't provide a guarantee to the library consumer, since (uncheckable) explicit constructors are often used.

On the other side of the balance, type-checking quotations finds bugs at library compile time that would otherwise be found at consumer compile time.

I wouldn't miss it if it were taken out. That would be my preference. But if you can make it work without causing the various subtle bugs and surprises, I wouldn't complain.

Changed 2 years ago by illissius

  • cc illissius@… added

Changed 19 months ago by jwlato

  • cc jwlato@… added

Changed 16 months ago by igloo

  • priority changed from normal to low
  • milestone changed from 7.4.1 to 7.6.1

Changed 8 months ago by igloo

  • milestone changed from 7.6.1 to 7.6.2

Changed 12 days ago by carter

  • version changed from 6.12.3 to 7.6.3

One use case / example /motivation I encountered this week is the following:

To make the LLVM-Base library work with cabal versions 1.16 and cabal head, I wanted to write the following code

--- what i'd like to write, but can't because template haskell rejecting the branch that has the wrong api version
extractCLBI x=  
    $(if cabalVersion >= Version [1,17,0] [] 
        then [|  getComponentLocalBuildInfo 'x CLibName  |]
        else  [|      let   LocalBuildInfo  { libraryConfig = Just clbi }  = 'x in clbi |]
   )

--- horrible hack to support cabal versions both above and below 1.17
extractCLBI x=  
    $(if cabalVersion >= Version [1,17,0] [] 
        then  appE (appE  ( varE $ mkName "getComponentLocalBuildInfo") ( varE 'x) ) (conE ( mkName "CLibName")) 

        else  letE  
                [valD  (recP 
                            (mkName "LocalBuildInfo" ) 
                            [fieldPat (mkName "libraryConfig") 
                             (conP (mkName "Just")    [varP $ mkName "clbi"] ) ] ) 
                    (normalB $ varE 'x)   []    ] 
                 (varE $ mkName "clbi")  )

In this case, I need to explicitly write out the AST so that template haskell doesn't barf.

Would one near term solution be to

  1. do the harmonization between haskell-src-exts and TH (as suggested / discussed elsewhere)
  2. provide untypedExp, Dec, etc quasiquoters ?

this would be a pretty happy balance of allowing both the stronger and weaker typings, and might be a reasonable first step towards actually working on MetaHaskell?

Changed 11 days ago by simonpj

  • cc gmainlan@… added
  • difficulty set to Unknown

This ticket is more or less subsumed by  my TH blog post proposal. Geoff Mainland is working on implementing it right now.

Simon

Note: See TracTickets for help on using tickets.