GF Quick Reference Aarne Ranta April 4, 2006 % NOTE: this is a txt2tags file. % Create an html file from this file using: % txt2tags -thtml gf-reference.t2t %!style:../css/style.css %!target:html %!options: --toc This is a quick reference on GF grammars. It aims to cover all forms of expression available when writing grammars. It assumes basic knowledge of GF, which can be acquired from the [GF Tutorial http://www.grammaticalframework.org/doc/tutorial/gf-tutorial.html]. Help on GF commands is obtained on line by the help command (``help``), and help on invoking GF with (``gf -help``). ===A complete example=== This is a complete example of a GF grammar divided into three modules in files. The grammar recognizes the phrases //one pizza// and //two pizzas//. File ``Order.gf``: ``` abstract Order = { cat Order ; Item ; fun One, Two : Item -> Order ; Pizza : Item ; } ``` File ``OrderEng.gf`` (the top file): ``` --# -path=.:prelude concrete OrderEng of Order = open Res, Prelude in { flags startcat=Order ; lincat Order = SS ; Item = {s : Num => Str} ; lin One it = ss ("one" ++ it.s ! Sg) ; Two it = ss ("two" ++ it.s ! Pl) ; Pizza = regNoun "pizza" ; } ``` File ``Res.gf``: ``` resource Res = open Prelude in { param Num = Sg | Pl ; oper regNoun : Str -> {s : Num => Str} = \dog -> {s = table { Sg => dog ; _ => dog + "s" } } ; } ``` To use this example, do ``` % gf -- in shell: start GF > i OrderEng.gf -- in GF: import grammar > p "one pizza" -- parse string > l Two Pizza -- linearize tree ``` ===Modules and files=== One module per file. File named ``Foo.gf`` contains module named ``Foo``. Each module has the structure ``` moduletypename = Inherits ** -- optional open Opens in -- optional { Judgements } ``` Inherits are names of modules of the same type. Inheritance can be restricted: ``` Mo[f,g], -- inherit only f,g from Mo Lo-[f,g] -- inheris all but f,g from Lo ``` Opens are possible in ``concrete`` and ``resource``. They are names of modules of these two types, possibly qualified: ``` (M = Mo), -- refer to f as M.f or Mo.f (Lo = Lo) -- refer to f as Lo.f ``` Module types and judgements in them: ``` abstract A -- cat, fun, def, data concrete C of A -- lincat, lin, lindef, printname resource R -- param, oper interface I -- like resource, but can have oper f : T without definition instance J of I -- like resource, defines opers that I leaves undefined incomplete -- functor: concrete that opens concrete CI of A = one or more interfaces open I in ... concrete CJ of A = -- completion: concrete that CI with instantiates a functor by (I = J) instances of open interfaces ``` The forms ``param``, ``oper`` may appear in ``concrete`` as well, but are then not inherited to extensions. All modules can moreover have ``flags`` and comments. Comments have the forms ``` -- till the end of line {- any number of lines between -} --# used for compiler pragmas ``` A ``concrete`` can be opened like a ``resource``. It is translated as follows: ``` cat C ---> oper C : Type = lincat C = T T ** {lock_C : {}} fun f : G -> C ---> oper f : A* -> C* = \g -> lin f = t t g ** {lock_C = <>} ``` An ``abstract`` can be opened like an ``interface``. Any ``concrete`` of it then works as an ``instance``. ===Judgements=== ``` cat C -- declare category C cat C (x:A)(y:B x) -- dependent category C cat C A B -- same as C (x : A)(y : B) fun f : T -- declare function f of type T def f = t -- define f as t def f p q = t -- define f by pattern matching data C = f | g -- set f,g as constructors of C data f : A -> C -- same as fun f : A -> C; data C=f lincat C = T -- define lin.type of cat C lin f = t -- define lin. of fun f lin f x y = t -- same as lin f = \x y -> t lindef C = \s -> t -- default lin. of cat C printname fun f = s -- printname shown in menus printname cat C = s -- printname shown in menus printname f = s -- same as printname fun f = s param P = C | D Q R -- define parameter type P with constructors C : P, D : Q -> R -> P oper h : T = t -- define oper h of type T oper h = t -- omit type, if inferrable flags p=v -- set value of flag p ``` Judgements are terminated by semicolons (``;``). Subsequent judgments of the same form may share the keyword: ``` cat C ; D ; -- same as cat C ; cat D ; ``` Judgements can also share RHS: ``` fun f,g : A -- same as fun f : A ; g : A ``` ===Types=== Abstract syntax (in ``fun``): ``` C -- basic type, if cat C C a b -- basic type for dep. category (x : A) -> B -- dep. functions from A to B (_ : A) -> B -- nondep. functions from A to B (p,q : A) -> B -- same as (p : A)-> (q : A) -> B A -> B -- same as (_ : A) -> B Int -- predefined integer type Float -- predefined float type String -- predefined string type ``` Concrete syntax (in ``lincat``): ``` Str -- token lists P -- parameter type, if param P P => B -- table type, if P param. type {s : Str ; p : P}-- record type {s,t : Str} -- same as {s : Str ; t : Str} {a : A} **{b : B}-- record type extension, same as {a : A ; b : B} A * B * C -- tuple type, same as {p1 : A ; p2 : B ; p3 : C} Ints n -- type of n first integers ``` Resource (in ``oper``): all those of concrete, plus ``` Tok -- tokens (subtype of Str) A -> B -- functions from A to B Int -- integers Strs -- list of prefixes (for pre) PType -- parameter type Type -- any type ``` As parameter types, one can use any finite type: ``P`` defined in ``param P``, ``Ints n``, and record types of parameter types. ===Expressions=== Syntax trees = full function applications ``` f a b -- : C if fun f : A -> B -> C 1977 -- : Int 3.14 -- : Float "foo" -- : String ``` Higher-Order Abstract syntax (HOAS): functions as arguments: ``` F a (\x -> c) -- : C if a : A, c : C (x : B), fun F : A -> (B -> C) -> C ``` Tokens and token lists ``` "hello" -- : Tok, singleton Str "hello" ++ "world" -- : Str ["hello world"] -- : Str, same as "hello" ++ "world" "hello" + "world" -- : Tok, computes to "helloworld" [] -- : Str, empty list ``` Parameters ``` Sg -- atomic constructor VPres Sg P2 -- applied constructor {n = Sg ; p = P3} -- record of parameters ``` Tables ``` table { -- by full branches Sg => "mouse" ; Pl => "mice" } table { -- by pattern matching Pl => "mice" ; _ => "mouse" -- wildcard pattern } table { n => regn n "cat" -- variable pattern } table Num {...} -- table given with arg. type table ["ox"; "oxen"] -- table as course of values \\_ => "fish" -- same as table {_ => "fish"} \\p,q => t -- same as \\p => \\q => t t ! p -- select p from table t case e of {...} -- same as table {...} ! e ``` Records ``` {s = "Liz"; g = Fem} -- record in full form {s,t = "et"} -- same as {s = "et";t= "et"} {s = "Liz"} ** -- record extension: same as {g = Fem} {s = "Liz" ; g = Fem} -- tuple, same as {p1=a;p2=b;p3=c} ``` Functions ``` \x -> t -- lambda abstract \x,y -> t -- same as \x -> \y -> t \x,_ -> t -- binding not in t ``` Local definitions ``` let x : A = d in t -- let definition let x = d in t -- let defin, type inferred let x=d ; y=e in t -- same as let x=d in let y=e in t let {...} in t -- same as let ... in t t where {...} -- same as let ... in t ``` Free variation ``` variants {x ; y} -- both x and y possible variants {} -- nothing possible ``` Prefix-dependent choices ``` pre {"a" ; "an" / v} -- "an" before v, "a" otherw. strs {"a" ; "i" ;"o"}-- list of condition prefixes ``` Typed expression ``` -- same as t, to help type inference ``` Accessing bound variables in ``lin``: use fields ``$1, $2, $3,...``. Example: ``` fun F : (A : Set) -> (El A -> Prop) -> Prop ; lin F A B = {s = ["for all"] ++ A.s ++ B.$1 ++ B.s} ``` ===Pattern matching=== These patterns can be used in branches of ``table`` and ``case`` expressions. Patterns are matched in the order in which they appear in the grammar. ``` C -- atomic param constructor C p q -- param constr. applied to patterns x -- variable, matches anything _ -- wildcard, matches anything "foo" -- string 56 -- integer {s = p ; y = q} -- record, matches extensions too -- tuple, same as {p1=p ; p2=q} p | q -- disjunction, binds to first match x@p -- binds x to what p matches - p -- negation p + "s" -- sequence of two string patterns p* -- repetition of a string pattern ``` ===Sample library functions=== ``` -- lib/prelude/Predef.gf drop : Int -> Tok -> Tok -- drop prefix of length take : Int -> Tok -> Tok -- take prefix of length tk : Int -> Tok -> Tok -- drop suffix of length dp : Int -> Tok -> Tok -- take suffix of length occur : Tok -> Tok -> PBool -- test if substring occurs : Tok -> Tok -> PBool -- test if any char occurs show : (P:Type) -> P ->Tok -- param to string read : (P:Type) -> Tok-> P -- string to param toStr : (L:Type) -> L ->Str -- find "first" string -- lib/prelude/Prelude.gf param Bool = True | False oper SS : Type -- the type {s : Str} ss : Str -> SS -- construct SS cc2 : (_,_ : SS) -> SS -- concat SS's optStr : Str -> Str -- string or empty strOpt : Str -> Str -- empty or string bothWays : Str -> Str -> Str -- X++Y or Y++X init : Tok -> Tok -- all but last char last : Tok -> Tok -- last char prefixSS : Str -> SS -> SS postfixSS : Str -> SS -> SS infixSS : Str -> SS -> SS -> SS if_then_else : (A : Type) -> Bool -> A -> A -> A if_then_Str : Bool -> Str -> Str -> Str ``` ===Flags=== Flags can appear, with growing priority, - in files, judgement ``flags`` and without dash (``-``) - as flags to ``gf`` when invoked, with dash - as flags to various GF commands, with dash Some common flags used in grammars: ``` startcat=cat use this category as default lexer=literals int and string literals recognized lexer=code like program code lexer=text like text: spacing, capitals lexer=textlit text, unknowns as string lits unlexer=code like program code unlexer=codelit code, remove string lit quotes unlexer=text like text: punctuation, capitals unlexer=textlit text, remove string lit quotes unlexer=concat remove all spaces unlexer=bind remove spaces around "&+" optimize=all_subs best for almost any concrete optimize=values good for lexicon concrete optimize=all usually good for resource optimize=noexpand for resource, if =all too big ``` For the full set of values for ``FLAG``, use on-line ``h -FLAG``. ===File paths=== Colon-separated lists of directories searched in the given order: ``` --# -path=.:../abstract:../common:prelude ``` This can be (in order of growing preference), as first line in the top file, as flag to ``gf`` when invoked, or as flag to the ``i`` command. The prefix ``--#`` is used only in files. If the environment variabls ``GF_LIB_PATH`` is defined, its value is automatically prefixed to each directory to extend the original search path. ===Alternative grammar formats=== **Old GF** (before GF 2.0): all judgements in any kinds of modules, division into files uses ``include``s. A file ``Foo.gf`` is recognized as the old format if it lacks a module header. **Context-free** (file ``foo.cf``). The form of rules is e.g. ``` Fun. S ::= NP "is" AP ; ``` If ``Fun`` is omitted, it is generated automatically. Rules must be one per line. The RHS can be empty. **Extended BNF** (file ``foo.ebnf``). The form of rules is e.g. ``` S ::= (NP+ ("is" | "was") AP | V NP*) ; ``` where the RHS is a regular expression of categories and quoted tokens: ``"foo", CAT, T U, T|U, T*, T+, T?``, or empty. Rule labels are generated automatically. **Probabilistic grammars** (not a separate format). You can set the probability of a function ``f`` (in its value category) by ``` --# prob f 0.009 ``` These are put into a file given to GF using the ``probs=File`` flag on command line. This file can be the grammar file itself. **Example-based grammars** (file ``foo.gfe``). Expressions of the form ``` in Cat "example string" ``` are preprocessed by using a parser given by the flag ``` --# -resource=File ``` and the result is written to ``foo.gf``. ===References=== [GF Homepage http://www.grammaticalframework.org/] A. Ranta, Grammatical Framework: A Type-Theoretical Grammar Formalism. //The Journal of Functional Programming//, vol. 14:2. 2004, pp. 145-189.