language-c-0.3.1: Analysis and generation of C code





Abstract syntax of C source and header files.

The tree structure is based on the grammar in Appendix A of K&R. The abstract syntax simplifies the concrete syntax by merging similar concrete constructs into a single type of abstract tree structure: declarations are merged with structure declarations, parameter declarations and type names, and declarators are merged with abstract declarators.

With K&R we refer to ``The C Programming Language'', second edition, Brain W. Kernighan and Dennis M. Ritchie, Prentice Hall, 1988. The AST supports all of C99 and several GNU extensions


C translation units

data CTranslUnit Source

Complete C tranlsation unit (C99 6.9, K&R A10)

A complete C translation unit, for example representing a C header or source file. It consists of a list of external (i.e. toplevel) declarations.


CTranslUnit [CExtDecl] NodeInfo 

data CExtDecl Source

External C declaration (C99 6.9, K&R A10)

Either a toplevel declaration, function definition or external assembler.


data CFunDef Source

C function definition (C99 6.9.1, K&R A10.1)

A function definition is of the form CFunDef specifiers declarator decllist? stmt.

  • specifiers are the type and storage-class specifiers of the function. The only storage-class specifiers allowed are extern and static.
  • The declarator must be such that the declared identifier has function type. The return type shall be void or an object type other than array type.
  • The optional declaration list decllist is for old-style function declarations.
  • The statement stmt is a compound statement.

data CDecl Source

C declarations (K&R A8, C99 6.7), including structure declarations, parameter declarations and type names.

A declaration is of the form CDecl specifiers init-declarator-list, where the form of the declarator list's elements depends on the kind of declaration:

1) Toplevel declarations (K&R A8, C99 6.7 declaration)

  • C99 requires that there is at least one specifier, though this is merely a syntactic restriction
  • at most one storage class specifier is allowed per declaration
  • the elements of the non-empty init-declarator-list are of the form (Just declr, init?, Nothing). The declarator declr has to be present and non-abstract and the initialization expression is optional.

2) Structure declarations (K&R A8.3, C99 struct-declaration)

Those are the declarations of a structure's members.

  • do not allow storage specifiers
  • in strict C99, the list of declarators has to be non-empty
  • the elements of init-declarator-list are either of the form (Just declr, Nothing, size?), representing a member with optional bit-field size, or of the form (Nothing, Nothing, Just size), for unnamed bitfields. declr has to be non-abstract.
  • no member of a structure shall have incomplete type

3) Parameter declarations (K&R A8.6.3, C99 6.7.5 parameter-declaration)

  • init-declarator-list must contain at most one triple of the form (Just declr, Nothing, Nothing), i.e. consist of a single declarator, which is allowed to be abstract (i.e. unnamed).

4) Type names (A8.8, C99 6.7.6)

  • do not allow storage specifiers
  • init-declarator-list must contain at most one triple of the form (Just declr, Nothing, Nothing). where declr is an abstract declarator (i.e. doesn't contain a declared identifier)

data CStructUnion Source

C structure or union specifiers (K&R A8.3, C99

CStruct tag identifier struct-decls c-attrs represents a struct or union specifier (depending on tag).

  • either identifier or the declaration list struct-decls (or both) have to be present.

Example: in struct foo x;, the identifier is present, in struct { int y; } x the declaration list, and in struct foo { int y; } x; both of them.

  • c-attrs is a list of __attribute__ annotations associated with the struct or union specifier

data CStructTag Source

A tag to determine wheter we refer to a struct or union, see CStructUnion.



data CEnum Source

C enumeration specifier (K&R A8.4, C99

CEnum identifier enumerator-list attrs represent as enum specifier

  • Either the identifier or the enumerator-list (or both) have to be present.
  • If enumerator-list is present, it has to be non-empty.
  • The enumerator list is of the form (enumeration-constant, enumeration-value?), where the latter is an optional constant integral expression.
  • attrs is a list of __attribute__ annotations associated with the enumeration specifier


CEnum (Maybe Ident) (Maybe [(Ident, Maybe CExpr)]) [CAttr] NodeInfo 

Declaration attributes

data CDeclSpec Source

C declaration specifiers and qualifiers

Declaration specifiers include at most one storage-class specifier (C99 6.7.1), type specifiers (6.7.2) and type qualifiers (6.7.3).


CStorageSpec CStorageSpec

storage-class specifier or typedef

CTypeSpec CTypeSpec

type name

CTypeQual CTypeQual

type qualifier

partitionDeclSpecs :: [CDeclSpec] -> ([CStorageSpec], [CAttr], [CTypeQual], [CTypeSpec], Bool)Source

Seperate the declaration specifiers

Note that inline isn't actually a type qualifier, but a function specifier. __attribute__ of a declaration qualify declarations or declarators (but not types), and are therefore seperated as well.

data CStorageSpec Source

C storage class specifier (and typedefs) (K&R A8.1, C99 6.7.1)


CAuto NodeInfo


CRegister NodeInfo


CStatic NodeInfo


CExtern NodeInfo


CTypedef NodeInfo


CThread NodeInfo

GNUC thread local storage

data CTypeSpec Source

C type specifier (K&R A8.2, C99 6.7.2)

Type specifiers are either basic types such as char or int, struct, union or enum specifiers or typedef names.

As a GNU extension, a typeof expression also is a type specifier.

isSUEDef :: CTypeSpec -> BoolSource

returns True if the given typespec is a struct, union or enum definition

data CTypeQual Source

C type qualifiers (K&R A8.2, C99 6.7.3), function specifiers (C99 6.7.4), and attributes.

const, volatile and restrict type qualifiers and inline function specifier. Additionally, __attribute__ annotations for declarations and declarators.

data CAttr Source

__attribute__ annotations

Those are of the form CAttr attribute-name attribute-parameters, and serve as generic properties of some syntax tree elements.


CAttr Ident [CExpr] NodeInfo 


data CDeclr Source

C declarator (K&R A8.5, C99 6.7.5) and abstract declarator (K&R A8.8, C99 6.7.6)

A declarator declares a single object, function, or type. It is always associated with a declaration (CDecl), which specifies the declaration's type and the additional storage qualifiers and attributes, which apply to the declared object.

A declarator is of the form CDeclr name? indirections asm-name? attrs _, where name is the name of the declared object (missing for abstract declarators), declquals is a set of additional declaration specifiers, asm-name is the optional assembler name and attributes is a set of attrs is a set of __attribute__ annotations for the declared object.

indirections is a set of pointer, array and function declarators, which modify the type of the declared object as described below. If the declaration specifies the non-derived type T, and we have indirections = [D1, D2, ..., Dn] than the declared object has type (D1 indirect (D2 indirect ... (Dn indirect T))), where

  • (CPtrDeclr attrs) indirect T is attributed pointer to T
  • (CFunDeclr attrs) indirect T is attributed function returning T
  • (CArrayDeclr attrs) indirect T is attributed array of elemements of type T

Examples (simplified attributes):

  • x is an int
 int x;
 CDeclr "x" []
  • x is a restrict pointer to a const pointer to int
 const int * const * restrict x;
 CDeclr "x" [CPtrDeclr [restrict], CPtrDeclr [const]]
  • f is an function return a constant pointer to int
 int* const f();
 CDeclr "f" [CFunDeclr [],CPtrDeclr [const]]
  • f is a constant pointer to a function returning int
 int (* const f)(); ==>
 CDeclr "f" [CPtrDeclr [const], CFunDeclr []]

data CDerivedDeclr Source

Derived declarators, see CDeclr

Indirections are qualified using type-qualifiers and generic attributes, and additionally

  • The size of an array is either a constant expression, variable length (*) or missing; in the last case, the type of the array is incomplete. The qualifier static is allowed for function arguments only, indicating that the supplied argument is an array of at least the given size.
  • New style parameter lists have the form Right (declarations, isVariadic), old style parameter lists have the form Left (parameter-names)


CPtrDeclr [CTypeQual] NodeInfo

Pointer declarator CPtrDeclr tyquals declr

CArrDeclr [CTypeQual] CArrSize NodeInfo

Array declarator CArrDeclr declr tyquals size-expr?

CFunDeclr (Either [Ident] ([CDecl], Bool)) [CAttr] NodeInfo

Function declarator CFunDeclr declr (old-style-params | new-style-params) c-attrs

data CArrSize Source

Size of an array


CNoArrSize Bool
CUnknownSize isCompleteType
CArrSize Bool CExpr
CArrSize isStatic expr


data CInit Source

C initialization (K&R A8.7, C99 6.7.8)

Initializers are either assignment expressions or initializer lists (surrounded in curly braces), whose elements are themselves initializers, paired with an optional list of designators.


CInitExpr CExpr NodeInfo

assignment expression

CInitList CInitList NodeInfo

initialization list (see CInitList)

type CInitList = [([CDesignator], CInit)]Source

Initializer List

The members of an initializer list are of the form (designator-list,initializer). designator-list is allowed to be empty - in this case the initializer refers to the ''next'' member of the compound type (see C99 6.7.8).

Examples (simplified expressions and identifiers):

 -- { [0], [3] = 4, [2] = 5, 8 }
 let init1 = ([CArrDesig 0, CArrDesig 3], CInitExpr 4)
     init2 = ([CArrDesig 2], CInitExpr 5)
     init3 = ([], CInitExpr 8)
 in  CInitList [init1, init2, init3]
 -- { .s = { {2,3} , .a = { 1 } } }
 let init_1  = [ ([], CInitExpr 1) ]
     init_23 = zip (repeat []) [CInitExpr 2, CInitExpr 3]
     init_s_1 = ([], CInitList init_23)
     init_s_a = ([CMemberDesig "a"], CInitList init_1)
     init_s  = ((CMemberDesig "s"), CInitList [init_s_1,init_s_a])
 in  CInitList [init_s]

data CDesignator Source


A designator specifies a member of an object, either an element or range of an array, or the named member of a struct / union.


CArrDesig CExpr NodeInfo

array position designator

CMemberDesig Ident NodeInfo

member designator

CRangeDesig CExpr CExpr NodeInfo

array range designator CRangeDesig from to _ (GNU C)


data CStat Source

C statement (K&R A9, C99 6.8)


CLabel Ident CStat [CAttr] NodeInfo

An (attributed) label followed by a statement

CCase CExpr CStat NodeInfo

A statement of the form case expr : stmt

CCases CExpr CExpr CStat NodeInfo

A case range of the form case lower ... upper : stmt

CDefault CStat NodeInfo

The default case default : stmt

CExpr (Maybe CExpr) NodeInfo

A simple statement, that is in C: evaluating an expression with side-effects and discarding the result.

CCompound [Ident] [CBlockItem] NodeInfo

compound statement CCompound localLabels blockItems at

CIf CExpr CStat (Maybe CStat) NodeInfo

conditional statement CIf ifExpr thenStmt maybeElseStmt at

CSwitch CExpr CStat NodeInfo

switch statement CSwitch selectorExpr switchStmt, where switchStmt usually includes case, break and default statements

CWhile CExpr CStat Bool NodeInfo

while or do-while statement CWhile guard stmt isDoWhile at

CFor (Either (Maybe CExpr) CDecl) (Maybe CExpr) (Maybe CExpr) CStat NodeInfo

for statement CFor init expr-2 expr-3 stmt, where init is either a declaration or initializing expression

CGoto Ident NodeInfo

goto statement CGoto label

CGotoPtr CExpr NodeInfo

computed goto CGotoPtr labelExpr

CCont NodeInfo

continue statement

CBreak NodeInfo

break statement

CReturn (Maybe CExpr) NodeInfo

return statement CReturn returnExpr

CAsm CAsmStmt NodeInfo

assembly statement

data CBlockItem Source

C99 Block items

Things that may appear in compound statements: either statements, declarations or nested function definitions.


CBlockStmt CStat

A statement

CBlockDecl CDecl

A local declaration

CNestedFunDef CFunDef

A nested function (GNU C)

data CAsmStmt Source

GNU Assembler statement

 CAsmStatement type-qual? asm-expr out-ops in-ops clobbers _

is an inline assembler statement. The only type-qualifier (if any) allowed is volatile. asm-expr is the actual assembler epxression (a string), out-ops and in-ops are the input and output operands of the statement. clobbers is a list of registers which are clobbered when executing the assembler statement

data CAsmOperand Source

Assembler operand

CAsmOperand argName? constraintExpr arg specifies an operand for an assembler statement.


data CExpr Source

C expression (K&R A7)

  • these can be arbitrary expression, as the argument of sizeof can be arbitrary, even if appearing in a constant expression
  • GNU C extensions: alignof, __real, __imag, ({ stmt-expr }), && label and built-ins

data CBinaryOp Source

C binary operators (K&R A7.6-15)



remainder of division


shift left


shift right






less or equal


greater or equal




not equal


bitwise and


exclusive bitwise or


inclusive bitwise or


logical and


logical or

data CUnaryOp Source

C unary operator (K&R A7.3-4)



prefix increment operator


prefix decrement operator


postfix increment operator


postfix decrement operator


address operator


indirection operator


prefix plus


prefix minus


one's complement


logical negation

data CBuiltin Source

GNU Builtins, which cannot be typed in C99


data CStrLit Source

Attributed string literals


CStrLit CString NodeInfo 

liftStrLit :: CStrLit -> CConstSource

Lift a string literal to a C constant