> module Modules where

> import List (partition, nub)
> import Maybe (fromJust)

> -- implementation only
> import QualNames


Module system parameters
========================

The module system is parametrized on four type parameters:

    * The type of identifiers used in the source code of a program (#qn#).
      Those may be qualified or unqualified names.
    
    * The type of module names and name qualifiers (#m#).
      These are used to name modules, in import declarations, export lists,
      and qualified names.

    * The type of simple names (#n#).  Those are used
      to name entities in definitions or when exported.

    * The type of entities (#ent#).  These are the objects referred to by 
      identifiers in the program and are introduced by declarations 
      (functions, type constructors, value constructors, field selectors,
      classes, class methods).


The parameters are not completely arbitrary, in fact there is a close relation
between the first three.  To capture this, we use the class #QualNames#:

| class QualNames qn m n | qn -> m n, m n -> qn where
|   -- observers
|   isQual          :: qn -> Bool
|   getQualifier    :: qn -> m
|   getQualified    :: qn -> n
| 
|   -- constructors
|   mkUnqual        :: n -> qn 
|   mkQual          :: m -> n -> qn 

In order for the types to be suited for the purposes of the module system,
we require that they be related by the #QualNames# class.
This essentially forces #i# to be a type similar to the following:

| data Name m n = Unqualified n | Qualified m n


We also require that there be a special element in #m#, which we shall
call "Main", and a special element in #qn# which shall be denoted as "main".


For the type of entities, we require that there is a way to identify value
constructors,  and that we know the the ownership relation between
entities (e.g. a type constructor may "own" a number of value constructors).
Formally this is captured by the requirement that the type #ent#
be an instance of the following class:

| class Entity ent where
|   isCon   :: ent -> Bool
|   owns    :: ent -> [ent]


Finally, all four types should be in the #Eq# class, i.e. values of the
appropriate types should be comparable for equality. 


The abstract syntax
===================


A Haskell module consists of a name, an export specification, a number
of import declarations and a number of local declarations.  This 
suggests the following data structure:

> data Mod qn m n ent = Mod
>     { modName       :: m
>     , modExport     :: Maybe [ExpListEntry qn m]
>     , modImports    :: [Import m n]
>     , modDefines    :: Rel n ent
>     }                                                 

In the concrete syntax of Haskell, it is valid to just have the module body
(imports and declarations).  This is considered to be an abbreviation for 
a module with name "Main" (i.e. the special element of the #m# type),
and export specification exporting a single entity named "main"
(the special element of the #qn# type).  (sect 5.1. of report)

It is also valid to omit just the export specification (sect 5.2 of report),
but this is not considered to be an abbreviation.  Instead it is a special
kind of specification, one saying that only (and all) locally defined
entities are to be exported.  

> data ExpListEntry qn m    = Ent (EntSpec qn) | ModuleExp m

In an export list a programmer may write two kinds of things - either
the name of an entity with (perhaps) some subordinate names (here modeled
by an #EntSpec#), or the so called "module" export, which in concrete
Haskell syntax is written as #module M# (where #M# is a module alias).

An #EntSpec# consists of the name of an entity, with possibly the
names of some subordinate names.  An explicit list of subordinate names
is modeled with the #Some# component of the #SubSpec# data type,
while a request for all available subordinate names (the #(..)# notation
of Haskell) is modeled using #All#.

> data EntSpec j          = EntSpec j (Maybe (SubSpec j)) deriving Eq
> data SubSpec j          = All | Some [j]                deriving Eq

The #EntSpec# type is in fact shared between export lists and import
declarations, as they are structurally the same.  The only difference 
between the two is the types used to name entities (i.e. the #j# parameter
above).  In the case of an export list we use #qn# (the type of program
identifiers - qualified or not), while in the case of import declarations
we use #n# (the type of simple names).

We note that our abstract syntax is slightly more permissive than Haskell's
as one may add subordinate names to any name, while in Haskell
it is syntactically invalid to add subordinate names to lower case
identifiers.  

This brings us to the data structure used to model import declarations:

> data Import m n     = Import
>     { impQualified  :: Bool
>     , impFrom       :: m 
>     , impAs         :: m
>     , impSpec       :: (ImpMethod, [EntSpec n])
>     }                  
>
> data ImpMethod = Importing | Hiding

An import may be qualified (or not), contains the name of the module from which 
entities are to be imported, an alias to be used in qualified names for entities
which came through this import, and finally a specification of which entities 
are to be imported.  In concrete Haskell terms, the only required field is
the name of the source module, but the omissions of any of the other fields
are considered to be abbreviations as follows:

    * The value of #impQualified# is #False# unless the #qualified#
    keyword was present in the source code, in which case it is #True#.

    * If the #as# clause of a module is omitted, this field gets the same
    value as the #impFrom# field, i.e. the name of the source module
    is used to qualify names.

    * If the import specification is omitted, it is assumed to be
    #(Hiding, [])# (or #hide ()# in Haskell syntax), essentially saying that
    all entities are to be imported.
    

Semantics
=========

The abstract syntax above was nicely structured in the sense that bigger 
pieces were made out of smaller ones.  It would be nice to be able to
give meanings of the bigger pieces in terms of the meanings of the smaller
ones and this is the approach we take.  Each piece of the abstract syntax
is modeled as a relation transformer, i.e. given a relation it
produces a new relation. To put the whole idea in perspective
we first need a brief digression to relations.


Relations
---------

We use relations to model the association between names and entities.
This seems like a natural choice as in a single framework we may
represent undefined names (i.e. names which refer to no entities),
ambiguous names (names referring to more than one entity) and proper
names (referring to exactly one entity).  It is not enough to restrict
ourselves to just using proper names, as there are valid Haskell
programs containing ambiguous names in scope (report section 5.5.2),
of course there is the requirement that such names are not used.
Furthermore, one may use the information provided by the relation to
generate useful error messages.  We model relations as lists of pairs:

> type Rel a b = [(a,b)]

The exports of a module are modeled by a relation between simple names
and entities (i.e. #Rel n ent#), while the names in scope of a module
are modeled as a relation between identifiers and entities (i.e. #Rel qn ent#).
Ultimately our goal is for each module to come up with two relations:

    * the "export" relation of type #Rel n ent#
    * the "in scope" relation of type #Rel qn ent#


Functions used to compute both "exports" and "in scope" relations
-----------------------------------------------------------------

From the structure of a module we know what entities it defines.  Now
we need to consider how may a programmer refer to those entities
(i.e. what is the relation between names and entities defined in a module).
This is specified by the #locals# function, which computes a part
of the "in scope" relation of a module.  When an entity is defined in
a module one may refer to it by either using its simple name 
or by qualifying it with the name of the module 
(report section 5.5.1).  Formally this is:

| locals :: ...
|   => Mod qn m n ent 
|   -> Rel qn ent

> locals mod = [ (mkUnqual n, e) | (n,e) <- defEnts ]
>           ++ [ (mkQual m n, e) | (n,e) <- defEnts ]
>   where
>   m       = modName mod
>   defEnts = modDefines mod

This function is also used to give semantics to modules without an
export specifications.


As we saw above, there is a certain commonality between the structure of
export lists and import declarations, in the sense that they both made
use of the #EntSpec# data type.  It is nice that we (nearly) have this
commonality at the semantic level as well.  When occurring in an export
list and a non-hiding import declaration, #EntSpec#s have essentially the same
meaning.  A difference occurs in the #hiding# case of an import declaration,
in which case it was decided that ambiguities are not important and 
the semantics is slightly more permissive.

The design choice made in Haskell was that the exports of a module should
be unambiguous in the sense that all exported entities have distinct
(simple) names.  The one exception to this rule is that a type and
a data constructor may have the same name.  The reason for this is
practical - although two different entities have the same name, one
may determine from the context which entity is meant, so the ambiguity
is not crucial (as types and data constructors are never interchangeable).
The only situation in which one may not distinguish between such
entities from the context is in an export lists and import declarations.
So in Haskell a different disambiguating mechanism is used in such cases -
the only way to export (or import) a value constructor is as a subordinate 
name.  Unfortunately this is not the case in #hiding# imports, where
if a name refers to more than one entity, all of them are hidden (i.e.
not imported).

To model this asymmetry in Haskell, we add
an additional parameter to the meaning function of #EntSpec#s, namely a
predicate over entities.  As we shall see later this predicate would
either be #not . isCon# (i.e. ignore constructors), or #const True#
for the #hiding# imports case, where everything is treated in the same way.

| mEntSpec :: ...
|   => (ent -> Bool) -> EntSpec j
|   -> Rel j ent -> Rel j ent
 
> mEntSpec p (EntSpec x spec) r = xents ++ xsubs
>   where
>   xents       = [ (y,e) | (y,e) <- r, y == x, p e ]
>
>   xsubs = case spec of 
>       Nothing         -> []
>       Just All        -> allxsubs
>       Just (Some xs)  -> [ (y,e) | (y,e) <- allxsubs, y `elem` xs ]
>
>   allxsubs    = [ (y,e) | (_,xe) <- xents, s <- owns xe, (y,e) <- r, e == s ]

It is not important if the relation passed as a parameter to #mEntSpec# is
an "export" or an "in scope" one.  The idea is that we take all entities
which may be named #x# (#xents#), and then we add
all subordinate entities belonging to entities in #xent# and
satisfy #spec# (#xsubs#).
If there was no specification, then we add no entities,
if the #(..)# specification was used, then we add all the subordinate entities
in scope (#allxsubs#) and finally if an explicit list of entities was provided
we only consider elements of #allxsubs# which conform to the specification.


Export relations
----------------

To compute the exported entities of a module, we need to know what is the 
relation between names and entities in scope of the module (#inscp#).

| exports :: ...
|   => Mod qn m n ent
|   -> Rel qn ent -> Rel n ent

> exports m inscp = [ (getQualified qn, e) | (qn,e) <- exps ]
>   where
>   exps = case modExport m of
>     Nothing   -> locals m
>     Just es   -> concat [ mExpListEntry e inscp | e <- es ]

If the programmer omitted the export specification, we just export the 
locally defined entities using the #locals# function.
Alternatively if an explicit export list was present,  we take
the union of the meanings of all entries in the list.
Finally to obtain an "export" relation, we remove all qualifiers.
Note that after removing the qualifiers (or even before that)
we might end up with a relation containing ambiguous names.
This is not valid in Haskell (except for the special case mentioned in
the previous section), but we choose to handle this and all other error
related issues at a later stage.

There are two different kinds of entries which may occur in an export list are:

    * export entity by qualifier (the so called "module" export).  The
    meaning of such an export specification is described in the report 
    as follows (report 5.2 - item 5 in list):

The form "module M" abbreviates the set of all entities whose unqualified name, e, is in scope, and for which the qualified name M.e is also in scope and refers to the same entity as e.

    * export entity by name, i.e. when the programmer uses the name of an
    entity in the export list with perhaps some subordinate names.  We already
    have a function to handle such situations - #mEntSpec#.  Since a data
    constructors may not be exported on its own (report 5.2 - item 2 in list)
    we pass #not . isCon# as the additional parameter to #mEntSpec#.

| mExpListEntry :: ...
|   => ExpListEntry qn m   
|   -> Rel qn ent -> Rel qn ent

> mExpListEntry (ModuleExp m) inscp =
>               [ (x,ent) | (x,ent) <- unqs, (qual m x,ent) `elem` qs ]
>   where
>   (qs,unqs) = partition (isQual . fst) inscp
>
> mExpListEntry (Ent ispec) inscp = mEntSpec (not . isCon) ispec inscp

 
In scope relations
------------------

To compute what names are in scope within a module, we need to know the
exports of all imported modules.  The parameter #expsOf# is a function
mapping module names to their exports.  
Simply put, the entities in scope of a module are all locally
defined entities and the union of all entities imported through an import
declaration:

| inscope :: ...
|   => Module qn m n ent
|   -> (m -> Rel n ent) -> Rel qn ent 
  
> inscope m expsOf = imports ++ locals m 
>   where
>   imports = concat [ mImp imp (expsOf (impFrom imp)) | imp <- modImports m ]


The meaning of an import declaration is a relation transformer, which 
changes "export" relations to "in scope" ones:

| mImp :: ...
|   => Import m n
|   -> Rel n ent -> Rel qn ent
 
> mImp imp exps 
>   | impQualified imp  = qs
>   | otherwise         = unqs ++ qs
>   where
>   m           = impAs imp
>   qs          = [ (mkQual m n, e) | (n,e) <- incoming ]
>   unqs        = [ (mkUnqual n, e) | (n,e) <- incoming ]
>   incoming    = mImpSpec (impSpec imp) exps

We use the semantic function for import specifications to compute what entities
come in scope (#incoming#), and then we adjust their names appropriately.
In the case of a #qualified# import, we only consider qualified names (#qs#),
otherwise we also add unqualified names (#unqs#).


The meaning of import specifications is computed by #mImpSpec#, which is
essentially a "front end" to the #mEntSpec# function.
We have two cases to consider - "importing" and #hiding# specifications.
In the first case we take the union of the relations corresponding to 
the entity specifications.  We ignore data constructors, as those may
only be imported as subordinate names (#not . isCon#).
For #hiding# specifications we essentially take the complement of
the non-hiding case, except now we do not ignore data constructors.

| mImpSpec :: ...
|   => ImpSpec m n 
|   -> Rel n ent -> Rel n ent
 
> mImpSpec (method, specs) exps = case method of
>   Importing   -> ents (not . isCon)
>   Hiding      -> [ exp | exp <- exps, exp `notElem` ents (const True) ]
>   where
>   ents p = [ x | spec <- specs, x <- mEntSpec p spec exps ]


Iteration
---------

The reader might have noticed that to compute the exports of a module we
need to know what is in scope, and to compute what is in scope we need
to know certain exports.  If we allow for mutually recursive modules, our
equations become mutually recursive and care needs to be taken when
solving them.  The idea is that the "export" and "in scope" relations for a
group of mutually recursive modules are computed at the same time (as they all
depend on each other).  In fact one could process _all_ modules in a 
program together, which is less efficient from an implementation point
of view, but simplifies this presentation as we need not supply the code
for computing groups of mutually recursive modules and dependencies between
those (a standard graph algorithm, followed by a topological sort).

To represent the "in scope" and "exports" relations of a group of modules
we use lists.  Since relations may in general be ordered by inclusion,
we may also order lists (of equal length) of relations as follows:

> (<<=) :: (Eq a, Eq b) => [Rel a b] -> [Rel a b] -> Bool
> xss <<= yss = and (zipWith subEq xss yss)
>   where
>   xs `subEq` ys   = all (`elem` ys) xs 

Note that this order is different from the one usually used in Haskell and
is only well defined on finite lists.  Since Haskell programs only have
finitely many modules this is not a problem.  In fact the lists we deal
with all have the same length, namely the number of modules in the program,
so one could think of them as tuples of relations.

We not only have an ordering structure on the lists of relations, we also
have a lattice structure which is directly inherited from the lattice structure
on the relations (operations defined element-wise as in the ordering above).
The lattice we deal with is finite, as each module may only define finitely
many entities and each entity may only have finitely many names.
Names get associated with entities in two ways:

    * by local declaration, in which case an entity receives two names
    (qualified and unqualified)

    * by import declaration, in which case an entity receives either one or
    two names.

Since there are only finitely many local and import declarations,
an entity may only have finitely many names. 


| computeInsOuts :: ...         
|   => [Mod qn m n ent]       
|   -> [(Rel qn ent, Rel n ent)]

> computeInsOuts otherExps mods = inscps `zip` exps
>   where
>   inscps          = nub `map` computeIs exps 
>   exps            = lfpAfter nextExps (repeat [])
>
>   nextExps        = computeEs . computeIs 
>
>   computeEs is    = zipWith (\m -> nub . exports m) mods is
>   computeIs es    = (\m -> inscope m (toFun es)) `map` mods
> 
>   toFun es m = maybe (otherExps m) (es !!) $ lookup m mod_ixs
>   -- toFun es m = let Just pos = lookup m mod_ixs in es !! pos
>   mod_ixs     = (modName `map` mods) `zip` [0 .. ] 

> lfpAfter :: Eq a => (a -> a) -> (a -> a) 
> lfpAfter f x = let fx = f x in if fx == x then x else lfpAfter f fx

The semantic functions #exports# and #inscope# deal with only one module
at a time, but it is easy to generalize them to work on multiple modules.
Given a list of modules and a list of "in scope" relations we may 
apply #exports# in "parallel" to compute a list of "export" relations
(#computeEs#).

If we are given a list of "export" relations, we may convert it to
a function associating each module with its exports (#toFun#). This
is achieved with the aid of the association list #mod_ixs#, which 
relates modules to their positions in the list (tuple) of modules.
Now that we have a function relating modules to their exports,
we may apply #inscope# to compute the "in scope" relations of the modules
(#computeIs#).

We observe that both #exports# and #inscope# are monotonic with respect to the
relation ordering.  This follows essentially from the monotonicity of
 #mEntSpec# and the definition of "module exports" (the first case
of #mExpListEntry#).  But then #computeEs# and #computeIs# are also monotonic
when lists are ordered as described above, and hence their composition
is monotonic.  Since we work in a finite lattice this is enough to conclude
that the function #nextExps# has a least fixed point.

We may than define the exports of the argument list of modules as the least
fixed point of #nextExps#.  Intuitively, given a list of "export" relations
 #nextExps# will compute what should be in scope in each module, and then based
on that compute a better approximation of what should be exported.
Similarly, the "inscope" relations of the modules may be defined as the
least fixed point of #computeIs . computeEs#, but using the well known
fact $fix (f . g) = f (fix (g . f))$ we obtain the alternative definition
used above.