Changes between Version 13 and Version 14 of SafeHaskell
- Timestamp:
- 11/21/10 17:28:57 (2 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
SafeHaskell
v13 v14 1 1 = Safe Haskell = 2 2 3 This is a proposal for a Haskell extension through which people can safely execute untrusted Haskell code, much the way web browsers currently run untrusted Java and !JavaScript, or the way the Spin and Singularity operating systems ran untrusted Modula-3 and C#/Sing#. 3 This is a proposal for a Haskell extension through which people can safely execute untrusted Haskell code, much the way web browsers currently run untrusted Java and !JavaScript, or the way the Spin and Singularity operating systems ran untrusted Modula-3 and C#/Sing#. 4 4 5 5 == Setup == 6 6 7 Safe Haskell assumes the following setup.8 * A server S wants to run code provided by untrusted (and perhaps malicious) clients X.9 * Clients X may send untrusted Haskell code to S ''in source form''10 * The server compiles this untrusted code, with the `-XSafe` flag.11 * If compilation succeeds, S can safely run the code, knowing that it cannot cause unsafe effects.12 * The server S trusts13 * GHC, its supporting tools 14 * Any Haskell modules that S chooses to compile without `-XSafe`15 * The server S does not trust client code. That is why it compiles itwith `-XSafe`.7 The proposal addresses security in the following scenario. 8 * An application A needs to incorporate a module M provided by an untrusted (and perhaps malicious) programmer. 9 * Module M is made available ''in source form'' to the user constructing A. 10 * The user constructing A compiles M with a new flag, `-XSafe`. 11 * If compilation succeeds, A can safely import M, knowing that M cannot cause effects that are not visible in the types of M's functions. 12 * The user constructing A must trust: 13 * GHC, its supporting tools, and 14 * Any Haskell modules of A compiled without `-XSafe`. 15 * The user does not trust M, which is why it compiles M with `-XSafe`. 16 16 17 More specifically, there are two parts to th isproposed extension:17 More specifically, there are two parts to the proposed extension: 18 18 19 1. A n option to GHC ({{{-XSafe}}}) that causes it to rejectany source code that might produce unsafe effects.19 1. A new GHC option (`-XSafe`) enabling a "Safe" dialect of Haskell in which GHC rejects any source code that might produce unsafe effects. 20 20 21 2. An option to GHC ({{{-XTrusted}}}) indicating that, even though a module might invoke unsafe functions internally, the set of exported symbols cannot be used in an unsafe way. 21 2. Another new GHC option (`-XTrustworthy`) indicating that, though a module may invoke unsafe functions internally, the set of exported symbols cannot be used in an unsafe way. The `-XTrustworthy` option makes a small extension to the syntax of import statements, adding a `safe` keyword: 22 23 impdecl -> `import` [`safe`] [`qualified`] modid [`as` modid] [impspec] 22 24 23 A module compiled with `-XSafe` can only import modules compiled with `-XTrusted` or `-XSafe`.25 Roughly speaking, import declarations are either safe or unsafe, while modules are classified as either trusted or untrusted. A safe import declaration causes compilation to fail if the imported module is not trusted. In the Safe dialect, all import declarations are implicitly safe. When compiling with `-XTrustworthy`, import declarations with the `safe` keyword are safe, and those without unsafe. A module can be trusted only if it was compiled with `-XSafe` or `-XTrustworthy`, but there are additional restrictions as discussed below under '''module trust'''. 24 26 25 27 == Safety Goal == 26 28 27 As long as no module compiled with {{{-XTrusted}}} contains a vulnerability, the goal of {{{-XSafe}}}is to guarantee the following properties:29 As long as no module compiled with `-XTrustworthy` contains a vulnerability, the safe dialect's goal is to guarantee the following properties: 28 30 29 * '''Referential transparency.''' Functions in a module compiled {{{-XSafe}}}must be deterministic. Moreover, evaluating them should have no side effects, and should not halt the program (except by throwing uncaught exceptions or looping forever).31 * '''Referential transparency.''' Functions in the Safe dialect must be deterministic. Moreover, evaluating them should have no side effects, and should not halt the program (except by throwing uncaught exceptions or looping forever). 30 32 31 * '''Constructor access control.''' Code in a module compiled {{{-XSafe}}} must not be able to examine or synthesize data values usingconstructors the module cannot import.33 * '''Constructor access control.''' Safe code must not be able to examine or synthesize data values using data constructors the module cannot import. 32 34 33 Note that {{{-XSafe}}} should not prevent use of the symbol {{{IO}}}. Authors of normal (trusted) code may wish to use {{{ {-# LANGUAGE Safe #-} }}} as a means of ensuring they do not accidentally invoke unsafe actions, directly or indirectly. 35 * '''Semantic consistency.''' Any expression that compiles both with and without the import of a Safe module must have the same meaning in both cases. (E.g., {{{1 + 1 == 3}}} must remain `False` when you add the import of a Safe module.) 34 36 35 Of course, if an untrusted module exports an {{{IO}}} action, that action may have arbitrary side effects. Compiling the module with {{{-XSafe}}} does not meaningfully restrict the effects of exported {{{IO}}} actions. Hence, an application importing an untrusted but safe module may safely invoke pure functions from the untrusted module, but must avoid executing {{{IO}}} actions from the module. 37 The Safe dialect is intended to be of use for both normal (trusted) and untrusted code. Authors of trusted modules may wish to include `{-# LANGUAGE Safe #-}` pragmas to ensure they do not accidentally invoke unsafe actions (directly or indirectly), or to allow other Safe code to import their modules. 38 39 Note that the safe dialect does not prevent use of the symbol `IO`. If an untrusted module exports an `IO` action, that action may have arbitrary side effects. Simply compiling a module with `-XSafe` does not meaningfully restrict the effects of exported `IO` actions. Hence, an application A importing an untrusted but safe module M may safely invoke pure functions from M, but must avoid executing `IO` actions unless some other mechanism has been used to ensure M can only construct `IO` actions that conform to A's security goal. (See `-ultrasafe` below for one such mechanism.) 40 41 == Module trust == 42 43 Recall that a safe import of a module M fails unless M is trusted. M is trusted when two conditions hold: 44 45 1. M is compiled with `-XSafe` or `-XTrustWorthy`, in which case we say M is ''trustable'', and 46 47 2. The modules on which M depends (transitively) for safety all reside in packages that the application trusts. 48 49 Determining trust requires two modifications to the way GHC manages modules. First, the interface file format must change to record whether each module is trustable, and if so what its trust dependencies are. Second, we need run-time options to specify which packages are trusted by an application. 50 51 We therefore change the interface file format to record for each module M whether or not M is trustable. If M is trustable, the interface file contains a list of ''trust dependencies'', which are (package, module) pairs on which the current application depends for safety. The trust dependencies of a module are determined as follows: 52 53 * ''If a module is compiled with `-XTrustworthy`'', then its trust dependencies always include itself, plus the union of trust dependencies of all modules imported with the `safe` keyword. 54 55 * ''If a module is compiled with `-XSafe` but not `-XTrustworthy`'', then its trust dependencies are the union trust dependencies of all modules it imports. 56 57 * If a module was compiled with neither `-XSafe` nor `-XTrustworthy`, or, regardless of these options, if a module contains a `{-# LANGUAGE Untrustworthy #-}` pragma, then the interface file marks the module as not trustable, so there are no trust dependencies. 58 59 Currently, in any given run of the compiler, GHC classifies each package as either exposed or hidden. To incorporate trust, we now allow a second bit specifying whether each package is trusted or untrusted. This bit will be controllable by two new options to `ghc-pkg`, `trust` and `distrust`, which are analogous to `expose` and `hide`. (The trust bit is meaningless while a package is hidden, but will stay in the package database for when packages are exposed.) 60 61 On the command line, several new options control which packages are trusted: 62 63 * `-trust` __P__ - exposes package __P__ (if it was hidden), and considers it a trusted package regardless of the contents of the package database. 64 65 * `-distrust` __P__ - exposes package __P__ (if it was hidden), and considers it an untrusted package, regardless of the contents of the package database. 66 67 * `-distrust-all-packages` - considers all packages untrusted unless they are explicitly trusted by subsequent command-line options. 68 69 * A convenience option `-ultrasafe` is equivalent to `-distrust-all-packages` `-XNoForeignFunctionInterface`. 70 71 None of these options can be specified or overwritten by `OPTIONS_GHC` pragmas in the Safe dialect. 36 72 37 73 == Threats == 38 74 39 The following aspects of Haskell can be used to violate the safety goal, and thus would need to be disallowed when {{{-XSafe}}} is in effect. ''Please add more examples to this list, as some are likely missing.''75 The following aspects of Haskell can be used to violate the safety goal, and thus need to be disallowed or modified for the Safe dialect. ''Please add more issues to this list, as some are likely missing.'' 40 76 41 77 * Some symbols in {{{GHC.Prim}}} can be used to do very unsafe things. At least one of these symbols, {{{realWorld#}}}, is magically exported by {{{GHC.Prim}}} even though it doesn't appear in the {{{GHC.Prim}}} module export list. ''Are there other such magic symbols in this or other modules?'' 42 78 43 * A number of functions can be used to violate safety. Many of these have names prefixed with {{{unsafe}}} (e.g., {{{unsafePerformIO}}}, {{{unsafeIterleaveIO}}}, {{{unsafeCoerce}}} {{{unsafeSTToIO}}}, ...). However, not all unsafe functions fit this pattern. For instance, {{{inlinePerformIO}}} and {{{fromForeignPtr}}} from the {{{bytestring}}}package are unsafe.79 * A number of functions can be used to violate safety. Many of these have names prefixed with `unsafe` (e.g., {{{unsafePerformIO}}}, {{{unsafeIterleaveIO}}}, {{{unsafeCoerce}}} {{{unsafeSTToIO}}}, ...). However, not all unsafe functions fit this pattern. For instance, {{{inlinePerformIO}}} and {{{fromForeignPtr}}} from the `bytestring` package are unsafe. 44 80 45 81 * Code that defines hand-crafted instances of {{{Typeable}}} can violate safety by causing {{{typeOf}}} to return identical results on two distinct types, then using {{{cast}}} to coerce between the two unsafely. ''Are there other classes? Perhaps {{{Data}}} should also be restricted? Simon says {{{Ix}}} doesn't need to be protected any more.'' … … 47 83 * Certain exposed constructors of otherwise mostly safe data types allow unsafe actions. For instance, the {{{PS}}} constructor of {{{Data.ByteString.ByteString}}} contains a pointer, offset, and length. Code that can see the pointer value can act in a non-deterministic way by depending on the address rather than value of a {{{ByteString}}}. Worse, code that can use {{{PS}}} to construct {{{ByteString}}}s can include bad lengths that will lead to stray pointer references. 48 84 49 * The {{{ForeignFunctionInterface}}} extension is mostly safe, but `foreign import`declarations that import a function with a non-`IO` type must be disallowed.85 * The {{{ForeignFunctionInterface}}} extension is mostly safe, but {{{foreign import}}} declarations that import a function with a non-`IO` type must be disallowed. 50 86 51 87 * {{{TemplateHaskell}}} is also particularly dangerous, as it can cause side effects even at compilation time. 52 88 53 * The {{{OverlappingInstances}}} extension may allow unsafe actions, because untrusted code can potentially redefine a type instance (by containing a more specific instance definition) in a way that changes the behaviour of code importing the untrusted module. To avoid that, overlapping instances declarations must come only from modules compiled with {{{-XTrusted}}} or modules compiled with {{{-XSafe}}}. It is not safe to allow an overlapping instance declaration for a given class in a modules compiled with {{{-XTrusted}}} and another overlapping instance declaration for the same class in a module compiled with {{{-XSafe}}}. It is also not safe to allow overlapping instances placed in different modules compiled with {{{-XSafe}}}. '''SLPJ: this may be undesirable, but does it violate the Safety Goal?'''89 * The {{{OverlappingInstances}}} extension can be used to violate semantic consistency, because untrusted code can potentially redefine a type instance (by containing a more specific instance definition) in a way that changes the behaviour of code importing the untrusted module. 54 90 55 * Likewise, {{{RULES}}} and {{{SPECIALIZE}}} pragmas can change the behavior of trusted code in unanticipated ways . '''SLPJ: same question'''91 * Likewise, {{{RULES}}} and {{{SPECIALIZE}}} pragmas can change the behavior of trusted code in unanticipated ways, violating semantic consistency. 56 92 57 * {{{OPTIONS_GHC}}} is dangerous in unfiltered form, as it could potentially expose packages with trusted but not trustworthy modules. {{{-XSafe}}} must be processed last after all other options. If previous options conflict with {{{-XSafe}}}, they must be overrided or compilation must fail.93 * {{{OPTIONS_GHC}}} is dangerous in unfiltered form, as it could potentially make packages trusted that the invoking user doesn't trust. 58 94 59 95 * The {{{StandaloneDeriving}}} extension can be used to violate constructor access control by defining instances of {{{Read}}} and {{{Show}}} to examine and construct data values with inaccessible constructors. … … 63 99 == Implementation details == 64 100 65 * An interface file should record whether a module is safe. When the module is safe, the interface file should additionally include a set of trusted modules on which the module depends. '''SLPJ:what is the function of the "set of trusted modules on which it depends"?''' There could be some option like {{{--trust-only}}} that restricts the set of packages from which trusted modules may be imported. Thus one could restrict what modules safe code imports in a way that is independent of whatever happens to be installed in a user's {{{~/.cabal}}} directory. 66 67 * A module compiled with {{{-XTrusted}}} should be marked safe; its set of trusted modules should contain itself and only itself. 68 69 * A module compiled with {{{-XSafe}}} should only be able to import modules that are marked safe. Its set of trusted modules should be the union of the trusted sets of all the modules it imports. 101 * The `-XSafe` option must be processed last after all other options. If previous options conflict with {{{-XSafe}}}, they must be overwritten or compilation must fail. Alternatively, maybe `-XSafe` and `{-# LANGUAGE Safe #-)` should always be processed before all other pragmas, especially `LANGUAGE` and `OPTIONS_GHC`, and any subsequent dangerous pragmas should cause compilation to fail. 70 102 71 103 * {{{GHC.Prim}}} will need to be made (or just kept) unsafe. 72 104 73 * {{{-XSafe}}} should disallow the {{{ FFI}}}, {{{TemplateHaskell}}}, {{{OverlappingInstances}}}, {{{StandaloneDeriving}}}, {{{GeneralizedNewtypeDeriving}}}, and {{{CPP}}} language extensions, as well as {{{RULES}}} and {{{SPECIALIZE}}} pragmas.105 * {{{-XSafe}}} should disallow the {{{TemplateHaskell}}}, {{{StandaloneDeriving}}}, {{{GeneralizedNewtypeDeriving}}}, and {{{CPP}}} language extensions, as well as {{{RULES}}} and {{{SPECIALIZE}}} pragmas. 74 106 75 * {{{OPTIONS_GHC}}} pragmas will have to be filtered. Some options, (e.g., -fno-warn-unused-do-bind) are totally fine, but many others are likely problematic (e.g., {{{-cpp}}}, which provides access to the local file system at compilation time, or {{{-F}}} which allows an arbitrary file to be executed, possibly even one named {{{/afs/}}}... and hence entirely under an attacker's control).107 * Overlapping instance declarations must either all reside in modules compiled without `-XSafe`, or else all reside in the same module. It violates semantic consistency to allow Safe code to change the instance definition associated with a particular type. 76 108 77 * Libraries will progressively need to be updated to export safe interfaces, which may require moving unsafe functions into separate modules, or adding new {{{ {-# LANGUAGE Safe #-} }}} modules that re-export a safe subset of symbols. Ideally, most modules in widely-used libraries would eventually contain either {{{ {-# LANGUAGE Safe -#} }}} or {{{ {-# LANGUAGE Trusted -#} }}} pragmas, except for internal modules or a few modules exporting unsafe symbols. Maybe haddock could add some indicator to make it obvious which modules are safe.109 * {{{OPTIONS_GHC}}} pragmas will have to be filtered. Some options, (e.g., `-fno-warn-unused-do-bind`) are totally fine, but many others are likely problematic (e.g., {{{-cpp}}}, which provides access to the local file system at compilation time, or {{{-F}}} which allows an arbitrary file to be executed, possibly even one named {{{/afs/}}}... and hence entirely under an attacker's control). 78 110 79 * The {{{-XTrusted}}} command-line option and corresponding pragma do not increase what a module can do--the only effect is to mark the module's interface file as safe. In particular, if both {{{-XTrusted}}} and {{{-XSafe}}} are supplied, then any unsafe actions will still cause a compilation error. A plausible use for both pragmas simultaneously is to prune the list of trusted modules--for instance if a module imports a bunch of trusted modules but does not use any of their trusted features, or only uses those features in a very limited way. If the code happens also to be safe, the programmer may want to add {{{-XSafe}}} to catch accidental unsafe actions.111 * Libraries will progressively need to be updated to export safe interfaces, which may require moving unsafe functions into separate modules, or adding new {{{ {-# LANGUAGE Safe #-} }}} modules that re-export a safe subset of symbols. Ideally, most modules in widely-used libraries would eventually contain either {{{ {-# LANGUAGE Safe -#} }}} or {{{ {-# LANGUAGE Trustworthy -#} }}} pragmas, except for internal modules or a few modules exporting unsafe symbols. Maybe haddock could add some indicator to make it obvious which modules are safe. 80 112 81 * It might be nice to add a third option, {{{ {-# LANGUAGE Unsafe -#} }}}, which prevents a module from compiling with {{{-XSafe}}} and disables the {{{-XTrusted}}} option so that the interface file is guaranteed not to be marked safe. This option could be used in seemingly safe modules that export constructors that would cause other modules to do unsafe things. (The {{{PS}}} constructor discussed above is an example of a dangerous constructor that could potentially be defined in a module that happily compiles with {{{-XSafe}}}.) This isn't strictly necessary, since one could always just enable another extension such as {{{FFI}}}, but having a specific {{{Unsafe}}} language option seems cleaner. 113 * The {{{-XTrustworthy}}} command-line option and corresponding pragma do not increase what a module can do--the only effect is to mark the module's interface file as trustable and add the module to its own trust dependency set. If both {{{-XTrustworthy}}} and {{{-XSafe}}} are supplied, then the language is still restricted to the Safe dialect and any unsafe actions will still cause a compilation error. A plausible use for both pragmas simultaneously is to prune the list of trusted modules--for instance, if a module imports a bunch of trusted modules but does not use any of their trusted features, or only uses those features in a very limited way. If the code happens also to be safe, the programmer may want to add {{{-XSafe}}} to catch accidental unsafe actions. 114 115 * The option `{-# LANGUAGE Untrustworthy -#}` is not incompatible with `{-# LANGUAGE Safe -#}`. The former causes the interface file to be marked not trustable, while the latter causes the source code to be confined to the safe dialect. `Untrustworthy` should be used in seemingly safe modules that export constructors that would cause other modules to do unsafe things. (The `PS` constructor discussed above is an example of a dangerous constructor that could potentially be defined in a module that happily compiles with `-XSafe`.) 82 116 83 117 == References ==
