smuggler2: GHC Source Plugin that helps to minimise imports and generate explicit exports

[ compiler-plugin, development, library, mpl, refactoring ] [ Propose Tags ]

Usage

Add smuggler2 to the build dependencies of your project. . Then add the following to ghc-options: -fplugin=Smuggler2.Plugin. See the README https://hackage.haskell.org/package/smuggler2 for more details and options.


[Skip to Readme]

Flags

Manual Flags

NameDescriptionDefault
debug

Enable debugging support

Disabled
threaded

Build with support for multithreaded execution

Enabled

Use -f <flag> to enable a flag, or -f -<flag> to disable that flag. More info

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

Versions [RSS] 0.3.2.1, 0.3.2.2, 0.3.3.2, 0.3.4.1, 0.3.4.2, 0.3.5.1, 0.3.5.2, 0.3.6.1, 0.3.6.2
Change log CHANGELOG.md
Dependencies base (>=4.9 && <4.16), containers (>=0.6.0 && <0.7), directory (>=1.3.3 && <1.4), filepath (>=1.4.2 && <1.5), ghc (>=8.6.5 && <8.11), ghc-boot (>=8.6.5 && <8.11), ghc-exactprint (>=0.6.3 && <0.7), ghc-paths (>=0.1.0 && <0.2), syb (>=0.7.1 && <0.8), typed-process (>=0.2.6 && <0.3) [details]
License MPL-2.0
Copyright 2020 jrp2014, Dmitrii Kovanikov
Author jrp2014, Dmitrii Kovanikov, Veronika Romashkina
Maintainer jrp2014
Category Development, Refactoring, Compiler Plugin
Home page https://github.com/jrp2014/smuggler2
Bug tracker https://github.com/jrp2014/smuggler2/issues
Source repo head: git clone https://github.com/jrp2014/smuggler2
Uploaded by jrp at 2020-06-09T21:41:52Z
Distributions
Executables ghc-smuggler2
Downloads 1370 total (25 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs uploaded by user
Build status unknown [no reports yet]

Readme for smuggler2-0.3.4.1

[back to package description]

smuggler2

MPL-2.0 license Smuggler2 Build Status Hackage Stackage

Smuggler2 is a Haskell GHC Source Plugin that automatically

  • rewrites module imports to produce a minimal set. This may make code easier to read because the provenance of imported names is explcit.

  • adds or replaces explicit exports to produce a maximalist set for hand pruning. All values, types and classes defined in a module are exported (excluding those that are imported). It does not check whether an exported name is used elsewhere in your package. Limiting exports may make it easier for ghc to optimise some code.

While writing code, it may be convenient to import a complete module (by not specifiying what is to be imported from it) and then get Smuggler2 to limit the import to include only the names that are used.

How to use

Install smuggler2 using cabal install --lib smuggler2.

If you also want the ghc wrapper, install it using cabal install exe:smuggler2.

Adding Smuggler2 to your dependencies

Add smuggler2 to the dependencies of your project and to your compiler flags. For example, you could include in your project cabal file something like

flag smuggler2
  description: Rewrite sources to cleanup imports, and create explicit exports
  exports
  default:     False
  manual:      True

common smuggler-options
  if flag(smuggler2)
    ghc-options: -fplugin=Smuggler2.Plugin
    build-depends: smuggler2 >= 0.3 && < 0.4

and then import: smuggler-options in the appropriate library or executable sections.

The use of the flag allows you to build with or without source processing. Eg,

$ cabal build -fsmuggler2

using the example above.

You might use this approach to refine your imports or get a starting point for your exports, but not rewrite them every time you compile. The use of a flag means that you can also exclude smuggler2 dependencies from your final builds.

Alternatively, using a local version

If you have installed smuggler2 from a local copy of this repository, you may need to add -package smuggler2 to your ghc-options if you did not install using the --lib flag to cabal install.

Or use a ghc wrapper

The smuggler2 package provides an executable ghc-smuggler2 that calls ghc with the -fplugin=Smuggler2.Plugin argument (followed by any others that you supply). This allows you to run the plugin over your sources without modifying your .cabal file:

$ cabal build -with-compiler=ghc-smuggler2

or just

$ cabal build -w ghc-smuggler2

Options

Smuggler2 has several (case-insensitive) options, which can be set by adding -fplugin-opt=Smuggler2.Plugin: flags to your ghc-options

  • NoImportProcessing - do no import processing

  • PreserveInstanceImports - remove unused imports, but preserve a library import stub. such as import Mod (), to import only instances of typeclasses from it. (The default.)

  • MinimiseImports - remove unused imports, including any that may be needed only to import typeclass instances. This may, therefore, stop the module from compiling.

  • NoExportProcessing - do no export processing

  • AddExplicitExports - add an explicit list of all available exports (excluding those that are imported) if there is no existing export list. (The default.) You may want to edit it to keep specific values, types or classes local to the module. At present, a typeclass and its class methods are exported individually. You may want to replace those exports with an abbreviation such as C(..).

  • ReplaceExports - replace any existing module export list with one containing all available exports (which, again, you can, of course, then prune to your requirements).

Any other option value is used to generate a source file with a new extension of the option value (new in the following example) rather than replacing the original file.

    ghc-options: -fplugin=Smuggler2.Plugin -fplugin-opt=Smuggler2.Plugin:new

This will create output files with a .new suffix rather the overwriting the originals.

Smuggler2 tries not to change files when there is no work to do. So you can just run ghcid as usual:

$ ghcid --command='cabal repl'

Caveats

Because cabal and ghc don't have full support for distinguishing dependent packages from plug-ins you will probably want to ensure that the build dependencies for your project are installed into your local package db first, before enabling sumuggler, otherwise they will all be processed by it too, as your project builds, which should do no harm, but will increase your build time.

Smuggler2 is robust -- it can chew through the Agda codebase of over 370 modules with complex interdependencies and be tripped over by only

  • a couple of ambiguous exports (are we trying to export something defined in the current module or something with the same name from an imported module)
  • and a couple of imports where both qualifed and unqualifed version of the module are imported and there are references to both qualified and unqualifed version of the same names
  • some qualified record fields are overlooked

But there are some caveats, most of which are either easy enough to work around (and still offer the benefit of a great reduction in keyboard work):

  • Smuggler2 rewrites the existing imports, rather than attempting to prune them. (This is a more aggressive approach than smuggler which focuses on removing redundant imports.) It has advantages and disadvantages. The advantage is that a minimal set of imports is generated in a reproducable format. So you can just import a library without specifying any specific imports and Smuggler2 will add an explict list of things that are used from it. This can be a useful check and better document your modules. The disdvantage is that imports may be reordered, comments and blank lines dropped, external imports mixed with external, etc.

  • By default Smuggler2 does not remove imports completely because an import may be being used to only import instances of typeclasses, So it will leave stubs like

    import Mod ()
    

    that you may want to remove manually. Alternatively use the MinimiseImports option to remove them anyway, at the risk of producing code that fails to compile.

  • CPP files will not be processed correctly: the imports will be generated for current CPP settings and any CPP annotations in the import block will be discarded. This may be a particular problem if you are writing code for several generations of ghc and base for example. Nevetheless, Smuggler2 will generate a new CPP preprocessed output file with a -cpp suffix. retrie solves this problem generating all possible versions of the module (exponential in the number of #if directives), operating on each version individually, and splicing results back into the original file. A tour de force!

  • smuggler2 depends on the current ghc compiler and base library to check whether an import is redundant. Different versions of the compiler may, of course, need different slightly imports, typically from base. The base library changelog provides some details of what was made available when.

  • Multiple separate import lines referring to the same library are not consolidated

  • Literate Haskell .lhs files will procssed into ordinary haskell files a -lhs suffix.

  • hiding clauses may not be properly analysed. So hiding things that are not used may not be spotted.

  • The test suite does not seem to run reliably on Windows. This is probably more of an issue with the way that the tests are run, than Smuggler2 itself.

  • Currently cabal does not have a particular way of specifying plugins. (See, eg, https://gitlab.haskell.org/ghc/ghc/issues/11244 and https://github.com/haskell/cabal/issues/2965) which would allow cleaner separation of user code and plugin-code

For contributors

Requirements:

  • ghc-8.6.5, ghc-8.8.3 and ghc-8.10.1: Smuggler2 will not compile with earlier versions.
  • The test golden values are for ghc-8.10.1 and ghc-8.8.3. Some of them fail on ghc-8.6.5 because it seems to need to import Data.Bool whereas later versions of GHC don't. The results compile on ghc-8.6.5 and later anyway, but the imports are not as minimal for later versions as they could be.
  • cabal >= 3.0 (ideally 3.2)

How to build

$ cabal update
$ cabal build

To build with debugging:

$ cabal build -fdebug

Curently this just adds an -fdump-minimal-imports parameter to GHC compilation.

How to run tests

There is a tasty-golden-based test suite that can be run by

$ cabal test smuggler-test --enable-tests

Further help can be found by

$ cabal run smuggler-test -- --help

(note the extra --)

For example, if you are running on ghc-8.6.5 you can

$ cabal run smuggler2-test -- --accept

to update the golden outputs to the current results of (failing) tests.

It is sometimes necessary to run cabal clean before running tests to ensure that old build artefacts do not lead to misleading results.

smuggler-test uses cabal exec ghc internally to run a test. The cabal command that is to be used to do that can be set using the CABAL environment variable. This may be helpful for certain workflows where cabal is not in the current path, or you want to add extra flags to the cabal command.

The test suite does not seem to run reliably on Windows

Importing a test module from another test module in the same directory is likely to lead to race conditions as 'Tasty' runs tests in parallel and so will try to generate the same smuggler2 output both when the imported module is being tested directly and when it is being processed when the importing module is being tested. Put the imported module in a subdirectory to avoid this issue, as the test harness only looks for tests in test\tests and not its subdirectories.

Implementation approach

smuggler2 uses the ghc-exactprint library to modiify the source code. The documentation for the library is fairly spartan, and the library is not widely used, at least in publicly available code, so the use here can, no doubt, be optimised.

The library is needed because the annotated AST that GHC generates does not have enough information to reconstitute the original source. Some parts of the renamed syntax tree (for example, imports) are not found in the typechecked one. ghc-exactprint provides parsers that preserve this information, which is stored in a separate Anns Map used to generate properly formatted source text.

To make manipulation of GHC's AST and ghc-exactprint's Anns easier, ghc-exactprint provides a set of Transform functions. These are intended to facilitate making changes to the AST and adjusting the Anns to suit the changes.

These functions are said to be under heavy development. It is not entirely obvious how they are intended to be used or composed. The approach provided by retrie wraps an AST and Anns into a single type that seems to make AST transformations easier to compose and reduces the risk of the Anns and AST getting out of sync as it is being transformed, something with which the type system doesn't help you since the Anns are stored as a Map.

Imports

smuggler2 uses GHC to generate a set of minimal imports. It

  • parses the original file
  • dumps the minimal exports that GHC generates and parses them back in (to pick up the annotations needed for printing)
  • drops implicit imports (such as Prelude) and, optionally, imports that are for instances only
  • replaces the original imports with minimal ones
  • exactPrints the result back over the original file (or one with a different suffix, if that was specified as option to smuggler2)

This round tripping is needed because the AST that ghc provides does not have enough information in it to reconstitute the source (which is why ghc-exactprint exists).

Exports

Exports are simpler to deal with as GHC's exports_from_avail does the work.

Other projects

  • Smuggler2 was is a rewrite of smuggler
  • retrie a code modding tool that works with GHC 8.10.1
  • refact-global-hse an ambitious import refactoring tool. This uses haskell-src-exts rather than ghc-exactprint and so may not work with current versions of GHC.
  • These blog posts contain some fragments on the topic of using ghc-exactprint to manipulate import lists Terser import declarations and GHC API (The site doesn't always seem to be up.)

Acknowledgements

Thanks to

  • Dmitrii Kovanikov and Veronika Romashkina who wrote smuggler
  • Alan Zimmerman and Matthew Pickering for ghc-exactprint
  • The ghc authors who have made the compiler internals available through an API.