# [configurator-ng](https://github.com/lpsmith/configurator-ng) ##What is this? This is a massively breaking revision of the application interface of [configurator]. The configuration file syntax is backward compatible, and mostly forward compatible as well. This fork is not (yet?) intended for widespread public consumption. Rather, this repo is being used as a stopgap measure in some of my own projects as well as a playground and laboratory for a new configurator-like package that may be released sometime in the future. ## Manifesto (Note that this section is mildly aspirational at a few points, and/or contains errors.) The application interface of `configurator` has numerous problems: * it makes it easy to introduce race conditions * it makes it difficult to write an application that is relatively robust to misconfiguration errors * does not scale well to moderately complex configuration scenarios * the configuration change notifications are particularly difficult to use beyond the most trivial of use cases. The aim of `configurator-ng` is to improve these issues, with the initial efforts focused on the first three. I hope to make more correct solutions easier, and less correct solutions harder, all wrapped up in a more expressive interface. ### Race conditions The interface of `configurator` basically is: data Config = Config (IORef (HashMap Text Value)) lookup :: Configured a => Config -> Text -> IO (Maybe a) The `IORef` is there to support configuration file reloading, which is often done automatically. So this results in the race condition: do key0 <- lookup config "key0" reload config {- in another thread -} key1 <- lookup config "key1" return (key0, key1) Thus, we have taken `key0` and `key1` from two versions of the configuration files, with a overall result that is not necessarily consistent with either version. There is a way to solve this race condition\*, though it is by no means convenient and it provides even less support for turning the result into configuration parameters: getMap :: Config -> IO (HashMap Text Value) This obtains a consistent\* snapshot of the configuration, from which you can pull out multiple values. But in addition to being less obvious and inconvenient, the fact that the `HashMap` returned is not an abstract type makes means that changing the representation breaks client code that uses this approach. `configurator-ng` makes the latter mode of use much more convenient by introducing `ConfigParser`s, a applicative/monadic high-level parsing interface to read configuration info from a single snapshot. See the module `Data.Configurator.Parser`. The basic ideas behind the revised interface is as follows: data ConfigCache = ConfigCache (IORef Config) readConfig :: ConfigCache -> IO Config runParser :: ConfigParser m => m a -> Config -> (Maybe a, [ConfigError]) (Here, `ConfigError` could be an error condition, or it might be more analogous to a warning or informational message; thus a parser can return a result *and* some `ConfigError`s.) Finally, we could define a `ConfigParser` to read from key1 and key2 by writing: getKeys :: ConfigParser m => m (Text, Int) getKeys = (,) <$> key "key0" <*> key "key1" (\*It's important to point out that `getMap` only avoids introducing additional race conditions; commonly used filesystems are racey software artifacts, so this is only consistent relative to filesystem reads. For a complete solution, one would have to take care in the precise filesystem calls used to manipulate the configuration file(s). Most popular text editors should be ok as far as the consistency of a single file, consistent reads of multiple files is trickier.) ### Configuration validation Another advantage of the `ConfigParser` interface is that it makes it easier and more convenient to validate a (sub-)configuration as an entirety, and thus also make more intelligent decisions about what to do in cases of misconfigurations. For example, one might want to continue running on the last known good configuration, and raise a big red flag in a monitoring solution. The goal is to provide mechanism, not policy. ### Greater Expressive Power Consider the following use case: you have an event processor, that watches several named sources for events. You might like your configuration file to look something like this: ~~~ event-sources { amazon-cloud { postgres { host = "cloudevents.mydomain.com" port = 5433 dbname = "eventdb" sslmode = "verify-full" sslcert = "${HOME}/credentials/pgclient.crt" sslkey = "${HOME}/credentials/pgclient.key" } heartbeat-interval = 15 heartbeat-timeout = 15 } chicago-service-center { postgres { host = "pgevents.customerdomain.com" port = 5433 dbname = "eventdb" sslmode = "verify-full" sslcert = "${HOME}/credentials/pgclient.crt" sslkey = "${HOME}/credentials/pgclient.key" } heartbeat-interval = 15 heartbeat-timeout = 15 } } ~~~ Now, `amazon-cloud` and `chicago-service-center` are names of the source useful for whatever purposes (logging, API endpoints, etc), that the event processor doesn't know about in advance. Since `configurator` is tied down to `HashMap`, the data structure offers no support for efficiently discovering these names. In order to fix this, `configurator-ng` moved to [`critbit`][critbit]. which allows us to efficiently iterate over these keys (in alphabetical order). So `configurator-ng` offers the following operator: subgroups :: ConfigParser m => Text -> m [Text] `subgroups` returns the non-empty value groupings of it's argument, so for example when evaluated in the context of the configuration above: subgroups "" ==> [ "event-sources" ] subgroups "event-sources" ==> [ "event-sources.amazon-cloud" , "event-sources.chicago-service-center" ] Another issue is that there's a lot of redundancy here, so maybe we'd like to refactor the configuration file into something like this: ~~~ event-sources { amazon-cloud { postgres.host = "cloudevents.mydomain.com" } chicago-service-center { postgres.host = "pgevents.customerdomain.com" } default { postgres { port = 5433 dbname = "eventdb" sslmode = "verify-full" sslcert = "${HOME}/credentials/pgclient.crt" sslkey = "${HOME}/credentials/pgclient.key" } heartbeat-interval = 15 heartbeat-timeout = 15 } } ~~~ So now the problem is that we want to turn this configuration into a list of `EventSource`s: ~~~ data EventSource = EventSource { name :: !Text, libpqConnParams :: [(Text,Value)], heartbeatInterval :: !Micro, heartbeatTimeout :: !Micro, } ~~~ Now, even ignoring the issue of the names mentioned above, handling this sort of customizable defaulting in `configurator` would be rather painful. But it's actually quite easy with `configurator-ng`: ~~~ {-# LANGUAGE ApplicativeDo, RecordWildCards #-} mapA :: Applicative f => (a -> f b) -> [a] -> f [b] mapA f = foldr (liftA2 (:)) (pure []) . map f eventSources :: ConfigParserA [EventSource] eventSources = do localConfig (subconfig "event-sources") $ do mapA eventSource . filter (/= "default") <$> subgroups "" eventSource :: Text -> ConfigParserA EventSource eventSource name = do localConfig (union (subconfig name ) (subconfig "default")) $ do libpqConnParams <- localConfig (subconfig "postgres") (subassocs "") heartbeatInterval <- key "heartbeat-interval" heartbeatTimeout <- key "heartbeat-timeout" pure $! EventSource{..} ~~~ This example uses the `ConfigParserA` variant of `ConfigParser`, so that the parser continues to run after encountering an error in order to generate more error messages. It also uses `localConfig` operator to run a subparser in a different configuration context. There are a few operators for modifying the configuration context: ~~~ localConfig :: ConfigParser m => ConfigTransform -> m a -> m a data ConfigTransform -- Conceptually, type ConfigTransform = Config -> Config instance Monoid ConfigTransform -- mempty is identity transformation -- mappend is composition of transformations -- | Left-biased union of two configurations union :: ConfigTransform -> ConfigTransform -> ConfigTransform -- | Restrict a configuration to a given group, and remove that group -- prefix from all key names. subconfig :: Text -> ConfigTransform -- | Add a group name as a prefix to all key names superconfig :: Text -> ConfigTransform ~~~ Note that these operators are implemented "symbolically", so that they run in sub-linear (Possibly `O(1)`?) time. Instead, the cost of these are paid on each `(key,value)` lookup. ### Syntactic extensions Datum comments have been implemented, not unlike Scheme and Clojure. The `configurator-ng` parser will ignore any binding preceded by a `#;` token; the binding following `#;` must be begin on the same line, and must be syntactically correct, but will otherwise be ignored. This is a significant convenience for use cases like the event source example above: for example one could disable `chicago-service-center` by putting `#;` before the name. One can also use this as a slightly restricted means of block comments, by writing `#; comment {` (the name doesn't matter) to begin the block comment, and a matching `}` to end the comment. Of course, the intervening bindings must be syntactically correct, so this isn't an exact substitute for block comments. Also, `configurator-ng` also allows group names to be inlined into other group and key names, separated by a dot character. For example, these configuration snippets are all equivalent: ~~~ foo { bar { x = "Hello" y = "World" } } foo.bar { x = "Hello" y = "World" } foo { bar.x = "Hello" bar.y = "World" } foo.bar.x = "Hello" foo.bar.y = "World" ~~~~ With the original `configurator`, only the first snippet is syntactically legal. Finally, `configurator-ng` supports scientific notation for numerical values, via the [`scientific`](https://hackage.haskell.org/package/scientific) package, which corresponds closely to typical floating point syntax. ### Configuration Change Subscriptions `Configurator`'s change notification system is also painful to use except in the most trivial of cases, not least because the callback is called for a single changed `(key,value)` pair at a time. Determining how that impacts a given configuration record (like `EventSource` above) is up to the user. Soon, configurator-ng will offer something along the lines of the following function: ~~~ subscribe :: ConfigParser m => ConfigCache -> m a -> (a -> IO ()) -> IO () ~~~ When the configuration files are reloaded, every subscribed `ConfigParser` is rerun, and the result is passed on to the callback. Now, of course, many callbacks won't want to be called unless their configuration actually changes. However, this is actually a reasonable thing to punt to the callback, because we can write a generic callback wrapper to handle this issue: ~~~ debounce :: (a -> a -> Bool) -> (a -> IO ()) -> IO (a -> IO ()) debounce notEq callback = do last_seen <- newIORef Nothing return $ \new -> do m_old <- readIORef last_seen if case m_old of Nothing -> True Just old -> notEq old new then do writeIORef last_seen (Just new) callback new else do return () ~~~ #### Optimizing subscribe It would be more efficient to run only those `ConfigParser`s that have the possibility of changing. If we design the `configurator-ng` interface carefully, we can determine all the keys that a parser depends on. We can then use this information to rerun only those parsers whose result might possibly change. (Though, `debounce` could still be useful, as `ConfigParser`s aren't guaranteed to be 1-1 functions.) However, once we have dependency tracking that works, there are further applications this could enable, such as: * providing tools to sysadmins to understand which parts of the configuration files affect which parts of the system. * finding values in configuration files that have no effect at all * more speculatively, using this information to generate sample configuration files. [configurator]: https://hackage.haskell.org/package/configurator [critbit]: https://hackage.haskell.org/package/critbit