The selda package

[Tags:library, mit]

This package provides an EDSL for writing portable, type-safe, high-level database code. Its feature set includes querying and modifying databases, automatic, in-process caching with consistency guarantees, and transaction support. See the package readme for a brief usage tutorial. To use this package you need at least one backend package, in addition to this package. There are currently two different backend packages: selda-sqlite and selda-postgresql.


[Skip to Readme]

Properties

Versions 0.1.0.0, 0.1.1.0, 0.1.1.1, 0.1.2.0
Change log ChangeLog.md
Dependencies base (>=4.8 && <5), exceptions (==0.8.*), hashable (>=1.1 && <1.3), mtl (>=2.0 && <2.3), psqueues (==0.2.*), text (>=1.0 && <1.3), time (>=1.6 && <1.9), transformers (>=0.4 && <0.6), unordered-containers (==0.2.*) [details]
License MIT
Author Anton Ekblad
Maintainer anton@ekblad.cc
Category Database
Home page https://github.com/valderman/selda
Source repository head: git clone https://github.com/valderman/selda.git
Uploaded Thu Apr 20 15:27:48 UTC 2017 by AntonEkblad
Distributions NixOS:0.1.2.0
Downloads 131 total (131 in the last 30 days)
Votes
0 []
Status Docs available [build log]
Last success reported on 2017-04-20 [all 1 reports]

Modules

[Index]

Flags

NameDescriptionDefaultType
localcacheEnable process-local cache support. Even when supported, caching is turned off by default until enabled by the application. When unsupported, the relevant APIs are still available, but the cache will act as if every update is a no-op and every lookup a cache miss.EnabledAutomatic
hasteAutomatically set when installing for the Haste compiler.DisabledAutomatic

Use -f <flag> to enable a flag, or -f -<flag> to disable that flag. More info

Downloads

Maintainer's Corner

For package maintainers and hackage trustees

Readme for selda

Readme for selda-0.1.1.1

What is Selda?

Build Status

Selda is an embedded domain-specific language for interacting with relational databases. It was inspired by LINQ and Opaleye.

Features

  • Monadic interface: no need to be a category theory wizard just to write a few database queries.
  • Portable: fully functional backends for SQLite and PostgreSQL.
  • Creating, dropping and querying tables using type-safe database schemas.
  • Typed query language with products, filtering, joins and aggregation.
  • Inserting, updating and deleting rows from tables.
  • Transaction support.
  • Configurable, automatic, consistent in-process caching of query results.
  • Lightweight and modular: non-essential features are optional or split into add-on packages.

Getting started

Install the selda package from Hackage, as well as at least one of the backends:

$ cabal update
$ cabal install selda selda-sqlite selda-postgresql

Then, read the tutorial. The API documentation will probably also come in handy.

Requirements

Selda requires GHC 7.10+, as well as SQLite 3.7.11+ or PostgreSQL 9+. To build the SQLite backend, you need a C compiler installed. To build the PostgreSQL backend, you need the libpq development libraries installed (libpq-dev on Debian-based Linux distributions).

<span id="tutorial"></span>

A brief tutorial

Defining a schema

To work productively with Selda, you will need to enable the TypeOperators and OverloadedStrings extensions.

Table schemas are defined as the product of one or more columns, stitched together using the ¤ operator. A table is parameterized over the types of its columns, with the column types separated by the :*: operator. This, by the way, is why you need TypeOperators.

people :: Table (Text :*: Int :*: Maybe Text)
people = table "people" $ primary "name" ¤ required "age" ¤ optional "pet"

addresses :: Table (Text :*: Text)
addresses = table "addresses" $ required "name" ¤ required "city"

Columns may be either required or optional. Although the SQL standard supports nullable primary keys, Selda primary keys are always required.

Running queries

Selda queries are run in the SeldaT monad transformer. Any MonadIO can be extended with database capabilities. Throughout this tutorial, we will simply use SeldaT on top of the plain IO monad. SeldaT is entered using a backend-specific withX function. For instance, the SQLite backend uses the withSQLite function:

main :: IO ()
main = withSQLite "my_database.sqlite" $ do
  people <- getAllPeople
  liftIO (print people)

getAllPeople :: SeldaT IO [Text :*: Int :*: Maybe Text]
getAllPeople = query (select people)

This will open the my_database.sqlite database for the duration of the computation. If the computation terminates normally, or if it raises an exception, the database is automatically closed.

Note the somewhat weird return type of getAllPeople. In Selda, queries are represented using inductive tuples: a list of values, separated by the :*: operator, but where each element can have a different type. You can think of them as tuples with a slightly different syntax. In this example, getAllPeople having a return type of [Text :*: Int :*: Maybe Text] means that it returns a list of "3-tuples", where the three elements have the types Text, Int and Maybe Text respectively.

You can pattern match on these values as you would on normal tuples:

firstOfThree :: (a :*: b :*: c) -> a
firstOfThree (a :*: b :*: c) = a

Since inductive tuples are inductively defined, you may also choose to pattern match on just the first few elements:

firstOfN :: (a :*: rest) -> a
firstOfN (a :*: _) = a

Throughout the rest of this tutorial, we will simply use inductive tuples as if they were "normal" tuples.

Creating and deleting databases

You can use a table definition to create the corresponding table in your database backend, as well as delete it.

setup :: SeldaT IO ()
setup = do
  createTable people
  createTable addresses

teardown :: SeldaT IO ()
teardown = do
  tryDropTable people
  tryDropTable addresses

Both creating and deleting tables comes in two variants: the try version which is a silent no-op when attempting to create a table that already exists or delete one that doesn't, and the "plain" version which raises an error.

Inserting data

Data insertion is done in batches. To insert a batch of rows, pass a list of rows where each row is an inductive tuple matching the type of the table. Optional values are encoded as Maybe values.

populate :: SeldaT IO ()
populate = do
  insert_ people
    [ "Link"      :*: 125 :*: Just "horse"
    , "Velvet"    :*: 19  :*: Nothing
    , "Kobayashi" :*: 23  :*: Just "dragon"
    , "Miyu"      :*: 10  :*: Nothing
    ]
  insert_ addresses
    [ "Link"      :*: "Kakariko"
    , "Kobayashi" :*: "Tokyo"
    , "Miyu"      :*: "Fuyukishi"
    ]

Insertions come in two variants: the "plain" version which reports back the number of inserted rows, and one appended with an underscore which returns (). Use the latter to explicitly indicate your intent to ignore the return value.

The following example inserts a few rows into a table with an auto-incrementing primary key:

people' :: Table (Int :*: Text :*: Int :*: Maybe Text)
people' = table "people_with_ids"
        $ autoPrimary "id"
        ¤ required "name"
        ¤ required "age"
        ¤ optional "pet"

populate' :: SeldaT IO ()
populate' = do
  insert_ people'
    [ def :*: "Link"      :*: 125 :*: Just "horse"
    , def :*: "Velvet"    :*: 19  :*: Nothing
    , def :*: "Kobayashi" :*: 23  :*: Just "dragon"
    , def :*: "Miyu"      :*: 10  :*: Nothing
    ]

Note the use of the def value for the id field. This indicates that the default value for the column should be used in lieu of any user-provided value. Since the id field is an auto-incrementing primary key, it will automatically be assigned a unique, increasing value. Thus, the resulting table would look like this:

id | name      | age | pet
-----------------------------
 0 | Link      | 125 | horse
 1 | Velvet    | 19  |
 2 | Kobayashi | 23  | dragon
 3 | Miyu      | 10  |

Also note that def can only be used for columns that have default values. Currently, only auto-incrementing primary keys can have defaults. Attempting to use def in any other context results in a runtime error.

Updating rows

To update a table, pass the table and two functions to the update function. The first is a mapping over table columns, specifying how to update each row. The second is a predicate over table columns. Only rows satisfying the predicate are updated.

age10Years :: SeldaT IO ()
age10Years = do
  update_ people (\(name :*: _ :*: _) -> name ./= "Link")
                 (\(name :*: age :*: pet) -> name :*: age + 10 :*: pet)

Note that you can use arithmetic, logic and other standard SQL operations on the columns in either function. Columns implement the appropriate numeric type classes. For operations with less malleable types -- logic and comparisons, for instance -- the standard Haskell operators are prefixed with a period (.).

Deleting rows

Deleting rows is quite similar to updating them. The only difference is that the delete operation takes a table and a predicate, specifying which rows to delete. The following example deletes all minors from the people table:

byeMinors :: SeldaT IO ()
byeMinors = delete_ people (\(_ :*: age :*: _) -> age .< 20)

Basic queries

Queries are written in the Query monad, in which you can query tables, restrict the result set, and perform inner, aggregate queries. Queries are executed in some Selda monad using the query function.

The following example uses the select operation to draw each row from the people table, and the restrict operation to remove out all rows except those having an age column with a value greater than 20.

grownups :: Query s (Col s Text)
grownups = do
  (name :*: age :*: _) <- select people
  restrict (age .> 20)
  return name

printGrownups :: SeldaT IO ()
printGrownups = do
  names <- query grownups
  liftIO (print names)

You may have noticed that in addition to the return type of a query, the Query type has an additional type parameter s. We'll cover this parameter in more detail when we get to aggregating queries, so for now you can just ignore it.

Products and joins

Of course, data can be drawn from multiple tables. The unfiltered result set is essentially the cartesian product of all queried tables. For this reason, restrict calls should be made as early as possible, to avoid creating an unnecessarily large result set.

Arbitrary Haskell values can be injected into queries. As injected values are passed as parameters to prepared statements under the hood, there is no need to escape data; SQL injection is impossible by construction.

The following example uses data from two tables to find all grown-ups who reside in Tokyo. Note the use of the text function, to convert a Haskell Text value into an SQL column literal, as well as the use of name .== name' to remove all elements from the result set where the name in the people table does not match the one in the addresses table.

grownupsIn :: Text -> Query s (Col s Text)
grownupsIn city = do
  (name :*: age :*: _) <- select people
  restrict (age .> 20)
  (name' :*: home) <- select addresses
  restrict (home .== text city .&& name .== name')
  return name

printGrownupsInTokyo :: SeldaT IO ()
printGrownupsInTokyo = do
  names <- query (grownupsIn "Tokyo")
  liftIO (print names)

Also note that this is slightly different from an SQL join. If, for instance, you wanted to get a list of all people and their addresses, you might do something like this:

allPeople :: Query s (Col s Text :*: Col s Text)
allPeople = do
  (people_name :*: _ :*: _) <- select people
  (addresses_name :*: city) <- select addresses
  restrict (people_name == addresses_name)
  return (people_name :*: city)

This will give you the list of everyone who has an address, resulting in the following result set:

name      | city
---------------------
Link      | Kakariko
Kobayashi | Tokyo
Miyu      | Fuyukishi

Note the absence of Velvet in this result set. Since there is no entry for Velvet in the addresses table, there can be no entry in the product table people × addresses where both people_name and addresses_name are equal to "Velvet". To produce a table like the above but with a NULL column for Velvet's address (or for anyone else who does not have an entry in the addresses table), you would have to use a join:

allPeople' :: Query s (Col s Text :*: Col s Maybe Text)
allPeople' = do
  (name :*: _ :*: _) <- select people
  (_ :*: city) <- leftJoin (\(name' :*: _) -> name .== name')
                           (select addresses)
  return (name :*: city)

This gives us the result table we want:

name      | city
---------------------
Link      | Kakariko
Velvet    |
Kobayashi | Tokyo
Miyu      | Fuyukishi

The leftJoin function left joins its query argument to the current result set for all rows matching its predicate argument. Note that all columns returned from the inner (or right) query are converted by leftJoin into nullable columns. As there may not be a right counter part for every element in the result set, SQL and Selda alike set any missing joined columns to NULL.

Aggregate queries, grouping and sorting

You can also perform queries that sum, count, or otherwise aggregate their result sets. This is done using the aggregate function. This is where the additional type parameter to Query comes into play. When used as an inner query, aggregate queries must not depend on any columns from the outer query. To enforce this, the aggregate function forces all operations to take place in the Query (Inner s) monad, if the outer query takes place in the Query s monad. This ensures that aggregate inner queries can only communicate with their outside query by returning some value.

Like in standard SQL, aggregate queries can be grouped by column name or by some arbitrary expression. An aggregate subquery must return at least one aggregate column, obtained using sum_, avg, count, or one of the other provided aggregate functions. Note that aggregate columns, having type Aggr s a, are different from normal columns of type Col s a. Since SQL does not allow aggregate functions in WHERE clauses, Selda prevents them from being used in arguments to restrict.

The following example uses an aggregate query to calculate how many home each person has, and order the result set with the most affluent homeowners at the top.

countHomes :: Query s (Col s Text :*: Col s Int)
countHomes = do
  (name :*: _ :*: _) <- select people
  (owner :*: homes) <- aggregate $ do
    (owner :*: city) <- select addresses
    owner' <- groupBy owner
    return (count city :*: owner')
  restrict (owner .== name)
  order homes descending
  return (owner :*: homes)

Note how groupBy returns an aggregate version of its argument, which can be returned from the aggregate query. In this example, returning owner instead of owner' wouldn't work since the former is a plain column and not an aggregate.

Transactions

All databases supported by Selda guarantee that each query is atomic: either the entire query is performed in one go, with no observable intermediate state, or the whole query fails without leaving a trace in the database. However, sometimes this guarantee is not enough. Consider, for instance, a money transfer from Alice's bank account to Bob's. This involves at least two queries: one to remove the money from Alice's account, and one to add the same amount to Bob's. Clearly, it would be bad if this operation were to be interrupted after withdrawing the money from Alice's account but before depositing it into Bob's.

The solution to this problem is transactions: a mechanism by which a list of queries gain the same atomicity guarantees as a single query always enjoys. Using transactions in Selda is super easy:

transferMoney :: Text -> Text -> Double -> SeldaT IO ()
transferMoney from to amount = do
  transaction $ do
    update_ accounts (\(owner :*: _) -> owner .== text from)
                     (\(owner :*: money) -> owner :*: money - float amount)
    update_ accounts (\(owner :*: _) -> owner .== text to)
                     (\(owner :*: money) -> owner :*: money + float amount)

This is all there is to it: pass the entire computation to the transaction function, and the whole computation is guaranteed to either execute atomically, or to fail without leaving a trace in the database. If an exception is raised during the computation, it will of course be rolled back.

Do be careful, however, to avoid performing IO within a query. While they will not affect the atomicity of the computation as far as the database is concerned, the computations themselves can obviously not be rolled back.

In-process caching

In many applications, read operations are orders of magnitude more common than write operations. For such applications, it is often useful to cache the results of a query, to avoid having the database perform the same, potentially heavy, query over and over even though we know we'll get the same result every time.

Selda supports automatic caching of query results out of the box. However, it is turned off by default. To enable caching, use the setLocalCache function.

main = withPostgreSQL connection_info $ do
  setLocalCache 1000
  ...

This will enable local caching of up to 1,000 different results. When that limit is reached, the least recently used result will be discarded, so the next request for that result will need to actually execute the query on the database backend. If caching was already enabled, changing the maximum number of cached results will discard the cache's previous contents. Setting the cache limit to 0 disables caching again.

To make sure that the cache is always consistent with the underlying database, Selda keeps track of which tables each query depends on. Whenever an insert, update, delete or drop is issued on a table t, all cached queries that depend on t will be discarded.

This guarantees consistency between cache and database, but only under the assumption that no other process will modify the database. If this assumption does not hold for your application, you should avoid using in-process caching. It is perfectly fine, however, to have multiple threads within the same application modifying the same database as long as they're all using Selda to do it, as the cache shared between all Selda computations running in the same process.

Generic tables and queries

Selda also supports building tables and queries from (almost) arbytrary data types, using the Database.Selda.Generic module. Re-implementing the ad hoc people and addresses tables from before in a more disciplined manner in this way is quite easy:

data Person = Person
  { personName :: Text
  , age        :: Int
  , pet        :: Maybe Int
  } deriving Generic

data Address = Address
  { addrName :: Text
  , city     :: Text
  } deriving Generic


people :: GenTable Person
people = genTable "people" [personName :- primaryGen]

addresses :: GenTable Address
addresses = genTable "addresses" [personName :- primaryGen]

This will declare two tables with the same structure as their ad hoc predecessors. Creating the tables is similarly easy:

create :: SeldaT IO ()
create = do
  createTable (gen people)
  createTable (gen addresses)

Note the use of the gen function here, to extract the underlying table of columns from the generic table.

With generic tables, you can use the table's datatype's record selectors together with the ! operator to access its columns in queries.

genericGrownups :: Query s (Col s Text)
genericGrownups = do
  person <- select (gen people)
  restrict (person ! age .> 20)
  return (person ! personName)

However, queries over generic tables aren't magic; they still consist of the same collections of columns as queries over non-generic tables.

genericGrownups2 :: Query s (Col s Text)
genericGrownups2 = do
  (name :*: age :*: _) <- select (gen people)
  restrict (age .> 20)
  return name

Finally, with generics it's also quite easy to re-assemble Haskell objects from the results of a query using the fromRel function.

getPeopleOfAge :: Int -> SeldaT IO [Person]
getPeopleOfAge yrs = do
  ps <- query $ do
    p <- select (gen people)
    restrict (p ! age .== yrs)
    return p
  return (map fromRel ps)

And with that, we conclude this tutorial. Hopefully it has been enough to get you comfortable started using Selda. For a more detailed API reference, please see Selda's Haddock documentation.

TODOs

Features that would be nice to have but are not yet implemented.

  • If/else.
  • Examples.
  • Foreign keys.
  • WHERE x IN (SELECT ...)
  • SELECT INTO.
  • Constraints other than primary key.
  • Database schema upgrades.
  • Stack build.
  • MySQL/MariaDB backend.