dataframe-persistent: Persistent database integration for the dataframe library

[ data, database, gpl, library ] [ Propose Tags ] [ Report a vulnerability ]

This package provides integration between the dataframe library and the Persistent database library, allowing you to load database entities into DataFrames and save DataFrames back to the database.


[Skip to Readme]

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

  • No Candidates
Versions [RSS] 0.1.0.0
Change log CHANGELOG.md
Dependencies base (>=4 && <5), bytestring (>=0.11 && <0.13), containers (>=0.6.7 && <0.9), dataframe (>=0.3 && <0.4), persistent (>=2.14 && <3), template-haskell (>=2.0 && <3), text (>=2.0 && <3), time (>=1.12 && <2), transformers (>=0.5 && <0.7), vector (>=0.13 && <0.14) [details]
Tested with ghc ==9.4.8 || ==9.6.7 || ==9.8.4 || ==9.10.3 || ==9.12.2
License GPL-3.0-or-later
Copyright (c) 2024-2025 Michael Chavinda
Author Michael Chavinda, Junji Hashimoto
Maintainer mschavinda@gmail.com
Category Data, Database
Bug tracker https://github.com/mchav/dataframe/issues
Source repo head: git clone https://github.com/mchav/dataframe
Uploaded by mchav at 2025-10-31T21:28:39Z
Distributions
Downloads 4 total (4 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2025-10-31 [all 1 reports]

Readme for dataframe-persistent-0.1.0.0

[back to package description]

dataframe-persistent

Persistent database integration for the Haskell DataFrame library.

Overview

This package provides seamless integration between the dataframe library and the persistent database library, allowing you to:

  • Load database entities directly into DataFrames
  • Perform DataFrame operations on database data
  • Save DataFrame results back to the database
  • Work with type-safe database entities

Installation

Add to your package.yaml:

dependencies:
- dataframe ^>= 0.3
- dataframe-persistent ^>= 0.1
- persistent >= 2.14
- persistent-sqlite >= 2.13  # or your preferred backend

Or to your .cabal file:

build-depends:
  dataframe ^>= 0.3,
  dataframe-persistent ^>= 0.1,
  persistent >= 2.14,
  persistent-sqlite >= 2.13

Quick Start

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE DerivingStrategies #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE UndecidableInstances #-}

import Control.Monad.IO.Class (liftIO)
import Database.Persist
import Database.Persist.Sqlite
import Database.Persist.TH
import qualified DataFrame as DF
import qualified DataFrame.Functions as F
import DataFrame.IO.Persistent
import DataFrame.IO.Persistent.TH
import qualified Data.Vector as V

-- Define your entities
share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
TestUser
    name Text
    age Int
    active Bool
    deriving Show Eq
|]

-- Derive DataFrame instances
$(derivePersistentDataFrame ''TestUser)

-- Example usage
main :: IO ()
main = runSqlite "example.db" $ do
    -- Run migrations
    runMigration migrateAll
    
    -- Insert some test data
    _ <- insert $ TestUser "Alice" 25 True
    _ <- insert $ TestUser "Bob" 30 False
    _ <- insert $ TestUser "Charlie" 35 True
    
    -- Load from database
    allUsersDF <- fromPersistent @TestUser []
    liftIO $ putStrLn $ "Loaded " ++ show (nRows allUsersDF) ++ " users"
    
    -- Load with filters
    activeUsersDF <- fromPersistent @TestUser [TestUserActive ==. True]
    liftIO $ putStrLn $ "Active users: " ++ show (nRows activeUsersDF)
    
    -- Process with DataFrame operations
    let youngUsers = DF.filter @Int "age" (< 30) allUsersDF
        ages = V.toList $ DF.columnAsVector @Int "age" youngUsers
    liftIO $ putStrLn $ "Young user ages: " ++ show ages
    
    -- Custom configuration
    let config = defaultPersistentConfig 
                    { pcIdColumnName = "user_id"
                    , pcIncludeId = True
                    }
    customDF <- fromPersistentWith @TestUser config []
    liftIO $ putStrLn $ "Columns with custom config: " ++ show (DF.columnNames customDF)

Features

  • Type-safe conversions between Persistent entities and DataFrames
  • Template Haskell support for automatic instance generation
  • Configurable loading with batch size and column selection
  • Column name cleaning - removes table prefixes automatically (e.g., test_user_namename)
  • Type preservation - maintains proper types for Text, Int, Bool, Day, etc.
  • Empty DataFrame support - preserves column structure even with no data
  • Support for all Persistent backends (SQLite, PostgreSQL, MySQL, etc.)

Configuration Options

data PersistentConfig = PersistentConfig
    { pcBatchSize :: Int        -- Number of records to fetch at once (default: 10000)
    , pcIncludeId :: Bool       -- Whether to include entity ID as column (default: True)
    , pcIdColumnName :: Text    -- Name for the ID column (default: "id")
    }

Advanced Usage

Custom Field Extraction

You can also extract fields from individual entities:

let user = TestUser "Alice" 25 True
    columns = persistFieldsToColumns user
-- Result: [("name", SomeColumn ["Alice"]), ("age", SomeColumn [25]), ("active", SomeColumn [True])]

Working with Vector Data

-- Extract specific column data
let names = V.toList $ DF.columnAsVector @Text "name" df
    ages = V.toList $ DF.columnAsVector @Int "age" df
    activeFlags = V.toList $ DF.columnAsVector @Bool "active" df

Examples

For comprehensive examples and test cases, see:

Status

This package is actively maintained and tested. Current test coverage includes:

  • ✅ Entity loading with and without filters
  • ✅ Custom configuration options
  • ✅ DataFrame operations on Persistent data
  • ✅ Empty result set handling
  • ✅ Field extraction utilities
  • ✅ Multi-table relationships

Documentation

For detailed documentation, see:

License

GPL-3.0-or-later (same as the main dataframe package)