dirstream-1.0.1: Easily stream directory contents in constant memory

Safe HaskellSafe
LanguageHaskell98

Data.DirStream

Contents

Description

Use this module to stream directory contents lazily in constant memory in conjunction with pipes

Synopsis

Directory Traversals

There many possible recursion schemes for traversing directories. Rather than provide them all, I prefer that you learn to assemble your own recursion schemes, using the source code for descendentOf as a starting point.

childOf :: MonadSafe m => FilePath -> ListT m FilePath Source #

Select all immediate children of the given directory, ignoring "." and "..".

Returns zero children if the directory is not readable or (on Windows) if the directory is actually a reparse point.

descendentOf :: MonadSafe m => FilePath -> ListT m FilePath Source #

Select all recursive descendents of the given directory

Utilities

unixVisible :: FilePath -> Bool Source #

Determine if a file is visible according to Unix conventions, defined as the base name not beginning with a '.'

isDirectory :: FilePath -> IO Bool #

Check if a directory exists at the given path.

Symbolic links are resolved to their targets before checking their type.

This computation does not throw exceptions.

Tutorial

The following example shows a simple program that enumerates the contents of a single directory:

{-# LANGUAGE OverloadedStrings #-}

import Data.DirStream
import Pipes
import Pipes.Safe

main1 = runSafeT $ runEffect $
    for (every (childOf "/tmp")) (liftIO . print)
>>> main1
FilePath "/tmp"
FilePath "/tmp/dir1"
FilePath "/tmp/dir2"
FilePath "/tmp/fileE"

The childOf function streams the list of files in constant memory, allowing you to traverse very large directory lists.

You can use ListT to assemble more sophisticated traversals, such as the recursive descendentOf traversal, which has the following definition:

descendentOf :: F.FilePath -> ListT (SafeT IO) F.FilePath
descendentOf path = do
    child <- childOf path
    isDir <- liftIO $ isDirectory child
    if isDir
        then return child <|> descendentOf child
        else return child

These recursive traversals will promptly open and close nested directory streams as they traverse the directory tree:

main2 = runSafeT $ runEffect $
    for (every (descendentOf "/tmp")) (liftIO . print)
>>> main2
FilePath "/tmp"
FilePath "/tmp/dir1"
FilePath "/tmp/dir1/fileA"
FilePath "/tmp/dir1/fileB"
FilePath "/tmp/dir2"
FilePath "/tmp/dir2/fileC"
FilePath "/tmp/dir2/fileD"
FilePath "/tmp/fileE"

These traverals are lazy and will open the minimal number of directories necessary to satisfy downstream demand:

import qualified Pipes.Prelude as P

main3 = runSafeT $ runEffect $
    for (every (descendentOf "/tmp") >-> P.take 3) (liftIO . print)
>>> main3  -- This never opens the "/tmp/dir2" directory
FilePath "/tmp"
FilePath "/tmp/dir1"
FilePath "/tmp/dir1/fileA"