pipes-files-0.1.1: Fast traversal of directory trees using pipes

Safe HaskellNone
LanguageHaskell2010

Pipes.Files

Contents

Synopsis

Introduction

  • *find-conduit** is essentially a souped version of GNU find for Haskell, using a DSL to provide both ease of us, and extensive flexbility.

In its simplest form, let's compare some uses of find to find-conduit. Bear in mind that the result of the find function is a conduit, so you're expected to either sink it to a list, or operate on the file paths as they are yielded.

Basic comparison with GNU find

A typical find command:

find src -name '*.hs' -type f -print

Would in find-conduit be:

find "src" (glob "*.hs" <> regular) $$ mapM_C (liftIO . print)

The glob predicate matches the file basename against the globbing pattern, while the regular predicate matches plain files.

A more complicated example:

find . -size +100M -perm 644 -mtime 1

Now in find-conduit:

let megs = 1024 * 1024
    days = 86400
now <- liftIO getCurrentTime
find "." ( fileSize (> 100*megs)
        <> hasMode 0o644
        <> lastModified (> addUTCTime now (-(1*days)))
         )

Appending predicates like this expressing an "and" relationship. Use <|> to express "or". You can also negate any predicate:

find "." (not_ (hasMode 0o644))

By default, predicates, whether matching or not, will allow recursion into directories. In order to express that matching predicate should disallow recursion, use prune:

find "." (prune (depth (> 2)))

This is the same as using '-maxdepth 2' in find.

find "." (prune (filename_ (== "dist")))

This is the same as:

find . \( -name dist -prune \) -o -print

Performance

find-conduit strives to make file-finding a well performing operation. To this end, a composed Predicate will only call stat once per entry being considered; and if you prune a directory, it is not traversed at all.

By default, find calls stat for every file before it applies the predicate, in order to ensure that only one such call is needed. Sometimes, however, you know just from the FilePath that you don't want to consider a certain file, or you want to prune a directory tree.

To support these types of optimized queries, a variant of find is provided called findWithPreFilter. This takes two predicates: one that is applied to only the FilePath, before stat (or lstat) is called; and one that is applied to the full file information after the stat.

Other notes

See Cond for more details on the Monad used to build predicates.

Finding functions

sourceFindFiles :: (MonadIO m, MonadSafe m, IsFilePath f) => FindOptions -> f -> CondT (FileEntry f) m a -> Producer (FileEntry f, a) m () Source

Find file entries in a directory tree, recursively, applying the given recursion predicate to the search. This conduit yields pairs of type (FileEntry f, a), where is the return value from the predicate at each step.

find :: (MonadIO m, MonadSafe m) => FilePath -> CondT (FileEntry FilePath) m a -> Producer FilePath m () Source

Calls findFilePaths with the default set of finding options. Equivalent to findFilePaths defaultFindOptions.

findFilesIO :: IsFilePath f => FindOptions -> f -> CondT (FileEntry f) IO a -> IO () Source

Find file entries in a directory tree, recursively, applying the given recursion predicate to the search. This conduit yields pairs of type (FileEntry f, a), where is the return value from the predicate at each step.

findFilePaths :: (MonadIO m, MonadSafe m) => FindOptions -> FilePath -> CondT (FileEntry FilePath) m a -> Producer FilePath m () Source

A simpler version of findFiles, which yields only FilePath values, and ignores any values returned by the predicate action.

directoryFiles :: MonadIO m => FilePath -> TreeT m FilePath Source

Return all files within a directory tree, hierarchically.

test :: MonadIO m => CondT (FileEntry FilePath) m () -> FilePath -> m Bool Source

Test a file path using the same type of predicate that is accepted by findFiles.

ltest :: MonadIO m => CondT (FileEntry FilePath) m () -> FilePath -> m Bool Source

Test a file path using the same type of predicate that is accepted by findFiles, but do not follow symlinks.

File path predicates

glob :: (Monad m, IsString f, IsFilePath f, Monoid f) => String -> CondT (FileEntry f) m () Source

Find every entry whose filename part matching the given filename globbing expression. For example: glob "*.hs".

regex :: (Monad m, IsFilePath f) => String -> CondT (FileEntry f) m () Source

ignoreVcs :: (Monad m, IsString f, Eq f, IsFilePath f) => CondT (FileEntry f) m () Source

Return all entries, except for those within version-control metadata directories (and not including the version control directory itself either).

GNU find compatibility predicates

depth_ :: Monad m => CondT (FileEntry f) m () Source

prune_ :: Monad m => CondT a m () Source

maxdepth_ :: Monad m => Int -> CondT (FileEntry f) m () Source

mindepth_ :: Monad m => Int -> CondT (FileEntry f) m () Source

amin_ :: MonadIO m => Int -> CondT (FileEntry f) m () Source

atime_ :: MonadIO m => Int -> CondT (FileEntry f) m () Source

anewer_ :: (MonadIO m, IsFilePath f) => f -> CondT (FileEntry f) m () Source

gid_ :: MonadIO m => Int -> CondT (FileEntry f) m () Source

name_ :: (Monad m, IsFilePath f, Eq f) => f -> CondT (FileEntry f) m () Source

filename_ :: (Monad m, IsFilePath f) => (f -> Bool) -> CondT (FileEntry f) m () Source

pathname_ :: (Monad m, IsFilePath f) => (f -> Bool) -> CondT (FileEntry f) m () Source

File entry predicates (uses stat information)

Predicate combinators

(=~) :: (RegexMaker Regex CompOption ExecOption source, RegexContext Regex source1 target) => source1 -> source -> target

Types and type classes

data FileEntry f Source

Constructors

FileEntry 

Fields

entryPath :: !RawFilePath
 
entryDepth :: !Int
 
entryFindOptions :: !FindOptions
 
entryStatus :: !(Maybe FileStatus)

This is Nothing until we determine stat should be called.

Instances

Helper functions for library writers

genericFindFilePaths :: (MonadIO m, MonadSafe m, IsFilePath f) => FindOptions -> f -> CondT (FileEntry f) m a -> Producer f m () Source

A simpler version of findFiles, which yields only FilePath values, and ignores any values returned by the predicate action.

genericFind :: (MonadIO m, MonadSafe m, IsFilePath f) => f -> CondT (FileEntry f) m a -> Producer f m () Source

Calls findFilePaths with the default set of finding options. Equivalent to findFilePaths defaultFindOptions.

genericTest :: (MonadIO m, IsFilePath f) => CondT (FileEntry f) m () -> f -> m Bool Source

Test a file path using the same type of predicate that is accepted by findFiles.

genericLtest :: (MonadIO m, IsFilePath f) => CondT (FileEntry f) m () -> f -> m Bool Source

Test a file path using the same type of predicate that is accepted by findFiles, but do not follow symlinks.