squeeze-1.0.4.8: A file-packing application.

Safe HaskellNone
LanguageHaskell2010

Squeeze.Squeeze

Contents

Description

AUTHOR
Dr. Alistair Ward
DESCRIPTION
  • Returns combinations of the specified files, which fit into the available space, without wasting more than the specified ratio.
  • Any directory-names are treated as atomic units, rather than individual files.
  • Because of the exponential growth of possible combinations, an exact match for the available space is frequently found with a surprisingly small set of files.

Synopsis

Functions

findCombinations Source

Arguments

:: Interval FileSize

The closed interval of acceptible aggregate size, for file-combinations.

-> [FileSizeAndPath]

The list of file-names & sizes, ordered by decreasing size.

-> [FileCombination]

The resulting unordered list of suitable file-combinations.

  • Checks that the total aggregate FileSize, meets or exceeds minimumBytes.
  • Drops excessively large files, assuming that the file-list has been sorted by size, largest first.
  • Generates up to 2^n combinations of the n specified files; the algorithm is similar to subsequences, except that unproductive lines are immediately terminated. This is the performance bottle-neck, and though there may be simpler and faster algorithms, the key attribute is that it operates in constant space.
  • CAVEAT: assumes files have been sorted by decreasing size.
  • The algorithm is stable, in that it maintains the specified file-order within each combination; though the order in which the combinations are concatenated is rather arbitrary.

findBestFit Source

Arguments

:: Interval FileSize

The closed interval of acceptible sizes for file-combinations.

-> [FileSizeAndPath]

The input list of file-names & sizes.

-> [FileCombination]

A reduced list of increasingly suitable file-combinations.

Orders the files by decreasing size, calls findCombinations, calls risingFilter to select progressively better solutions.

distributeAndFindBestFit Source

Arguments

:: RealFrac ratio 
=> CommandOptions ratio 
-> [FileSizeAndPath]

The unordered list of files & sizes.

-> IO [FileCombination] 
  • Recursively bisects the task, distributing the sub-tasks to findBestFit, to utilise the available CPU-cores.
  • The task is bisected by removing the smallest file, then solving the remaining problem for two independent cases; that the selected file is excluded or the selected file is included, in the final solution. Selecting the smallest file rather than the largest, seems to balance the load of the sub-tasks more evenly. CAVEAT: no account has been taken of the possibility that the smallest file has size zero, which makes the sub-tasks identical.
  • Recombines the part solutions to finds the single monotonically increasing list of file-combinations matching the original criteria.
  • CAVEAT: whilst the ultimate solution is similar, regardless of the specified number of CPU-cores available, the path leading to it typically differs.

partitionEmptyFilesAndDistributeAndFindBestFit Source

Arguments

:: RealFrac ratio 
=> CommandOptions ratio 
-> [FileSizeAndPath]

The unordered list of files & sizes.

-> IO [FileCombination] 
  • Neither distributeAndFindBestFit nor findBestFit distinguish between empty & non-empty files, but empty files cause significant inefficiency in the former (where the same calculation is performed multiple times) & could be treated much more efficiently in the latter (since they're potentially a member of any other solution).
  • This function side-lines empty files, delegates the remaining problem to distributeAndFindBestFit (& consequently findBestFit), then prepends combinations of empty files to the resulting combinations of non-empty files.