hpdft: A tool for looking through PDF file using Haskell

This is a package candidate release! Here you can preview how this package release will appear once published to the main package index (which can be accomplished via the 'maintain' link below). Please note that once a package has been published to the main package index it cannot be undone! Please consult the package uploading documentation for more information.

[maintain] [Publish]

A command line PDF-to-text converter. It may take a much longer than other similar tools but could yield better results.

This package can also serve as a library for working with text data in PDF files. You could write your own PDF-to-text converter for some particular PDF files, utilizing any meta data or special data structures of those.


[Skip to Readme]

Properties

Versions 0.1.0.0, 0.1.0.1, 0.1.0.2, 0.1.0.3, 0.1.0.4, 0.1.0.5, 0.1.0.6, 0.1.1.1, 0.1.1.2, 0.1.1.2, 0.1.1.3
Change log None available
Dependencies attoparsec (>=0.14.4 && <0.15), base (>=4.18.0 && <4.19), binary (>=0.8.9 && <0.9), bytestring (>=0.11.4 && <0.12), containers (>=0.6.7 && <0.7), directory (>=1.3.8 && <1.4), file-embed (>=0.0.15 && <0.1), hpdft, memory (>=0.18.0 && <0.19), optparse-applicative (>=0.18.1 && <0.19), parsec (>=3.0 && <3.2), regex-base (>=0.94.0 && <0.95), regex-compat (>=0.95.2 && <0.96), semigroups (>=0.20 && <0.21), text (>=2.0.2 && <2.1), utf8-string (>=1.0.2 && <1.1), zlib (>=0.6.3 && <0.7) [details]
License MIT
Author Keiichiro Shikano
Maintainer k16.shikano@gmail.com
Category PDF
Home page https://github.com/k16shikano/hpdft
Uploaded by keiichiroShikano at 2023-08-22T05:02:42Z

Modules

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees


Readme for hpdft-0.1.1.2

[back to package description]

hpdft (Haskell PDF Tools)

hpdft is a PDF parsing tool. It can also be used as a command to grab text, metadata outline (i.e. table of contents) from PDF.

Command usage:

hpdft [-p|--page PAGE] [-r|--ref REF] [-g|--grep RegExp] [-R|--refs]
             [-T|--title] [-I|--info] [-O|--toc] [--trailer] FILE

Available options:
  -p,--page PAGE           Page number (nomble)
  -r,--ref REF             Object reference
  -g,--grep RegExp         grep PDF
  -R,--refs                Show object references in page order
  -T,--title               Show title (from metadata)
  -I,--info                Show PDF metainfo
  -O,--toc                 Show table of contents (from metadata)
  --trailer                Show the trailer of PDF
  FILE                     input pdf file
  -h,--help                Show this help text

install

Clone this repository and do cabal-install.

$ cabal install