Text.Format.Para
Contents
Description
A paragraph formatting utility. Provided with input text that is arbitrarily split amongst several strings, this utility will reformat the text into paragraphs which do not exceed the specified width. Paragraphs are delimited by blank lines in the input.
This function is roughly equivalent to the Unix fmt utility.
Features:
- An indentation/prefix text may be specified. This prefix is used on the first paragraph line and determines the standard indentation for all subsequent lines. If no indentation is specified, the blank indentation of the first line of the first paragraph becomes the default indentation for all paragraphs.
- Subsequent paragraphs may increase their indentation over the default as determined by the indentation level of their first line. Indentation values less than that of the primary paragraph are ignored.
- Paragraph text is reformatted to fit the paragraph layout.
- Extra whitespace is removed.
- "French spacing" is used: if the current word is capitalized and the previous word ended in a punctuation character, then two spaces are used between the words instead of a single space which is the default elsewhere.
- Avoids orphan words. The last line of a paragraph will usually be formatted to contain at least 2 words, pulling from the line above it.
- Recognizes lists of items, where each item starts with * or - or alphanumeric characters followed by a ) or . character. Uses list-oriented per-item indentation independent of paragraph indentation.
Documentation
Arguments
| :: Int | Width |
| -> Maybe String | Prefix (defines indent), Nothing means indent is taken from first input line |
| -> [String] | Text to format in arbitrarily-divided strings. Blank lines separate paragraphs. Paragraphs are indented the same as the first line if second argument is Nothing. |
| -> [String] | Formatted text |
The formatParas function accepts an arbitrarily-divided list of
Strings along with a width and optional indentation/prefix and
returns an array of strings representing paragraphs formatted to
fit the specified width and indentation.
Example
The following show example uses and output of the Para formatter.
Here is a simple program that takes 2 or more arguments.
- A width
- One or more filenames
The program will read the specified files and then use Para to format them with the specified width and display them on stdout.
import Text.Format.Para
import System.Environment
import Data.List
main = do
args <- getArgs
let width = head args
bodies <- mapM readFile $ tail args
putStrLn $ unlines $ formatParas (read width) (Just "Example: ") $
intersperse "\n" bodies
This program is useable in a similar manner to the Unix fmt
application. It also provides a convenient way to test and
experiment with the output of the Para module.
The following represents an example input file that demonstrates most of the capabilities of the Para formatter:
This is a test.
This is line 2. Note: double spacing (a.k.a. french spacing) between sentences, but
elsewhere only single spacing
is used; i.e. whitespace compression is performed.
This is the second paragraph. Note that all indentation is based on
the initial indentation string specified, although that string only
introduces the first paragraph on the sequence.
* Here is a list
* This is another list item. It is fairly long, so when it wraps the subline should be indented.
This is the third paragraph.
And it is indented.
It is followed by a command-line example:
$ ghc --make -o ptest ptest.hs
$ ./ptest
The fourth paragraph
is indented even more.
Birdtracks are verbatim, even if the line is long.
> main = do
> args <- getArgs
> putStrLn $ "Hello! Hi. Greetings. I think you said " ++ intercalate ", " args
The list can also be
numbered or use other
indicators:
1) Here is a list
2) Item #2
10) Item 10
20a) This is a longer item with a mixed representation of the item count.
4. Can use standard decimals as well for numbering elements.
5. And it doesn't really matter if all elements of the list are the same. Just as long as it's recognized as a list element.
a) But it does have to be at the same indentation?
b) Right. Multi-level lists are supported. Each list item is handled as a paragraph.
6. Level is based on initial character indentation.
And that's it!
If this example is saved to an input file and then processed with the test application above and a width of 80, the output might look like the following:
Example: This is a test. This is line 2. Note: double spacing (a.k.a. french
spacing) between sentences, but elsewhere only single spacing is used;
i.e. whitespace compression is performed.
This is the second paragraph. Note that all indentation is based on
the initial indentation string specified, although that string only
introduces the first paragraph on the sequence.
* Here is a list
* This is another list item. It is fairly long, so when it wraps
the subline should be indented.
This is the third paragraph. And it is indented. It is followed by
a command-line example:
$ ghc --make -o ptest ptest.hs
$ ./ptest
The fourth paragraph is indented even more.
Birdtracks are verbatim, even if the line is long.
> main = do
> args <- getArgs
> putStrLn $ "Hello! Hi. Greetings. I think you said " ++ intercalate ", " args
The list can also be numbered or use other indicators:
1) Here is a list
2) Item #2
10) Item 10
20a) This is a longer item with a mixed representation of the
item count.
4. Can use standard decimals as well for numbering elements.
5. And it doesn't really matter if all elements of the list are the
same. Just as long as it's recognized as a list element.
a) But it does have to be at the same indentation?
b) Right. Multi-level lists are supported. Each list item is
handled as a paragraph.
6. Level is based on initial character indentation.
And that's it!
If this same input file was run with a a width of 50 instead then the output would look like the following:
Example: This is a test. This is line 2. Note:
double spacing (a.k.a. french spacing)
between sentences, but elsewhere only
single spacing is used; i.e. whitespace
compression is performed.
This is the second paragraph. Note that
all indentation is based on the initial
indentation string specified, although
that string only introduces the first
paragraph on the sequence.
* Here is a list
* This is another list item. It is
fairly long, so when it wraps the
subline should be indented.
This is the third paragraph. And it
is indented. It is followed by a
command-line example:
$ ghc --make -o ptest ptest.hs
$ ./ptest
The fourth paragraph is indented
even more.
Birdtracks are verbatim, even if the line
is long.
> main = do
> args <- getArgs
> putStrLn $ "Hello! Hi. Greetings. I think you said " ++ intercalate ", " args
The list can also be numbered or use
other indicators:
1) Here is a list
2) Item #2
10) Item 10
20a) This is a longer item with a
mixed representation of the
item count.
4. Can use standard decimals as well
for numbering elements.
5. And it doesn't really matter if
all elements of the list are the
same. Just as long as it's
recognized as a list element.
a) But it does have to be at the
same indentation?
b) Right. Multi-level lists are
supported. Each list item is
handled as a paragraph.
6. Level is based on initial
character indentation.
And that's it!