Copyright	(c) 2017 Johannes Hildén
License	BSD3
Maintainer	johannes@isomorphic.co
Stability	experimental
Portability	GHC
Safe Haskell	Safe
Language	Haskell2010

Math.DyckWord.Binary

Contents

Types
Creating and inspecting Dyck words
- Textual form
Ranking and unranking

Description

Background

In formal language theory, the Dyck language consists of all strings of evenly balanced left and right parentheses, brackets, or some other symbols, together with the empty word. Words in this language (named after German mathematician Walther von Dyck) are known as Dyck words, some examples of which are ()()(), (())((())), and ((()()))().

The type of Dyck language considered here is defined over a binary alphabet. We will take this alphabet to be the set Σ = {(, )} in the following examples. The binary Dyck language is the subset of Σ* (the Kleene closure of Σ) of all words that satisfy two conditions:

The number of left brackets must be the same as the number of right brackets.
Going from left to right, for each character read, the total number of right brackets visited must be less than or equal to the number of left brackets up to the current position.

E.g., (()(() and ())(())() are not Dyck words.

When regarded as a combinatorial class—with the size of a word defined as the number of bracket pairs it contains—the counting sequence associated with the Dyck language is the Catalan numbers.

$\begin{array}{ccl} \text{Size} & \text{Count} & \text{Words} \\ 0 & 1 & \epsilon \\ 1 & 1 & ⟨⟩ \\ 2 & 2 & ⟨⟩⟨⟩, \ ⟨⟨⟩⟩ \\ 3 & 5 & ⟨⟩⟨⟩⟨⟩, \ ⟨⟩⟨⟨⟩⟩, \ ⟨⟨⟩⟩⟨⟩, \ ⟨⟨⟩⟨⟩⟩, \ ⟨⟨⟨⟩⟩⟩ \\ 4 & 14 & ⟨⟩⟨⟩⟨⟩⟨⟩, \ ⟨⟩⟨⟩⟨⟨⟩⟩, \ ⟨⟩⟨⟨⟩⟩⟨⟩, \ ⟨⟩⟨⟨⟩⟨⟩⟩, \ ⟨⟩⟨⟨⟨⟩⟩⟩, \ ⟨⟨⟩⟩⟨⟩⟨⟩, \ ⟨⟨⟩⟩⟨⟨⟩⟩, \ ⟨⟨⟩⟨⟩⟩⟨⟩, \\ & & ⟨⟨⟨⟩⟩⟩⟨⟩, \ ⟨⟨⟩⟨⟩⟨⟩⟩, \ ⟨⟨⟩⟨⟨⟩⟩⟩, \ ⟨⟨⟨⟩⟩⟨⟩⟩, \ ⟨⟨⟨⟩⟨⟩⟩⟩, \ ⟨⟨⟨⟨⟩⟩⟩⟩ \\ 5 & 42 & ⟨⟩⟨⟩⟨⟩⟨⟩⟨⟩, \ ⟨⟩⟨⟩⟨⟩⟨⟨⟩⟩, \ ⟨⟩⟨⟩⟨⟨⟩⟩⟨⟩, \ ⟨⟩⟨⟩⟨⟨⟩⟨⟩⟩, \ ⟨⟩⟨⟩⟨⟨⟨⟩⟩⟩, \ \dots, \ ⟨⟨⟨⟨⟨⟩⟩⟩⟩⟩ \\ 6 & 132 & \dots \end{array}$

Synopsis

Types

type Size = Integer Source #

See size.

type Rank = Integer Source #

Represents the rank of a Dyck word.

data DyckWord Source #

Opaque Dyck word type. For functions to build Dyck words, see:

Conceptually, every non-empty Dyck word has the form (a)b, where a and b are Dyck words. The BNF grammar for this language is given by $\omega = \epsilon \mid ( \omega ) \omega$ .

Instances

Eq DyckWord Source #	Two Dyck words are considered equal when they have the same absolute rank. `>>>` `fromText' "010011000111" == fromText' "xyxxyyxxxyyy"`True
Methods (==) :: DyckWord -> DyckWord -> Bool # (/=) :: DyckWord -> DyckWord -> Bool #
Ord DyckWord Source #	Dyck words of the same size are ordered lexicographically, and a smaller Dyck word always gets precedence over a bigger one. `>>>` `fromText' "0011" < fromText' "0101"`True `>>>` `fromText' "0101" > fromText' "01"`True
Methods compare :: DyckWord -> DyckWord -> Ordering # (<) :: DyckWord -> DyckWord -> Bool # (<=) :: DyckWord -> DyckWord -> Bool # (>) :: DyckWord -> DyckWord -> Bool # (>=) :: DyckWord -> DyckWord -> Bool # max :: DyckWord -> DyckWord -> DyckWord # min :: DyckWord -> DyckWord -> DyckWord #
Show DyckWord Source #
Methods showsPrec :: Int -> DyckWord -> ShowS # show :: DyckWord -> String # showList :: [DyckWord] -> ShowS #
Monoid DyckWord Source #	The Dyck language forms a monoid under `concatWords`, with the `empty` word as identity.
Methods mempty :: DyckWord # mappend :: DyckWord -> DyckWord -> DyckWord # mconcat :: [DyckWord] -> DyckWord #

Creating and inspecting Dyck words

empty :: DyckWord Source #

The empty Dyck word.

size :: DyckWord -> Size Source #

The size of a Dyck word is the number of bracket pairs it contains, i.e., half of the length of the word's string representation. Inductively, it can be defined as

$\text{size}\;w = \begin{cases} 0 && \text{if} \; w = \epsilon \\ 1 + \text{size}\;u + \text{size}\;v && \text{if} \; w = (u)v. \end{cases}$

setAlphabet Source #

Arguments

:: Char	Left symbol
-> Char	Right symbol
-> DyckWord
-> DyckWord

Change the alphabet of a Dyck word. An alphabet has two characters, here referred to as the left and right symbol, respectively. For example:

>>> toText (setAlphabet 'x' 'o' (unrank 55))
xoxxoxxooo

concatWords :: DyckWord -> DyckWord -> DyckWord Source #

Concatenate two Dyck words. Corresponds to ordinary string concatenation.

$\begin{align} \epsilon +\!\!\!+\;w &= w \\ (u)v +\!\!\!+\;w &= (u)[ v +\!\!\!+\; w ] \end{align}$

If the two words have different alphabets, the concatenated word will inherit the first word's symbol set.

Textual form

fromText :: Text -> Either String DyckWord Source #

Parse a Text value to a DyckWord. The result is wrapped in a Maybe, so that the value becomes Nothing if parsing fails. The alphabet is determined by looking at the first and last characters of the input.

fromText' :: Text -> DyckWord Source #

Parse a Text value to a DyckWord. Throws an error if parsing fails. This is an unsafe version of fromText.

toText :: DyckWord -> Text Source #

Return a textual representation of a DyckWord.

Ranking and unranking

To rank a Dyck word is to determine its position in some ordered sequence of words. The dual of this—to produce the Dyck word corresponding to a position in said sequence—is called unranking. The order we are interested in here is known as shortlex, and it demands that a smaller Dyck word always gets precedence over a bigger one. When comparing words of the same size, normal lexicographical order applies.

Relative vs. absolute rank

In this library, we adopt the following (non-standard) terminology. The relative rank of a Dyck word w is its position in the sequence of those words with the same size as w. By contrast, the absolute rank means a word's position in the shortlex sequence of all Dyck words. For example: The (absolute) rank of (((()))) is 9, but the relative rank of the same word is 0, since it is the first word of size four.

$\begin{array}{lccc} \text{Word} & \text{Size} & \text{Rank} & \text{Relative rank} \\ \epsilon & 0 & 0 & 0 \\ ⟨⟩ & 1 & 1 & 0 \\ ⟨⟨⟩⟩ & 2 & 2 & 0 \\ ⟨⟩⟨⟩ & 2 & 3 & 1 \\ ⟨⟨⟨⟩⟩⟩ & 3 & 4 & 0 \\ ⟨⟨⟩⟨⟩⟩ & 3 & 5 & 1 \\ ⟨⟨⟩⟩⟨⟩ & 3 & 6 & 2 \\ ⟨⟩⟨⟨⟩⟩ & 3 & 7 & 3 \\ ⟨⟩⟨⟩⟨⟩ & 3 & 8 & 4 \\ ⟨⟨⟨⟨⟩⟩⟩⟩ & 4 & 9 & 0 \\ ⟨⟨⟨⟩⟨⟩⟩⟩ & 4 & 10 & 1 \\ \;\;\;\; \vdots & \vdots & \vdots & \vdots \end{array}$

If we let $r(w)$ denote the relative rank of a word $w$ , and $C_i$ the $i^{th}$ Catalan number, then the absolute rank $R(w)$ of $w$ is given by the formula $R(w) = r(w) + \sum_{i=0}^{s-1} C_i,$ where $s$ is the size of $w$ .

rank :: DyckWord -> Rank Source #

Return the absolute rank of a Dyck word. That is, its position in the (shortlex) ordered sequence of all Dyck words.

rankRelative :: DyckWord -> Rank Source #

Return the relative rank of a Dyck word.

unrank :: Rank -> DyckWord Source #

Return the n-th Dyck word in the (shortlex) ordered sequence of all Dyck words.

unrankRelative :: Size -> Rank -> Maybe DyckWord Source #

Return the n-th Dyck word, restricted to only words of a given size. Words are lexicographically ordered. The result is wrapped in a Maybe, and is equal to Nothing if the given rank is outside of the valid range.

unrankRelative' :: Size -> Rank -> DyckWord Source #

Unsafe version of unrankRelative. This function throws an error if the given rank is outside of the valid range.

wordsOfSize :: Size -> [DyckWord] Source #

Return a lexicographically ordered list with all Dyck words of a specific size.