dyckword-0.1.0.1: A library for working with binary Dyck words.

Math.DyckWord.Binary

Description

## Background

In formal language theory, the Dyck language consists of all strings of evenly balanced left and right parentheses, brackets, or some other symbols, together with the empty word. Words in this language (named after German mathematician Walther von Dyck) are known as Dyck words, some examples of which are ()()(), (())((())), and ((()()))().

The type of Dyck language considered here is defined over a binary alphabet. We will take this alphabet to be the set Σ = {(, )} in the following examples. The binary Dyck language is the subset of Σ* (the Kleene closure of Σ) of all words that satisfy two conditions:

1. The number of left brackets must be the same as the number of right brackets.
2. Going from left to right, for each character read, the total number of right brackets visited must be less than or equal to the number of left brackets up to the current position.

E.g., (()(() and ())(())() are not Dyck words.

When regarded as a combinatorial class—with the size of a word defined as the number of bracket pairs it contains—the counting sequence associated with the Dyck language is the Catalan numbers.

$\begin{array}{ccl} \text{Size} & \text{Count} & \text{Words} \\ 0 & 1 & \epsilon \\ 1 & 1 & ⟨⟩ \\ 2 & 2 & ⟨⟩⟨⟩, \ ⟨⟨⟩⟩ \\ 3 & 5 & ⟨⟩⟨⟩⟨⟩, \ ⟨⟩⟨⟨⟩⟩, \ ⟨⟨⟩⟩⟨⟩, \ ⟨⟨⟩⟨⟩⟩, \ ⟨⟨⟨⟩⟩⟩ \\ 4 & 14 & ⟨⟩⟨⟩⟨⟩⟨⟩, \ ⟨⟩⟨⟩⟨⟨⟩⟩, \ ⟨⟩⟨⟨⟩⟩⟨⟩, \ ⟨⟩⟨⟨⟩⟨⟩⟩, \ ⟨⟩⟨⟨⟨⟩⟩⟩, \ ⟨⟨⟩⟩⟨⟩⟨⟩, \ ⟨⟨⟩⟩⟨⟨⟩⟩, \ ⟨⟨⟩⟨⟩⟩⟨⟩, \\ & & ⟨⟨⟨⟩⟩⟩⟨⟩, \ ⟨⟨⟩⟨⟩⟨⟩⟩, \ ⟨⟨⟩⟨⟨⟩⟩⟩, \ ⟨⟨⟨⟩⟩⟨⟩⟩, \ ⟨⟨⟨⟩⟨⟩⟩⟩, \ ⟨⟨⟨⟨⟩⟩⟩⟩ \\ 5 & 42 & ⟨⟩⟨⟩⟨⟩⟨⟩⟨⟩, \ ⟨⟩⟨⟩⟨⟩⟨⟨⟩⟩, \ ⟨⟩⟨⟩⟨⟨⟩⟩⟨⟩, \ ⟨⟩⟨⟩⟨⟨⟩⟨⟩⟩, \ ⟨⟩⟨⟩⟨⟨⟨⟩⟩⟩, \ \dots, \ ⟨⟨⟨⟨⟨⟩⟩⟩⟩⟩ \\ 6 & 132 & \dots \end{array}$

Synopsis

# Types

type Size = Integer Source #

See size.

type Rank = Integer Source #

Represents the rank of a Dyck word.

data DyckWord Source #

Opaque Dyck word type. For functions to build Dyck words, see:

• empty,
• fromText,
• fromText',
• unrank,
• unrankRelative, and
• unrankRelative'.

Conceptually, every non-empty Dyck word has the form (a)b, where a and b are Dyck words. The BNF grammar for this language is given by $$\omega = \epsilon \mid ( \omega ) \omega$$.

Instances

 Source # Two Dyck words are considered equal when they have the same absolute rank. >>> fromText' "010011000111" == fromText' "xyxxyyxxxyyy" True  Methods Source # Dyck words of the same size are ordered lexicographically, and a smaller Dyck word always gets precedence over a bigger one.>>> fromText' "0011" < fromText' "0101" True >>> fromText' "0101" > fromText' "01" True  Methods(<) :: DyckWord -> DyckWord -> Bool #(>) :: DyckWord -> DyckWord -> Bool # Source # MethodsshowList :: [DyckWord] -> ShowS # Source # Methodsmconcat :: [DyckWord] -> DyckWord #

# Creating and inspecting Dyck words

The empty Dyck word.

The size of a Dyck word is the number of bracket pairs it contains, i.e., half of the length of the word's string representation. Inductively, it can be defined as

$\text{size}\;w = \begin{cases} 0 && \text{if} \; w = \epsilon \\ 1 + \text{size}\;u + \text{size}\;v && \text{if} \; w = (u)v. \end{cases}$

Arguments

 :: Char Left symbol -> Char Right symbol -> DyckWord -> DyckWord

Change the alphabet of a Dyck word. An alphabet has two characters, here referred to as the left and right symbol, respectively. For example:

>>> toText (setAlphabet 'x' 'o' (unrank 55))
xoxxoxxooo


Concatenate two Dyck words. Corresponds to ordinary string concatenation.

\begin{align} \epsilon +\!\!\!+\;w &= w \\ (u)v +\!\!\!+\;w &= (u)[ v +\!\!\!+\; w ] \end{align}

If the two words have different alphabets, the concatenated word will inherit the first word's symbol set.

## Textual form

Parse a Text value to a DyckWord. The result is wrapped in a Maybe, so that the value becomes Nothing if parsing fails. The alphabet is determined by looking at the first and last characters of the input.

Parse a Text value to a DyckWord. Throws an error if parsing fails. This is an unsafe version of fromText.

Return a textual representation of a DyckWord.

# Ranking and unranking

To rank a Dyck word is to determine its position in some ordered sequence of words. The dual of this—to produce the Dyck word corresponding to a position in said sequence—is called unranking. The order we are interested in here is known as shortlex, and it demands that a smaller Dyck word always gets precedence over a bigger one. When comparing words of the same size, normal lexicographical order applies.

### Relative vs. absolute rank

In this library, we adopt the following (non-standard) terminology. The relative rank of a Dyck word w is its position in the sequence of those words with the same size as w. By contrast, the absolute rank means a word's position in the shortlex sequence of all Dyck words. For example: The (absolute) rank of (((()))) is 9, but the relative rank of the same word is 0, since it is the first word of size four.

$\begin{array}{lccc} \text{Word} & \text{Size} & \text{Rank} & \text{Relative rank} \\ \epsilon & 0 & 0 & 0 \\ ⟨⟩ & 1 & 1 & 0 \\ ⟨⟨⟩⟩ & 2 & 2 & 0 \\ ⟨⟩⟨⟩ & 2 & 3 & 1 \\ ⟨⟨⟨⟩⟩⟩ & 3 & 4 & 0 \\ ⟨⟨⟩⟨⟩⟩ & 3 & 5 & 1 \\ ⟨⟨⟩⟩⟨⟩ & 3 & 6 & 2 \\ ⟨⟩⟨⟨⟩⟩ & 3 & 7 & 3 \\ ⟨⟩⟨⟩⟨⟩ & 3 & 8 & 4 \\ ⟨⟨⟨⟨⟩⟩⟩⟩ & 4 & 9 & 0 \\ ⟨⟨⟨⟩⟨⟩⟩⟩ & 4 & 10 & 1 \\ \;\;\;\; \vdots & \vdots & \vdots & \vdots \end{array}$

If we let $$r(w)$$ denote the relative rank of a word $$w$$, and $$C_i$$ the $$i^{th}$$ Catalan number, then the absolute rank $$R(w)$$ of $$w$$ is given by the formula $$R(w) = r(w) + \sum_{i=0}^{s-1} C_i,$$ where $$s$$ is the size of $$w$$.

Return the absolute rank of a Dyck word. That is, its position in the (shortlex) ordered sequence of all Dyck words.

Return the relative rank of a Dyck word.

Return the n-th Dyck word in the (shortlex) ordered sequence of all Dyck words.

Return the n-th Dyck word, restricted to only words of a given size. Words are lexicographically ordered. The result is wrapped in a Maybe, and is equal to Nothing if the given rank is outside of the valid range.

Unsafe version of unrankRelative. This function throws an error if the given rank is outside of the valid range.

Return a lexicographically ordered list with all Dyck words of a specific size.