Copyright | (c) OleksandrZhabenko 2020-2024 |
---|---|
License | MIT |
Maintainer | oleksandr.zhabenko@yahoo.com |
Stability | Experimental |
Safe Haskell | Safe-Inferred |
Language | Haskell2010 |
Earlier it has been a module DobutokO.Poetry.Ukrainian.PrepareText
from the dobutokO-poetry
package.
In particular, this module can be used to prepare the Ukrainian text
by applying the most needed grammar for AFTOVolio to avoid misunderstanding
for the produced text. The attention is paid to the prepositions, pronouns, conjunctions
and particles that are most commonly connected (or not) in a significant way
with the next text.
Uses the information from:
https://uk.wikipedia.org/wiki/%D0%A1%D0%BF%D0%BE%D0%BB%D1%83%D1%87%D0%BD%D0%B8%D0%BA
and
https://uk.wikipedia.org/wiki/%D0%A7%D0%B0%D1%81%D1%82%D0%BA%D0%B0_(%D0%BC%D0%BE%D0%B2%D0%BE%D0%B7%D0%BD%D0%B0%D0%B2%D1%81%D1%82%D0%B2%D0%BE)
Uses arrays instead of vectors.
Synopsis
- complexWords :: [String] -> [String]
- participleConc :: [String] -> [String]
- splitLines :: [String] -> [String]
- splitLinesN :: Int -> [String] -> [String]
- auxiliary1 :: [String] -> [String]
- isPreposition :: String -> Bool
- isParticipleAppended :: String -> Bool
- isPrepended :: String -> Bool
- isConcatenated :: String -> Bool
- isSpC :: Char -> Bool
- isUkrainianL :: Char -> Bool
- concatenated2 :: [String] -> [String]
- jottedConv :: String -> String
- jottedCnv :: String -> String
- prepareText :: String -> [String]
- prepareTextN :: Int -> String -> [String]
- prepareTextN2 :: Int -> String -> [String]
- prepareTextN3 :: Int -> String -> [String]
- prepareTextNG :: (Char -> Bool) -> Int -> String -> [String]
- growLinesN :: Int -> [String] -> [String]
- prepareGrowTextMN :: Int -> Int -> String -> [String]
- prepareGrowTextMNG :: (Char -> Bool) -> Int -> Int -> String -> [String]
- tuneLinesN :: Int -> [String] -> [String]
- prepareTuneTextMN :: Int -> Int -> String -> [String]
- prepareTuneTextMNG :: (Char -> Bool) -> Int -> Int -> String -> [String]
- aux4 :: String -> Char
Basic functions
complexWords :: [String] -> [String] Source #
Concatenates complex words in Ukrainian so that they are not separated further by possible words order rearrangements (because they are treated as a single word). This is needed to preserve basic grammar in phonetic languages.
participleConc :: [String] -> [String] Source #
splitLines :: [String] -> [String] Source #
Since 0.2.1.0 version the function is recursive and is applied so that all returned elements (String
) are no longer than 7 words in them.
splitLinesN :: Int -> [String] -> [String] Source #
A generalized variant of the splitLines
with the arbitrary maximum number of the words in the lines given as the first argument.
auxiliary1 :: [String] -> [String] Source #
isPreposition :: String -> Bool Source #
isParticipleAppended :: String -> Bool Source #
isPrepended :: String -> Bool Source #
isConcatenated :: String -> Bool Source #
Since the dobutokO-poetry version 0.16.3.0 the (||) operator has been changed to the (&&). The idea is that these words are the ones that are pronouns and they "should" be treated (by the author's understanding) as independent words.
isUkrainianL :: Char -> Bool Source #
Is taken from the mmsyn6ukr
package version 0.8.1.0 so that the amount of dependencies are reduced (and was slightly modified).
concatenated2 :: [String] -> [String] Source #
jottedConv :: String -> String Source #
The end-user functions
prepareText :: String -> [String] Source #
Is used to convert a Ukrainian text into list of String
each of which is ready to be
used by the functions of the modules for the phonetic languages approach.
It applies minimal grammar links and connections between the most commonly used Ukrainian
words that "should" be paired and not dealt with separately
to avoid the misinterpretation and preserve maximum of the semantics for the
"phonetic" language on the Ukrainian basis.
prepareTextN :: Int -> String -> [String] Source #
A generalized variant of the prepareText
with the arbitrary maximum number of the words in the lines given as the first argument.
prepareTextN2 :: Int -> String -> [String] Source #
A generalized variant of the prepareText
with the arbitrary maximum number of the words in the lines given as the first argument. The '_' is not filtered out.
prepareTextN3 :: Int -> String -> [String] Source #
A generalized variant of the prepareText
with the arbitrary maximum number of the words in the lines given as the first argument. Both '_' and '=' are not filtered out.
An even more generalized variant of the prepareTextN
with the arbitrary maximum number of the words in the lines given as the second argument and the possibility to provide custom function for filtering.
growLinesN :: Int -> [String] -> [String] Source #
@ since 0.2.0.0
Given a positive number and a list tries to rearrange the list's String
s by concatenation of the several elements of the list
so that the number of words in every new String
in the resulting list is not greater than the Int
argument. If some of the
String
s have more than that number quantity of the words then these String
s are preserved.
:: Int | A maximum number of the words or their concatenations in the resulting list of |
-> Int | A number of words in every |
-> String | |
-> [String] |
@ since 0.2.0.0
The function combines the prepareTextN
and growLinesN
function. Applies needed phonetic language preparations
to the Ukrainian text and tries to 'grow' the resulting String
s in the list so that the number of the words in every
of them is no greater than the given first Int
number.
:: (Char -> Bool) | A predicate to filter the symbols during preparation. |
-> Int | A maximum number of the words or their concatenations in the resulting list of |
-> Int | A number of words in every |
-> String | |
-> [String] |
@ since 0.11.0.0
The generalized version of the prepareGrowTextMN
with additional possibility to provide custom function for symbols filtering inside.
:: Int | A number of the words or their concatenations in the resulting list of |
-> Int | A number of words in every |
-> String | |
-> [String] |
@ since 0.6.0.0
The function combines the prepareTextN
and tuneLinesN
functions. Applies needed phonetic language preparations
to the Ukrainian text and splits the list of String
s so that the number of the words in each of them (except the last one)
is equal the given first Int
number.
:: (Char -> Bool) | A predicate to filter the symbols during preparation. |
-> Int | A number of the words or their concatenations in the resulting list of |
-> Int | A number of words in every |
-> String | |
-> [String] |
@ since 0.11.0.0
The generalized version of the prepareTuneTextMN
with additional possibility to provide custom function for symbols filtering inside.