Seonbi: SmartyPants for Korean language ======================================= [![][releases-badge]][releases] [![][hackage-badge]][hackage] [![][dockerhub-badge]][dockerhub] [![][ci-status-badge]][ci] [![](https://dahlia.github.io/seonbi/showcase.svg)][demo web app] (TL;DR: See the [demo web app].) Seonbi (선비) is an HTML preprocessor that makes typographic adjustments to an HTML so that the result uses accurate punctuations according to the modern Korean orthography. (It's similar to what [SmartyPants] does for text written in English.) It also transforms `ko-Kore` text (國漢文混用; [Korean mixed script]) into `ko-Hang` text (한글전용; Hangul-only script). Seonbi provides a Haskell library, a CLI, and an HTTP API; any of them can perform the following transformations: - All hanja words (e.g., `漢字`) into corresponding hangul-only words (e.g., `한자`) - Straight quotes and apostrophes (`"` & `'`) into curly quotes HTML entities (`“`, `”`, `‘`, & `’`) - Three consecutive periods (`...` or `。。。`) into an ellipsis entity (`…`) - Classical (Chinese-style) stops (`。` & `、`) into modern (English-style) stops (`.` & `,`) - Pairs of less-than and greater-than inequality symbols (`<` & `>`) into pairs of proper angle quotes (`〈` & `〉`) - Pairs of two consecutive inequality symbols (`<<` & `>>`) into pairs of proper double angle quotes (`《` & `》`) - A hyphen (`-`) or hangul vowel *eu* (`ㅡ`) surrounded by spaces, or two/three consecutive hyphens (`--` or `---`) into a proper em dash (`—`) - A less-than inequality symbol followed by a hyphen or an equality symbol (`<-`, `<=`) into arrows to the left (`←`, `⇐`) - A hyphen or an equality symbol followed by a greater-than inequality symbol (`->`, `=>`) into arrows to the right (`→`, `⇒`) - A hyphen or an equality symbol wrapped by inequality symbols (`<->`, `<=>`) into bi-directional arrows (`↔`, `⇔`) Each transformations can be partially turned on and off, and some transformations have many options. All transformations work with both plain texts and rich text tree. In a similar way to SmartyPants, it does not modify characters within several sensitive HTML elements like `
`/``/`