inchworm: Simple parser combinators for lexical analysis.

This is a package candidate release! Here you can preview how this package release will appear once published to the main package index (which can be accomplished via the 'maintain' link below). Please note that once a package has been published to the main package index it cannot be undone! Please consult the package uploading documentation for more information.

[maintain]

Parser combinator framework specialized to lexical analysis. Tokens are specified via simple fold functions, and we include baked in source location handling. Comes with matchers for standard lexemes like integers, comments, and Haskell style strings with escape handling. No dependencies other than the Haskell base library. If you want to parse expressions instead of tokens then try try the parsec or attoparsec packages, which have more general purpose combinators.


[Skip to ReadMe]

Properties

Versions1.0.0.1, 1.0.1.1, 1.0.2.1, 1.0.2.2, 1.0.2.3, 1.0.2.4, 1.1.1.1, 1.1.1.2, 1.1.1.2
Change logChangelog.md
Dependenciesbase (>=4.8 && <4.13) [details]
LicenseMIT
AuthorThe Inchworm Development Team
MaintainerBen Lippmeier <benl@ouroborus.net>
CategoryParsing
Home pagehttps://github.com/discus-lang/inchworm
Source repositoryhead: git clone https://github.com/discus-lang/inchworm.git
UploadedWed Jan 2 03:41:53 UTC 2019 by BenLippmeier

Modules

Downloads

Maintainers' corner

For package maintainers and hackage trustees


Readme for inchworm-1.1.1.2

[back to package description]

Inchworm

Inchworm is a simple parser combinator framework specialized to lexical analysis. Tokens are specified via simple fold functions, and we include baked in source location handling.

Matchers for standard tokens like comments and strings are in the Text.Lexer.Inchworm.Char module.

No dependencies other than the Haskell base library.

If you want to parse expressions instead of performing lexical analysis then try the parsec or attoparsec packages, which have more general purpose combinators.

Minimal example

The following code demonstrates how to perform lexical analysis of a simple LISP-like language. We use two separate name classes, one for variables that start with a lower-case letter, and one for constructors that start with an upper case letter.

Integers are scanned using the scanInteger function from the Text.Lexer.Inchworm.Char module.

The result of scanStringIO contains the list of leftover input characters that could not be parsed. In a real lexer you should check that this is empty to ensure there has not been a lexical error.

import Text.Lexer.Inchworm.Char
import qualified Data.Char as Char

-- | A source token.
data Token 
        = KBra | KKet | KVar String | KCon String | KInt Integer
        deriving Show

-- | A thing with attached location information.
data Located a
        = Located FilePath (Range Location) a
        deriving Show

-- | Scanner for a lispy language.
scanner :: FilePath
        -> Scanner IO Location [Char] (Located Token)
scanner fileName
 = skip Char.isSpace
 $ alts [ fmap (stamp id)   $ accept '(' KBra
        , fmap (stamp id)   $ accept ')' KKet
        , fmap (stamp KInt) $ scanInteger 
        , fmap (stamp KVar)
          $ munchWord (\ix c -> if ix == 0 then Char.isLower c
                                           else Char.isAlpha c) 
        , fmap (stamp KCon) 
          $ munchWord (\ix c -> if ix == 0 then Char.isUpper c
                                           else Char.isAlpha c)
        ]
 where  -- Stamp a token with source location information.
        stamp k (range, t) 
          = Located fileName range (k t)

main :: IO ()
main 
 = do   let fileName = "Source.lispy"
        let source   = "(some (Lispy like) 26 Program 93 (for you))"
        let toks     = scanString source (scanner fileName)
        print toks