Copyright	© 2018–present Mark Karpov
License	BSD 3 clause
Maintainer	Mark Karpov <markkarpov92@gmail.com>
Stability	experimental
Portability	portable
Safe Haskell	Safe-Inferred
Language	GHC2021

GHC.SyntaxHighlighter

Description

The module allows you to decompose a Text stream containing Haskell source code into a stream of Text chunks tagged with Token.

This library uses the GHC's lexer, so the result is guaranteed to be 100% correct, as if it were parsed by GHC itself.

Synopsis

data Token
- = KeywordTok
- | PragmaTok
- | SymbolTok
- | VariableTok
- | ConstructorTok
- | OperatorTok
- | CharTok
- | StringTok
- | IntegerTok
- | RationalTok
- | CommentTok
- | SpaceTok
- | OtherTok
data Loc = Loc !Int !Int !Int !Int
tokenizeHaskell :: Text -> Maybe [(Token, Text)]
tokenizeHaskellLoc :: Text -> Maybe [(Token, Loc)]

Documentation

data Token Source #

Token types that are used as tags to mark spans of source code.

Constructors

KeywordTok	Keyword
PragmaTok	Pragmas
SymbolTok	Symbols (punctuation that is not an operator)
VariableTok	Variable name (term level)
ConstructorTok	Data/type constructor
OperatorTok	Operator
CharTok	Character
StringTok	String
IntegerTok	Integer
RationalTok	Rational number
CommentTok	Comment (including Haddocks)
SpaceTok	Space filling
OtherTok	Something else?

Instances

Instances details

Bounded Token Source #
Instance details Defined in GHC.SyntaxHighlighter Methods minBound :: Token # maxBound :: Token #
Enum Token Source #
Instance details Defined in GHC.SyntaxHighlighter Methods succ :: Token -> Token # pred :: Token -> Token # toEnum :: Int -> Token # fromEnum :: Token -> Int # enumFrom :: Token -> [Token] # enumFromThen :: Token -> Token -> [Token] # enumFromTo :: Token -> Token -> [Token] # enumFromThenTo :: Token -> Token -> Token -> [Token] #
Show Token Source #
Instance details Defined in GHC.SyntaxHighlighter Methods showsPrec :: Int -> Token -> ShowS # show :: Token -> String # showList :: [Token] -> ShowS #
Eq Token Source #
Instance details Defined in GHC.SyntaxHighlighter Methods (==) :: Token -> Token -> Bool # (/=) :: Token -> Token -> Bool #
Ord Token Source #
Instance details Defined in GHC.SyntaxHighlighter Methods compare :: Token -> Token -> Ordering # (<) :: Token -> Token -> Bool # (<=) :: Token -> Token -> Bool # (>) :: Token -> Token -> Bool # (>=) :: Token -> Token -> Bool # max :: Token -> Token -> Token # min :: Token -> Token -> Token #

data Loc Source #

The start and end positions of a span. The arguments of the data constructor contain in order:

Line number of start position of a span
Column number of start position of a span
Line number of end position of a span
Column number of end position of a span

Since: 0.0.2.0

Constructors

Loc !Int !Int !Int !Int

Instances

Instances details

Show Loc Source #
Instance details Defined in GHC.SyntaxHighlighter Methods showsPrec :: Int -> Loc -> ShowS # show :: Loc -> String # showList :: [Loc] -> ShowS #
Eq Loc Source #
Instance details Defined in GHC.SyntaxHighlighter Methods (==) :: Loc -> Loc -> Bool # (/=) :: Loc -> Loc -> Bool #
Ord Loc Source #
Instance details Defined in GHC.SyntaxHighlighter Methods compare :: Loc -> Loc -> Ordering # (<) :: Loc -> Loc -> Bool # (<=) :: Loc -> Loc -> Bool # (>) :: Loc -> Loc -> Bool # (>=) :: Loc -> Loc -> Bool # max :: Loc -> Loc -> Loc # min :: Loc -> Loc -> Loc #

tokenizeHaskell :: Text -> Maybe [(Token, Text)] Source #

Tokenize Haskell source code. If the code cannot be parsed, return Nothing. Otherwise return the original input tagged by Tokens. Nothing is rarely returned, if ever, because it looks like the lexer is capable of interpreting almost any text as a stream of GHC tokens.

The parser does not require the input source code to form a valid Haskell program, so as long as the lexer can decompose your input (most of the time), it'll return something in Just.

tokenizeHaskellLoc :: Text -> Maybe [(Token, Loc)] Source #

Similar to tokenizeHaskell, but instead of Text chunks provides locations of corresponding spans in the given input stream.

Since: 0.0.2.0