Copyright | © Herbert Valerio Riedel 2015-2018 |
---|---|
License | GPL-2.0-or-later |
Safe Haskell | Safe |
Language | Haskell2010 |
Event-stream oriented YAML parsing and serializing API
Synopsis
- parseEvents :: ByteString -> EvStream
- writeEvents :: Encoding -> [Event] -> ByteString
- writeEventsText :: [Event] -> Text
- type EvStream = [Either (Pos, String) EvPos]
- data Event
- = StreamStart
- | StreamEnd
- | DocumentStart !Directives
- | DocumentEnd !Bool
- | Comment !Text
- | Alias !Anchor
- | Scalar !(Maybe Anchor) !Tag !ScalarStyle !Text
- | SequenceStart !(Maybe Anchor) !Tag !NodeStyle
- | SequenceEnd
- | MappingStart !(Maybe Anchor) !Tag !NodeStyle
- | MappingEnd
- data EvPos = EvPos {}
- data Directives
- data ScalarStyle
- = Plain
- | SingleQuoted
- | DoubleQuoted
- | Literal !Chomp !IndentOfs
- | Folded !Chomp !IndentOfs
- data NodeStyle
- data Chomp
- data IndentOfs
- data Tag
- untagged :: Tag
- isUntagged :: Tag -> Bool
- tagToText :: Tag -> Maybe Text
- mkTag :: String -> Tag
- type Anchor = Text
- data Pos = Pos {
- posByteOffset :: !Int
- posCharOffset :: !Int
- posLine :: !Int
- posColumn :: !Int
Tutorial
Data.YAML module provides us with API which allow us to interact with YAML data at the cost of some presentation details.
In contrast, this module provide us with API which gives us access to a other significant details like ScalarStyle
s, NodeStyle
s, Comment
s, etc.
Parsing YAML Documents
Suppose you want to parse this YAML Document while preserving its format and comments
# Home runs hr: 65 # Runs Batted In rbi: 147
then you might want to use the function parseEvents
.
The following is a reference implementation of a function using parseEvents
.
It takes a YAML document as input and prints the parsed YAML Event
s.
import Data.YAML.Event import qualified Data.ByteString.Lazy as BS.L printEvents :: BS.L.ByteString -> IO () printEvents input = forM_ (parseEvents
input) $ ev -> case ev of Left _ -> error "Failed to parse" Right event -> print (eEvent
event)
When we pass the above mentioned YAML document to the function printEvents it outputs the following:
StreamStart DocumentStart NoDirEndMarker MappingStart Nothing Nothing Block Comment " Home runs" Scalar Nothing Nothing Plain "hr" Scalar Nothing Nothing Plain "65" Comment " Runs Batted In" Scalar Nothing Nothing Plain "rbi" Scalar Nothing Nothing Plain "147" MappingEnd DocumentEnd False StreamEnd
Notice that now we have all the necessary details in the form of Event
s.
We can now write simple functions to work with this data without losing any more details.
parseEvents :: ByteString -> EvStream Source #
Parse YAML Event
s from a lazy ByteString
.
The parsed Events allow us to round-trip at the event-level while preserving many features and presentation details like
Comment
s,ScalarStyle
,NodeStyle
, Anchor
s, Directives
marker along with YAML document version,
Chomp
ing Indicator,Indentation Indicator (IndentOfs
) ,ordering, etc.
It does not preserve non-content white spaces.
The input ByteString
is expected to have a YAML 1.2 stream
using the UTF-8, UTF-16 (LE or BE), or UTF-32 (LE or BE) encodings
(which will be auto-detected).
Serializing Events to YAML Character Stream
Now, suppose we want to generate back the YAML document after playing with the Event-stream,
then you might want to use writeEvents
.
The following function takes a YAML document as a input and dumps it back to STDOUT after a round-trip.
import Data.YAML.Event import qualified Data.YAML.Token as YT import qualified Data.ByteString.Lazy as BS.L yaml2yaml :: BS.L.ByteString -> IO () yaml2yaml input = case sequence $ parseEvents input of Left _ -> error "Parsing Failure!" Right events -> do BS.L.hPutStr stdout (writeEvents YT.UTF8 (map eEvent events)) hFlush stdout
Let this be the sample document passed to the above function
# This is aDirectives
Marker --- # AllComment
s are preserved date : 2019-07-12 bill-to : #Anchor
represents a map node &id001 address: lines: # This a BlockScalar
withKeep
chomping Indicator andIndentAuto
Indentant indicator |+ # Extra Indentation (non-content white space) will not be preserved Vijay IIT Hyderabad # Trailing newlines are a preserved here as they are a part of theScalar
node country : India ship-to : # This is anAlias
*id001 # Key is aScalar
and Value is a Sequence Other Details: total: $ 3000 #Tag
s are also preserved Online Payment: !!bool True product: - Item1 # This comment is inside a Sequence - Item2 ... #DocumentEnd
True #StreamEnd
This function outputs the following
# This is aDirectives
Marker --- # AllComment
s are preserved date: 2019-07-12 bill-to: #Anchor
represents a map node &id001 address: lines: # This a BlockScalar
withKeep
chomping Indicator andIndentAuto
Indentant indicator # Extra Indentation (non-content white space) will not be preserved |+ Vijay IIT Hyderabad # Trailing newlines are a preserved here as they are a part of theScalar
node country: India ship-to: # This is anAlias
*id001 # Key is aScalar
and Value is a Sequence Other Details: total: $ 3000 #Tag
s are also preserved Online Payment: !!bool True product: - Item1 # This comment is inside a Sequence - Item2 ... #DocumentEnd
True #StreamEnd
writeEvents :: Encoding -> [Event] -> ByteString Source #
Serialise Event
s using specified UTF encoding to a lazy ByteString
NOTE: This function is only well-defined for valid Event
streams
Since: 0.2.0.0
writeEventsText :: [Event] -> Text Source #
How to comment your yaml document for best results
Round-tripping at event-level will preserve all the comments and their relative position in the YAML-document but still, we lose some information like the exact indentation and the position at which the comments were present previously. This information sometimes can be quiet important for human-perception of comments. Below are some guildlines, so that you can avoid ambiguities.
1) Always try to start your comment in a newline. This step will avoid most of the ambiguities.
2) Comments automaticly get indented according to the level in which they are present. For example,
Input YAML-document
# Level 0 - a # Level 0 - - a # Level 1 - a - - a # Level 2 - a
After a round-trip looks like
# Level 0 - a # Level 0 - - a # Level 1 - a - - a # Level 2 - a
3) Comments immediately after a Scalar
node, start from a newline. So avoid commenting just after a scalar ends, as it may lead to some ambiguity. For example,
Input YAML-document
- scalar # After scalar - random : scalar # After scalar key: 1 # not after scalar - random : scalar key: 1 - random : # not after scalar scalar # not after scalar key: 1
After a round-trip looks like
- scalar # After scalar - random: scalar # After scalar key: 1 # not after scalar - random: scalar key: 1 - random: # not after scalar scalar # not after scalar key: 1
4) Similarly in flow-style, avoid commenting immediately after a comma (,
) seperator. Comments immediately after a comma (,
) seperator will start from a new line
Input YAML-document
{ # comment 0 Name: Vijay # comment 1 , # comment 2 age: 19, # comment 3 # comment 4 country: India # comment 5 }
After a round-trip looks like
{ # comment 0 Name: Vijay, # comment 1 # comment 2 age: 19, # comment 3 # comment 4 country: India, # comment 5 }
5) Avoid commenting in between syntatical elements. For example,
Input YAML-document
? # Complex key starts [ a, b ] # Complex key ends : # Complex Value starts ? # Complex key starts [ a, b ] # Complex key ends : # Simple value a # Complex value ends
After a round-trip looks like
? # Complex key starts [ a, b ] : # Complex key ends # Complex Value starts ? # Complex key starts [ a, b ] : # Complex key ends # Simple value a # Complex value ends
The above two YAML-documents, after parsing produce the same Event
-stream.
So, these are some limitation of this Format-preserving YAML processor.
Event-stream Internals
type EvStream = [Either (Pos, String) EvPos] Source #
Event stream produced by parseEvents
A Left
value denotes parsing errors. The event stream ends
immediately once a Left
value is returned.
YAML Event Types
The events correspond to the ones from LibYAML
The grammar below defines well-formed streams of Event
s:
stream ::=StreamStart
document*StreamEnd
document ::=DocumentStart
nodeDocumentEnd
node ::=Alias
|Scalar
|Comment
| sequence | mapping sequence ::=SequenceStart
node*SequenceEnd
mapping ::=MappingStart
(node node)*MappingEnd
Since: 0.2.0
Instances
Event
with corresponding Pos in YAML stream
Since: 0.2.0
Instances
Eq EvPos Source # | |
Show EvPos Source # | |
Generic EvPos Source # | |
NFData EvPos Source # | Since: 0.2.0 |
Defined in Data.YAML.Event.Internal | |
type Rep EvPos Source # | |
Defined in Data.YAML.Event.Internal type Rep EvPos = D1 (MetaData "EvPos" "Data.YAML.Event.Internal" "HsYAML-0.2.0.0-Lc2Xi1g9LKMImTc3Ihoid7" False) (C1 (MetaCons "EvPos" PrefixI True) (S1 (MetaSel (Just "eEvent") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Event) :*: S1 (MetaSel (Just "ePos") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Pos))) |
data Directives Source #
Encodes document %YAML
directives and the directives end-marker
Since: 0.2.0
NoDirEndMarker | no directives and also no |
DirEndMarkerNoVersion |
|
DirEndMarkerVersion !Word |
|
Instances
data ScalarStyle Source #
Scalar
-specific node style
This can be considered a more granular superset of NodeStyle
.
See also scalarNodeStyle
.
Since: 0.2.0
Instances
Node style
Since: 0.2.0
Since: 0.2.0
Strip | Remove all trailing line breaks and shows the presence of |
Clip | Keep first trailing line break; this also the default behavior used if no explicit chomping indicator is specified. |
Keep | Keep all trailing line breaks and shows the presence of |
Block Indentation Indicator
IndentAuto
is the special case for auto Block Indentation Indicator
Since: 0.2.0
IndentAuto | |
IndentOfs1 | |
IndentOfs2 | |
IndentOfs3 | |
IndentOfs4 | |
IndentOfs5 | |
IndentOfs6 | |
IndentOfs7 | |
IndentOfs8 | |
IndentOfs9 |
Instances
YAML Tags
Position in parsed YAML source
NOTE: if posCharOffset
is negative the Pos
value doesn't refer to a proper location; this may be emitted in corner cases when no proper location can be inferred.
Pos | |
|
Instances
Eq Pos Source # | |
Show Pos Source # | |
Generic Pos Source # | |
NFData Pos Source # | Since: 0.2.0 |
Defined in Data.YAML.Event.Internal | |
type Rep Pos Source # | |
Defined in Data.YAML.Event.Internal type Rep Pos = D1 (MetaData "Pos" "Data.YAML.Event.Internal" "HsYAML-0.2.0.0-Lc2Xi1g9LKMImTc3Ihoid7" False) (C1 (MetaCons "Pos" PrefixI True) ((S1 (MetaSel (Just "posByteOffset") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Int) :*: S1 (MetaSel (Just "posCharOffset") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Int)) :*: (S1 (MetaSel (Just "posLine") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Int) :*: S1 (MetaSel (Just "posColumn") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Int)))) |