| Copyright | © Herbert Valerio Riedel 2015-2018 |
|---|---|
| License | GPL-2.0-or-later |
| Safe Haskell | Safe |
| Language | Haskell2010 |
Data.YAML.Event
Contents
Description
Event-stream oriented YAML parsing and serializing API
Synopsis
- parseEvents :: ByteString -> EvStream
- writeEvents :: Encoding -> [Event] -> ByteString
- writeEventsText :: [Event] -> Text
- type EvStream = [Either (Pos, String) EvPos]
- data Event
- = StreamStart
- | StreamEnd
- | DocumentStart !Directives
- | DocumentEnd !Bool
- | Comment !Text
- | Alias !Anchor
- | Scalar !(Maybe Anchor) !Tag !ScalarStyle !Text
- | SequenceStart !(Maybe Anchor) !Tag !NodeStyle
- | SequenceEnd
- | MappingStart !(Maybe Anchor) !Tag !NodeStyle
- | MappingEnd
- data EvPos = EvPos {}
- data Directives
- data ScalarStyle
- = Plain
- | SingleQuoted
- | DoubleQuoted
- | Literal !Chomp !IndentOfs
- | Folded !Chomp !IndentOfs
- data NodeStyle
- data Chomp
- data IndentOfs
- data Tag
- untagged :: Tag
- isUntagged :: Tag -> Bool
- tagToText :: Tag -> Maybe Text
- mkTag :: String -> Tag
- type Anchor = Text
- data Pos = Pos {
- posByteOffset :: !Int
- posCharOffset :: !Int
- posLine :: !Int
- posColumn :: !Int
Tutorial
Data.YAML module provides us with API which allow us to interact with YAML data at the cost of some presentation details.
In contrast, this module provide us with API which gives us access to a other significant details like ScalarStyles, NodeStyles, Comments, etc.
Parsing YAML Documents
Suppose you want to parse this YAML Document while preserving its format and comments
# Home runs hr: 65 # Runs Batted In rbi: 147
then you might want to use the function parseEvents.
The following is a reference implementation of a function using parseEvents.
It takes a YAML document as input and prints the parsed YAML Events.
import Data.YAML.Event import qualified Data.ByteString.Lazy as BS.L printEvents :: BS.L.ByteString -> IO () printEvents input = forM_ (parseEventsinput) $ ev -> case ev of Left _ -> error "Failed to parse" Right event -> print (eEventevent)
When we pass the above mentioned YAML document to the function printEvents it outputs the following:
StreamStart DocumentStart NoDirEndMarker MappingStart Nothing Nothing Block Comment " Home runs" Scalar Nothing Nothing Plain "hr" Scalar Nothing Nothing Plain "65" Comment " Runs Batted In" Scalar Nothing Nothing Plain "rbi" Scalar Nothing Nothing Plain "147" MappingEnd DocumentEnd False StreamEnd
Notice that now we have all the necessary details in the form of Events.
We can now write simple functions to work with this data without losing any more details.
parseEvents :: ByteString -> EvStream Source #
Parse YAML Events from a lazy ByteString.
The parsed Events allow us to round-trip at the event-level while preserving many features and presentation details like
Comments,ScalarStyle,NodeStyle, Anchors, Directives marker along with YAML document version,
Chomping Indicator,Indentation Indicator (IndentOfs) ,ordering, etc.
It does not preserve non-content white spaces.
The input ByteString is expected to have a YAML 1.2 stream
using the UTF-8, UTF-16 (LE or BE), or UTF-32 (LE or BE) encodings
(which will be auto-detected).
Serializing Events to YAML Character Stream
Now, suppose we want to generate back the YAML document after playing with the Event-stream,
then you might want to use writeEvents.
The following function takes a YAML document as a input and dumps it back to STDOUT after a round-trip.
import Data.YAML.Event
import qualified Data.YAML.Token as YT
import qualified Data.ByteString.Lazy as BS.L
yaml2yaml :: BS.L.ByteString -> IO ()
yaml2yaml input = case sequence $ parseEvents input of
Left _ -> error "Parsing Failure!"
Right events -> do
BS.L.hPutStr stdout (writeEvents YT.UTF8 (map eEvent events))
hFlush stdout
Let this be the sample document passed to the above function
# This is aDirectivesMarker --- # AllComments are preserved date : 2019-07-12 bill-to : #Anchorrepresents a map node &id001 address: lines: # This a BlockScalarwithKeepchomping Indicator andIndentAutoIndentant indicator |+ # Extra Indentation (non-content white space) will not be preserved Vijay IIT Hyderabad # Trailing newlines are a preserved here as they are a part of theScalarnode country : India ship-to : # This is anAlias*id001 # Key is aScalarand Value is a Sequence Other Details: total: $ 3000 #Tags are also preserved Online Payment: !!bool True product: - Item1 # This comment is inside a Sequence - Item2 ... #DocumentEndTrue #StreamEnd
This function outputs the following
# This is aDirectivesMarker --- # AllComments are preserved date: 2019-07-12 bill-to: #Anchorrepresents a map node &id001 address: lines: # This a BlockScalarwithKeepchomping Indicator andIndentAutoIndentant indicator # Extra Indentation (non-content white space) will not be preserved |+ Vijay IIT Hyderabad # Trailing newlines are a preserved here as they are a part of theScalarnode country: India ship-to: # This is anAlias*id001 # Key is aScalarand Value is a Sequence Other Details: total: $ 3000 #Tags are also preserved Online Payment: !!bool True product: - Item1 # This comment is inside a Sequence - Item2 ... #DocumentEndTrue #StreamEnd
writeEvents :: Encoding -> [Event] -> ByteString Source #
Serialise Events using specified UTF encoding to a lazy ByteString
NOTE: This function is only well-defined for valid Event streams
Since: 0.2.0.0
writeEventsText :: [Event] -> Text Source #
How to comment your yaml document for best results
Round-tripping at event-level will preserve all the comments and their relative position in the YAML-document but still, we lose some information like the exact indentation and the position at which the comments were present previously. This information sometimes can be quiet important for human-perception of comments. Below are some guildlines, so that you can avoid ambiguities.
1) Always try to start your comment in a newline. This step will avoid most of the ambiguities.
2) Comments automaticly get indented according to the level in which they are present. For example,
Input YAML-document
# Level 0
- a
# Level 0
- - a
# Level 1
- a
- - a
# Level 2
- a
After a round-trip looks like
# Level 0
- a
# Level 0
- - a
# Level 1
- a
- - a
# Level 2
- a
3) Comments immediately after a Scalar node, start from a newline. So avoid commenting just after a scalar ends, as it may lead to some ambiguity. For example,
Input YAML-document
- scalar # After scalar
- random : scalar # After scalar
key: 1
# not after scalar
- random : scalar
key: 1
- random : # not after scalar
scalar
# not after scalar
key: 1
After a round-trip looks like
- scalar
# After scalar
- random: scalar
# After scalar
key: 1
# not after scalar
- random: scalar
key: 1
- random: # not after scalar
scalar
# not after scalar
key: 1
4) Similarly in flow-style, avoid commenting immediately after a comma (,) seperator. Comments immediately after a comma (,) seperator will start from a new line
Input YAML-document
{
# comment 0
Name: Vijay # comment 1
,
# comment 2
age: 19, # comment 3
# comment 4
country: India # comment 5
}
After a round-trip looks like
{
# comment 0
Name: Vijay,
# comment 1
# comment 2
age: 19,
# comment 3
# comment 4
country: India,
# comment 5
}
5) Avoid commenting in between syntatical elements. For example,
Input YAML-document
? # Complex key starts
[
a,
b
]
# Complex key ends
: # Complex Value starts
? # Complex key starts
[
a,
b
]
# Complex key ends
: # Simple value
a
# Complex value ends
After a round-trip looks like
? # Complex key starts
[
a,
b
]
: # Complex key ends
# Complex Value starts
? # Complex key starts
[
a,
b
]
: # Complex key ends
# Simple value
a
# Complex value ends
The above two YAML-documents, after parsing produce the same Event-stream.
So, these are some limitation of this Format-preserving YAML processor.
Event-stream Internals
type EvStream = [Either (Pos, String) EvPos] Source #
Event stream produced by parseEvents
A Left value denotes parsing errors. The event stream ends
immediately once a Left value is returned.
YAML Event Types
The events correspond to the ones from LibYAML
The grammar below defines well-formed streams of Events:
stream ::=StreamStartdocument*StreamEnddocument ::=DocumentStartnodeDocumentEndnode ::=Alias|Scalar|Comment| sequence | mapping sequence ::=SequenceStartnode*SequenceEndmapping ::=MappingStart(node node)*MappingEnd
Since: 0.2.0
Constructors
Instances
Event with corresponding Pos in YAML stream
Since: 0.2.0
Instances
| Eq EvPos Source # | |
| Show EvPos Source # | |
| Generic EvPos Source # | |
| NFData EvPos Source # | Since: 0.2.0 |
Defined in Data.YAML.Event.Internal | |
| type Rep EvPos Source # | |
Defined in Data.YAML.Event.Internal type Rep EvPos = D1 (MetaData "EvPos" "Data.YAML.Event.Internal" "HsYAML-0.2.1.0-3n04Fs6nDN3CMPVb0vyYIF" False) (C1 (MetaCons "EvPos" PrefixI True) (S1 (MetaSel (Just "eEvent") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Event) :*: S1 (MetaSel (Just "ePos") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Pos))) | |
data Directives Source #
Encodes document %YAML directives and the directives end-marker
Since: 0.2.0
Constructors
| NoDirEndMarker | no directives and also no |
| DirEndMarkerNoVersion |
|
| DirEndMarkerVersion !Word |
|
Instances
| Eq Directives Source # | |
Defined in Data.YAML.Event.Internal | |
| Show Directives Source # | |
Defined in Data.YAML.Event.Internal Methods showsPrec :: Int -> Directives -> ShowS # show :: Directives -> String # showList :: [Directives] -> ShowS # | |
| Generic Directives Source # | |
Defined in Data.YAML.Event.Internal Associated Types type Rep Directives :: Type -> Type # | |
| NFData Directives Source # | Since: 0.2.0 |
Defined in Data.YAML.Event.Internal Methods rnf :: Directives -> () # | |
| type Rep Directives Source # | |
Defined in Data.YAML.Event.Internal type Rep Directives = D1 (MetaData "Directives" "Data.YAML.Event.Internal" "HsYAML-0.2.1.0-3n04Fs6nDN3CMPVb0vyYIF" False) (C1 (MetaCons "NoDirEndMarker" PrefixI False) (U1 :: Type -> Type) :+: (C1 (MetaCons "DirEndMarkerNoVersion" PrefixI False) (U1 :: Type -> Type) :+: C1 (MetaCons "DirEndMarkerVersion" PrefixI False) (S1 (MetaSel (Nothing :: Maybe Symbol) NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Word)))) | |
data ScalarStyle Source #
Scalar-specific node style
This can be considered a more granular superset of NodeStyle.
See also scalarNodeStyle.
Since: 0.2.0
Constructors
| Plain | |
| SingleQuoted | |
| DoubleQuoted | |
| Literal !Chomp !IndentOfs | |
| Folded !Chomp !IndentOfs |
Instances
Node style
Since: 0.2.0
Instances
| Eq NodeStyle Source # | |
| Ord NodeStyle Source # | |
| Show NodeStyle Source # | |
| Generic NodeStyle Source # | |
| NFData NodeStyle Source # | Since: 0.2.0 |
Defined in Data.YAML.Event.Internal | |
| type Rep NodeStyle Source # | |
Since: 0.2.0
Constructors
| Strip | Remove all trailing line breaks and shows the presence of |
| Clip | Keep first trailing line break; this also the default behavior used if no explicit chomping indicator is specified. |
| Keep | Keep all trailing line breaks and shows the presence of |
Block Indentation Indicator
IndentAuto is the special case for auto Block Indentation Indicator
Since: 0.2.0
Constructors
| IndentAuto | |
| IndentOfs1 | |
| IndentOfs2 | |
| IndentOfs3 | |
| IndentOfs4 | |
| IndentOfs5 | |
| IndentOfs6 | |
| IndentOfs7 | |
| IndentOfs8 | |
| IndentOfs9 |
Instances
YAML Tags
Position in parsed YAML source
See also prettyPosWithSource.
NOTE: if posCharOffset is negative the Pos value doesn't refer to a proper location; this may be emitted in corner cases when no proper location can be inferred.
Constructors
| Pos | |
Fields
| |
Instances
| Eq Pos Source # | |
| Show Pos Source # | |
| Generic Pos Source # | |
| NFData Pos Source # | Since: 0.2.0 |
Defined in Data.YAML.Pos | |
| type Rep Pos Source # | |
Defined in Data.YAML.Pos type Rep Pos = D1 (MetaData "Pos" "Data.YAML.Pos" "HsYAML-0.2.1.0-3n04Fs6nDN3CMPVb0vyYIF" False) (C1 (MetaCons "Pos" PrefixI True) ((S1 (MetaSel (Just "posByteOffset") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Int) :*: S1 (MetaSel (Just "posCharOffset") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Int)) :*: (S1 (MetaSel (Just "posLine") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Int) :*: S1 (MetaSel (Just "posColumn") NoSourceUnpackedness SourceStrict DecidedStrict) (Rec0 Int)))) | |