Safe Haskell	None
Language	Haskell2010

Spark.Core.Internal.OpStructures

Contents

PHYSICAL OPERATORS ***********
DATASET OPERATORS ************
OBSERVABLE OPERATORS *******
AGGREGATION OPERATORS *****

Description

A description of the operations that can be performed on nodes and columns.

Synopsis

Documentation

type SqlFunctionName = Text Source #

The name of a SQL function.

It is one of the predefined SQL functions available in Spark.

type UdafClassName = Text Source #

The classpath of a UDAF.

type OperatorName = Text Source #

The name of an operator defined in Karps.

data HdfsPath Source #

A path in the Hadoop File System (HDFS).

These paths are usually not created by the user directly.

Constructors

HdfsPath Text

Instances

Eq HdfsPath Source #
Methods (==) :: HdfsPath -> HdfsPath -> Bool # (/=) :: HdfsPath -> HdfsPath -> Bool #
Ord HdfsPath Source #
Methods compare :: HdfsPath -> HdfsPath -> Ordering # (<) :: HdfsPath -> HdfsPath -> Bool # (<=) :: HdfsPath -> HdfsPath -> Bool # (>) :: HdfsPath -> HdfsPath -> Bool # (>=) :: HdfsPath -> HdfsPath -> Bool # max :: HdfsPath -> HdfsPath -> HdfsPath # min :: HdfsPath -> HdfsPath -> HdfsPath #
Show HdfsPath Source #
Methods showsPrec :: Int -> HdfsPath -> ShowS # show :: HdfsPath -> String # showList :: [HdfsPath] -> ShowS #
ToJSON HdfsPath Source #
Methods toJSON :: HdfsPath -> Value # toEncoding :: HdfsPath -> Encoding #
FromJSON HdfsPath Source #
Methods parseJSON :: Value -> Parser HdfsPath #

data DataInputStamp Source #

A stamp that defines some notion of uniqueness of the data source.

The general contract is that: - stamps can be extracted fast (no need to scan the whole dataset) - if the data gets changed, the stamp will change.

Stamps are used for performing aggressing operation caching, so it is better to conservatively update stamps if one is unsure about the freshness of the dataset. For regular files, stamps are computed using the file system time stamps.

Constructors

DataInputStamp Text

Instances

Eq DataInputStamp Source #
Methods (==) :: DataInputStamp -> DataInputStamp -> Bool # (/=) :: DataInputStamp -> DataInputStamp -> Bool #
Show DataInputStamp Source #
Methods showsPrec :: Int -> DataInputStamp -> ShowS # show :: DataInputStamp -> String # showList :: [DataInputStamp] -> ShowS #
ToJSON DataInputStamp Source #
Methods toJSON :: DataInputStamp -> Value # toEncoding :: DataInputStamp -> Encoding #
FromJSON DataInputStamp Source #
Methods parseJSON :: Value -> Parser DataInputStamp #

data TransformInvariant Source #

The invariant respected by a transform.

Depending on the value of the invariant, different optimizations may be available.

Constructors

Opaque	This operator has no special property. It may depend on the partitioning layout, the number of partitions, the order of elements in the partitions, etc. This sort of operator is unwelcome in Karps...
PartitioningInvariant	This operator respects the canonical partition order, but may not have the same number of elements. For example, this could be a flatMap on an RDD (filter, etc.). This operator can be used locally with the signature a -> [a]
DirectPartitioningInvariant	The strongest invariant. It respects the canonical partition order and it outputs the same number of elements. This is typically a map. This operator can be used locally with the signature a -> a

data Locality Source #

The dynamic value of locality. There is still a tag on it, but it can be easily dropped.

Constructors

Local	The data associated to this node is local. It can be materialized and accessed by the user.
Distributed	The data associated to this node is distributed or not accessible locally. It cannot be accessed by the user.

Instances

Eq Locality Source #
Methods (==) :: Locality -> Locality -> Bool # (/=) :: Locality -> Locality -> Bool #
Show Locality Source #
Methods showsPrec :: Int -> Locality -> ShowS # show :: Locality -> String # showList :: [Locality] -> ShowS #

PHYSICAL OPERATORS ***********

data StandardOperator Source #

An operator defined by default in the release of Karps. All other physical operators can be converted to a standard operators.

Constructors

StandardOperator
Fields soName :: !OperatorName soOutputType :: !DataType soExtra :: !Value

Instances

Eq StandardOperator Source #
Methods (==) :: StandardOperator -> StandardOperator -> Bool # (/=) :: StandardOperator -> StandardOperator -> Bool #
Show StandardOperator Source #
Methods showsPrec :: Int -> StandardOperator -> ShowS # show :: StandardOperator -> String # showList :: [StandardOperator] -> ShowS #

data ScalaStaticFunctionApplication Source #

A scala method of a singleton object.

Constructors

ScalaStaticFunctionApplication
Fields sfaObjectName :: !Text sfaMethodName :: !Text

data ColOp Source #

The different kinds of column operations that are understood by the backend.

These operations describe the physical operations on columns as supported by Spark SQL. They can operate on column -> column, column -> row, row->row. Of course, not all operators are valid for each configuration.

Constructors

ColExtraction !FieldPath	A projection onto a single column An extraction is always direct.
ColFunction !SqlFunctionName !(Vector ColOp)	A function of other columns. In this case, the other columns may matter TODO(kps) add if this function is partition invariant. It should be the case most of the time.
ColLit !DataType !Value	A constant defined for each element. The type should be the same as for the column A literal is always direct
ColStruct !(Vector TransformField)	A structure.

Instances

Eq ColOp Source #
Methods (==) :: ColOp -> ColOp -> Bool # (/=) :: ColOp -> ColOp -> Bool #
Show ColOp Source #
Methods showsPrec :: Int -> ColOp -> ShowS # show :: ColOp -> String # showList :: [ColOp] -> ShowS #

data TransformField Source #

A field in a structure.

Constructors

TransformField
Fields tfName :: !FieldName tfValue :: !ColOp

Instances

Eq TransformField Source #
Methods (==) :: TransformField -> TransformField -> Bool # (/=) :: TransformField -> TransformField -> Bool #
Show TransformField Source #
Methods showsPrec :: Int -> TransformField -> ShowS # show :: TransformField -> String # showList :: [TransformField] -> ShowS #

data StructuredTransform Source #

The content of a structured transform.

Constructors

InnerOp !ColOp
InnerStruct !(Vector TransformField)

Instances

Eq StructuredTransform Source #
Methods (==) :: StructuredTransform -> StructuredTransform -> Bool # (/=) :: StructuredTransform -> StructuredTransform -> Bool #
Show StructuredTransform Source #
Methods showsPrec :: Int -> StructuredTransform -> ShowS # show :: StructuredTransform -> String # showList :: [StructuredTransform] -> ShowS #

data UdafApplication Source #

When applying a UDAF, determines if it should only perform the algebraic portion of the UDAF (initialize+update+merge), or if it also performs the final, non-algebraic step.

Constructors

Algebraic
Complete

Instances

Eq UdafApplication Source #
Methods (==) :: UdafApplication -> UdafApplication -> Bool # (/=) :: UdafApplication -> UdafApplication -> Bool #
Show UdafApplication Source #
Methods showsPrec :: Int -> UdafApplication -> ShowS # show :: UdafApplication -> String # showList :: [UdafApplication] -> ShowS #

data AggOp Source #

Constructors

AggUdaf !UdafApplication !UdafClassName !FieldPath
AggFunction !SqlFunctionName !(Vector FieldPath)
AggStruct !(Vector AggField)

Instances

Eq AggOp Source #
Methods (==) :: AggOp -> AggOp -> Bool # (/=) :: AggOp -> AggOp -> Bool #
Show AggOp Source #
Methods showsPrec :: Int -> AggOp -> ShowS # show :: AggOp -> String # showList :: [AggOp] -> ShowS #

data AggField Source #

A field in the resulting aggregation transform.

Constructors

AggField
Fields afName :: !FieldName afValue :: !AggOp

Instances

Eq AggField Source #
Methods (==) :: AggField -> AggField -> Bool # (/=) :: AggField -> AggField -> Bool #
Show AggField Source #
Methods showsPrec :: Int -> AggField -> ShowS # show :: AggField -> String # showList :: [AggField] -> ShowS #

data AggTransform Source #

Constructors

OpaqueAggTransform !StandardOperator
InnerAggOp !AggOp

Instances

Eq AggTransform Source #
Methods (==) :: AggTransform -> AggTransform -> Bool # (/=) :: AggTransform -> AggTransform -> Bool #
Show AggTransform Source #
Methods showsPrec :: Int -> AggTransform -> ShowS # show :: AggTransform -> String # showList :: [AggTransform] -> ShowS #

data SemiGroupOperator Source #

The representation of a semi-group law in Spark.

This is the basic law used in universal aggregators. It is a function on observables that must respect the following laws:

f :: X -> X -> X commutative associative

A neutral element is not required for the semi-group laws. However, if used in the context of a universal aggregator, such an element implicitly exists and corresponds to the empty dataset.

Constructors

OpaqueSemiGroupLaw !StandardOperator	A standard operator that happens to respect the semi-group laws.
UdafSemiGroupOperator !UdafClassName	The merging portion of a UDAF
ColumnSemiGroupLaw !SqlFunctionName	A SQL operator that happens to respect the semi-group laws.

Instances

Eq SemiGroupOperator Source #
Methods (==) :: SemiGroupOperator -> SemiGroupOperator -> Bool # (/=) :: SemiGroupOperator -> SemiGroupOperator -> Bool #
Show SemiGroupOperator Source #
Methods showsPrec :: Int -> SemiGroupOperator -> ShowS # show :: SemiGroupOperator -> String # showList :: [SemiGroupOperator] -> ShowS #

DATASET OPERATORS ************

data DatasetTransformDesc Source #

Constructors

DSScalaStaticFunction !ScalaStaticFunctionApplication
DSStructuredTransform !ColOp
DSOperator !StandardOperator

OBSERVABLE OPERATORS *******

AGGREGATION OPERATORS *****

data UniversalAggregatorOp Source #

Constructors

UniversalAggregatorOp
Fields uaoMergeType :: !DataType uaoInitialOuter :: !AggTransform uaoMergeBuffer :: !SemiGroupOperator

Instances

Eq UniversalAggregatorOp Source #
Methods (==) :: UniversalAggregatorOp -> UniversalAggregatorOp -> Bool # (/=) :: UniversalAggregatorOp -> UniversalAggregatorOp -> Bool #
Show UniversalAggregatorOp Source #
Methods showsPrec :: Int -> UniversalAggregatorOp -> ShowS # show :: UniversalAggregatorOp -> String # showList :: [UniversalAggregatorOp] -> ShowS #

data NodeOp2 Source #

Constructors

NodeLocalLiteral !DataType !Value
NodeDistributedLiteral !DataType !(Vector Value)
NodeStructuredAggregation !AggOp !(Maybe UniversalAggregatorOp)
NodeStructuredTransform2 !Locality !ColOp
NodeOpaqueTransform !Locality StandardOperator

Instances

Eq NodeOp2 Source #
Methods (==) :: NodeOp2 -> NodeOp2 -> Bool # (/=) :: NodeOp2 -> NodeOp2 -> Bool #
Show NodeOp2 Source #
Methods showsPrec :: Int -> NodeOp2 -> ShowS # show :: NodeOp2 -> String # showList :: [NodeOp2] -> ShowS #

data Pointer Source #

A pointer to a node that is assumed to be already computed.

Constructors

Pointer
Fields pointerComputation :: !ComputationID pointerPath :: !NodePath

Instances

Eq Pointer Source #
Methods (==) :: Pointer -> Pointer -> Bool # (/=) :: Pointer -> Pointer -> Bool #
Show Pointer Source #
Methods showsPrec :: Int -> Pointer -> ShowS # show :: Pointer -> String # showList :: [Pointer] -> ShowS #

data NodeOp Source #

Constructors

NodeLocalOp StandardOperator	An operation between local nodes: [Observable] -> Observable
NodeLocalLit !DataType !Value	An observable literal
NodeBroadcastJoin	A special join that broadcasts a value along a dataset.
NodeOpaqueAggregator StandardOperator	Some aggregator that does not respect any particular invariant.
NodeGroupedReduction !AggOp
NodeReduction !AggTransform
NodeAggregatorReduction UniversalAggregatorOp	A universal aggregator.
NodeAggregatorLocalReduction UniversalAggregatorOp
NodeStructuredTransform !ColOp	A structured transform, performed either on a local node or a distributed node.
NodeDistributedLit !DataType !(Vector Value)	A distributed dataset (with no partition information)
NodeDistributedOp StandardOperator	An opaque distributed operator.
NodePointer Pointer

Instances

Eq NodeOp Source #
Methods (==) :: NodeOp -> NodeOp -> Bool # (/=) :: NodeOp -> NodeOp -> Bool #
Show NodeOp Source #
Methods showsPrec :: Int -> NodeOp -> ShowS # show :: NodeOp -> String # showList :: [NodeOp] -> ShowS #

makeOperator :: Text -> SQLType a -> StandardOperator Source #

Makes a standard operator with no extra value