Safe Haskell | None |
---|---|
Language | Haskell2010 |
- type FinalResult = Either NodeComputationFailure NodeComputationSuccess
- inputSourcesRead :: ComputeGraph -> [HdfsPath]
- prepareComputation :: ComputeGraph -> SparkStatePure (Try Computation)
- buildComputationGraph :: ComputeNode loc a -> Try ComputeGraph
- performGraphTransforms :: SparkSession -> ComputeGraph -> Try ComputeGraph
- getTargetNodes :: HasCallStack => Computation -> [UntypedLocalData]
- getObservables :: Computation -> [UntypedLocalData]
- insertSourceInfo :: ComputeGraph -> [(HdfsPath, DataInputStamp)] -> Try ComputeGraph
- updateCache :: ComputationID -> [(NodeId, NodePath, DataType, PossibleNodeStatus)] -> SparkStatePure ([(NodeId, Try Cell)], [(NodePath, NodeCacheStatus)])
Documentation
inputSourcesRead :: ComputeGraph -> [HdfsPath] Source #
A list of file sources that are being requested by the compute graph
prepareComputation :: ComputeGraph -> SparkStatePure (Try Computation) Source #
Given a context for the computation and a graph of computation, builds a computation object.
buildComputationGraph :: ComputeNode loc a -> Try ComputeGraph Source #
Builds the computation graph by expanding a single node until a transitive closure is reached.
It performs the naming, node deduplication and cycle detection.
TODO(kps) use the caching information to have a correct fringe
performGraphTransforms :: SparkSession -> ComputeGraph -> Try ComputeGraph Source #
Performs all the operations that are done on the compute graph:
- fullfilling autocache requests
- checking the cache/uncache pairs
- pruning of observed successful computations
- deconstructions of the unions (in the future)
This could all be done on the server side at this point.
getTargetNodes :: HasCallStack => Computation -> [UntypedLocalData] Source #
getObservables :: Computation -> [UntypedLocalData] Source #
Retrieves all the observables from a computation.
insertSourceInfo :: ComputeGraph -> [(HdfsPath, DataInputStamp)] -> Try ComputeGraph Source #
Exposed for debugging
Inserts the source information into the graph.
Note: after that, the node IDs may be different. The names and the paths will be kept though.
updateCache :: ComputationID -> [(NodeId, NodePath, DataType, PossibleNodeStatus)] -> SparkStatePure ([(NodeId, Try Cell)], [(NodePath, NodeCacheStatus)]) Source #
Updates the cache, and returns the updates if there are any.
The updates are split into final results, and general update status (scheduled, running, etc.)