karps-0.2.0.0: Haskell bindings for Spark Dataframes and Datasets

Safe HaskellNone
LanguageHaskell2010

Spark.Core.Context

Description

This module defines session objects that act as entry points to spark.

There are two ways to interact with Spark: using an explicit state object, or using the default state object (interactive session).

While the interactive session is the most convenient, it should not be used for more than quick experimentations. Any complex code should use the SparkSession and SparkState objects.

Synopsis

Documentation

data SparkSessionConf Source #

The configuration of a remote spark session in Karps.

Constructors

SparkSessionConf 

Fields

  • confEndPoint :: !Text

    The URL of the end point.

  • confPort :: !Int

    The port used to configure the end point.

  • confPollingIntervalMillis :: !Int

    (internal) the polling interval

  • confRequestedSessionName :: !Text

    (optional) the requested name of the session. This name must obey a number of rules: - it must consist in alphanumerical and -,_: [a-zA-Z0-9-_] - if it already exists on the server, it will be reconnected to

    The default value is "" (a new random context name will be chosen).

  • confUseNodePrunning :: !Bool

    If enabled, attempts to prune the computation graph as much as possible.

    This option is useful in interactive sessions when long chains of computations are extracted. This forces the execution of only the missing parts. The algorithm is experimental, so disabling it is a safe option.

    Disabled by default.

data SparkSession Source #

A session in Spark. Encapsualates all the state needed to communicate with Spark and to perfor some simple optimizations on the code.

type SparkState a = SparkStateT IO a Source #

Represents the state of a session and accounts for the communication with the server.

defaultConf :: SparkSessionConf Source #

The default configuration if the Karps server is being run locally.

executeCommand1 :: forall a. FromSQL a => LocalData a -> SparkState (Try a) Source #

Executes a command: - performs the transforms and the optimizations in the pure state - sends the computation to the backend - waits for the terminal nodes to reach a final state - commits the final results to the state

If any failure is detected that is internal to Karps, it returns an error. If the error comes from an underlying library (http stack, programming failure), an exception may be thrown instead.

createSparkSessionDef :: SparkSessionConf -> IO () Source #

Creates a spark session that will be used as the default session.

If a session already exists, an exception will be thrown.

closeSparkSessionDef :: IO () Source #

Closes the default session. The default session is empty after this call completes.

NOTE: This does not currently clear up the resources! It is a stub implementation used in testing.

exec1Def :: FromSQL a => LocalData a -> IO a Source #

Executes a command using the default spark session.

This is the most unsafe way of running a command: it executes a command using the default spark session, and throws an exception if any error happens.

execStateDef :: SparkState a -> IO a Source #

Runs the computation described in the state transform, using the default Spark session.

Will throw an exception if no session currently exists.