T      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~      !"#$%&'()*+,-./01 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P QRS T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~                                                                                       !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~  !  !!!""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" " " " " """"""""""######### #!#"###$#%#&#'#(#)#*#+$,$-$.$/$0$1$2$3$4%5%6%7%8%9%:%;%<%=%>%?%@%A&B&C&D&E&F&G&H&I&J&K'L(M(N(O(P(Q(R(S(7Safe5      )NoneYA Row of data: the basic data structure to transport information TODO rename to rowCellspThe basic representation of one row of data. This is a standard type that comes out of the SQL engine in Spark.tAn element in a Row object. All objects manipulated by the Spark framework are assumed to be convertible to cells..This is usually handled by generic transforms.TRowUCell TU TU*Safe*VWXVWXVWXNone:<= MA type that is is not known and that is not meant to be exposed to the user.;Pretty printing for Aeson values (and deterministic output)5Produces a bytestring output of a JSON value that is deterministic and that is invariant to the insertion order of the keys. (i.e the keys are stored in alphabetic order) This is to ensure that all id computations are stable and reproducible on the server part. TODO(kps) use everywhere JSON is converted -group by TODO: have a non-empty list instead!-group by TODO: have a non-empty list instead")Missing implementations in the code base.#RThe function that is used to trigger exception due to internal programming errors.Currently, all programming errors simply trigger an exception. All these impure functions are tagged with an implicit call stack argument.%RGiven a DataFrame or a LocalFrame, attempts to get the value, or throws the error.This function is not total.&/Force the complete evaluation of a list to WNF.'%(internal) prints a hint with a value(show with Text !"#$%&'() !"#$%&'()! "#$%()&' !"#$%&'()+None+679:;<=DQRTY3Products: encode multiple arguments to constructorsZ8Constants, additional parameters and recursion of kind *[\]*^_`ab+YcdefZghijkl*+[\]*^^_`ab+YcdefZghijklNone+679:;<=DQRT03Products: encode multiple arguments to constructors mnopq,-rst./uvw0123456789:;<=>?@,-./,--./mnopq,--rst./uvw0123456789:;<=>?@None5: EUA unique identifier for a computation (a batch of nodes sent for execution to Spark).HnA path to a nested field an a sql structure. This structure ensures that proper escaping happens if required.KThe name of a field in a sql structure This structure ensures that proper escaping happens if required. TODO: prevent the constructor from being used, it should be checked first.NsThe unique ID of a node. It is based on the parents of the node and all the relevant intrinsic values of the node.QRThe user-defined path of the node in the hierarchical representation of the graph.T-The name of a node (without path information)WsA safe constructor for field names that fixes all the issues relevant to SQL escaping TODO: proper implementationXGConstructs the field name, but will fail if the content is not correct.YrA safe constructor for field names that fixes all the issues relevant to SQL escaping TODO: proper implementation]eThe concatenated path. This is the inverse function of fieldPath. | TODO: this one should be hidden?)EFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmEFGHIJKLMNOPQRSTUVWXYZ[\]^TUVQRSNOPKLMHIJEFG]WXZ[\Y^EFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmNone5zmA tagged datatype that encodes the sql types This is the main type information that should be used by users.|The underlying data type.}/Encodes the type of all the nullable data types;Represents the choice between a strict and a nullable field.The main structure of a dataframe or a datasetA field in a structurevAll the data types supported by the Spark engine. The data types can either be nullable (they may contain null values) or strict (all the values are present). There are a couple of differences with the algebraic data types in Haskell: Maybe (Maybe a) ~ Maybe a which implies that arbitrary nesting of values will be flattened to a top-level Nullable Similarly, [[]] ~ []WThe data types that are guaranteed to not be null: evaluating them will return a value.*z{|}~z{|}~*}~z{|z{|}~None%<An operation between local nodes: [Observable] -> ObservableAn observable literal7A special join that broadcasts a value along a dataset.?Some aggregator that does not respect any particular invariant.A universal aggregator.PA structured transform, performed either on a local node or a distributed node.5A distributed dataset (with no partition information)An opaque distributed operator.;A pointer to a node that is assumed to be already computed.0The representation of a semi-group law in Spark.zThis is the basic law used in universal aggregators. It is a function on observables that must respect the following laws:(f :: X -> X -> X commutative associativeA neutral element is not required for the semi-group laws. However, if used in the context of a universal aggregator, such an element implicitly exists and corresponds to the empty dataset.@A standard operator that happens to respect the semi-group laws.The merging portion of a UDAF;A SQL operator that happens to respect the semi-group laws./A field in the resulting aggregation transform.When applying a UDAF, determines if it should only perform the algebraic portion of the UDAF (initialize+update+merge), or if it also performs the final, non-algebraic step.&The content of a structured transform.A field in a structure.MThe different kinds of column operations that are understood by the backend.These operations describe the physical operations on columns as supported by Spark SQL. They can operate on column -> column, column -> row, row->row. Of course, not all operators are valid for each configuration.BA projection onto a single column An extraction is always direct.A function of other columns. In this case, the other columns may matter TODO(kps) add if this function is partition invariant. It should be the case most of the time.oA constant defined for each element. The type should be the same as for the column A literal is always direct A structure.%A scala method of a singleton object.An operator defined by default in the release of Karps. All other physical operators can be converted to a standard operators.YThe dynamic value of locality. There is still a tag on it, but it can be easily dropped.\The data associated to this node is local. It can be materialized and accessed by the user.nThe data associated to this node is distributed or not accessible locally. It cannot be accessed by the user.'The invariant respected by a transform.RDepending on the value of the invariant, different optimizations may be available.This operator has no special property. It may depend on the partitioning layout, the number of partitions, the order of elements in the partitions, etc. This sort of operator is unwelcome in Karps...This operator respects the canonical partition order, but may not have the same number of elements. For example, this could be a flatMap on an RDD (filter, etc.). This operator can be used locally with the signature a -> [a]The strongest invariant. It respects the canonical partition order and it outputs the same number of elements. This is typically a map. This operator can be used locally with the signature a -> aBA stamp that defines some notion of uniqueness of the data source.The general contract is that: - stamps can be extracted fast (no need to scan the whole dataset) - if the data gets changed, the stamp will change.Stamps are used for performing aggressing operation caching, so it is better to conservatively update stamps if one is unsure about the freshness of the dataset. For regular files, stamps are computed using the file system time stamps.(A path in the Hadoop File System (HDFS).9These paths are usually not created by the user directly.)The name of an operator defined in Karps.The classpath of a UDAF.The name of a SQL function.=It is one of the predefined SQL functions available in Spark.-Makes a standard operator with no extra valueZVZ None(3The common result of attempting to build something.)FAn error associated to a particular node (an observable or a dataset).-*Returns an error object given a text clue..xReturns an error object given a string clue. Remove this method tryError' :: String -> Try a tryError' = _error . T.packE(internal) Given a potentially errored object, converts it to a Try.()*+,x-.()*+,-.)*+,(-.()*+,x-. None:2DA text representation of the operation that is appealing for humans.4`If the node is a reading operation, returns the HdfsPath of the source that is going to be read.5$Updates the input stamp if possible.[If the node cannot be updated, it is most likely a programming error: an error is returned.12345yz{|}678~9:;<123456781267345812345yz{|}678~9:;< NoneM5Builds a type that is a tuple of all the given types.AFollowing the Spark and SQL convention, the indexing starts at 1.N6Returns a data type instead (the most common use case);Note that unlike Spark and SQL, the indexing starts from 0.=>?@ABCDEFGHIJKLMNOP=>?@ABCDEFGHIJKLMNO?J>=DIBG@AHEFOMNCLK=>?@ABCDEFGHIJKLMNOP,NoneQZDecodes a JSON into a row. This operation requires a SQL type that describes the schema.CGiven a datatype, ensures that the cell has the corresponding type.R.Convenience constructor for an array of cells.QRQRQRNone*+,/QR*,+/QR None+679:;<=DQRTYxThe only function that should matter for users in this file. Given a type, returns the SQL representation of this type.STUVWXYZ[\]^_`abcdefghijkSTUVWXYZXVWWTUYSkjihgfedcbZa`_^]\[STUVWWXYZ[\]^_`abcdefghijk None 9;<=DORTpuThe edges in a compute DAG, after name resolution (which is where most of the checks and computations are being done)parent edges are the direct parents of a node, the only ones required for defining computations. They are included in the id.logical edges define logical dependencies between nodes to force a specific ordering of the nodes. They are included in the id.sWThe different paths of edges in the compute DAG of nodes, at the start of computations.Wscope edges specify the scope of a node for naming. They are not included in the id.wbObservable, whose type can only be infered at runtime and that can fail to be computed at runtime.9Any observable can be converted to an untyped observable.Untyped observables are more flexible and can be combined in arbitrary manner, but they will fail during the validation of the Spark computation graph.!TODO(kps) rename to DynObservablex@The dataframe type. Any dataset can be converted to a dataframe.dFor the Spark users: this is different than the definition of the dataframe in Spark, which is a dataset of rows. Because the support for single columns is more akward in the case of rows, it is more natural to generalize datasets to contain cells. When communicating with Spark, though, single cells are wrapped into rows with single field, as Spark does.y0A unit of data that can be accessed by the user.This is a typed unit of data. The type is guaranteed to be a proper type accessible by the Haskell compiler (instead of simply a Cell type, which represents types only accessible at runtime).TODO(kps) rename to Observablez'A typed collection of distributed data.Most operations on datasets are type-checked by the Haskell compiler: the type tag associated to this dataset is guaranteed to be convertible to a proper Haskell type. In particular, building a Dataset of dynamic cells is guaranteed to never happen.cIf you want to do untyped operations and gain some flexibility, consider using UDataFrames instead.rComputations with Datasets and observables are generally checked for correctness using the type system of Haskell.X(internal) The main data structure that represents a data node in the computation graph.]This data structure forms the backbone of computation graphs expressed with spark operations.{loc is a typed locality tag. a is the type of the data, as seen by the Haskell compiler. If erased, it will be a Cell type.The id of the node.'Non strict because it may be expensive.&The operation associated to this node.The type of the nodejThe direct parents of the node. The order of the parents is important for the semantics of the operation.SA set of extra dependencies that can be added to force an order between the nodes.2The order is not important, they are sorted by ID. TODO(kps) add this one to the idThe locality of this node. TODO(kps) add this one to the idThe nameA set of nodes considered as the logical input for this node. This has no influence on the calculation of the id and is used for organization purposes only.,The path of this oned in a computation flow.This path includes the node name. Not strict because it may be expensive to compute. By default it only contains the name of the node (i.e. the node is attached to the root)(lmnopqrstuvwxyz{|}~#lmnopqrstuvwxyz{|}~(~}|{zyxwvstupqrnolmlmnopqrstuvwxyz{|}~  None9:;OT1(developer) The operation performed by this node.(The nodes this node depends on directly.2(developer) Returns the logical parenst of a node.+Returns the logical dependencies of a node./The name of a node. TODO: should be a NodePath The path of a node, as resolved.LThis path includes information about the logical parents (after resolution).CThe type of the node TODO have nodeType' for dynamic types as wellThe identity function.Returns a compute node with the same datatype and the same content as the previous node. If the operation of the input has a side effect, this side side effect is *not* reevaluated.This operation is typically used when establishing an ordering between some operations such as caching or side effects, along with logicalDependencies.Caches the dataset.oThis function instructs Spark to cache a dataset with the default persistence level in Spark (MEMORY_AND_DISK).mNote that the dataset will have to be evaluated first for the caching to take effect, so it is usual to call count5 or other aggregrators to force the caching to occur.Uncaches the dataset.sThis function instructs Spark to unmark the dataset as cached. The disk and the memory used by Spark in the future.Unlike Spark, Karps is stricter with the uncaching operation: - the argument of cache must be a cached dataset - once a dataset is uncached, its cached version cannot be used again (i.e. it must be recomputed).SKarps performs escape analysis and will refuse to run programs with caching issues.mAutomatically caches the dataset on a need basis, and performs deallocation when the dataset is not required.This function marks a dataset as eligible for the default caching level in Spark. The current implementation performs caching only if it can be established that the dataset is going to be involved in more than one shuffling or aggregation operation.If the dataset has no observable child, no uncaching operation is added: the autocache operation is equivalent to unconditional caching."Returns the union of two datasets.In the context of streaming and differentiation, this union is biased towards the left: the left argument expresses the stream and the right element expresses the increment.DConverts to a dataframe and drops the type info. This always works.7Attempts to convert a dataframe into a (typed) dataset.This will fail if the dataframe itself is a failure, of if the casting operation is not correct. This operation assumes that both field names and types are correct.;Converts a local node to a local frame. This always works.$Converts any node to an untyped node-Removes type informatino from an observable. Adds parents to the node. It is assumed the parents are the unique set of nodes required by the operation defined in this node. If you want to set parents for the sake of organizing computation use logicalParents. If you want to add some timing dependencies between nodes, use depends.Establishes a naming convention on this node: the path of this node will be determined as if the parents of this node were the list provided (and without any effect from the direct parents of this node).For this to work, the logical parents should split the nodes between internal nodes, logical parents, and the rest. In other words, for any ancestor of this node, and for any valid path to reach this ancestor, this path should include at least one node from the logical dependencies.:This set can be a super set of the actual logical parents.lThe check is lazy (done during the analysis phase). An error (if any) will only be reported during analysis.+Sets the logical dependencies on this node.NAll the nodes given will be guaranteed to be executed before the current node.oIf there are any failures, this node will also be treated as a failure (even if the parents are all successes).8Creates a dataframe from a list of cells and a datatype.JWil fail if the content of the cells is not compatible with the data type. (internal)(internal) conversion(internal) conversion(internal) conversion(internal) conversion(internal) conversion(internal) conversion(internal) conversion(internal) conversiongLow-level operator that takes an observable and propagates it along the content of an existing dataset.*Users are advised to use the Column-based  broadcast function instead.?11?None9;T(developer API)~This function takes a non-empty list of observables and puts them into a structure. The names of each element is _0 ... _(n-1)  None5<=@Graph operations on types that are supposed to represent edges.CGraph operations on types that are supposed to represent vertices.The representation of a graph..In all the project, it is considered as a DAG.The adjacency map of a graph.The node Id corresponds to the start node, the pairs are the end node and and the edge to reach to the node. There may be multiple edges leading to the same node.!An edge, along with its end node.2A vertex in a graph, parametrized by some payload.1An edge in a graph, parametrized by some payload.The unique ID of a vertex.""None:T0The different filter modes when pruning a graph.Keep: keep the current node. CutChildren: keep the current node, but do not consider the children. Remove: remove the current node, do not consider the children.CSeparate type of error to make it more general and easier to test.^Starts from a vertex and expands the vertex to reach all the transitive closure of the vertex.Returns a list in lexicographic order of dependencies: the graph corresponding to this list of elements has one sink (the start element) and potentially multiple sources. The nodes are ordered so that all the parents are visited before the node itself..Builds the list of vertices, up to a boundary.+Builds a graph by expanding a start vertex.BAttempts to build a graph from a collection of vertices and edges.`This collection may be invalid (cycles, etc.) and the vertices need not be in topological order.KAll the vertices referred by edges must be present in the list of vertices.,The sources of a DAG (nodes with no parent).0The sinks of a graph (nodes with no descendant).0Flips the edges of this graph (it is also a DAG)[A generic transform over the graph that may account for potential failures in the process.(internal) Maps the edges ;(internal) Maps and the edges, and may create more or less. (internal) Maps the vertices. /Given a graph, prunes out a subset of vertices.PAll the corresponding edges and the unreachable chunks of the graph are removed. "The map of vertices, by vertex id.Given a list of elements with vertex/edge information and a start vertex, builds the graph from all the reachable vertices in the list.1It returns the vertices in a DAG traversal order.JNote that this function is robust and silently drops the missing vertices.                     None:TfThis structure describes the last time a node was observed by the controller, and the state it was in.{This information is used to do smart computation pruning, by assuming that the observables are kept by the Spark processes.$The status of a node being computed.bOn purpose, it does not store data. This is meant to be only the control plane of the compuations.1It assumes a compute graph, NOT a dependency dag.  None!A DAG of computation nodes.XAt a high level, it is a total function with a number of inputs and a number of outputs.Note about the edges: the edges flow along the path of dependencies: the inputs are the start points, and the outputs are the end points of the graph.' Conversion( Conversion !"#$%&'()*+,- !"#$%&'()*+,- !"#$%&'()*+,-!"#$%&'()*+,-None9:;<=T4The types of edges for the calculation of paths. - same level parent -> the node should have the same prefix as its parents - inner edge -> the parent defines the scope of this node/0123456789:;<=>/0123456789:;<=>789456:;3/012<=>/0123456789:;<=>None9:;<=BCDEFBCCBBCDEF-None9;<=OTnA tag that carries the reference information of a column at a type level. This is useful when creating column.See ref and colRef.>A dummy data type that indicates the data referenc is missing.G(dev) A column for which the type of the cells is unavailable (at the type level), but for which the origin is available at the type level.'(dev) The type of untyped column data.H8An untyped column of data from a dataset or a dataframe.yThis column is untyped and may not be properly constructed. Any error will be found during the analysis phase at runtime.I/A column of data from a dataset or a dataframe.aThis column is typed: the operations on this column will be validdated by Haskell' type inferenc.)A generalization of the column operation.This structure is useful to performn some extra operations not supported by the Spark engine: - express joins with an observable - keep track of DAGs of column operations (not implemented yet)>The data structure that implements the notion of data columns.;The type on this one may either be a Cell or a proper type.A column of data from a dataset The ref is a reference potentially to the originating dataset, but it may be more general than that to perform type-safe tricks.Unlike Spark, columns are always attached to a reference dataset or dataframe. One cannot materialize a column out of thin air. In order to broadcast a value along a given column, the  broadcast function is provided.TODO: try something like this 4https://www.vidarholen.net/contents/junk/catbag.htmlGHIGHI GHINone:T(JKLMNOPQRSTUVWXYZ[\JKLMNOPQRSTUZVWXY[\UVWXYZPQRSTONJKLM[\JKLMNOPQRSTUVWXYZ[\.None:T/Safe9:;<=?DORT@Algebraic structures that are common to columns and observables.  0None 9:;<=DORTeThe type of a column.f,Converts a type column to an antyped column.g3Drops the type information, but kees the reference.h4Casts a dynamic column to a statically typed column.In this case, one must supply the reference (which can be obtained from another column with colRef, or from a dataset), and a type (which can be built using the buildType function).iwCasts a dynamic column to a statically typed column, but does not attempt to enforce a single origin at the type level.This is useful when building a dataset from a dataframe: the origin information cannot be conveyed since it is not available in the first place. (internal)_Takes some local data (contained in an observable) and broadacasts it along a reference column.j%A tag with the reference of a column.=This is useful when casting dynamic columns to typed columns.kHTakes an observable and makes it available as a column of the same type.lFTakes a dynamic observable and makes it available as a dynamic column. (internal)mSA converience function for applying one-argument typed functions to dynamic column.4(internal) creates a new column with some empty data4(internal) Creates a new column with a dynamic type.(Homogeneous operation betweet 2 columns.'efghijklm efghijklm'efghijklm None9:;<=DORTnMThe class of types that can be renamed. It is made generic because it covers 2 notions: - the name of a compute node that will eventually determine its compute path - the name of field (which may become an object path) This syntax tries to be convenient and will fail immediately for basic errors such as illegal characters.NThis could be revisited in the future, but it is a compromise on readability.nopqrstunonoutsrqpnopqrstuo1None 9:;<=DORTvvvvNone9:;<=?DORT BThe operation of extraction from a Spark object to another object.yiThe class of projections that require some runtime introspection to confirm that the projection is valid.zXThe class of static projections that are guaranteed to succeed by using the type system.cfrom is the type of the dataset (which is also a typed dataset) to is the type of the final column.}The projector operation.This is the general projection operation in Spark. It lets you extract columns from datasets or dataframes, or sub-observables from observables.TODO(kps) put an example here.~#The projector operation for string.This is the general projection operation in Spark. It lets you extract columns from datasets or dataframes, or sub-observables from observables.HBecause of a Haskell limitation, this operator is different for strings.TODO(kps) put an example here.3Lets the users define their own static projections.LThrows an error if the type cannot be found, so should be used with caution.7String has to be used because of type inferrence issues RGiven a string that contains a name or a path, builds a dynamic column projection. 1Converts a static project to a dynamic projector.6w x yz{|}~The start typeThe name of a field assumed to be found in the start type. This only has to be valid for Spark purposes, not internal Haskell representation.   wxyz{|}~ xw}~z{|y-w x yz{|}~  1None 9:;<=?OT mA class that expresses the fact that a certain type (that is well-formed) is equivalent to a tuple of points.JUseful for auto conversions between tuples of columns and data structures.kThe class of pairs of types that express the fact that some type a can be converted to a dataset of type b.`This class is meant to be extended by users to create converters associated to their data types. kThe class of pairs of types that express the fact that some type a can be converted to a dataset of type b.HThis class is only inhabited by some internal types: lists, tuples, etc.*Represents a dataframe as a single column.'Packs a single column into a dataframe.2Packs a number of columns into a single dataframe.HThis operation is checked for same origin and no duplication of columns.^This function accepts columns, list of columns and tuples of columns (both typed and untyped).IPacks a number of columns with the same references into a single dataset.PThe type of the dataset must be provided in order to have proper type inference.TODO: example.FPacks a number of columns into a single column (the struct construct).;Columns must have different names, or an error is returned.GPacks a number of columns into a single structure, given a return type._The field names of the columns are discarded, and replaced by the field names of the structure.!yTakes a typed function that operates on columns and projects this function onto a similar operation for type observables.jThis function is not very smart and may throw an error for complex cases such as broadcasting, joins, etc."oTakes a function that operates on columns, and projects this function onto the same operations for observables.iThis is not very smart at the moment and will miss the more complex operations such as broadcasting, etc.(# $%!&"'()*+,-./0123456789:; %!"'($# $%!&"'()*+,-./0123456789:;None 9:;<=DREThe class of types that can be lifted to operations onto Karps types.pThis is the class for operations on homogeneous types (the inputs and the output have the same underlying type).xAt its core, it takes a broadcasted operation that works on columns, and makes that operation available on other shapes.7All the automatic conversions supported when lifting a FPerforms an operation, using a reference operation defined on columns.<=>?<=>?None9:;OT5A generalization of the addition for the Karps types.5A generalization of the negation for the Karps types.None:*Standard (inner) join on two sets of data.%Untyped version of the standard join.Explicit inner join."Untyped version of the inner join.UBroadcasts an observable alongside a dataset to make it available as an extra column.VBroadcasts an observable along side a dataset to make it available as an extra column.<The resulting dataframe has 2 columns: - one column called values - one column called PNote: this is a low-level operation. Users may want to use broadcastObs instead.@@NoneCasts a local data as a double.NoneT]z|>VXz|>VX]|None:OTA (developper)-A group data type with no typing information.BA dataset that has been partitioned according to some given field.0Performs a logical group of data based on a key.!Transforms the values in a group. The generalized value transform.XThis generalizes mapGroup to allow more complex transforms involving joins, groups, etc.?Given a group and an aggregation function, aggregates the data.xNote: not all the reduction functions may be used in this case. The analyzer will fail if the function is not universal.Creates a group by  expanding- a value into a potentially large collection.Note on performance: this function is optimized to work at any scale and may not be the most efficient when the generated collections are small (a few elements).Builds groups within groups.SThis function allows groups to be constructed from each collections inside a group.This function is usually not used directly by the user, but rather as part of more complex pipelines that may involve multiple levels of nesting.-Reduces a group in group into a single group.XReturns the collapsed representation of a grouped dataset, discarding group information.B"Checks that the group can be cast.'CDEFAGHIJKLMNOPQRSTUVWXYZ[\]B^_`CDEFAGHIJKLMNOPQRSTUVWXYZ[\]B^_`2None:OTaThis is the universal aggregator: the invariant aggregator and some extra laws to combine multiple outputs. It is useful for combining the results over multiple passes. A real implementation in Spark has also an inner pass.(The sum of all the elements in a column.ZIf the data type is too small to represent the sum, the value being returned is undefined.#The number of elements in a column.2Collects all the elements of a column into a list.NOTE: This list is sorted in the canonical ordering of the data type: however the data may be stored by Spark, the result will always be in the same order. This is a departure from Spark, which does not guarantee an ordering on the returned data."See the documentation of collect. abcdefghijklabcdefgklabcdefghijkl3#Dataset types and basic operations.Nonewxyz}~~}zxyw!None:oo"None5The ID of an RDD in Spark.4,4#None internalhA graph of computations. This graph is a direct acyclic graph. Each node is associated to a global path.VRepresents the state of a session and accounts for the communication with the server.A session in Spark. Encapsualates all the state needed to communicate with Spark and to perfor some simple optimizations on the code."5The configuration of a remote spark session in Karps.$The URL of the end point.%)The port used to configure the end point.&(internal) the polling interval'(optional) the requested name of the session. This name must obey a number of rules: - it must consist in alphanumerical and -,_: [a-zA-Z0-9-_] - if it already exists on the server, it will be reconnected toCThe default value is "" (a new random context name will be chosen).(HIf enabled, attempts to prune the computation graph as much as possible.This option is useful in interactive sessions when long chains of computations are extracted. This forces the execution of only the missing parts. The algorithm is experimental, so disabling it is a safe option.Disabled by default. !"#$%&'( !"#$%&'("#$%&'( ! !"#$%&'($None:L,\Given a context for the computation and a graph of computation, builds a computation object.-Exposed for debugging.Inserts the source information into the graph.]Note: after that, the node IDs may be different. The names and the paths will be kept though..EA list of file sources that are being requested by the compute graph /^Builds the computation graph by expanding a single node until a transitive closure is reached.?It performs the naming, node deduplication and cycle detection.>TODO(kps) use the caching information to have a correct fringe0?Performs all the operations that are done on the compute graph:fullfilling autocache requests checking the cache/uncache pairs+pruning of observed successful computations-deconstructions of the unions (in the future)8This could all be done on the server side at this point.21Retrieves all the observables from a computation.3<Updates the cache, and returns the updates if there are any.^The updates are split into final results, and general update status (scheduled, running, etc.)+,-./0mnopqr123st +,-./0123 +.,/012-3+,-./0mnopqr123st%None5:T5Creates a new Spark session.PThis session is unique, and it will not try to reconnect to an existing session.6GConvenience function for simple cases that do not require monad stacks.7Executes a command: - performs the transforms and the optimizations in the pure state - sends the computation to the backend - waits for the terminal nodes to reach a final state - commits the final results to the stateIf any failure is detected that is internal to Karps, it returns an error. If the error comes from an underlying library (http stack, programming failure), an exception may be thrown instead.9Exposed for debugging :Exposed for debugging ;Exposed for debugging <Given a list of paths, checks each of these paths on the file system of the given Spark cluster to infer the status of these resources.mThe primary role of this function is to check how recent these resources are compared to some previous usage.'uvwxyz45678{9:;|}<~= 456789:;< 45678<;:9#uvwxyz45678{9:;|}<~=&None:ASThe exception thrown when a request cannot be completed in an interactive session.BACreates a spark session that will be used as the default session.9If a session already exists, an exception will be thrown.C3Executes a command using the default spark session.This is the most unsafe way of running a command: it executes a command using the default spark session, and throws an exception if any error happens.EWRuns the computation described in the state transform, using the default Spark session.7Will throw an exception if no session currently exists.FSCloses the default session. The default session is empty after this call completes.bNOTE: This does not currently clear up the resources! It is a stub implementation used in testing.ABCDEFGHIJABCDEFGHABCDFEGHABCDEFGHIJ'NoneTKCThe default configuration if the Karps server is being run locally.K,"#$%&'(789ABCDEFGHK"#$%&'(A,K789BFHGCDEK4NoneTLGA description of a data source, following Spark's reader API version 2.Eeach source constists in an input source (json, xml, etc.), an optional schema for this source, and a number of options specific to this source.Since this descriptions is rather low-level, a number of wrappers of provided for each of the most popular sources that are already built into Spark.The type of the source.This enumeration contains all the data formats that are natively supported by Spark, either for input or for output, and allows the users to express their own format if requested.=The low-level option values accepted by the Spark reader API.MThe schema policty with respect to a data source. It should either request Spark to infer the schema from the source, or it should try to match the source against a schema provided by the user.N.A path to some data that can be read by Spark.0Generates a dataframe from a source description.UThis may trigger some calculations on the Spark side if schema inference is required.LGenerates a dataframe from a source description, and assumes a given schema.This schema overrides whatever may have been given in the source description. If the source description specified that the schema must be checked or inferred, this instruction is overriden.While this is convenient, it may lead to runtime errors that are hard to understand if the data does not follow the given schema.\Generates a dataframe from a source description, and assumes a certain schema on the source.$LMNLMNLMN5NoneTOThe options for the json input.Q1Declares a source of data of the given data type.tThe source is not read at this point, it is just declared. It may be found to be invalid in subsequent computations.R1Declares a source of data of the given data type.:The source is not read at this point, it is just declared.S9Reads a source of data expected to be in the JSON format.The schema is not required and Spark will infer the schema of the source. However, all the data contained in the source may end up being read in the process.9Reads a source of data expected to be in the JSON format.The schema is not required and Spark will infer the schema of the source. However, all the data contained in the source may end up being read in the process.9Reads a source of data expected to be in the JSON format.The schema is not required and Spark will infer the schema of the source. However, all the data contained in the source may end up being read in the process.OPQRS OPQRS OPQRS(NoneLMNOPQRSNPMOLQRS Column operationsNonevv"Column types and basic operations.None:GHIefghijklmyz}~IHGhij}~zygefklm6NoneTThe mode when saving the data.LInserts an action to store the given dataframe in the graph of computations.5NOTE: Because of some limitations in Spark, all the columns used when forming the buckets and the parttions must be present inside the column being written. These columns will be appended to the column being written if they happen to be missing. The consequence is that more data may be written than expected.TIt returns true if the update was successful. The return type is subject to change. 7VCore functions and data structures to communicate with the Karps server.(c) Karps contributors, 2016 Apache-2.0krapsh@yandex.com experimentalPOSIXSafe89:8;<=>?@@ABBCDEFGHIJKL)M)M)N)O)P)Q)R)S)T)UVWXYZ[\]^_`abc+d+efghijklmnopqrstuvwxyz{|}~C      !"#$$%%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVW X Y Z [ \ ] ^ _ ` a b c d e D f g h i j k l m n o p q r s t u v,w,x y z { | } ~                                                                                                                             !"#$%&'()*+,-./00123456789:;<=>??@ABCDEFGHIJKLLMNOPQRSTUVWXYZ[\]^_`ab-c-d-effghijkklmnopqrstuvwxyz{|}~000000000111111111112222222!""""""""""""""""""""""""""""""""""""""""""""" " " " " """"""""""""""""""" "!"""#"$"%#&#'#(#)#*#+#+#,#-#.#/#0#0#1#2#3#4#5#6#7$8$9$:$;$<$=$>$?$@%A%B%C%D%E%F%G%H%I%J%K%L%M&N&O&P&Q&R&S&T&U&V&W'X4Y4Z4[5\5]5^5_5`)a)b*c*d*e+f+g+h+i+j+k+l+m+n+o+p+q+r+s+t+u+v+w+x+yz{|}}~               ,,,,,,           ---------------------../////////////000000000000000000000000000000     111111 1!1"1#1$1%1&1'1(1)1*1+1,1-1.1/101112131415161789:;<=>?@ABCCDEFGHIJKLMNOPQRSTUVWXY2Z2Z2[2\2]2^2_2`2a2b2c2d$e$f$g$h$i$j$k$l%m%n%n%o%p%q%r%s%t%u%v%w%x%y%z%{%|%}%~%%%%%%%%%%&N&&&&&&&&&444444Y4444444444444444444[44444444555\555555555666666666666666666666666$karps-0.2.0.0-Ergj4tYxtT33lacYzPfeHgSpark.Core.Internal.Utilities'Spark.Core.Internal.TypesStructuresReprSpark.Core.Row#Spark.Core.Internal.RowGenericsFromSpark.Core.StructuresInternal#Spark.Core.Internal.TypesStructures Spark.Core.Internal.OpStructuresSpark.Core.TrySpark.Core.Internal.OpFunctions"Spark.Core.Internal.TypesFunctions!Spark.Core.Internal.TypesGenerics%Spark.Core.Internal.DatasetStructures$Spark.Core.Internal.DatasetFunctions&Spark.Core.Internal.LocalDataFunctions!Spark.Core.Internal.DAGStructures Spark.Core.Internal.DAGFunctionsSpark.Core.Internal.PruningSpark.Core.Internal.ComputeDagSpark.Core.Internal.Paths Spark.Core.Internal.PathsUntypedSpark.Core.ColumnSpark.Core.Internal.CachingSpark.Core.Internal.CanRename"Spark.Core.Internal.ColumnStandardSpark.Core.Internal.ProjectionsSpark.Core.TypesSpark.Core.Internal.Arithmetics#Spark.Core.Internal.ArithmeticsImplSpark.Core.Internal.Joins&Spark.Core.Internal.ObservableStandardSpark.Core.Internal.GroupsSpark.Core.ColumnFunctionsSpark.Core.FunctionsSpark.Core.Internal.Client%Spark.Core.Internal.ContextStructures#Spark.Core.Internal.ContextInternal%Spark.Core.Internal.ContextIOInternal&Spark.Core.Internal.ContextInteractiveSpark.Core.ContextSpark.IO.Inputs!Spark.Core.Internal.RowStructuresSpark.Core.Internal.LocatedBaseSpark.Core.Internal.RowGenericsSpark.Core.Internal.RowUtils$Spark.Core.Internal.ColumnStructures"Spark.Core.Internal.CachingUntyped%Spark.Core.Internal.AlgebraStructures#Spark.Core.Internal.ColumnFunctions&Spark.Core.Internal.FunctionsInternals(Spark.Core.Internal.AggregationFunctionsSpark.Core.DatasetSpark.IO.Internal.InputGenericSpark.IO.Internal.JsonSpark.IO.Internal.OutputCommon Spark.Corebase Data.Monoid<>GHC.Stack.Types HasCallStack&formatting-6.2.4-URn3UX9FqR9rF1ynQWbH4Formatting.ShortFormatterssh DataTypeReprrowsDataTypeElementRepr fieldPath isNullabletypeId fieldIndex$fEqDataTypeElementRepr$fShowDataTypeElementRepr$fGenericDataTypeElementRepr$fEqDataTypeRepr$fShowDataTypeRepr$fGenericDataTypeReprRowcellsCellEmpty IntElement DoubleElement StringElement BoolElementRowArray UnknownType<&>prettyencodeDeterministicPretty myGroupBy' myGroupBymissingfailurefailure' forceRight strictList traceHintshow' withContextToSQL valueToCellFromSQL _cellToValueTryS cellToValue$fGFromSQLTYPEK1$fGFromSQLTYPEM1$fGFromSQLTYPEM10$fGFromSQLTYPEM11$fGFromSQLTYPE:+:$fGFromSQLTYPE:*:$fGFromSQLTYPEU1 $fFromSQL(,) $fFromSQL[]$fFromSQLDataTypeElementRepr$fFromSQLDataTypeRepr $fFromSQLBool $fFromSQLCell $fFromSQLText$fFromSQLDouble $fFromSQLInt$fFromSQLMaybe $fEqDecode2 $fShowDecode2$fEqFailureInfo$fShowFailureInfo ComputationIDunComputationID FieldPath unFieldPath FieldName unFieldNameNodeIdunNodeIdNodePath unNodePathNodeName unNodeName fieldNameunsafeFieldNameemptyFieldPath nullFieldPath headFieldPath catNodePathprettyNodePath$fToJSONComputationID$fOrdFieldName$fToJSONFieldPath$fToJSONFieldName$fFromJSONNodePath$fToJSONNodePath$fFromJSONNodeName$fToJSONNodeName$fIsStringFieldName$fHashableNodeId$fShowFieldName$fShowFieldPath$fShowNodePath$fShowNodeName $fShowNodeId $fEqNodeName $fOrdNodeName $fEqNodePath $fOrdNodePath $fEqNodeId $fOrdNodeId$fGenericNodeId $fEqFieldName $fEqFieldPath$fEqComputationID$fShowComputationID$fGenericComputationIDSQLType unSQLTypeNullableDataTypeNullableCanNullNoNull StructType structFields StructFieldstructFieldNamestructFieldTypeDataType StrictType NullableTypeStrictDataTypeIntType DoubleType StringTypeBoolTypeStruct ArrayType _fieldToJson$fFromJSONStrictDataType$fFromJSONStructType$fFromJSONStructField$fFromJSONDataType$fToJSONDataType$fToJSONStructType$fToJSONStrictDataType$fArbitraryDataType$fArbitraryStrictDataType$fArbitraryStructType$fArbitraryStructField $fShowSQLType$fShowStructType$fShowStructField$fShowStrictDataType$fShowDataType$fEqStructType$fEqStructField $fEqDataType$fEqStrictDataType$fShowNullable $fEqNullable$fEqNullableDataType $fEqSQLType$fGenericSQLTypeNodeOp NodeLocalOp NodeLocalLitNodeBroadcastJoinNodeOpaqueAggregatorNodeGroupedReduction NodeReductionNodeAggregatorReductionNodeAggregatorLocalReductionNodeStructuredTransformNodeDistributedLitNodeDistributedOp NodePointerPointerpointerComputation pointerPathNodeOp2NodeLocalLiteralNodeDistributedLiteralNodeStructuredAggregationNodeStructuredTransform2NodeOpaqueTransformUniversalAggregatorOp uaoMergeTypeuaoInitialOuteruaoMergeBufferDatasetTransformDescDSScalaStaticFunctionDSStructuredTransform DSOperatorSemiGroupOperatorOpaqueSemiGroupLawUdafSemiGroupOperatorColumnSemiGroupLaw AggTransformOpaqueAggTransform InnerAggOpAggFieldafNameafValueAggOpAggUdaf AggFunction AggStructUdafApplication AlgebraicCompleteStructuredTransformInnerOp InnerStructTransformFieldtfNametfValueColOp ColExtraction ColFunctionColLit ColStructScalaStaticFunctionApplication sfaObjectName sfaMethodNameStandardOperatorsoName soOutputTypesoExtraLocalityLocal DistributedTransformInvariantOpaquePartitioningInvariantDirectPartitioningInvariantDataInputStampHdfsPath OperatorName UdafClassNameSqlFunctionName makeOperator$fFromJSONDataInputStamp$fFromJSONHdfsPath$fToJSONDataInputStamp$fToJSONHdfsPath $fEqHdfsPath$fShowHdfsPath $fOrdHdfsPath$fEqDataInputStamp$fShowDataInputStamp$fShowLocality $fEqLocality$fEqStandardOperator$fShowStandardOperator$fEqTransformField$fShowTransformField $fEqColOp $fShowColOp$fEqStructuredTransform$fShowStructuredTransform$fEqUdafApplication$fShowUdafApplication $fEqAggField$fShowAggField $fEqAggOp $fShowAggOp$fEqAggTransform$fShowAggTransform$fEqSemiGroupOperator$fShowSemiGroupOperator$fEqUniversalAggregatorOp$fShowUniversalAggregatorOp $fEqNodeOp2 $fShowNodeOp2 $fEqPointer $fShowPointer $fEqNodeOp $fShowNodeOpTry NodeErrorErrorePatheMessagetryError tryEither $fEqNodeError$fShowNodeError simpleShowOp prettyShowOpprettyShowColOphdfsPathupdateSourceStampextraNodeOpDatahashUpdateNodeOpprettyShowColFun $fToJSONAggOp$fToJSONAggField$fToJSONUdafApplication $fToJSONColOpunsafeCastType columnTypeframeTypeFromColcolTypeFromFramecompatibleTypes tupleTypeintType structField structType arrayType'canNull arrayTypeiInnerStrictType iSingleField structNamestructTypeTuplestructTypeTuple'structTypeFromFields$fFromSQLDataType jsonToCellrowArray GenericTypeGenSQLTypeablegenTypeFromProxy SQLTypeable_genericTypeFromValue buildType _buildType_buildTupleStruct$fGenSQLTypeableTYPEU1$fGenSQLTypeableTYPE:*:$fGenSQLTypeableTYPE:+:$fGenSQLTypeableTYPEK1$fGenSQLTypeableTYPEM1$fGenSQLTypeableTYPEM10$fGenSQLTypeableTYPEM11$fSQLTypeable(,)$fSQLTypeable[]$fSQLTypeableMaybe$fSQLTypeableDataType $fSQLTypeableDataTypeElementRepr$fSQLTypeableDataTypeRepr$fSQLTypeableBool$fSQLTypeableText$fSQLTypeableDouble$fSQLTypeableInt IsLocality_getTypedLocalityCheckedLocalityCast_validLocalityValues StructureEdge ParentEdge LogicalEdgeNodeEdge ScopeEdgeDataStructureEdge UntypedNode' LocalFrame DataFrame LocalDataDatasetUntypedLocalDataUntypedDataset UntypedNode LocUnknownLocDistributedLocLocal TypedLocalityunTypedLocality ComputeNode _cnNodeId_cnOp_cnType _cnParents_cnLogicalDeps _cnLocality_cnName_cnLogicalParents_cnPath$fCheckedLocalityCastLocUnknown$fIsLocalityLocDistributed$fIsLocalityLocLocal#$fCheckedLocalityCastLocDistributed$fCheckedLocalityCastLocLocal$fEqTypedLocality$fShowTypedLocality$fEqComputeNode$fShowStructureEdge$fEqStructureEdge$fShowNodeEdge $fEqNodeEdgenodeOp nodeParentsnodeLogicalParentsnodeLogicalDependenciesnodeNamenodePathnodeTypeidentitycache opnameCacheuncacheopnameUnpersist autocacheopnameAutocacheunionasDFasDSasLocalObservable asObservableuntypeduntyped'untypedDatasetuntypedLocalDataparentslogicalParentslogicalParents'depends castLocalitynodeId updateNode updateNodeOp nodeLocality emptyDatasetemptyLocalData dataframe placeholder fun1ToOpTyped fun2ToOpTyped nodeOpToFun1nodeOpToFun1TypednodeOpToFun1Untyped nodeOpToFun2nodeOpToFun2TypednodeOpToFun2Untyped broadcastPairunsafeCastDatasetcastType castType'emptyNodeStandard$fToJSONTypedLocality$fToJSONComputeNode$fShowComputeNodeconstant iPackTupleObs$fFractionalComputeNode$fIntegralComputeNode$fRealComputeNode$fOrdComputeNode$fEnumComputeNode$fNumComputeNodeGraphOperations expandVertexGraphVertexOperations vertexToIdexpandVertexAsVerticesGraphgEdges gVertices AdjacencyMap VertexEdge veEndVertexveEdgeVertexvertexId vertexDataEdgeedgeFromedgeToedgeDataVertexId unVertexId$fShowVertexId$fHashableVertexId $fShowGraph$fShowVertexEdge $fShowEdge $fShowVertex $fFunctorEdge$fFunctorVertex $fEqVertexId $fOrdVertexId$fGenericVertexIdFilterOpKeepRemove CutChildrenDagTrybuildVertexList buildGraphbuildGraphFromList graphSources graphSinks reverseGraphgraphMapVertices graphMapEdgesgraphFlatMapEdgesgraphMapVertices'graphFilterVertices vertexMapverticesAndEdgespruneLexicographic$fShowFilterVertex NodeCache NodeCacheInfo nciStatusnciComputationnciPathNodeCacheStatusNodeCacheRunningNodeCacheErrorNodeCacheSuccessemptyNodeCache pruneGraphpruneGraphDefault$fEqNodeCacheStatus$fShowNodeCacheStatus$fEqNodeCacheInfo$fShowNodeCacheInfo ComputeDagcdEdges cdVerticescdInputs cdOutputscomputeGraphToGraphgraphToComputeGraph_mapVerticesAdj mapVertices mapVertexData buildCGraphgraphDataLexico$fShowComputeDag ParentSplit psLogicalpsInnerScopesPathEdge SameLevelEdge InnerEdge HasNodeName getNodeName assignPath computePaths assignPaths' mergeScopes gatherPaths iGetScopes0$fShowPathEdge $fEqPathEdge$fShowParentSplittieNodesassignPathsUntyped$fHasNodeNameComputeNode$$fGraphOperationsComputeNodeNodeEdge"$fGraphVertexOperationsComputeNode GenericColumn DynColumnColumn AutocacheGen deriveUncachederiveIdentity CacheGraphCacheTryCachingFailure cachingNode uncachingNode escapingNodeNodeCachingType AutocacheOpCacheOp UncacheOpThroughStop checkCaching fillAutoCache$fShowNodeCachingType$fEqNodeCachingType$fShowCachingFailure$fEqCachingFailure$fShowUncacheVertex$fShowAnyCacheOp$fOrdAnyCacheOp$fEqAnyCacheOpcolType untypedCol dropColTypecastColcastCol'colRef colFromObs colFromObs' applyCol1 CanRename@@$fCanRenameEithers$fCanRenameComputeNodes$fCanRenameEithers0$fCanRenameEitherFieldName$fCanRenameColumnDatas$fCanRenameColumnDataFieldName asDoubleColProject ProjectReturnDynamicColProjectionStaticColProjection _staticProj///-_2_1unsafeStaticProjectiondynamicProjection$fProjectionEither[]Either+$fProjectionEitherStaticColProjectionEither,$fProjectionEitherDynamicColProjectionEither3$fProjectionColumnDataStaticColProjectionColumnData$fProjectionEither[]Either0,$fProjectionEitherStaticColProjectionEither0-$fProjectionEitherDynamicColProjectionEither0$fProjectionComputeNode[]Either1$fProjectionComputeNodeDynamicColProjectionEither4$fProjectionComputeNodeStaticColProjectionColumnData$$fProjectComputeNodeFixedProjection2$$fProjectComputeNodeFixedProjection1$fProjectComputeNodeText$fProjectEitherText$fProjectEitherText0($fProjectComputeNodeDynamicColProjection'$fProjectComputeNodeStaticColProjection"$fProjectEitherStaticColProjection#$fProjectEitherDynamicColProjection$$fProjectEitherDynamicColProjection0 $fMyStringaTupleEquivalencetupleFieldNames NameTupleasColasCol'pack1pack'packstruct'structGeneralizedHomo2 HomoColOp2GeneralizedHomoReturn performOp$fGeneralizedHomo2EitherEither"$fGeneralizedHomo2EitherColumnData"$fGeneralizedHomo2ColumnDataEither'$fGeneralizedHomo2ComputeNodeColumnData'$fGeneralizedHomo2ColumnDataComputeNode#$fGeneralizedHomo2ColumnDataEither0#$fGeneralizedHomo2EitherColumnData0&$fGeneralizedHomo2ColumnDataColumnData$fGeneralizedHomo2EitherEither0.+.-./div' $fNumEitherjoinjoin' joinInner joinInner'joinObsjoinObs'asDoubleLogicalGroupData GroupData groupByKeymapGroupaggKey groupAsDS$fShowGroupData$fEqGroupColumn$fShowGroupColumn$fShowPipedTranssumColsumCol'countcountCol countCol'collectcollect'datasetNodeComputationFailure ncfMessageNodeComputationSuccessncsData ncsDataTypePossibleNodeStatus NodeQueued NodeRunningNodeFinishedSuccessNodeFinishedFailureSparkComputationItemStats scisRddInfoRDDInforddiId rddiClassNamerddiRepr rddiParentsBatchComputationResultbcrTargetLocalPath bcrResultsBatchComputationKV bckvLocalPathbckvDeps bckvResult Computation cSessionIdcIdcNodescTerminalNodescCollectingNodecTerminalNodeIdsLocalSessionIdunLocalSessionRDDIdunRDDId$fFromJSONPossibleNodeStatus $fFromJSONNodeComputationSuccess $fFromJSONBatchComputationResult$fFromJSONBatchComputationKV#$fFromJSONSparkComputationItemStats$fFromJSONRDDInfo$fFromJSONRDDId$fToJSONLocalSessionId $fEqRDDId $fShowRDDId $fOrdRDDId$fEqLocalSessionId$fShowLocalSessionId$fShowComputation$fGenericComputation $fShowRDDInfo$fGenericRDDInfo$fShowSparkComputationItemStats"$fGenericSparkComputationItemStats$fShowNodeComputationSuccess$fGenericNodeComputationSuccess$fShowNodeComputationFailure$fGenericNodeComputationFailure$fShowPossibleNodeStatus$fGenericPossibleNodeStatus$fShowBatchComputationResult$fGenericBatchComputationResult$fShowBatchComputationKV$fGenericBatchComputationKV ComputeGraph SparkStateTSparkStatePureTSparkStatePure SparkState SparkSessionssConfssIdssCommandCounter ssNodeCacheSparkSessionConf confEndPointconfPortconfPollingIntervalMillisconfRequestedSessionNameconfUseNodePrunning$fShowSparkSessionConf$fShowSparkSession FinalResultprepareComputationinsertSourceInfoinputSourcesReadbuildComputationGraphperformGraphTransformsgetTargetNodesgetObservables updateCache returnPurecreateSparkSessioncreateSparkSession'executeCommand1executeCommand1'computationStatscreateComputationupdateSourceInfocheckDataStamps$fFromJSONStampReturn$fEqStampReturn$fShowStampReturn$fGenericStampReturnSparkInteractiveExceptioncreateSparkSessionDefexec1Def exec1Def' execStateDefcloseSparkSessionDefcomputationStatsDefcurrentSessionDef$$fExceptionSparkInteractiveException$fShowSparkInteractiveException defaultConfSourceDescription DataSchema SparkPath JsonOptionsJsonModejson'json jsonInfer $fToJSONRow $fToJSONCell showCallStackerror undefined$fGToSQLTYPEK1$fGToSQLTYPE:*:GToSQL _g2buffer_g2cell _valueToCell CurrentBufferConsData BuiltCell _cellOrError$fGToSQLTYPEM1$fGToSQLTYPEM10$fGToSQLTYPEM11$fGToSQLTYPE:+:$fGToSQLTYPEU1 $fToSQLText $fToSQLDouble $fToSQLInt $fToSQL(,) $fToSQLMaybeGFromSQL_gFcell InterResult FailureInfoDecode2D2ConsD2Normal_toTry_fromTry _withHint_error_jsonShowAggTrans _jsonShowSGO_prettyShowAggOp_prettyShowAggTrans_prettyShowSGO_isSym_hashUpdateJson_sDataTypeFromRepr _sToTreeRepr_packWithIndex_check _decodeLeaf_decodeLeafStrict_compatibleTypesStrict_structFromUnfields checkCellTryCell _checkCell _checkCell'_j2Cell_j2CellSplaceholderTyped _opnameUnion_asTyped_unsafeCastNode_unsafeCastNodeTyped_unsafeCastLoccheckLocalityValidity_nodeId_defaultNodeName _emptyNode_emptyNodeTyped_unaryOp_binOp_binOp' _intOperatorbuildVertexListBounded FilterVertex KeepVertex DropChildren RemoveVertex _vertexById_lexicographic _buildList_buildListGeneral_pruneLexicographic _transFilter_filt _filtEdge_createNodeCache _computePaths_lookupOrEmpty _singleScope _gatherPaths0_splitParents' _getScopes' _cleanEdges _getPathCDagColumnReferenceUnknownReferenceUntypedColumnDataGeneralizedColOp ColumnDataGeneralizedTransFieldgtfNamegtfValueGenColExtractionGenColFunction GenColLitBroadcastColOp GenColStruct_cOrigin_cType_cOp_cReferingPath$fEqColumnData CreateUncache AnyCacheOp unAnyCacheOp UncacheVertexIdentityVertex StopVertex unStopVertexAutocacheVertex_fillAutoCache_findOrCreateUncache_performEdgeTransform_autoCachingCandidates _checkCaching _cacheGraph _expansions _removals cachingType autocacheGen BinaryOpFun HomoBinaryOp2_liftFunbodLift1bodLift2bodOp _applyBinOp0 applyBinOp.* castTypeCol broadcast colFieldName iEmptyCol colExtraction homoColOp2 broadcast' colOrigincolOpgenColOpiUntypedColData_unsafeCastColData_checkedCastColData_checkedCastRefColDataunsafeProjectColextractPathUnsafe _extractPath0 _extractFielddropColReference makeColOp1_prettyShowColOp _emptyColData _homoColOp2'$fNumColumnData$fFractionalColumnData%$fHomoBinaryOp2EitherColumnDataEither%$fHomoBinaryOp2ColumnDataEitherEither$fHomoBinaryOp2aaa$fShowColumnData ProjectionstringToDynColProjcolStaticProjToDynProj_performProjectMyString convertToText_performProjectionFixedProjection2FixedProjection1 _dynProjTrypathToDynColProj projectDSDyn projectDFDyn projectDsCol projectColColprojectColDynColprojectDColDCol_projectNthFieldStaticColPackable2DynColPackableprojectColFunctionprojectColFunction'_staticPackAsColumn2 _packAsColumn checkOriginprojectColFunctionUntypedprojectColFunction2'colOpNoBroadcast _checkOrigin_unsafeBuildStruct _buildTuple _buildStruct _columnOrigin_pack1_packCol1WithObs_replaceObservables _replaceField _packCol1 _collectObs$fStaticColPackable2ref(,,)b$fStaticColPackable2ref(,)b$fTupleEquivalence(,)(,)"$fStaticColPackable2refColumnDataa$fDynColPackable(,)$fDynColPackableColumnData$fDynColPackableEither$fDynColPackable[] _projectHomo_performDynDyn _performCC _performCO_joinTypeInnerUntypedGroupData _castGroup PipedTrans PipedError PipedDataset PipedGroup GroupColumn_gcType_gcOp _gcRefName_gdRef_gdKey_gdValue mapGroupKeysmapGroupValues_keyCol _valueCol_pError_unrollTransform _unrollStep _applyAggOp_unrollGroupTrans _transformCol _combineColOp _extractColOp_aggKey _untypedGroup _groupByKey _groupColUniversalAggregator uaMergeTypeuaInitialOuter uaMergeBufferAggTrycount'_sumAgg' _countAgg' _collectAgg'applyUntypedUniAgg3applyUAOUnsafe_buildComputation _updateVertex_updateVertex2_increaseCompCounter _gatherNodes _extract1 _updateCache1_insertCacheUpdate DefLogger StampReturnstampReturnPathstampReturnError stampReturnwaitForCompletion_ret_f _parseStamp_randomSessionName _runLogger_post_get _pollMonad_createSparkSession_port_sessionEndPoint_sessionResourceCheck_sessionPortText _compEndPoint_compEndPointStatus_ensureSession_sendComputation_computationStatus_computationMultiStatus_try_computationStats_waitSingleComputation _sieInner_globalSessionRef_currentSession _setSession_removeSession_currentSessionOrThrow _getOrThrow _forceEither_throw DataFormatInputOptionValuegeneric'genericWithSchema'genericWithSchema inputPath inputSource inputSchema sdOptions inputStamp JsonFormat TextFormat CsvFormatCustomSourceFormatInputOptionKeyunInputOptionKeyInputIntOptionInputDoubleOptionInputStringOptionInputBooleanOption InferSchema UseSchema _inferSchema_inferSchemaCmd$fToJSONSourceDescription$fToJSONDataFormat$fToJSONDataSchema$fToJSONSparkPath$fIsStringSparkPath$fToJSONInputOptionValuejsonOpt'jsonOptmode jsonSchema Permissive DropMalformedFailFastdefaultJsonOptions_jsonSourceDescription _jsonOptions_modeSaveModesaveColSavingDescription partitionsbucketssavedCol saveFormatsavePathDynOutputBucket OutputBucketDynOutputPartitionOutputPartition OverwriteAppendIgnore ErrorIfExists partition partition'bucketbucket' saveDefaults