Safe Haskell | None |
---|---|
Language | Haskell2010 |
This module provides composable batched marshalling.
Batching
Calls to Java methods via JNI are slow in general. Marshalling an array of primitive values can be as slow as marshalling a single value.
Because of this, reifying an iterator or a container is best done by accumulating multiple elements on the java side before passing them to the Haskell side. And conversely, when reflecting an iterator or container, multiple Haskell values are put together before marshalling to the Java side.
Some Haskell values can be batched trivially into arrays of primitive values.
Int32
can be batched in a java int[]
, Double
can be batched in a java
double[]
, etc. However, other types like Tuple2 Int32 Double
would
require more primitive arrays. Values of type Tuple2 Int32 Double
are
batched in a pair of java arrays of type int[]
and double[]
.
data Tuple2 a b = Tuple2 a b
More generally, the design aims to provide composable batchers. If one knows
how to batch types a
and b
, one can also batch Tuple2 a b
, [a]
,
Vector a
, etc.
A reference to a batch of values in Java has the type J (Batch a)
, where
a
is the Haskell type of the elements in the batch. e.g.
type instance Batch Int32 = 'Array ('Prim "int") type instance Batch Double = 'Array ('Prim "double") type instance Batch (Tuple2 a b) = 'Class "scala.Tuple2" <> '[Batch a, Batch b]
When defining batching for a new type, one needs to tell how batches are
represented in Java by adding a type instance to the type family Batch
.
In addition, procedures for adding and extracting values from the batch
need to be specified on both the Haskell and the Java side.
On the Java side, batches are built using the interface
io.tweag.jvm.batching.BatchWriter
. On the Haskell side, these
batches are read using reifyBatch
.
class ( ... ) => BatchReify a where newBatchWriter :: proxy a -> IO (J ('Iface "io.tweag.jvm.batching.BatchWriter" <> [Interp a, Batch a] ) ) reifyBatch :: J (Batch a) -> Int32 -> IO (V.Vector a)
newBatchWriter
produces a java object implementing the BatchWriter
interface, and reifyBatch
allows to read a batch created in this fashion.
Conversely, batches can be read on the Java side using the interface
io.tweag.jvm.batching.BatchReader
. And on the Haskell side, these
batches can be created with reflectBatch
.
class ( ... ) => BatchReflect a where newBatchReader :: proxy a -> IO (J ('Iface "io.tweag.jvm.batching.BatchReader" <> [Batch a, Interp a] ) ) reflectBatch :: V.Vector a -> IO (J (Batch a))
newBatchReader
produces a java object implementing the BatchReader
interface, and reflectBatch
allows to create these batches from vectors of
Haskell values.
The methods of BatchReify
and BatchReflect
offer default
implementations which marshal elements in the batch one at a time. Taking
advantage of batching requires defining the methods explicitly. The default
implementations are useful for cases where speed is not important, for
instance when the iterators to reflect or reify contain a single element or
just very few.
Vector
s and ByteString
s are batched with the follow scheme.
type instance Batch BS.ByteString = 'Class "io.tweag.jvm.batching.Tuple2" <> '[ 'Array ('Prim "byte") , 'Array ('Prim "int") ]
We use two arrays. One of the arrays contains the result of appending all of
the ByteString
s in the batch. The other array contains the offset of each
vector in the resulting array. See ArrayBatch
.
Synopsis
- class (Interpretation a, SingI (Batch a)) => Batchable (a :: k) where
- type Batch a :: JType
- class Batchable a => BatchReify a where
- newBatchWriter :: proxy a -> IO (J ('Iface "io.tweag.jvm.batching.BatchWriter" <> [Interp a, Batch a]))
- reifyBatch :: J (Batch a) -> Int32 -> IO (Vector a)
- class Batchable a => BatchReflect a where
- newBatchReader :: proxy a -> IO (J ('Iface "io.tweag.jvm.batching.BatchReader" <> [Batch a, Interp a]))
- reflectBatch :: Vector a -> IO (J (Batch a))
- type ArrayBatch ty = 'Class "io.tweag.jvm.batching.Tuple2" <> '[ty, 'Array ('Prim "int")]
Documentation
class (Interpretation a, SingI (Batch a)) => Batchable (a :: k) Source #
A class of types whose values can be marshaled in batches.
type Batch a :: JType Source #
The type of java batches for reifying and reflecting values of type a
.
Instances
class Batchable a => BatchReify a where Source #
A class for batching reification of values.
It has a method to create a batcher that creates batches in Java, and another method that refies a batch into a vector of haskell values.
The type of the batch used to appear as a class parameter but we run into https://ghc.haskell.org/trac/ghc/ticket/13582
Nothing
newBatchWriter :: proxy a -> IO (J ('Iface "io.tweag.jvm.batching.BatchWriter" <> [Interp a, Batch a])) Source #
Produces a batcher that aggregates elements of type ty
(such as int
)
and produces collections of type Batch a
(such as int[]
).
default newBatchWriter :: Batch a ~ 'Array (Interp a) => proxy a -> IO (J ('Iface "io.tweag.jvm.batching.BatchWriter" <> [Interp a, Batch a])) Source #
reifyBatch :: J (Batch a) -> Int32 -> IO (Vector a) Source #
Reifies the values in a batch of type Batch a
.
Gets the batch and the amount of elements it contains.
Instances
class Batchable a => BatchReflect a where Source #
A class for batching reflection of values.
It has a method to create a batch reader that reads batches in Java, and another method that reflects a vector of haskell values into a batch.
We considered having the type of the batch appear as a class parameter but we run into https://ghc.haskell.org/trac/ghc/ticket/13582
Nothing
newBatchReader :: proxy a -> IO (J ('Iface "io.tweag.jvm.batching.BatchReader" <> [Batch a, Interp a])) Source #
Produces a batch reader that receives collections of type ty1
(such as int[]
) and produces values of type ty2
(such as int
).
default newBatchReader :: Batch a ~ 'Array (Interp a) => proxy a -> IO (J ('Iface "io.tweag.jvm.batching.BatchReader" <> [Batch a, Interp a])) Source #
reflectBatch :: Vector a -> IO (J (Batch a)) Source #
Reflects the values in a vector to a batch of type ty
.
Instances
Array batching
type ArrayBatch ty = 'Class "io.tweag.jvm.batching.Tuple2" <> '[ty, 'Array ('Prim "int")] Source #
Batches of arrays of variable length
The first component is an array or batch B containing the elements of all the arrays in the batch. The second component is an array of offsets F. The ith position in the offset array is the first position in B after the ith array of the batch.
Thus, the first array of the batch can be found in B between the indices 0 and F[0], the second array of the batch is between the indices F[0] and F[1], and so on.
Orphan instances
(Interpretation a, BatchReflect a) => Reflect (Vector a) Source # | |
(Interpretation a, BatchReify a) => Reify (Vector a) Source # | |
(Typeable (Dict (Reflect (Vector a))), Typeable (Dict (Interpretation a)), Typeable (Dict (BatchReflect a)), Static (Interpretation a), Static (BatchReflect a)) => Static (Reflect (Vector a)) Source # | |
closureDict :: Closure (Dict (Reflect (Vector a))) | |
(Typeable (Dict (Reify (Vector a))), Typeable (Dict (Interpretation a)), Typeable (Dict (BatchReify a)), Static (Interpretation a), Static (BatchReify a)) => Static (Reify (Vector a)) Source # | |
closureDict :: Closure (Dict (Reify (Vector a))) | |
Interpretation a => Interpretation (Vector a :: Type) Source # | |
type Interp (Vector a) :: JType | |
(Typeable (Dict (Interpretation (Vector a))), Typeable (Dict (Interpretation a)), Static (Interpretation a)) => Static (Interpretation (Vector a)) Source # | |
closureDict :: Closure (Dict (Interpretation (Vector a))) |