úÎ’ŽR      None 1345<GKLN5Sequential reduction of all the elements in an array.OSequential reduction of a multidimensional array along the innermost dimension.5Sequentially reduce values between the given indices. +Function to get an element from the source.&Binary associative combining function.Neutral starting value.Number of elements.)Function to write into the result buffer.+Function to get an element from the source.(Binary associative combination function.Neutral starting value.Total length of source.&Inner dimension (length to fold over).+Function to get an element from the source.&Binary associative combining function.Neutral starting value.Starting index. Ending index. None 1345<GKLNFill something sequentially.4The array is filled linearly from start to finish. -Fill a block in a rank-2 array, sequentially.YBlockwise filling can be more cache-efficient than linear filling for rank-2 arrays.:The block is filled in row major order from top to bottom.,Update function to write into result buffer.+Function to get the value at a given index.Number of elements to fill.,Update function to write into result buffer.-Function to get the value at an (x, y) index.Width of the whole array.&x0 lower left corner of block to fill.y0w0 width of block to fillh0 height of block to fillNone 1345<GKLNA < is a group of threads that execute arbitrary work requests.Number of threads in the gang.*Workers listen for requests on these vars.(Workers put their results in these vars. Indicates that the gang is busy. The  B type encapsulates work requests for individual members of a gang.!,Instruct the worker to run the given action."¹Tell the worker that we're shutting the gang down. The worker should signal that it's receieved the request by writing to its result var before returning to the caller (forkGang).)O(1). Yield the number of threads in the .Fork a / with the given number of threads (at least 1).#The worker thread of a ?. The threads blocks on the MVar waiting for a work request.$ÿÄFinaliser for worker threads. We want to shutdown the corresponding thread when it's MVar becomes unreachable. Without this Repa programs can complain about "Blocked indefinitely on an MVar" because worker threads are still blocked on the request MVars when the program ends. Whether the finalizer is called or not is very racey. It happens about 1 in 10 runs when for the repa-edgedetect benchmark, and less often with the others.ÒWe're relying on the comment in System.Mem.Weak that says "If there are no other threads to run, the runtime system will check for runnablefinalizers before declaring the system to be deadlocked."®If we were creating and destroying the gang cleanly we wouldn't need this, but theGang is created with a top-level unsafePerformIO. Hacks beget hacks beget hacks...Issue work requests for the  and wait until they complete.4If the gang is already busy then print a warning to %C and just run the actions sequentially in the requesting thread.&'Run an action on the gang sequentially.'&Run an action on the gang in parallel. Same as  but in the ( monad.) !"#$&' *  ) !"#$&' *None 1345<GKLN Fill something in parallel.UThe array is split into linear chunks, and each thread linearly fills one chunk. GFill something in parallel, using a separate IO action for each thread.UThe array is split into linear chunks, and each thread linearly fills one chunk. Gang to run the operation on.,Update function to write into result buffer.+Function to get the value at a given index.Number of elements. Gang to run the operation on.,Update function to write into result buffer.‘Create a function to get the value at a given index. The first argument is the thread number, so you can do some per-thread initialisation.Number of elements.  None 1345<GKLN 6Fill something in parallel, using a round-robin order.8Threads handle elements in row major, round-robin order.6Using this method helps even out unbalanced workloads. Gang to run the operation on.,Update function to write into result buffer.+Function to get the value at a given index.Number of elements.  None 1345<GKLN ÙParallel tree reduction of an array to a single value. Each thread takes an equally sized chunk of the data and computes a partial sum. The main thread then reduces the array of partial sums to the final result.¿We don't require that the initial value be a neutral element, so each thread computes a fold1 on its chunk of the data, and the seed element is only applied in the final reduction step.ÎParallel reduction of a multidimensional array along the innermost dimension. Each output value is computed by a single thread, with the output values distributed evenly amongst the available threads. Gang to run the operation on.+Function to get an element from the source.&Binary associative combining function.Starting value.Number of elements.Gang to run the operation on.)Function to write into the result buffer.+Function to get an element from the source.(Binary associative combination operator.Neutral starting value.Total length of source.&Inner dimension (length to fold over).  None 1345<GKLN+Generic version of touch,Generic version of zero-Generic version of goneDElement types that can be used with the blockwise filling functions.(This class is mainly used to define the   method. This is used internally in the imeplementation of Repa to prevent let-binding from being floated inappropriately by the GHC simplifier. Doing a .k sometimes isn't enough, because the GHC simplifier can erase these, and still move around the bindings.<This class supports the generic deriving mechanism, use deriving instance Elt (TYPE)EPlace a demand on a value at a particular point in an IO computation.*Generic zero value, helpful for debugging.)Generic one value, helpful for debugging. /+,-0123456789:;<=>?@ABCDEFG/+,-0123456789:;<=>?@ABCDEFG None 1345<GKLN-Fill a block in a rank-2 array, sequentially.YBlockwise filling can be more cache-efficient than linear filling for rank-2 arrays.rUsing cursor functions can help to expose inter-element indexing computations to the GHC and LLVM optimisers.7Coordinates given are of the filled edges of the block.:The block is filled in row major order from top to bottom. We need the # constraint so that we can use its „ function to provide an order of evaluation ammenable to the LLVM optimiser. You should compile your Haskell program with -fllvm -optlo-O3; to enable LLVM's Global Value Numbering optimisation. ,Update function to write into result buffer.;Make a cursor to a particular element from an (x, y) index.%Shift the cursor by an (x, y) offset.3Function to evaluate an element at the given index.Width of the whole array.&x0 lower left corner of block to fill.y0w0 width of block to fillh0 height of block to fillNone 1345<GKLN None 1345<GKLN+Fill a block in a rank-2 array in parallel.YBlockwise filling can be more cache-efficient than linear filling for rank-2 arrays.7Coordinates given are of the filled edges of the block.EWe divide the block into columns, and give one column to each thread.<Each column is filled in row major order from top to bottom.+Fill a block in a rank-2 array in parallel.YBlockwise filling can be more cache-efficient than linear filling for rank-2 arrays.rUsing cursor functions can help to expose inter-element indexing computations to the GHC and LLVM optimisers.7Coordinates given are of the filled edges of the block.EWe divide the block into columns, and give one column to each thread. We need the # constraint so that we can use its „ function to provide an order of evaluation ammenable to the LLVM optimiser. You should compile your Haskell program with -fllvm -optlo-O3; to enable LLVM's Global Value Numbering optimisation.HI,Update function to write into result buffer.4Function to evaluate the element at an (x, y) index.Width of the whole array.%x0 lower left corner of block to filly0 w0 width of block to fill.h0 height of block to fill. Gang to run the operation on.,Update function to write into result buffer.#Make a cursor from an (x, y) index.%Shift the cursor by an (x, y) offset.-Function to evaluate the element at an index.Width of the whole array.%x0 lower left corner of block to filly0w0 width of block to fill h0 height of block to fillHINone 1345<GKLN    J       !"#$%&'()*+,-./01-23456789:;<=>?@ABCDEFGHIJKLMNOPQRS T TUrepae_18oIhJ7bAIOJXF0MOJ8L0yData.Repa.Eval.Generic.SeqData.Repa.Eval.GangData.Repa.Eval.Generic.ParData.Repa.Eval.Elt$Data.Repa.Eval.Generic.Seq.Reduction"Data.Repa.Eval.Generic.Seq.Chunked"Data.Repa.Eval.Generic.Par.Chunked&Data.Repa.Eval.Generic.Par.Interleaved$Data.Repa.Eval.Generic.Par.Reduction#Data.Repa.Eval.Generic.Seq.Cursored#Data.Repa.Eval.Generic.Par.CursoredfoldAll foldInner foldRange fillLinear fillBlock2GanggangSizeforkGanggangIOgangST fillChunked fillChunkedIOfillInterleavedElttouchzeroonefillCursoredBlock2 foldRangeIntfoldRangeFloatfoldRangeDoubleunboxInt unboxFloat unboxDouble _gangThreads_gangRequestVars_gangResultVars _gangBusyReqReqDo ReqShutdown gangWorkerfinaliseWorkerbaseGHC.IO.Handle.FDstderrseqIOparIOGHC.STST $fShowGanggtouchgzerogoneghc-primGHC.PrimseqGElt $fElt(,,,,,) $fElt(,,,,) $fElt(,,,) $fElt(,,)$fElt(,) $fEltWord64 $fEltWord32 $fEltWord16 $fEltWord8 $fEltWord $fEltInt64 $fEltInt32 $fEltInt16 $fEltInt8$fEltInt $fEltDouble $fEltFloat $fEltChar $fEltBool$fGEltK1$fGEltM1 $fGElt:+: $fGElt:*:$fGEltU1DIM2