d)c      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~                               !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abNone*23468;<=BHKMVWorkaround for slice-wise currined filling functions inlining issues. See comment to  for details.cccNone*23468;<=BHKMddNone*23468;<=BHKM  !"  !"  !"None*23468;<=BHKMefghijklefghijklNone*23468;<=BHKM8#$%&m'()*+n,-./01o2345678p9:;<=>?@qABCDEFGHIrJKLMNOPQRSs1#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRS*%&m#$)*+n'(./01o,-45678p23;<=>?@q9:CDEFGHIrABLMNOPQRSsJKNone*23468;<=BHKMUWell-known missed in Data.List.Split function.T$Number of chunks to split range on (n)Start of range End of rangeSplit index functionU List to splitNumber of chuncks (n)Exactly n even chunks of the initial listTUTUTUNone*23468;<=BHKMVGeneralizes right folds.&To be passed to fold combinators from Data.Yarr.Walk module.WGeneralizes left folds.&To be passed to fold combinators from Data.Yarr.Walk module.X^Generalizes both partially applied left and right folds, as well as walks with mutable state."To be passed to walk runners from Data.Yarr.Walk module.Y;Alias to frequently used get-write-from-to arguments combo.%To be passed as 1st parameter of all  ing functions from Data.Yarr.Eval module.ZCurried version of X. Identical to [ , indeed.[Abstracts interval works: Ys, Zs.To be passed to functions from Data.Yarr.Utils.Fork module or called directly.VGeneralized right reduceCurried result stateful walkWGeneralized left reduceCurried result stateful walkX Initial stateIndexing function5Curried result function -- walker, emits final stateYIndexing functionWriting function!Curried result function -- workerZ6Lower bound (start for left walks, end for right ones)Upper bound (end or start)Result[ Lower bound Upper boundResultVWXYZ[VWXYZ[None*23468;<=BHKM] Version of \ which discards results.\(Number of threads to parallelize work on2Per-thread work producer, passed thread number [0..threads-1]Results](Number of threads to parallelize work on2Per-thread work producer, passed thread number [0..threads-1]\]\]\]None*23468;<=BHKMtu^_`abcdefghivwxyz{|}~   !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRS^_`abcdefghik bcdefghi !"^_`a#$#$#$&%'('('(+*),-,-,-10/.232323876549:9:9:@?>=<;ABABABIHGFEDCJKJKJKSRQPONMLtu^_`abcdefghivwNone*23468;<=BHKM jGHC simplifier tends to float numeric comparsions as high in execution graph as possible, which in conjunction with loop unrolling sometimes leads to dramatic code bloat. I'm not sure -MS functions work at all, but strict versions defenitely keep comparsions unfloated.kMaybe sequential .lDefinetely sequential .mMaybe sequential .nDefinetely sequential .oMaybe sequential clamp.pDefinetely sequential clamp.q{Mainly used to fight against GHC simplifier, which gives no chance to LLVM to perform Global Value Numbering optimization. Copied from repa, see [http://hackage.haskell.org/packages/archive/repa/3.2.3.1/doc/html/Data-Array-Repa-Eval.htmlr8The function intented to be passed as 3rd parameter to  unrolled- functions in  class and .NIf your loading operation is strictly local by elements (in most cases), use s instead of this function.s Alias to (\_ -> return ()).jklmno Min bound Max boundValue to clampValue in boundsp Min bound Max boundValue to clampValue in boundsqrstuvwxyz{|}~ jklmnopqrsqr~}|{zysjklmnopxwvutjklmnopqrstuvwxyz{|}~None*23468;<=BHKMNone*23468;<=BHKMNone)*23468;<=BHKMFor internal use.TODO: implement for  and merge with  class8Class for column-major, regular composite array indices.0, (0, 0),  (0, 0, 0) size is , size (3, 5) == 15 '(1, 2, 3) `plus` (0, 0, 1) == (1, 2, 4) (1, 2) `minus` (1, 0) == (0, 2)  offset ==  SConverts linear, memory index of shaped array to shape index without bound checks. fromLinear (3, 4) 5 == (1, 1) Opposite to R, converts composite array index to linear, "memory" index without bounds checks. for  shapes. toLinear (5, 5) (3, 0) == 15PComponent-wise minimum, returns maximum legal index for all given array extents Component-wise maximum, used in Data.Yarr.Convolution implementation.%Standard left fold wothout unrolling.wThis one and 5 following functions shouldn't be called directly, they are intented to be passed as first argument to   and functions from Data.Yarr.Work module.2Standard right folding function without unrolling.LStandard fill without unrolling. To avoid premature optimization just type fill each time you want to  # array to manifest representation.MMainly for internal use. Abstracts top-left -- bottom-right pair of indices.X2D-unrolled filling to maximize profit from "Global value numbering" LLVM optimization.Example:  blurred <-   ( ! (dim2BlockFill   r)) delayedBlurred( Outer block Inner blockShavingsExtent of array Linear index Shape indexExtent of array Shape index Linear indexSeveral array extentsMaximum common shape index Unroll factorr or s:Result curried function to be passed to working functions Unroll factorr or s:Result curried function to be passed to working functions Unroll factorr or s:Result curried function to by passed to loading functionsBlock size by x. Use  - " values.Block size by yr or s=Result curried function to be passed to loading functions.*VWXYZ[qrs*[ZYXWVqrs None*23468;<=BHKM%Number of threads to fork the work on Lower bound Upper boundPer-slice interval worksJProducer of per-thread work, which returns piece of result for each slice%Number of threads to fork the work on)(lower bound, upper bound) for each slicePer-slice interval worksWProducer of per-thread work, which returns pieces of results: [(slice number, result)]%Number of threads to fork the work on Lower bound Upper bound Interval workProducer of per-thread workNone*23468;<=BHKM\Type level fixation of preferred work (load, fold, etc.) index type of the array load type. Parameters:l - load type indexsh - shape of arraysi - preferred work index, Int or sh itself+Internal implementation class. Generalizes linear-. and simple indexing and writing function in  and  classes.tClass for mutable arrays of vectors. The class doesn't need to define functions, it just gathers it's dependencies.!Counterpart for "simple" arrays: .Class for arrays which could be created. It combines a pair of representations: freezed and mutable (raw). This segregation is lifted from Boxed representation and, in the final, from GHC system of primitive arrays. Parameters:r - freezed array representation.mr$ - mutable, raw array representationl3 - load type index, common for both representationssh - shape of arraysa - element typeO(1)6 Creates and returns mutable array of the given shape.O(1)@ Freezes mutable array and returns array which could be indexed.O(1)1 Thaws freezed array and returns mutable version.Class for mutable arrays.Just like for 0, it's function are unsafe and require calling  after the last call.Minimum complete defenition:  or .#Counterpart for arrays of vectors: Shape, genuine monadic writing.Default implementation: #write tarr sh = linearWrite tarr $  ( tarr) sh@Fast (usually), linear indexing. Intented to be used internally.Default implementation: "linearWrite tarr i = write tarr $  ( tarr) iClass for arrays of vectors which could be indexed. The class doesn't need to define functions, it just gathers it's dependencies.!Counterpart for "simple" arrays: .(Class for arrays which could be indexed.It's functions are unsafe: you must call T after the last call. Fortunately, you will hardly ever need to call them manually.Minimum complete defenition:  or .#Counterpart for arrays of vectors:  Shape, genuine monadic indexing.In Yarr arrays are always g:-indexed and multidimensionally square. Maximum index is  (extent arr).Default implementation: !index arr sh = linearIndex arr $  ( arr) sh"Surrogate" linear index. For  arrays index == linearIndex.Default implementation:  linearIndex arr i = index arr $  ( arr) iClass for arrays of vectors. Paramenters:rH - (entire) representation. Associated array type for this class is  r sh (v e).slr - slice representationl - load typesh - shapev - vector typee - vectorD (not array) element type. Array element type is entire vector: (v e).!Counterpart for "simple" arrays: .O(1)M Array of vectors -> vector of arrays. Think about this function as shallow "# from Prelude. Slices are views of an underlying array.Example: %let css = slices coords xs = css  0 ys = css  1 This class generalizes  and . Paramenters:r - representation,l - load type,sh - shape,a - element type.#Counterpart for arrays of vectors: . Returns the extent an the array.(Calling this function on foreign array ($q) ensures it is still alive (GC haven't picked it). In other manifest representations, the function defined as  return (). o is lifted to top level in class hierarchy because in fact foreign representation is the heart of the library.O(1) Ensures that array "and all it's real manifest sourcesP are fully evaluated. This function is not for people, it is for GHC compiler.Default implementation: #force arr = arr `deepseq` return ()" "   None*23468;<=BHKMGeneral shape load type index. s with  load type index specialize  and  and leave  and  functions defined by default.Type-level distinction between  inear and ^aped arrays is aimed to avoid integral division operations while looping through composite (,  ) indices.BIntegral division is very expensive operation even on modern CPUs.Linear load type index. s with  load type index define  and  and leave  and  functions defined by default.This class extends  just like  extends V. It abstracts slice-wise loading from one array type to another in specified range.O(n)R Loads vectors from source to target in specified range, slice-wise, in parallel.O(n)^ Sequentially loads vector elements from source to target in specified range, slice by slice.Class abstracts separated in time and space loading A of one array type to another. Result of running functions with -Slices- infix is always identical3 to result of running corresponding function from  class.  and H are just about performance. If target representation is separate (ex. (% $) ), using  may be faster than ( because of per-thread memory locality. Parameters: r - source representationslr - source slice representationl - source load type tr - target representationtslr - target slice representationtl - target load typesh - shape of arraysv - source vector typev2 - target vector typee; - vector element type, common for source and target arraysO(n)H Entirely, slice-wise loads vectors from source to target in parallel.Example: J-- blurred and delayedBlurred are arrays of color components. loadSlicesP   delayedBlurred blurred O(n)B Sequentially loads vectors from source to target, slice by slice.YClass abstracts pair of arrays which could be loaded in just specified range of indices.4"Range" is a multidimensional segment: segment for  arrays, square for  arrays and cube for D. Thus, it is specified by pair of indices: "top-left" (minimum is ") and "bottom-right" (maximum is ( arr tarr) ) corners.O(n)F Loads elements from source to target in specified range in parallel.Example: 5let ext = extent convolved res <- new ext rangeLoadP  + convolved res (5, 5) (ext `minus` (5, 5)) O(n)F Sequentially loads elements from source to target in specified range.PThis class abstracts pair of array types, which could be loaded one to another. Parameters:r& - source representation. Instance of D class. Typically one of fused representations: &, (% &) or '(.l - source load typetr& - target representation. Instance of  class.tl - target load typesh - shape of arraysa - array element type#Counterpart for arrays of vectors: .TODO:T this class seems to be overengineered, normally it should have only 3 parameters:  Load l tl sh. But Convoluted ('(;) representation is tightly connected with it's load type.Used in fillA parameter function. There are two options for this type to be: sh itself or Int . Don't confuse this type with load type indexes: r and ls. There are 2 different meanings of word "index": data type index (haskell term) and array index (linear, shape).O(n)- Entirely loads source to target in parallel.dFirst parameter is used to parameterize loop unrolling to maximize performance. Default choice is ! -- vanilla not unrolled looping. Examples: tarr <-  ( arr) loadP   arr tarr loadP (   r) ( 2) arr tarr O(n) Sequential analog of # function. Loads source to target ly.Example: loadS (  s)  arr tarrYThere are 2 common ways to parameterize parallelism: a) to say "split this work between na threads" or b) to say "split this work between maximum reasonable number of threads", that is  capabilities . Since + function is monadic, we need always pass IO Intv as thread number parameter in order not to multiply number of functions in this module (there are already too many). Alias to . Alias to .O(n)A This function simplifies the most common way of loading arrays. Instead of  mTarget <-  (extent source)    source mTarget target <-  mTarget You can write just target <- compute (  ) source!Most common parallel use case of .  dComputeP =  (  )#Most common sequential use case of .  dComputeS =  ( )/Determines maximum common range of 2 arrays -  ion of their s.(Fill function to work  on slices+Number of threads to parallelize loading onSource array of vectorsTarget array of vectorsTop-left)and bottom-right corners of range to loadFill function to work  on slicesSource array of vectorsTarget array of vectorsTop-left)and bottom-right corners of range to loadFill function to work  on slices+Number of threads to parallelize loading onSource array of vectorsTarget array of vectorsFill function to work  on slicesSource array of vectorsTarget array of vectorsFilling (real worker) function+Number of threads to parallelize loading on Source array Target array Top-left )and bottom-right corners of range to loadFilling (real worker) function Source array Target arrayTop-left)and bottom-right corners of range to loadFilling (real worker) function+Number of threads to parallelize loading on Source array Target arrayFilling (real worker) function Source array Target arrayLoading function Source array"Entirely loaded from the source, d manifest target arrayYY None*23468;<=BHKMLike , this class abstracts the pair array types, which should be fused one to another on maps and zips which accept index of element (several elements for zips) in array (arrays). Parameters:rA - source array representation. Determines result representation.l - source load typefr (fused repr) - result (fused) array representation. Result array isn't indeed presented in memory, finally it should be  d or  ed to  representation.fl - result, "shaped" load typesh - shape of arraysFAll functions are already defined, using non-injective versions from  class.*The class doesn't have vector counterpart.O(1)' Pure element mapping with array index.O(1)$ Monadic element mapping with index.O(1)% Pure zipping of 2 arrays with index.O(1)( Monadic zipping of 2 arrays with index.O(1)% Pure zipping of 3 arrays with index.O(1)( Monadic zipping of 3 arrays with index.O(1)W Generalized pure element zipping with index in arrays. Zipper function is wrapped in   for injectivity.O(1) Monadic version of  function.Like G, for mappings/zippings with array index. Used to define functions in .Minimum complete defenition: , ,  and .*The class doesn't have vector counterpart.XThis class abstracts pair of array types, which could be (preferably should be) mapped (fused)& one to another. Injective version of  class. Parameters:rD - source array representation. It determines result representation.fr (fused repr) - result (fused) array representation. Result array isn't indeed presented in memory, finally it should be  d or  ed to  representation.l0 - load type, common for source and fused arrayssh - shape of arraysEAll functions are already defined, using non-injective versions from  class.TThe class doesn't have vector counterpart, it's role play top-level functions from Data.Yarr.Repr.Separate module.O(1) Pure element mapping.Main basic "map" in Yarr.O(1) Monadic element mapping.O(1)6 Zipping 2 arrays of the same type indexes and shapes.Example: %let productArr = dzip2 (*) arr1 arr2 O(1) Monadic version of  function.O(1)6 Zipping 3 arrays of the same type indexes and shapes.O(1) Monadic version of  function.O(1)P Generalized element zipping with pure function. Zipper function is wrapped in   for injectivity.O(1) Monadic version of  function.&Generalized, non-injective version of . Used internally.Minimum complete defenition: , ,  and .TThe class doesn't have vector counterpart, it's role play top-level functions from Data.Yarr.Repr.Separate module.%Indexed mapping function Source arrayFused result array Indexed monadic mapping function Source arrayResult fused arrayIndexed zipping function1st source array2nd source arrayFused result array Indexed monadic zipping function1st source array2nd source arrayFused result arrayIndexed zipping function1st source array2nd source array3rd source arrayFused result array Indexed monadic zipping function1st source array2nd source array3rd source arrayFused result arrayAccepts index in array and returns wrapped zipper, which positionally accepts elements from source arrays and emits element for the result arrayBunch of source arraysResult fused arrayMonadic indexed zipper Source arraysResult fused array.......Element mapper function Source array Result arrayMonadic element mapper function Source array Result arrayPure element zipper function1st source array2nd source arrayFused result arrayMonadic element zipper function1st source array2nd source array Result arrayPure element zipper function1st source array2nd source array3rd source array Result arrayMonadic element zipper function1st source array2nd source array3rd source arrayFused result arrayeWrapped function positionally accepts elements from source arrays and emits element for fused array Source arrays Result arrayWrapped monadic zipper Source arrays Result array.......$% None*23468;<=BHKMIn opposite to .elayed (source) Delayed Target holds abstract writing function: (sh -> a -> IO ()). It may be used to perform arbitrarily tricky things, because no one obliges you to indeed write an element inside wrapped function.DDelayed representation is a wrapper for arbitrary indexing function. D  sh a instance holds linear getter ( (Int -> IO a)), and  D  sh a - shaped, "true"  (sh -> IO a) index, respectively.D?elayed arrays are most common recipients for fusion operations.3Load type preserving wrapping arbirtary array into elayed representation.3Wrap indexing function into delayed representation.Use this function carefully, don't implement through it something that has specialized implementation in the library (mapping, zipping, etc).<Suitable to obtain arrays of constant element, of indices (fromFunction sh  ), and so on.Wraps ( arr) into Delayed representation. Normally you shouldn't need to use this function. It may be dangerous for performance, because preferred $ing type of source array is ignored."Extent of arrayIndexing function Result arrayExtent of arrayLinear ndexing function Result array  None*23468;<=BHKM `SEparate meta array representation. Internally SEparate arrays hold vector of it's slices (so,  is just getter for them).Mostly useful for:Separate in memory manifest $'oreign arrays ("Unboxed" arrays in vector/repa libraries terms)./Element-wise vector array fusion (see group of  functions).  Group of  f-...-Elems-) functions is used internally to define  d-...-Elems- functions.O(1)) Injective element-wise fusion (mapping).Example: (let domainHSVImage = dmapElems (`; (* 360) (* 100) (* 100)) normedHSVImage  Also, used internally to define ) function.O(1) Monadic vesion of  function.O(1)? Generalized element-wise zipping of several arrays of vectors.O(1)F Generalized monadic element-wise zipping of several arrays of vectorsO(1)z Glues several arrays of the same type into one separate array of vectors. All source arrays must be of the same extent.Example: !let separateCoords = fromSlices (` xs ys zs)O(depends on mapper function)+ Maps slices of separate array "entirely".eThis function is useful when operation over slices is not element-wise (in that case you should use )): -let blurredImage = unsafeMapSlices blur imagetThe function is unsafe because it doesn't check that slice mapper translates extents uniformly (though it is pure).O(0)A Converts separate vector between vector types of the same arity.Example: -- floatPairs ::  ( $)  (   Float) let cs ::  ( $)  (*+$ Float) cs = convert floatPairs  . . . . ....Vector of mapper functionsSource array of vectors Fused array%Elemen-wise vector of monadic mappersSource array of vectors Result array....Vector of wrapped m-ary element-wise zippers"Vector of source arrays of vectorsFused result arrayVector of wrapped m-!ary element-wise monadic zippers"Vector of source arrays of vectors Result array!Slice mapper without restrictionsSource separate arrayResult separate array               None*23468;<=BHKMO(1) Function from repa.O(1)/ Function for in-place zipping vector elements. Always true: zipElems f arr ==  (  f) ( arr)Example: let s = zipElems ( ) coordsO(1)B Maps elements of vectors in array uniformly. Don't confuse with 3, which accepts a vector of mapper for each slice.$Typical use case -- type conversion: @let floatImage :: UArray F Dim2 Float floatImage = mapElems  word8Image O(1) Monadic version of  function. Don't confuse with .Example: let domained = mapElemsM (, 0.0 1.0) floatImage O(1)! Generalized zipping of 2 arrays.Main basic "zipWith" in Yarr.^Although sighature of this function has extremely big predicate, it is more permissible than C counterpart, because source arrays shouldn't be of the same type.Implemented by means of = function (source arrays are simply delayed before zipping).!O(1)V Generalized zipping of 3 arrays, which shouldn't be of the same representation type. 6Function to produce result extent from source extent.gFunction to produce elements of result array. Passed a lookup function to get elements of the source.Source array itself Result array Unwrapped n-ary zipper functionSource array of vectors Result array Mapper function for all elementsSource array of vectorsFused array of vectors&Monadic mapper for all vector elementsSource array of vectorsFused array of vectors Pure zipping function1st source array2nd source arrayFused result array!Pure zipping function1st source array2nd source array3rd source array Result array"#$%&') !"#$%&') !"#$%&'  !"#$%&'None*23468;<=BHKM(Foreign Slice representation, view slice representation for )oreign arrays.9To understand Foreign Slices, suppose you have standard image array of  )  (   Word8) type.+It's layout in memory (with array indices): : r g b | r g b | r g b | ... (0, 0) (0, 1) (0, 2) ...  &let (VecList [reds, greens, blues]) = 6 image -- reds, greens, blues :: UArray FS Dim2 Word8 Now bluesC just indexes each third byte on the same underlying memory block: 8... b | ... b | ... b | ... (0, 0) (0, 1) (0, 2)... )6Foreign representation is the heart of Yarr framework.!Internally it holds raw pointer (~), which makes indexing foreign arrays not slower than GHC's built-in primitive arrays, but without freeze/thaw boilerplate.bForeign arrays are very permissible, for example you can easily use them as source and target of  ;ing operation simultaneously, achieving old good in-place C-style array modifying:  !  (  arr) arr(Foreign arrays are intented to hold all H types and vectors of them (because there is a conditional instance of Storalbe class for s of s too).*O(1)* allocates zero-initialized foreign array.Needed because common ' function allocates array with garbage.+O(1)B Returns pointer to memory block used by the given foreign array.WMay be useful to reuse memory if you don't longer need the given array in the program: brandNewData <- , ext ( (toForeignPtr arr)) ,O(1)& Wraps foreign ptr into foreign array.oThe function is unsafe because it simply don't (and can't) check anything about correctness of produced array.()*+,    ()*+,)(*+,()*+,    -None*23468;<=BHKM      None*23468;<=BHKM-O(1).O(1)/O(1)0O(1)1O(1)2O(1)3O(1)4O(1) Version of 3C, accepts mutating function which additionaly accepts array index.5O(1)6O(1)7O(n). Walk with state, with non-indexed function (. group of fold combinators, 3).Example: . = walk (1  (:)) (return [])8O(n)* Walk with state, with indexed function (, , 4, etc).Example: res <- iwalk (& (\s i a -> ...)) foldZero sourceArray9O(n)0 Walk with state, in specified range of indices.:O(n)8 Run associative non-indexed stateful walk, in parallel.=Example -- associative image histogram filling in the test: Fhttps://github.com/leventov/yarr/blob/master/tests/lum-equalization.hs;O(n)4 Run associative indexed stateful walk, in parallel.<O(n)? Run associative stateful walk in specified range, in parallel.=O(n)R Walk with state, with non-indexed function, over each slice of array of vectors.>O(n)N Walk with state, with indexed function, over each slice of array of vectors.?O(n)V Walk with state, in specified range of indices, over each slice of array of vectors.@O(n)Y Run associative non-indexed stateful walk over slices of array of vectors, in parallel.AO(n)U Run associative indexed stateful walk over slices of array of vectors, in parallel.BO(n)a Run associative stateful walk in specified range, over slices of array of vectors, in parallel.- or curried Monadic left reduce2Result stateful walk to be passed to walk runners. or curried Pure left reduce2Result stateful walk to be passed to walk runners/ or curried Pure indexed left reduce2Result stateful walk to be passed to walk runners0 or curried Monadic right reduce2Result stateful walk to be passed to walk runners1 or curried Pure right reduce2Result stateful walk to be passed to walk runners2 or curried Pure indexed right reduce2Result stateful walk to be passed to walk runners3 or curried  . If mutating is associative,  is also acceptable.K(state -> array element -> (state has changed)) -- State mutating function2Result stateful walk to be passed to walk runners4 or curried  . If mutating is associative,  is also acceptable.Indexed state mutating function2Result stateful walk to be passed to walk runners567Stateful walking function7Monadic initial state (fold zero). Wrap pure state in . Source arrayFinal state (fold result)8Stateful walking function7Monadic initial state (fold zero). Wrap pure state in . Source arrayFinal state (fold result)9Stateful walking function7Monadic initial state (fold zero). Wrap pure state in . Source arrayTop-left,and bottom-right corners of range to walk inFinal state (fold result):(Number of threads to parallelize walk on%Associative stateful walking function(Monadic zero state. Wrap pure state in .*Associative monadic state joining function Source arrayGathered state (fold result);(Number of threads to parallelize walk on%Associative stateful walking function(Monadic zero state. Wrap pure state in .*Associative monadic state joining function Source arrayGathered state (fold result)<(Number of threads to parallelize walk on%Associative stateful walking function(Monadic zero state. Wrap pure state in .*Associative monadic state joining function Source arrayTop-left,and bottom-right corners of range to walk inGathered state (fold result)=$Stateful slice-wise walking function7Monadic initial state (fold zero). Wrap pure state in .Source array of vectors%Vector of final states (fold results)>$Stateful slice-wise walking function7Monadic initial state (fold zero). Wrap pure state in .Source array of vectors%Vector of final states (fold results)?$Stateful slice-wise walking function7Monadic initial state (fold zero). Wrap pure state in .Source array of vectorsTop-left,and bottom-right corners of range to walk in%Vector of final states (fold results)@(Number of threads to parallelize walk on$Stateful slice-wise walking function(Monadic zero state. Wrap pure state in .*Associative monadic state joining functionSource array of vectors$Vector of gathered per slice resultsA(Number of threads to parallelize walk on$Stateful slice-wise walking function(Monadic zero state. Wrap pure state in .*Associative monadic state joining functionSource array of vectors$Vector of gathered per slice resultsB(Number of threads to parallelize walk on$Stateful slice-wise walking function(Monadic zero state. Wrap pure state in .*Associative monadic state joining functionSource array of vectorsTop-left,and bottom-right corners of range to walk in$Vector of gathered per slice resultsVWX-./0123456789:;<=>?@AB./-1205634789:;<=>?@ABXWV-./0123456789:;<=>?@ABNone*23468;<=BHKMDYarr something to stderr.EYarr something as  message.CDE !CDECDE CDE! None*23468;<=BHKMFMutable Boxed is a wrapper for ".GG%oxed representation is a wrapper for # from  primitiver package. It may be used to operate with arrays of variable-lengths or multiconstructor ADTs, for example, lists.For /0% element types you would better use $oreign arrays.TODO:. test this representation at least one time...FG$%&'()*+,-./0123456 FG$%GFFG&'()*+,-/.$0123465%'None*23468;<=BHKMH ConVolution  / type is specialized to load convoluted arrays. It loads 7 with 8& and borders outside the center with 9 separately.qIt is even able to distribute quite expensive border loads evenly between available threads while parallel load.MElement-wise Loading convoluted arrays wasn't inlined propely with unrolled Ying (, ). However, with simple  performance was OK.For details see chttp://stackoverflow.com/questions/14748900/ghc-doesnt-perform-2-stage-partial-application-inliningALMOST SOLVED:9 you just need to support unrolled filling function with INLINE pragma, see  :https://github.com/leventov/yarr/blob/master/tests/blur.hs, ffill function.IDConvolution fused representation internally keeps 2 element getters:slow  border getd, which checks every index from applied stencil to lay inside extent of underlying source array.fast  center get(, which doesn't worry about bound checksand 7 .JCRetreives fast center get from convoluted array and wraps it into  elayed array.Remember that array indexing in Yarr is always zero-based, so indices in result array are shifted by top-level corner offset of given convoluted array.HI:;<=978J>?@ABCDEHI:;<=978J HIJ>?@ABDC:;<=978E1None*23468;<=BHKMFGHIJKLMFGHIJKLM2None*23468;<=BHKMKGeneralized static  stencil.O(Stencil values, packed in nested vectorsPGeneralized reduce functionQ Reduce zeroRGeneralized static  stencil.VGeneralized reduce functionW Reduce zeroX5QuasiQuoter for producing typical numeric convolving ? stencil, which effectively skips unnecessary multiplications. [dim1St| 1 4 6 4 1 |]Produces R  (  [\ acc a -> return (acc + a), \ acc a -> (return $ (acc + (4 * a))), \ acc a -> (return $ (acc + (6 * a))), \ acc a -> (return $ (acc + (4 * a))), \ acc a -> return (acc + a)]) (\ acc a reduce -> reduce acc a) (return 0) Y Most useful  stencil producer.Typing <[dim2St| 1 2 1 0 0 0 -1 -2 -1 |]  Results to K   (  [  [\ acc a -> return (acc + a), \ acc a -> (return $ (acc + (2 * a))), \ acc a -> return (acc + a)],  d [\ acc _ -> return acc, \ acc _ -> return acc, \ acc _ -> return acc],   [\ acc a -> return (acc - a), \ acc a -> (return $ (acc + (-2 * a))), \ acc a -> return (acc - a)]]) (\ acc a reduce -> reducej acc a) (return 0) ZCurried version of [5 with border get clamping indices out of bounds to 0 or ( source).[ Convolves  array with static stencil.\Clamps 6 index out of bounds to the nearest one inside bounds.] Defined as #dConvolveShDim2WithStaticStencil = ^ \Example: Jlet gradientX = dConvolveLinearDim2WithStaticStencil [Y]| -1 0 1 -2 0 2 -1 0 1 |] image ^ Convolves  array with #aped load type with static stencil._ Analog of ] to convolve arrays with inear load index.` Analog of ^ to convolve arrays with inear load index.NKLMNOPQRSTUVWXOYPQZConvolution stencil Source arrayFused convolved result array[(Source array -> Extent of this array -> Index (may be out of bounds) -> Result value): Border index (to treat indices near to bounds)Convolution stencil Source arrayFused convolved result array\]Convolution stencil Source arrayFused convolved result array^(Source array -> Extent of this array -> Index (may be out of bounds) -> Result value): Border index (to treat indices near to bounds)Convolution stencil Source arrayFused convolved result array_Convolution stencil Source arrayFused convolved result array`(Source array -> Extent of this array -> Index (may be out of bounds) -> Result value): Border index (to treat indices near to bounds)Convolution stencil Source arrayFused convolved result arrayRSTKLMNOPQRSTUVWXYZ[\]^_`NKLMNOPQRSTUVWXOYPQZ[\]^_`RSTNone*23468;<=BHKM$HI:;<=978JKLMNOPQRSTUVWXYZ[\]^_`IHJRSTUVWXZ[KLMNOPQY\]^_`None*23468;<=BHKMaO(n)W Covert array to flat list. Multidimentional arrays are flatten in column-major order:7[(elem at (0, .., 0, 1)), (elem at (0, .., 0, 2)), ...]bO(n)D Loads manifest array into memory, with elements from flatten list.FUse this function in the last resort, there are plenty of methods to  array, from &elayed array for example.abExtent of arrayFlatten elementsResult manifest arrayabababNone*23468;<=BHKMu Y !"#$%&')+,  ),+U340567568569:;<:;=:;>:;?:;@:;@:AB:AC:AC:AD:AE:AF:AG:AH:AIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~,            !                              ! " # $ % & ' ( ) * + , - . / & 0 1 2 3 4 5 6 % 7 8 9 : ; < = > ? @ A B C D E F G H IJK)LMNOPQRSTU$VWXYZ[\]^_`abcdefghijklmnopqrs''('t2u2u2v2w2x2y2z2{2{2|2}2~2222222222.:;:;.:;I:;:;:;:;:;:; :; :;:;:;:;:;:;:;:;:;:;:;:;:;:;:;:;:;:;:;:;:;:;:;:;:;:;:;:;:;:;:A:A:A:A3333                                                       33333 !"#$%&'()*+,-./0-1-2-3-4-5-6-7-8-93:;<=>?@ABCDEFGHFGIJKLMNOPQRSTUVWXYZ'['\']'^'_'`'a'b'c'd'e''f'g'h1i1j1k1l1m1n1o1p2q2r2s2t2u2v2wx yarr-1.3.3Data.Yarr.Repr.ForeignData.Yarr.Base Data.YarrData.Yarr.Utils.FixedVectorData.Yarr.Utils.SplitData.Yarr.ShapeData.Yarr.Utils.ParallelData.Yarr.Utils.PrimitiveData.Yarr.Utils.ForkData.Yarr.EvalData.Yarr.FusionData.Yarr.Repr.DelayedData.Yarr.Repr.SeparateData.Yarr.FlowData.Yarr.Walk Debug.YarrData.Yarr.Repr.BoxedData.Yarr.ConvolutionData.Yarr.IO.List*Data.Yarr.Utils.FixedVector.InlinableArityCVL$Data.Yarr.Utils.FixedVector.VecTuple!Data.Yarr.Utils.FixedVector.Arity3Data.Yarr.Utils.FixedVector.InlinableArityInstances-Data.Yarr.Utils.FixedVector.VecTupleInstancesData.Yarr.WorkTypesLoadShape dim2BlockFillData.Yarr.Utils.LowLevelFlowData.Yarr.Utils.StorablecomputeloadSPreludeunzipFSEDData.Yarr.Convolution.ReprCVmapElems Data.ComplexComplexclampMData.Yarr.Walk.InternaltoListForeignStorableData.Yarr.Convolution.Eval$Data.Yarr.Convolution.StaticStencilsbaseForeign.Storabledeepseq-1.3.0.2Control.DeepSeqdeepseqrnfNFDatafixed-vector-0.1.2.1Data.Vector.FixedN1N2N3N4VecListData.Vector.Fixed.InternalFnFunarityArityDiminspect constructVectorInlinableArityinlinableZipWith inlinableMapmakeInlinableArityInstanceVecTuplemakeVecTupleInstanceN8N7n1n2n3n4n5n6n7n8VT_2 toTuple_2sel_2_1sel_2_2VT_3 toTuple_3sel_3_1sel_3_2sel_3_3VT_4 toTuple_4sel_4_1sel_4_2sel_4_3sel_4_4VT_5 toTuple_5sel_5_1sel_5_2sel_5_3sel_5_4sel_5_5VT_6 toTuple_6sel_6_1sel_6_2sel_6_3sel_6_4sel_6_5sel_6_6VT_7 toTuple_7sel_7_1sel_7_2sel_7_3sel_7_4sel_7_5sel_7_6sel_7_7VT_8 toTuple_8sel_8_1sel_8_2sel_8_3sel_8_4sel_8_5sel_8_6sel_8_7sel_8_8makeSplitIndex evenChunksFoldrFoldl StatefulWalkFillWalkWorkparallel parallel_vl_1vl_2vl_3vl_4zipWith3 zipWithM_applyallanyzeroiifoldliifoldM PrimitiveOrdminMminM'maxMmaxM'clampM' TouchabletouchnoTouch$fPrimitiveOrdFloat$fPrimitiveOrdDouble$fPrimitiveOrdWord$fPrimitiveOrdChar$fPrimitiveOrdInt $fTouchablev$fTouchableDouble$fTouchableFloat$fTouchableWord64$fTouchableWord32$fTouchableWord16$fTouchableWord8$fTouchableWord$fTouchableInt64$fTouchableInt32$fTouchableInt16$fTouchableInt8$fTouchableInt$fTouchableBoolDim3Dim2Dim1 MultiShapelowerinnercombine BlockShape BorderCount clipBlocksizeincplusminusoffset fromLineartoLinear intersect complementintersectBlocks blockSize insideBlockmakeChunkRangefoldl unrolledFoldlfoldr unrolledFoldrfill unrolledFillBlockmakeForkEachSlicemakeForkSlicesOncemakeForkPreferredWorkIndex WorkIndextoWorkgindexgwritegsize UVecTargetManifestnewfreezethawUTargetwrite linearWrite UVecSourceUSourceindex linearIndex VecRegularslicesRegularUArrayextent touchArrayforceSHL RangeVecLoadrangeLoadSlicesPrangeLoadSlicesSVecLoad loadSlicesP loadSlicesS RangeLoad rangeLoadP rangeLoadS LoadIndexloadPThreadscapsthreads dComputeP dComputeSentireDefaultIFusionimapimapMizip2izip2Mizip3izip3MizipizipMIFusionfimapfimapMfizip2fizip2Mfizip3fizip3MfizipfizipM DefaultFusiondmapdmapMdzip2dzip2Mdzip3dzip3MdzipdzipMFusionfmapfmapMfzip2fzip2Mfzip3fzip3MfzipfzipM$fFusionrfrlshDTdelay fromFunctionfromLinearFunction linearConst shapedConst delayShaped delayLinear fmapElems fmapElemsM fzipElems2 fzipElems2M fzipElems3 fzipElems3M fzipElems fzipElemsM dmapElems dmapElemsM dzipElems2 dzipElems2M dzipElems3 dzipElems3M dzipElems dzipElemsM fromSlicesunsafeMapSlicesconverttraversezipElems mapElemsMdzipWith dzipWith3 cartProduct2 icartProduct2icartProduct2M cartProduct3 icartProduct3icartProduct3MFSnewEmpty toForeignPtrunsafeFromForeignPtr reduceLeftMreduceLireduceL reduceRightMreduceRireduceRmutateimutate reduceInner ireduceInnerwalkiwalk rangeWalkwalkPiwalkP rangeWalkPwalkSlicesSeparateiwalkSlicesSeparaterangeWalkSlicesSeparatewalkSlicesSeparatePiwalkSlicesSeparatePrangeWalkSlicesSeparatePCHKyarryerrMBB justCenter Dim2Stencildim2StencilSizeXdim2StencilSizeYdim2StencilValuesdim2StencilReducedim2StencilZero Dim1Stencildim1StencilSizedim1StencilValuesdim1StencilReducedim1StencilZerodim1Stdim2StdConvolveDim1WithStaticStencilconvolveDim1WithStaticStencil dim2OutClamp dConvolveShDim2WithStaticStencilconvolveShDim2WithStaticStencil$dConvolveLinearDim2WithStaticStencil#convolveLinearDim2WithStaticStencilfromListfunD'$fInlinableArityS$fInlinableArityS0$fInlinableArityS1$fInlinableArityS2$fInlinableArityS3$fInlinableArityS4$fInlinableArityS5$fInlinableArityS6TFCo:R:DimVecTupleTFCo:R:DimVecTuple0TFCo:R:DimVecTuple1TFCo:R:DimVecTuple2TFCo:R:DimVecTuple3TFCo:R:DimVecTuple4TFCo:R:DimVecTuple5T_ifoldlgifoldlF$fNFDataVecList izipWithMizipWithzipWithMzipWithimapM_mapM_mapM sequence_sequencemapeqminimummaximumsumifoldMifoldlfoldl1foldM!tailWithtailhead generateMgeneratebasis replicateM replicate|>convecconvertContinuationN5N6NewlengthZSVectorNghc-prim GHC.Classesminmaxfill# unrolledFill#foldl#unrolledFoldl#foldr#unrolledFoldr# $fStorablevGHC.Baseidflip$fMultiShape(,,)(,) $fShape(,,)$fMultiShape(,)Int$fBlockShape(,) $fShape(,)$fBlockShapeInt $fShapeInt$fWorkIndex(,,)Int$fWorkIndex(,)Int$fWorkIndexshsh GHC.Conc.SyncgetNumCapabilitiesreturn"$fRangeVecLoadrslrSHtrtslrSHshvv2e$fVecLoadrslrSHtrtslrSHshvv2e$fRangeLoadrSHtrSHsha$fLoadrSHtrSHsha!$fRangeVecLoadrslrLtrtslrSHshvv2e$fVecLoadrslrLtrtslrSHshvv2e$fRangeLoadrLtrSHsha$fLoadrLtrSHsha!$fRangeVecLoadrslrSHtrtslrLshvv2e$fVecLoadrslrSHtrtslrLshvv2e$fRangeLoadrSHtrLsha$fLoadrSHtrLsha$fPreferredWorkIndexSHshsh $fRangeVecLoadrslrLtrtslrLshvv2e$fVecLoadrslrLtrtslrLshvv2e$fRangeLoadrLtrLsha$fLoadrLtrLsha$fPreferredWorkIndexLshIntShapeDelayedTarget ShapeDelayed LinearDelayed$fUTargetDTSHsha$fNFDataUArrayTFCo:R:UArrayDTSHsha$fRegularDTSHsha$fDefaultFusionDDSHsh$fDefaultIFusionDSHDSHsh$fDefaultIFusionDLDSHsh$fIFusionrlDSHsh$fUVecSourceDDSHshve$fVecRegularDDSHshve$fUSourceDSHsha$fNFDataUArray0TFCo:R:UArrayDSHsha$fRegularDSHsha$fDefaultFusionDDLsh $fFusionrDLsh$fUVecSourceDDLshve$fVecRegularDDLshve$fUSourceDLsha$fNFDataUArray1TFCo:R:UArrayDLsha$fRegularDLshaSeparate$fUVecTargetSEtrtlshve$fManifestSESElshv$fUTargetSEtlshv$fDefaultIFusionSElDSHsh$fDefaultFusionSEDlsh$fUVecSourceSErlshve$fUSourceSElshv$fVecRegularSErlshveTFCo:R:UArraySElshv$fRegularSElshv GHC.Floatatan2GHC.Real fromIntegralGHC.PtrPtrsqrtGHC.ForeignPtrcastForeignPtr ForeignSlice ForeignArray internalNew$fUVecTargetFFSLshve$fUTargetFSLshe$fManifestFFLsha$fUTargetFLsha$fUVecSourceSEFLshve$fUVecSourceFFSLshve$fVecRegularFFSLshve$fDefaultIFusionFSLDSHsh$fDefaultFusionFSDLsh$fUSourceFSLsheTFCo:R:UArrayFSLshe$fRegularFSLshe$fDefaultIFusionFLDSHsh$fDefaultFusionFDLsh$fUSourceFLshaTFCo:R:UArrayFLsha$fRegularFLshaanyReduceInneranyWalk anyRangeWalkanyWalkP anyRangeWalkPanyWalkSlicesSeparateanyRangeWalkSlicesSeparateanyWalkSlicesSeparatePanyRangeWalkSlicesSeparatePGHC.ErrerrorChecked unchecked$fUVecTargetCHKCHKlshve$fManifestCHKCHKlsha$fUTargetCHKtlsha$fUVecSourceCHKCHKlshve$fUSourceCHKlsha$fVecRegularCHKCHKlshveTFCo:R:UArrayCHKlsha$fRegularCHKlshaprimitive-0.5.4.0Data.Primitive.Array MutableArrayArray MutableBoxedBoxed uninitialized$fManifestBMBLsha$fUTargetMBLsha$fUVecSourceSEMBLshve$fDefaultIFusionMBLDSHsh$fDefaultFusionMBDLsh$fUSourceMBLshaTFCo:R:UArrayMBLsha$fRegularMBLsha$fUVecSourceSEBLshve$fDefaultIFusionBLDSHsh$fDefaultFusionBDLsh$fUSourceBLshaTFCo:R:UArrayBLsha$fRegularBLshacenter centerGet borderGet Convoluted getExtentgetTouchinheritedForce$fDefaultFusionCVCVCVLsh$fDefaultIFusionCVCVLCVCVLsh$fIFusionCVCVLCVCVLsh$fUSourceCVCVLshaTFCo:R:UArrayCVCVLsha$fRegularCVCVLsha$fPreferredWorkIndexCVLshshcvLoadPcvLoadS cvLoadSlicesP cvLoadSlicesS#$fRangeVecLoadSECVCVLtrtslrtlshvv2e$fVecLoadSECVCVLtrtslrtlshvv2e$fRangeLoadCVCVLtrtlsha$fLoadCVCVLtrtlshaStencilOffsetsparseDim1StencilparseDim2Stencil justNonZero$fStencilOffsetsSSS$fStencilOffsetsSZS$fStencilOffsetsSZZ