Kz      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`a b c d e f g h i j k l m n o p qrs t u v w x y z {|}~       !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyNone Analogous to z The type parameter r- and its functional dependency are necessary since g must be a function of the form $a -> ... -> c -> CodeGenFunction r d %and we must ensure that the explicit r and the implicit r in the g do match. /This is an Applicative functor that registers, Gwhat extensions are needed in order to run the contained instructions. +You can escape from the functor by calling  (and providing a generic implementation. We use an applicative functor since with a monadic interface 5we had to create the specialised code in every case, ,in order to see which extensions where used ,in the course of creating the instructions. ;We use only one (unparameterized) type for all extensions, (since this is the most simple solution. ,Alternatively we could use a type parameter 9where class constraints show what extensions are needed. KThis would be just like exceptions that are explicit in the type signature +as in the control-monad-exception package. RHowever we would still need to lift all basic LLVM instructions to the new monad. .Declare that a certain plain LLVM instruction #depends on a particular extension. 2This can be useful if you rely on the data layout 0of a certain architecture when doing a bitcast, @or if you know that LLVM translates a certain generic operation <to something especially optimal for the declared extension. 7Create an intrinsic and register the needed extension. :We cannot immediately check whether the signature matches )or whether the right extension is given. #However, when resolving intrinsics <LLVM will not find the intrinsic if the extension is wrong, "and it also checks the signature. run generic specific generates the specific code ?if the required extensions are available on the host processor and generic otherwise. Convenient variant of : -Only run the code with extended instructions )if an additional condition is satisfied. Only for debugging purposes. { | }~    { |  }~ None !  !  ! !None"#$%&'()      !"#$%&'()*+,-./*+0123456789:;<=>?@ABCDEFGHIJKLMNOPQRST,U-VWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz./{|}~      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}"#$%&'()      !"#$%&'()*+,-./*+0123456789:;<=>?@ABCDEFGHIJKLMNOPQRST,U-VWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz./{|}~      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}"#$%&'()      !"#$%&'()*+,-./*+0123456789:;<=>?@ABCDEFGHIJKLMNOPQRST,U-VWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz./{|}~      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|} Safe-Infered012012012012None~"This would also work for vectors, /but LLVM-3.1 crashes when actually doing this. 345~6789345~6789345~6789NoneAn alternative to > 'where I try to persuade LLVM to use x86's LOOP instruction. %Unfortunately it becomes even worse. 5LLVM developers say that x86 LOOP is actually slower 9than manual decrement, zero test and conditional branch. CThis is a variant of A that may be more convient, ,because you only need one lambda expression 'for both loop condition and loop body. D"This construct starts new blocks, )so be prepared when continueing after an D. GBranch-free variant of E 5that is faster if the enclosed block is very simply, .say, if it contains at most two instructions. &It can only be used as alternative to E /if the enclosed block is free of side effects. :;<=>?@ABCDEFG:;<=>?@ABCDEFG<=>?@ACBDE:;FG:;<=>?@ABCDEFGNoneHIf isJust = False, then fromJust is an  undefTuple. Lcounterpart to  Ncounterpart to  with swapped arguments P%counterpart to Data.Maybe.HT.toMaybe HIJKLMNOPQRSTUVWHIJKLMNOPQRSTUVWHIJKLMNOPQRSTUVWNoneXIf isRight, then fromLeft is an  undefTuple. If  not isRight, then  fromRight is an  undefTuple. I would prefer a union type, Jbut it was temporarily removed in LLVM-2.8 and did not return since then. ]counterpart to  XYZ[\]^_` XYZ[\]^_`XYZ[\]^_` None+abcdefghijkl abcdefghijkl fgdehabcijkl'abcdefghijkl Nonen*construct an array out of single elements DYou must assert that the length of the list matches the array size. &This can be considered the inverse of o. oKprovide the elements of an array as a list of individual virtual registers &This can be considered the inverse of n. pThe loop is unrolled, since  and  expect constant indices. mnopmnopmnopmnopNoneqrHIJKLMNOPQRSTUVWqrHIJKLMNOPQSRqUVTWrqr NonestuvwxyzstuvwxyzstuvwxyzstuvwxyzNone{| XYZ[\]^_`{| XYZ[\]`^_{|{|NoneWthe upper two integers are set to zero, there is no instruction that converts to Int64 Uthe upper two integers are ignored, there is no instruction that converts from Int64 +MXCSR is not really supported by LLVM-2.6. ILLVM does not know about the dependency of all floating point operations on this status register. cumulative sum: "(a,b,c,d) -> (a,a+b,a+b+c,a+b+c+d) &I try to cleverly use horizontal add, 8but the generic version in the Vector module is better. 2}~;"#$%&'()*+,-./}~;$"%#(&)'}~*+/.,-2}~ None Attention: .The rounding and fraction functions only work 4for floating point values with maximum magnitude of maxBound :: Int32. >This way we safe expensive handling of possibly seldom cases. 8The order of addition is chosen for maximum efficiency. 'We do not try to prevent cancelations. CThe first result value is the sum of all vector elements from 0 to  div n 2 + 1 B and the second result value is the sum of vector elements from div n 2 to n-1.  n must be at least D2. JTreat the vector as concatenation of pairs and all these pairs are added. ( Useful for stereo signal processing.  n must be at least D2. GAllow to work on records of vectors as if they are vectors of records. EThis is a reasonable approach for records of different element types Jsince processor vectors can only be built from elements of the same type. ;But also, say, for chunked stereo signal this makes sense. In this case we would work on Stereo (Value a). -Formerly we used a two-way dependency Vector  - (Element, Size). ;Now we have only the dependency Vector -> (Element, Size). 3This means that we need some more type annotations as in umul32to64/ assemble, 5on the other hand we can allow multiple vector types 'with respect to the same element type. 5E.g. we can provide a vector type with pair elements 7where the pair elements are interleaved in the vector. ,Manually assemble a vector of equal values. %Better use ScalarOrVector.replicate. *construct a vector out of single elements EYou must assert that the length of the list matches the vector size. &This can be considered the inverse of . LManually implement vector shuffling using insertelement and extractelement. In contrast to LLVM':s built-in instruction it supports distinct vector sizes, $but it allows only one input vector =(or a tuple of vectors, but we cannot shuffle between them). (For more complex shuffling we recommend  and . 0Rotate one element towards the higher elements. I don'-t want to call it rotateLeft or rotateRight, =because there is no prefered layout for the vector elements. In Intel's instruction manual vector $elements are indexed like the bits, that is from right to left. @However, when working with Haskell list and enumeration syntax, the start index is left. Implement the ! method using the methods of the  class. Kprovide the elements of a vector as a list of individual virtual registers &This can be considered the inverse of . 8Like LLVM.Util.Loop.mapVector but the loop is unrolled, >which is faster since it can be packed by the code generator. 8Like LLVM.Util.Loop.mapVector but the loop is unrolled, >which is faster since it can be packed by the code generator. 7Ideally on ix86 with SSE41 this would be translated to dpps. +If the target vector type is a native type Fthen the chop operation produces no actual machine instruction. (nop) 3If the vector cannot be evenly divided into chunks 5the last chunk will be padded with undefined values. +The target size is determined by the type. EIf the chunk list provides more data, the exceeding data is dropped. )If the chunk list provides too few data, 5the target vector is filled with undefined elements. 6We partition a vector of size n into chunks of size m -and add these chunks using vector additions. .We do this by repeated halving of the vector, Hsince this way we do not need assumptions about the native vector size. *We reduce the vector size only virtually, Dthat is we maintain the vector size and fill with undefined values. This is reasonable Wsince LLVM-2.5 and LLVM-2.6 does not allow shuffling between vectors of different size ;and because it likes to do computations on Vector D2 Float in MMX registers on ix86 CPU's, &which interacts badly with FPU usage. 0Since we fill the vector with undefined values, ?LLVM actually treats the vectors like vectors of smaller size. Needs (log n) vector additions .On LLVM-2.6 and X86 this produces branch-free but even slower code than fractionSelect, %since the comparison to booleans and 8back to a floating point number is translated literally 8to elementwise comparison, conversion to a 0 or -1 byte %and then to a floating point number. MLLVM.select on boolean vectors cannot be translated to X86 code in LLVM-2.6, >thus I code my own version that calls select on all elements. This is slow but works. IWhen this issue is fixed, this function will be replaced by LLVM.select.  implemented using . This will need jumps.  implemented using . This will need jumps. Another implementation of , 1this time in terms of binary logical operations. The selecting integers must be 5(-1) for selecting an element from the first operand 8and 0 for selecting an element from the second operand. This leads to optimal code. 7On SSE41 this could be done with blendvps or blendvpd.      99x      Nonean alternative is using the  vector type 0The fraction has the same sign as the argument. @This is not particular useful but fast on IEEE implementations. +increment (first operand) may be negative, "phase must always be non-negative .both increment and phase must be non-negative =There are functions that are intended for processing scalars +but have formally vector input and output. ?This function breaks vector function down to a scalar function (by accessing the lowest vector element. W !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMJ !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNoneAn implementation of both a and  must ensure that  haskellValue is compatible with Stored (Struct haskellValue) (which we want to call  llvmStruct). That is, writing and reading  llvmStruct by LLVM must be the same as accessing  haskellValue by Storable methods. <ToDo: In future we may also require Storable constraint for  llvmStruct. KWe use a functional dependency in order to let type inference work nicely. ?NOPQRSTUVWX   YZ[\]  ^_`abcdefghijklmnopqrstuvw          0NOPQRSTUVWX   YZ[\]  ^_`abcdefghijklmnopqrstuvwNone>Adding the finalizer to a ForeignPtr seems to be the only way Hthat warrants execution of the finalizer (not too early and not never). THowever, the normal ForeignPtr finalizers must be independent from Haskell runtime. &In contrast to ForeignPtr finalizers, @addFinalizer adds finalizers to boxes, that are optimized away. 1Thus finalizers are run too early or not at all. 3Concurrent.ForeignPtr and using threaded execution 1is the only way to get finalizers in Haskell IO. xyzxyzNone !"#$%&'()*{|}~ !"#$%&'()* !"#$%&'()* !"#$%&'()*{|}~NoneG$This and the following type classes Fare intended for arithmetic operations on wrappers around LLVM types. 5E.g. you might define a fixed point fraction type by   newtype Fixed = Fixed Int32 Mand then use the same methods for floating point and fixed point arithmetic. -In contrast to the arithmetic methods in the llvm wrapper, 7in our methods the types of operands and result match. FAdvantage: Type inference determines most of the types automatically. 7Disadvantage: You cannot use constant values directly, $but you have to convert them all to . R.both increment and phase must be non-negative @+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQR63456789+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQR6GHIJKL34EFMDBC>?@A<=NO6789789:;456PQR523+,-./01*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRNoneSIsomorphic to =ReaderT (CodeGenFunction r z) (ContT z (CodeGenFunction r)) a, (where the reader provides the block for  &and the continuation part manages the . W%counterpart to Data.Maybe.HT.toMaybe `4Run an exception handler if the Maybe-action fails. The exception is propagated. :That is, the handler is intended for a cleanup procedure. a>Run the first action and if that fails run the second action. ;If both actions fail, then the composed action fails, too. c9If the returned position is smaller than the array size, then returned final state is q. STUVWXYZ[\]^_`abcdSTUVWXYZ[\]^_`abcdSTUVWXYZ[\]^_`abcdSTUVWXYZ[\]^_`abcdNoneeAThe entire purpose of this datatype is to mark a type as scalar, 1although it might also be interpreted as vector. 6This way you can write generic operations for vectors  using the B class, 8and specialise them to scalar types with respect to the E class. From another perspective you can consider the  type constructor a marker  where the D type function 4stops reducing nested vector types to scalar types. efghijklmn efghijklmn efghijklmnefghijklmnNone opqrstuvwxy opqrstuvwxy uvwrstxopqyopqrstuvwxy !"#$%&'())*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefgh-iYjklmnopqrfstu-vwx y z { | } ~  `  f           Y              0f !"#$%&'()*+,-./012345f67kl8g9:;<j^Z[f=>r?@ABCDDEFFGH/IJKfLMNOPQNRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~      !"#$%&'()*+,-./0123456789:;</0=>?@ABCDEFGHIFJKH L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i jklmnopq r s t u v w x y z  { | } ~                                                                                                                                    GK      dI!"#$%&'()*+,-./0123456789F:F;<=>I?@ABCDEFGHIJllvm-extra-0.5LLVM.Extra.ArithmeticLLVM.Extra.ExtensionLLVM.Extra.ExtensionCheck.X86LLVM.Extra.Extension.X86LLVM.Extra.MonadLLVM.Extra.ControlLLVM.Extra.MaybeLLVM.Extra.EitherLLVM.Extra.ClassLLVM.Extra.ArrayLLVM.Extra.Multi.ValueLLVM.Extra.VectorLLVM.Extra.ScalarOrVectorLLVM.Extra.MemoryLLVM.Extra.ForeignPtrLLVM.Extra.Multi.VectorLLVM.Extra.MaybeContinuationLLVM.Extra.ScalarLLVM.Extra.Multi.ClassLLVM.Extra.Extension.X86AutoLLVM.Extra.ArithmeticPrivateLLVM.Extra.MaybePrivate Data.Maybe fromMaybeLLVM.Extra.EitherPrivateMemoryCScalarT llvm-tf-3.0.1LLVM.Core.InstructionsCmpEQCmpNECmpGTCmpGECmpLTCmpLE CmpPredicateCallArgsResult Subtargetwrap intrinsic intrinsicAttrrunrunWhen runUnsafewithwith2with3sse1sse2sse3ssse3sse41sse42avxavx2fma3fma4amd3dnow amd3dnowaaesminssminpsmaxssmaxpsminsdminpdmaxsdmaxpdhaddpshaddpdroundpsroundpddppddppschainliftR2liftR3incdecadvanceArrayElementPtrfcmpcmpandorSelectselect arrayLoop arrayLoop2arrayLoopWithExitarrayLoop2WithExitfixedLengthLoop whileLoop loopWithExitwhileLoopShared ifThenElseifThenselectTraversable ifThenSelectConsisJustfromJustfor alternativefromBooltoBooljust getIsNothinglift2sequencetraverseliftM2isRightfromLeft fromRightmapLeftmapRight getIsLeftMakeValueTuple ValueTuple valueTupleOfZero zeroTuple Undefined undefTuplezeroTuplePointedundefTuplePointedvalueTupleOfFunctorphisTraversableaddPhisFoldablesizeassemble extractAllmapnothingvalueOfundefzipzip3unzipunzip3leftrightcmpsscmppscmpsdcmppdcmpps256cmppd256pcmpgtbpcmpgtwpcmpgtdpcmpgtqpcmpugtbpcmpugtwpcmpugtdpcmpugtqpminsbpmaxsbpminswpmaxswpminsdpmaxsdpminubpmaxubpminuwpmaxuwpminudpmaxudpabsbpabswpabsdpmuludqpmuldqpmulldcvtps2dqcvtpd2dqcvtdq2pscvtdq2pdldmxcsrstmxcsr withMXCSRroundssroundsdabsssabssdabspsabspdRealminmaxabssignumtruncatefractionfloor Arithmeticsum sumToPairsumInterleavedToPaircumulate dotProductmul Canonical ConstructConstantSimpleElementSize shuffleMatchextractinsertconstant replicate insertChunkiterateshuffle sizeInTuplerotateUp rotateDownreverseshiftUp shiftDownshiftUpMultiZeroshiftDownMultiZeroshuffleMatchTraversableshuffleMatchAccessshuffleMatchPlain1shuffleMatchPlain2insertTraversableextractTraversablemodify mapChunks zipChunksWithchopconcat cumulate1signedFraction umul32to64TranscendentalConstantconstPiRationalConstantconstFromRationalIntegerConstantconstFromInteger PseudoModulescale scaleConst ReplicatereplicateConstFraction addToPhaseincPhase replicateOf FirstClassStoredRecordStructloadstore decomposecomposeelement loadRecord storeRecorddecomposeRecord composeRecordcastStorablePtr loadNewtype storeNewtypedecomposeNewtypecomposeNewtypenewInitnewParamnewVectorundefPrimitiveshuffleMatchPrimitiveextractPrimitiveinsertPrimitivedissectshuffleMatchGen extractGen insertGenTranscendentalpisinlogexpcospow Algebraicsqrt fromRational'Fieldfdiv fromInteger' PseudoRingAdditivezeroaddsubnegonesquareidiviremresolvewithBooltoMaybeliftguardbindonFaildeconsliftMunliftMunliftM2unliftM3unliftM4unliftM5AddrunAddUndefgetUndefswitchLLVM.Core.CodeGen FunctionArgsbuildIntrinsic targetNamenamecheck$fCallArgsCodeGenFunction$fCallArgs(->) subtargetV8Word32V8Word16V8Int32V8Int16V8FloatV4Word64V4Word32V4Int64V4Int32V4FloatV4DoubleV32Word8V32Int8V2Word64V2Int64V2DoubleV16Word8 V16Word16V16Int8V16Int16MMXpavgusbpf2idpfaccpfaddpfcmpeqpfcmpgepfcmpgtpfmaxpfminpfmulpfrcppfrcpit1pfrcpit2pfrsqrtpfrsqit1pfsubpfsubrpi2fdpmulhrwpf2iwpfnaccpfpnaccpi2fwaddsssubssmulssdivsssqrtsssqrtpsrcpssrcppsrsqrtssrsqrtpscomieqcomiltcomilecomigtcomigecominequcomiequcomiltucomileucomigtucomigeucomineqcvtss2si cvtss2si64 cvttss2si cvttss2si64cvtsi2ss cvtsi642sscvtps2pi cvttps2picvtpi2psstoreupssfencemovmskpsaddsdsubsdmulsddivsdsqrtsdsqrtpdcomisdeqcomisdltcomisdlecomisdgtcomisdge comisdneq ucomisdeq ucomisdlt ucomisdle ucomisdgt ucomisdge ucomisdneq paddsb128 paddsw128 paddusb128 paddusw128 psubsb128 psubsw128 psubusb128 psubusw128 pmulhuw128 pmulhw128 pmuludq128 pmaddwd128pavgb128pavgw128 pmaxub128 pmaxsw128 pminub128 pminsw128 psadbw128psllw128pslld128psllq128psrlw128psrld128psrlq128psraw128psrad128 psllwi128 pslldi128 psllqi128 psrlwi128 psrldi128 psrlqi128 psrawi128 psradi128 pslldqi128 psrldqi128pslldqi128_byteshiftpsrldqi128_byteshift cvttpd2dqcvtpd2ps cvttps2dqcvtps2pdcvtsd2si cvtsd2si64 cvttsd2si cvttsd2si64cvtsi2sd cvtsi642sdcvtsd2sscvtss2sdcvtpd2pi cvttpd2picvtpi2pdstoreupdstoredqu storelv4si packsswb128 packssdw128 packuswb128movmskpd pmovmskb128 maskmovdquclflushlfencemfenceaddsubpsaddsubpdhsubpshsubpdlddqumonitormwaitphaddw phaddw128phaddd phaddd128phaddsw phaddsw128phsubw phsubw128phsubd phsubd128phsubsw phsubsw128 pmaddubsw pmaddubsw128pmulhrsw pmulhrsw128pshufb pshufb128pshufwpsignb psignb128psignw psignw128psignd psignd128pabsb128pabsw128pabsd128 pmovsxbd128 pmovsxbq128 pmovsxbw128 pmovsxdq128 pmovsxwd128 pmovsxwq128 pmovzxbd128 pmovzxbq128 pmovzxbw128 pmovzxdq128 pmovzxwd128 pmovzxwq128 phminposuw128 pmaxsb128 pmaxsd128 pmaxud128 pmaxuw128 pminsb128 pminsd128 pminud128 pminuw128 aesimc128 aesenc128 aesenclast128 aesdec128 aesdeclast128aeskeygenassist128 packusdw128 pmuldq128 extractps128 insertps128 pblendvb128 pblendw128blendpdblendpsblendvpdblendvps mpsadbw128movntdqa ptestz128 ptestc128 ptestnzc128crc32qicrc32hicrc32sicrc32di pcmpistrm128 pcmpistri128 pcmpistria128 pcmpistric128 pcmpistrio128 pcmpistris128 pcmpistriz128 pcmpestrm128 pcmpestri128 pcmpestria128 pcmpestric128 pcmpestrio128 pcmpestris128 pcmpestriz128 addsubpd256 addsubps256maxpd256maxps256minpd256minps256 sqrtpd256 sqrtps256 rsqrtps256rcpps256 roundpd256 roundps256 haddpd256 hsubps256 hsubpd256 haddps256 vpermilvarpd vpermilvarpsvpermilvarpd256vpermilvarps256vperm2f128_pd256vperm2f128_ps256vperm2f128_si256 blendpd256 blendps256 blendvpd256 blendvps256dpps256vextractf128_pd256vextractf128_ps256vextractf128_si256vinsertf128_pd256vinsertf128_ps256vinsertf128_si256 cvtdq2pd256 cvtdq2ps256 cvtpd2ps256 cvtps2dq256 cvtps2pd256 cvttpd2dq256 cvtpd2dq256 cvttps2dq256vtestzpdvtestcpd vtestnzcpdvtestzpsvtestcps vtestnzcps vtestzpd256 vtestcpd256 vtestnzcpd256 vtestzps256 vtestcps256 vtestnzcps256 ptestz256 ptestc256 ptestnzc256 movmskpd256 movmskps256vzeroall vzeroupper vbroadcastssvbroadcastsd256vbroadcastss256vbroadcastf128_pd256vbroadcastf128_ps256lddqu256 storeupd256 storeups256 storedqu256 movntdq256 movntpd256 movntps256 maskloadpd maskloadps maskloadpd256 maskloadps256 maskstorepd maskstorepsmaskstorepd256maskstoreps256 paddsb256 paddsw256 paddusb256 paddusw256 psubsb256 psubsw256 psubusb256 psubusw256 pmulhuw256 pmulhw256 pmuludq256 pmuldq256 pmaddwd256pavgb256pavgw256 psadbw256 pmaxub256 pmaxuw256 pmaxud256 pmaxsb256 pmaxsw256 pmaxsd256 pminub256 pminuw256 pminud256 pminsb256 pminsw256 pminsd256psllw256pslld256psllq256psrlw256psrld256psrlq256psraw256psrad256 psllwi256 pslldi256 psllqi256 psrlwi256 psrldi256 psrlqi256 psrawi256 psradi256 pslldqi256 psrldqi256pslldqi256_byteshiftpsrldqi256_byteshift packsswb256 packssdw256 packuswb256 packusdw256pabsb256pabsw256pabsd256 phaddw256 phaddd256 phaddsw256 phsubw256 phsubd256 phsubsw256 pmaddubsw256 psignb256 psignw256 psignd256 pmulhrsw256 pmovsxbd256 pmovsxbq256 pmovsxbw256 pmovsxdq256 pmovsxwd256 pmovsxwq256 pmovzxbd256 pmovzxbq256 pmovzxbw256 pmovzxdq256 pmovzxwd256 pmovzxwq256 pblendvb256 pblendw256 pblendd128 pblendd256vbroadcastss_psvbroadcastsd_pd256vbroadcastss_ps256vbroadcastsi256pbroadcastb128pbroadcastb256pbroadcastw128pbroadcastw256pbroadcastd128pbroadcastd256pbroadcastq128pbroadcastq256 permvarsi256 permvarsf256 permti256extract128i256 insert128i256 maskloadd maskloadq maskloadd256 maskloadq256 maskstored maskstoreq maskstored256 maskstoreq256psllv4sipsllv8sipsllv2dipsllv4dipsrlv4sipsrlv8sipsrlv2dipsrlv4dipsrav4sipsrav8si pmovmskb256 pshufb256 mpsadbw256 movntdqa256vfmaddssvfmaddsdvfmaddpsvfmaddpd vfmaddps256 vfmaddpd256vfmsubssvfmsubsdvfmsubpsvfmsubpd vfmsubps256 vfmsubpd256 vfnmaddss vfnmaddsd vfnmaddps vfnmaddpd vfnmaddps256 vfnmaddpd256 vfnmsubss vfnmsubsd vfnmsubps vfnmsubpd vfnmsubps256 vfnmsubpd256 vfmaddsubps vfmaddsubpdvfmaddsubps256vfmaddsubpd256 vfmsubaddps vfmsubaddpdvfmsubaddps256vfmsubaddpd256 signumGen cmpSelect_arrayLoopWithExitDecLoop _whileLoop _emitCode $fSelect(,,) $fSelect(,) $fSelect() $fSelectValuebasemaybe$fPhiT $fFunctorT Data.Eithereither$fMakeValueTupleVector$fMakeValueTupleStablePtr$fMakeValueTuplePtr$fMakeValueTuple()$fMakeValueTupleWord64$fMakeValueTupleWord32$fMakeValueTupleWord16$fMakeValueTupleWord8$fMakeValueTupleInt64$fMakeValueTupleInt32$fMakeValueTupleInt16$fMakeValueTupleInt8$fMakeValueTupleBool$fMakeValueTupleDouble$fMakeValueTupleFloat$fMakeValueTupleEither$fMakeValueTupleMaybe$fMakeValueTuple(,,)$fMakeValueTuple(,) $fZero(,,) $fZero(,)$fZeroConstValue $fZeroValue$fZero() $fUndefinedT $fUndefinedT0$fUndefined(,,)$fUndefined(,)$fUndefinedConstValue$fUndefinedValue $fUndefined() insertvalue extractvalue _cumulate1s switchFPPred pcmpuFromPcmp valueUnitmask _mapByFolddotProductPartialreduceSumInterleavedfractionGeneric _floorSelect_fractionSelect selectLogicalMaskableMask replicateCore iterateCore mapChunks2zipChunks2With withRound sumPartialchopCore getLowestPair_reduceAddInterleaved sumGenericsumToPairGeneric_cumulateSimplecumulateGeneric cumulateFrom1inttofp signumLogicalsignumIntGenericsignumWordGenericsignumFloatGeneric floorGenericmakeMask minGeneric maxGeneric absGenericabsAuto floorLogicalfractionLogicalorder $fRealWord64 $fRealWord32 $fRealWord16 $fRealWord8 $fRealInt64 $fRealInt32 $fRealInt16 $fRealInt8 $fRealDouble $fRealFloat$fArithmeticWord32$fArithmeticWord64$fArithmeticWord16$fArithmeticWord8$fArithmeticInt64$fArithmeticInt32$fArithmeticInt16$fArithmeticInt8$fArithmeticDouble$fArithmeticFloat$fMaskableDouble$fMaskableFloat$fMaskableWord64$fMaskableWord32$fMaskableWord16$fMaskableWord8$fMaskableInt64$fMaskableInt32$fMaskableInt16$fMaskableInt8$fCanonicaln(,,)$fCanonicaln(,)$fCanonicalnValue$fSimpleConstant$fUndefinedConstant $fPhiConstant$fTraversableConstant$fFoldableConstant$fApplicativeConstant$fFunctorConstant$fC(,,) $fSimple(,,)$fC(,) $fSimple(,)$fCValue $fSimpleValue runScalar fractionGen singletonmapAuto zipAutoWith$fTranscendentalConstantVector$fTranscendentalConstantDouble$fTranscendentalConstantFloat$fRationalConstantVector$fRationalConstantDouble$fRationalConstantFloat$fIntegerConstantVector$fIntegerConstantDouble$fIntegerConstantFloat$fIntegerConstantInt64$fIntegerConstantInt32$fIntegerConstantInt16$fIntegerConstantInt8$fIntegerConstantWord64$fIntegerConstantWord32$fIntegerConstantWord16$fIntegerConstantWord8$fPseudoModuleVector$fPseudoModuleDouble$fPseudoModuleFloat$fPseudoModuleInt64$fPseudoModuleInt32$fPseudoModuleInt16$fPseudoModuleInt8$fPseudoModuleWord64$fPseudoModuleWord32$fPseudoModuleWord16$fPseudoModuleWord8 $fRealVector $fRealFP128$fReplicateVector$fReplicateWord64$fReplicateWord32$fReplicateWord16$fReplicateWord8$fReplicateInt64$fReplicateInt32$fReplicateInt16$fReplicateInt8$fReplicateBool$fReplicateFP128$fReplicateDouble$fReplicateFloat$fFractionVector$fFractionDouble$fFractionFloat ConvertStructdecomposeField composeField StoredStruct fromStorable toStorable loadElement storeElementextractElement insertElementpairtriplefields$fC()$fConvertStructsi()$fConvertStructsi(,)$fFirstClassStruct$fFirstClassStablePtr$fFirstClassPtr$fFirstClassArray$fFirstClassVector$fFirstClassBool$fFirstClassWord64$fFirstClassWord32$fFirstClassWord16$fFirstClassWord8$fFirstClassInt64$fFirstClassInt32$fFirstClassInt16$fFirstClassInt8$fFirstClassDouble$fFirstClassFloat$fCT$fCT0$fApplicativeElement$fFunctorElementImporterderefStartParamPtr derefStartPtr $fCDouble$fCFloatValue_inc_dec valueTypeNamecallIntrinsic1callIntrinsic2 addReadNone$fTranscendentalValue$fAlgebraicValue$fFractionValue $fRealValue$fRationalConstantValue$fRationalConstantConstValue$fFieldConstValue $fFieldValue$fIntegerConstantValue$fIntegerConstantConstValue$fPseudoModuleConstValue$fPseudoModuleValue$fPseudoRingConstValue$fPseudoRingValue$fAdditive(,,) $fAdditive(,)$fAdditiveConstValue$fAdditiveValueNothingJust $fMonadIOT$fMonadT$fApplicativeT$fTranscendentalT $fAlgebraicT $fFractionT$fRealT$fPseudoModuleT$fFieldT $fPseudoRingT $fAdditiveT$fRationalConstantT$fIntegerConstantT$fZeroT