-- Hoogle documentation, generated by Haddock
-- See Hoogle, http://www.haskell.org/hoogle/


-- | Numeric type with asymptotically faster multiplications.
--   
--   This numeric type internally reorders multiplications to achieve
--   asymptotically faster multiplication of large numbers of small
--   integers in particular. See the module docs for more detail.
@package fast-mult
@version 0.1.0.2

module Data.FastMult.Internal

-- | <a>FastMult</a> is a Numeric type that can be used in any place a 'Num
--   a' is required. It represents a standard integer using three
--   components, which multiplied together represent the stored number:
--   
--   <ol>
--   <li>The number's sign</li>
--   <li>An unsigned machine word.</li>
--   <li>A (possibly empty) list of <a>BigNat</a>s, which are the internal
--   type for <a>Integer</a>s which are too large to fit in a machine
--   word.</li>
--   </ol>
--   
--   Each <a>BigNat</a> in the list has a scale. It's scale is the log base
--   2 of the number of words to store the machine word, minus 1.
--   
--   Note that we never store BigNats with length of only one machine word
--   in this list, we instead convert them to an ordinary unsigned machine
--   word and multiply them by item 2 in the list above. Only then if the
--   result overflows we place them in this <a>BigNat</a> list.
--   
--   This is a few examples of "MachineWords: Scale"
--   
--   <ul>
--   <li>2: 0</li>
--   <li>3: 1</li>
--   <li>4: 1</li>
--   <li>5: 2</li>
--   <li>6..8: 2</li>
--   <li>9..16: 3</li>
--   <li>17..32: 4</li>
--   </ul>
--   
--   etc.
--   
--   Note this "scale" has the very nice property that multipling
--   <a>BigNat</a>s of scale <tt>x</tt> always results in a <a>BigNat</a>
--   of scale <tt>x+1</tt>.
--   
--   The list of <a>BigNat</a>s only ever contains one <a>BigNat</a> of
--   each "scale". As the size of <a>BigNat</a>s increases exponentially
--   with scale, this list should always be relatively small. The
--   <a>BigNat</a> list is always sorted as well, smallest to largest.
--   
--   When we multiply two <a>FastMult</a>s, we merge the BigNat lists. This
--   is basically a simple merge of sorted list, but with one significant
--   change. Note that we said that the <a>BigNat</a> list cannot contain
--   two <a>BigNat</a>s of the same scale. So if find that a <a>BigNat</a>
--   in the left hand list of the multiplication is the same scale as a
--   <a>BigNat</a> in right hand list, we multiply these two <a>BigNat</a>s
--   to create a <a>BigNat</a> one "scale" larger. We then continue the
--   merge, including this new BigNat.
--   
--   As a result, we only ever multiply numbers of the same "scale", that
--   is, no more than double the length of one another.
--   
--   Why do we do this? Well, an ordinary product, say <tt>product
--   [1..1000000]</tt>, towards the end of the list involves
--   multiplications of a very large number by a machine word. These take
--   <tt>O(n)</tt> time. So the whole product takes <tt>O(n^2)</tt> time.
--   
--   If we instead did the following:
--   
--   <pre>
--   product x y = product x mid * product mid y
--     mid = (x + y) <a>div</a> 2
--   
--   (suitible base case here)
--   
--   </pre>
--   
--   We find that this runs a lot faster. The reason is that with this
--   approach we're minimising products involving very large numbers, and
--   importantly, multiplying two <tt>n</tt> length numbers doesn't take
--   <tt>O(n^2)</tt> but more like <tt>O(n*log(n))</tt> time. For this
--   reason it's better to do a few multiplication of large numbers by
--   large numbers, instead of lots of multiplications of large numbers by
--   small numbers.
--   
--   But to do this I've had to redefine product. What if you don't want to
--   change the algorithm, but just want to use one that's already been
--   written, perhaps inefficiently. Well this is where <a>FastMult</a> is
--   useful. Instead of making the algorithm smarter, <a>FastMult</a> just
--   makes numbers smarter. The numbers themselves reorder the
--   multiplications so you don't have too.
--   
--   As well as having the advantage of speeding up existing algorithms,
--   <a>FastMult</a> dynamically behaves differently based on what numbers
--   it's actually multiplying and always maintains the invariant that
--   multiplications will not be performed between numbers greater than
--   twice the size each other.
--   
--   At this point I haven't mentioned the meaning of the <a>FastMult</a>
--   type parameter <tt>n</tt>'. <a>FastMult</a> can also add paralellism
--   to your multiplication algorithms. However, sparking new GHC threads
--   has a cost, so we only want to do it for large multiplications.
--   Multiplications of <tt>scale &gt; n</tt> will spark a new thread, so
--   <tt>n = 0</tt> will spark new threads for any multiplication involving
--   at least 3 machine words. This is probably too small, you can
--   experiment with different numbers. Note that <tt>n</tt> represents the
--   scale, not size, so for example setting <tt>n=4</tt> will only spark
--   threads for multiplications involving at least 33 machine words.
--   
--   How well parallelism works (or if it works at all) hasn't been tested
--   yet however.
--   
--   We include an ordinary machine word in the type as an optimisation for
--   single machine word numbers. This is because multiplying
--   <a>BigNat</a>s involves calling GMP using a C call, which is a large
--   overhead for small multiplications.
--   
--   To use <a>FastMult</a>, all you have to do is import it's type, not
--   it's implementation. If you're not interested in parallelism, just
--   import <a>FastMultSeq</a>.
--   
--   For example, just compare in GHCi:
--   
--   <pre>
--   product [1..100000]
--   
--   </pre>
--   
--   and:
--   
--   <pre>
--   product [1::FastMultSeq..100000]
--   
--   </pre>
--   
--   and you should find the latter completes much faster.
--   
--   Converting to and from <a>Integer</a>s can be done with the
--   <a>toInteger</a> and <a>fromInteger</a> class methods from
--   <a>Integral</a> and <a>Num</a> respectively.
data FastMult (n :: Nat)
[FastMult] :: KnownNat n => {-# UNPACK #-} !Sign -> {-# UNPACK #-} !Word -> !(List (BigNatWithScale n)) -> FastMult n

-- | A type synonym for a fully sequential <a>FastMult</a>. The parameter
--   is supposed to be <tt>WORD_MAX</tt>, but I couldn't find that defined,
--   anyway what's important is that anything of scale smaller than
--   <tt>0xFFFFFFFF</tt> will be sequential, which is everything.
type FastMultSeq = FastMult 4294967295

-- | <a>simplify</a> returns a <a>FastMult</a> the same as it's argument
--   but "simplified".
--   
--   To explain this, consider the following for <tt>x :: FastMult</tt>:
--   
--   <pre>
--   f x = (show x, x + 1)
--   
--   </pre>
--   
--   It will multiply out <tt>x</tt> twice, once for the addition, and once
--   for <a>show</a>. Note that the list of <tt>BigInt</tt>s in <tt>x</tt>
--   is generally a small number, as only one <tt>BigInt</tt> is stored for
--   each scale, and the sizes of scales increase exponentially, but there
--   may be some multiplications required nevertheless. A better way to
--   write this is as follows:
--   
--   <pre>
--   f x = let y = simplify x in (show y, y + 1)
--   
--   </pre>
--   
--   This will ensure that <tt>x</tt> is multiplied out only once.
--   
--   Unfortunately using <a>simplify</a> stops your algorithms from being
--   generic, so it might be better to define simplify as <a>id</a> with a
--   rewrite rule. I'll think about this.
simplify :: KnownNat n => FastMult n -> FastMult n
instance GHC.Show.Show Data.FastMult.Internal.Sign
instance GHC.Show.Show (Data.FastMult.Internal.BigNatWithScale n)
instance GHC.TypeLits.KnownNat n => GHC.Classes.Eq (Data.FastMult.Internal.FastMult n)
instance GHC.TypeLits.KnownNat n => GHC.Classes.Ord (Data.FastMult.Internal.FastMult n)
instance GHC.TypeLits.KnownNat n => GHC.Enum.Enum (Data.FastMult.Internal.FastMult n)
instance GHC.TypeLits.KnownNat n => GHC.Num.Num (Data.FastMult.Internal.FastMult n)
instance GHC.TypeLits.KnownNat n => GHC.Real.Real (Data.FastMult.Internal.FastMult n)
instance GHC.TypeLits.KnownNat n => GHC.Real.Integral (Data.FastMult.Internal.FastMult n)
instance GHC.TypeLits.KnownNat n => GHC.Show.Show (Data.FastMult.Internal.FastMult n)
instance GHC.TypeLits.KnownNat n => GHC.Read.Read (Data.FastMult.Internal.FastMult n)

module Data.FastMult

-- | <a>FastMult</a> is a Numeric type that can be used in any place a 'Num
--   a' is required. It represents a standard integer using three
--   components, which multiplied together represent the stored number:
--   
--   <ol>
--   <li>The number's sign</li>
--   <li>An unsigned machine word.</li>
--   <li>A (possibly empty) list of <a>BigNat</a>s, which are the internal
--   type for <a>Integer</a>s which are too large to fit in a machine
--   word.</li>
--   </ol>
--   
--   Each <a>BigNat</a> in the list has a scale. It's scale is the log base
--   2 of the number of words to store the machine word, minus 1.
--   
--   Note that we never store BigNats with length of only one machine word
--   in this list, we instead convert them to an ordinary unsigned machine
--   word and multiply them by item 2 in the list above. Only then if the
--   result overflows we place them in this <a>BigNat</a> list.
--   
--   This is a few examples of "MachineWords: Scale"
--   
--   <ul>
--   <li>2: 0</li>
--   <li>3: 1</li>
--   <li>4: 1</li>
--   <li>5: 2</li>
--   <li>6..8: 2</li>
--   <li>9..16: 3</li>
--   <li>17..32: 4</li>
--   </ul>
--   
--   etc.
--   
--   Note this "scale" has the very nice property that multipling
--   <a>BigNat</a>s of scale <tt>x</tt> always results in a <a>BigNat</a>
--   of scale <tt>x+1</tt>.
--   
--   The list of <a>BigNat</a>s only ever contains one <a>BigNat</a> of
--   each "scale". As the size of <a>BigNat</a>s increases exponentially
--   with scale, this list should always be relatively small. The
--   <a>BigNat</a> list is always sorted as well, smallest to largest.
--   
--   When we multiply two <a>FastMult</a>s, we merge the BigNat lists. This
--   is basically a simple merge of sorted list, but with one significant
--   change. Note that we said that the <a>BigNat</a> list cannot contain
--   two <a>BigNat</a>s of the same scale. So if find that a <a>BigNat</a>
--   in the left hand list of the multiplication is the same scale as a
--   <a>BigNat</a> in right hand list, we multiply these two <a>BigNat</a>s
--   to create a <a>BigNat</a> one "scale" larger. We then continue the
--   merge, including this new BigNat.
--   
--   As a result, we only ever multiply numbers of the same "scale", that
--   is, no more than double the length of one another.
--   
--   Why do we do this? Well, an ordinary product, say <tt>product
--   [1..1000000]</tt>, towards the end of the list involves
--   multiplications of a very large number by a machine word. These take
--   <tt>O(n)</tt> time. So the whole product takes <tt>O(n^2)</tt> time.
--   
--   If we instead did the following:
--   
--   <pre>
--   product x y = product x mid * product mid y
--     mid = (x + y) <a>div</a> 2
--   
--   (suitible base case here)
--   
--   </pre>
--   
--   We find that this runs a lot faster. The reason is that with this
--   approach we're minimising products involving very large numbers, and
--   importantly, multiplying two <tt>n</tt> length numbers doesn't take
--   <tt>O(n^2)</tt> but more like <tt>O(n*log(n))</tt> time. For this
--   reason it's better to do a few multiplication of large numbers by
--   large numbers, instead of lots of multiplications of large numbers by
--   small numbers.
--   
--   But to do this I've had to redefine product. What if you don't want to
--   change the algorithm, but just want to use one that's already been
--   written, perhaps inefficiently. Well this is where <a>FastMult</a> is
--   useful. Instead of making the algorithm smarter, <a>FastMult</a> just
--   makes numbers smarter. The numbers themselves reorder the
--   multiplications so you don't have too.
--   
--   As well as having the advantage of speeding up existing algorithms,
--   <a>FastMult</a> dynamically behaves differently based on what numbers
--   it's actually multiplying and always maintains the invariant that
--   multiplications will not be performed between numbers greater than
--   twice the size each other.
--   
--   At this point I haven't mentioned the meaning of the <a>FastMult</a>
--   type parameter <tt>n</tt>'. <a>FastMult</a> can also add paralellism
--   to your multiplication algorithms. However, sparking new GHC threads
--   has a cost, so we only want to do it for large multiplications.
--   Multiplications of <tt>scale &gt; n</tt> will spark a new thread, so
--   <tt>n = 0</tt> will spark new threads for any multiplication involving
--   at least 3 machine words. This is probably too small, you can
--   experiment with different numbers. Note that <tt>n</tt> represents the
--   scale, not size, so for example setting <tt>n=4</tt> will only spark
--   threads for multiplications involving at least 33 machine words.
--   
--   How well parallelism works (or if it works at all) hasn't been tested
--   yet however.
--   
--   We include an ordinary machine word in the type as an optimisation for
--   single machine word numbers. This is because multiplying
--   <a>BigNat</a>s involves calling GMP using a C call, which is a large
--   overhead for small multiplications.
--   
--   To use <a>FastMult</a>, all you have to do is import it's type, not
--   it's implementation. If you're not interested in parallelism, just
--   import <a>FastMultSeq</a>.
--   
--   For example, just compare in GHCi:
--   
--   <pre>
--   product [1..100000]
--   
--   </pre>
--   
--   and:
--   
--   <pre>
--   product [1::FastMultSeq..100000]
--   
--   </pre>
--   
--   and you should find the latter completes much faster.
--   
--   Converting to and from <a>Integer</a>s can be done with the
--   <a>toInteger</a> and <a>fromInteger</a> class methods from
--   <a>Integral</a> and <a>Num</a> respectively.
data FastMult (n :: Nat)

-- | A type synonym for a fully sequential <a>FastMult</a>. The parameter
--   is supposed to be <tt>WORD_MAX</tt>, but I couldn't find that defined,
--   anyway what's important is that anything of scale smaller than
--   <tt>0xFFFFFFFF</tt> will be sequential, which is everything.
type FastMultSeq = FastMult 4294967295

-- | <a>simplify</a> returns a <a>FastMult</a> the same as it's argument
--   but "simplified".
--   
--   To explain this, consider the following for <tt>x :: FastMult</tt>:
--   
--   <pre>
--   f x = (show x, x + 1)
--   
--   </pre>
--   
--   It will multiply out <tt>x</tt> twice, once for the addition, and once
--   for <a>show</a>. Note that the list of <tt>BigInt</tt>s in <tt>x</tt>
--   is generally a small number, as only one <tt>BigInt</tt> is stored for
--   each scale, and the sizes of scales increase exponentially, but there
--   may be some multiplications required nevertheless. A better way to
--   write this is as follows:
--   
--   <pre>
--   f x = let y = simplify x in (show y, y + 1)
--   
--   </pre>
--   
--   This will ensure that <tt>x</tt> is multiplied out only once.
--   
--   Unfortunately using <a>simplify</a> stops your algorithms from being
--   generic, so it might be better to define simplify as <a>id</a> with a
--   rewrite rule. I'll think about this.
simplify :: KnownNat n => FastMult n -> FastMult n