Copyright	(c) 2011 Daniel Fischer
License	MIT
Maintainer	Daniel Fischer <daniel.is.fischer@googlemail.com>
Stability	Provisional
Portability	Non-portable (GHC extensions)
Safe Haskell	None
Language	Haskell2010

Math.NumberTheory.Primes.Sieve

Contents

Limitations
Sieves and lists

Description

Prime generation using a sieve. Currently, an enhanced sieve of Eratosthenes is used, switching to an Atkin sieve is planned (if I get around to implementing it and it's not slower).

The sieve used is segmented, with a chunk size chosen to give good (enough) cache locality while still getting something substantial done per chunk. However, that means we must store data for primes up to the square root of where sieving is done, thus sieving primes up to n requires O(sqrt n/log n) space.

Synopsis

primes :: (Ord a, Num a) => [a]
sieveFrom :: Integer -> [Integer]
data PrimeSieve
primeSieve :: Integer -> PrimeSieve
psieveList :: [PrimeSieve]
psieveFrom :: Integer -> [PrimeSieve]
primeList :: forall a. Integral a => PrimeSieve -> [a]

Limitations

There are three factors limiting the range of these sieves.

Memory
Overflow
The internal representation of the state

An Eratosthenes type sieve needs to store the primes up to the square root of the currently sieved region, thus requires O(sqrt n/log n) space.We store 16 bytes of information per prime, thus a Gigabyte of memory takes you to about 1.6*10^18. The log doesn't change much in that range, so as a first approximation, doubling the storage increases the sieve range by a factor of four.

On a 64-bit system, this is (currently) the only limitation to be concerned with, but with more than four Terabyte of memory, the fact that the internal representation currently limits the sieve range to about 6.8*10^25 could become relevant. Overflow in array indexing doesn't become a concern before memory and internal representation would allow to sieve past 10^37.

On a 32-bit system, the internal representation imposes no additional limits, but overflow has to be reckoned with. On the one hand, the fact that arrays are Int-indexed restricts the size of the prime store, on the other hand, overflow in calculating the indices to cross off multiples is possible before running out of memory. The former limits the upper bound of the monolithic primeSieve to shortly above 8*10^9, the latter limits the range of the segmented sieves to about 1.7*10^18.

Sieves and lists

primes :: (Ord a, Num a) => [a] Source #

Ascending list of primes.

>>> take 10 primes
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

primes is a polymorphic list, so the results of computations are not retained in memory. Make it monomorphic to take advantages of memoization. Compare

>>> :set +s
>>> primes !! 1000000 :: Int
15485867
(5.32 secs, 6,945,267,496 bytes)
>>> primes !! 1000000 :: Int
15485867
(5.19 secs, 6,945,267,496 bytes)

against

>>> let primes' = primes :: [Int]
>>> primes' !! 1000000 :: Int
15485867
(5.29 secs, 6,945,269,856 bytes)
>>> primes' !! 1000000 :: Int
15485867
(0.02 secs, 336,232 bytes)

sieveFrom :: Integer -> [Integer] Source #

sieveFrom n creates the list of primes not less than n.

data PrimeSieve Source #

Compact store of primality flags.

primeSieve :: Integer -> PrimeSieve Source #

Sieve primes up to (and including) a bound (or 7, if bound is smaller). For small enough bounds, this is more efficient than using the segmented sieve.

Since arrays are Int-indexed, overflow occurs when the sieve size comes near maxBound :: Int, that corresponds to an upper bound near 15/8*maxBound. On 32-bit systems, that is often within memory limits, so don't give bounds larger than 8*10^9 there.

psieveList :: [PrimeSieve] Source #

List of primes in the form of a list of PrimeSieves, more compact than primes, thus it may be better to use psieveList >>= primeList than keeping the list of primes alive during the entire run.

psieveFrom :: Integer -> [PrimeSieve] Source #

psieveFrom n creates the list of PrimeSieves starting roughly at n. Due to the organisation of the sieve, the list may contain a few primes less than n. This form uses less memory than [Integer], hence it may be preferable to use this if it is to be reused.

primeList :: forall a. Integral a => PrimeSieve -> [a] Source #

Generate a list of primes for consumption from a PrimeSieve.