Ticket #1434 (closed task: fixed)

Opened 6 years ago

Last modified 3 years ago

Missing RULEs for truncate

Reported by: ghc@… Owned by: simonmar
Priority: high Milestone: 7.0.1
Component: libraries/base Version: 6.4.1
Keywords: rules Cc: ghc-bug@…, dons@…, alexey.skladnoy@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Difficulty: Unknown
Test Case: Blocked By:
Blocking: Related Tickets:

Description

I found that the rounding functions from RealFrac? class are considerably slower than the low level functions from GHC.Float. This is really a problem for me when doing signal processing, since for writing to a common audio file format or listening to a signal data has to be converted from Double to Int16.

$ ghci +RTS -M256m -c30 -RTS

   ___         ___ _
  / _ \ /\  /\/ __(_)
 / /_\// /_/ / /  | |      GHC Interactive, version 6.4.1, for Haskell 98.
/ /_\\/ __  / /___| |      http://www.haskell.org/ghc/
\____/\/ /_/\____/|_|      Type :? for help.

Loading package base-1.0 ... linking ... done.
Prelude> :set +s
Prelude> sum $ map round [0.1,0.32..100000] :: Int
1252489463
(6.50 secs, 241894764 bytes)
Prelude> sum $ map floor [0.1,0.32..100000] :: Int
1252262188
(6.07 secs, 240099200 bytes)
Prelude> sum $ map ceiling [0.1,0.32..100000] :: Int
1252716734
(6.13 secs, 243795404 bytes)
Prelude> sum $ map truncate [0.1,0.32..100000] :: Int
1252262188
(6.76 secs, 234572324 bytes)
Prelude> sum $ map GHC.Float.double2Int [0.1,0.32..100000] :: Int
1252262188
(1.38 secs, 66137016 bytes)

As far as I can judge, double2Int does the same like truncate. Instead of using the methods from RealFrac? I could simply use double2Int but I consider this a work-around.

In GHC-6.6.1 these examples end with a stack overflow, but if I shorten the list, the time relations remain the same.

Attachments

rules1434.dpatch Download (65.9 KB) - added by daniel.is.fischer 3 years ago.
rewrite rules for sized Int and Word types

Change History

Changed 6 years ago by ghc@…

In a more compact example using the GHC compiler and optimizations there are no performance differences:

{-# OPTIONS -O2 -Wall #-}
module Main where

import GHC.Float (double2Int)

main  :: IO ()
main  = print (sum (map double2Int [-0.32,-0.29..(100000::Double)]) :: Int)

main' :: IO ()
main' = print (sum (map truncate   [-0.32,-0.29..(100000::Double)]) :: Int)

However in a more complicated application I have, I generated a signal with round in 4 seconds and with a replacement based on double2Int in only one second. This may indicate an insufficient inlining depth, but increasing it didn't help. Seems like I have to extract a working example to demonstrate that.

Changed 6 years ago by simonpj

A repeatable small-ish example would indeed make it much easier for us to find out what is going on. Thanks

Simon

Changed 6 years ago by ghc@…

The following example is still certainly not the most compact one. The above examples let me assume that the difference between round and double2Int is eliminated by the optimizer. However, when I use Int16 instead of Int, this optimization seems to fail, again.

{-# OPTIONS -O2 #-}
{-  -package fps -package binary -}
module Main (main) where

import System.Time (getClockTime, diffClockTimes, tdSec, tdPicosec)
import System.Directory (removeFile)

import qualified Data.ByteString.Lazy as B
import qualified Data.Binary.Put as Bin

import Foreign (Int16)

import GHC.Float (double2Int)



type Sample = Int16    -- 'truncate' is slow
-- type Sample = Int     -- 'truncate' is as fast as double2Int


writeSignalMonoBinaryPut ::
   FilePath -> [Sample] -> IO ()
writeSignalMonoBinaryPut fileName =
   B.writeFile fileName .
   Bin.runPut .
   mapM_ (Bin.putWord16host . fromIntegral)



numToSample :: Double -> Sample
numToSample x =
   -- fromIntegral $ (truncate x :: Int)
   truncate x

doubleToSample :: Double -> Sample
doubleToSample x =
   fromIntegral $ double2Int x



{- * driver -}

measureTime :: String -> IO () -> IO ()
measureTime name act =
   do putStr (name++": ")
      timeA <- getClockTime
      act
      timeB <- getClockTime
      let td = diffClockTimes timeB timeA
      print (fromIntegral (tdSec td) +
             fromInteger (tdPicosec td) * 1e-12 :: Double)




exponential2 :: Double -> Double -> [Double]
exponential2 k = iterate (k*)


sawSignalDouble :: Double -> [Double]
sawSignalDouble halfLife =
   take 200000 (exponential2 halfLife 32767)

sawSignal16Trunc :: [Sample]
sawSignal16Trunc =
   map numToSample (sawSignalDouble 0.999)

sawSignal16LowLevel :: [Sample]
sawSignal16LowLevel =
   map doubleToSample (sawSignalDouble 0.999001)


tests :: [(String, FilePath, FilePath -> IO ())]
tests =
   ("saw double2Int", "saw-double2Int.sw", flip writeSignalMonoBinaryPut sawSignal16LowLevel) :
   ("saw round",      "saw-round.sw",      flip writeSignalMonoBinaryPut sawSignal16Trunc) :
   []


main :: IO ()
main =
   do mapM (\(label, fileName, action) ->
              measureTime label (action fileName))
           tests

      mapM_ (\(_,fileName,_) -> removeFile fileName)
           tests

For type Sample = Int16 I get

$ ghc-6.6.1 -package binary RoundTest.hs && a.out
saw double2Int: 8.7173e-2
saw round: 0.430843

For type Sample = Int I get

$ ghc-6.6.1 -package binary RoundTest.hs && a.out
saw double2Int: 8.8279e-2
saw round: 9.8028e-2

Changed 6 years ago by igloo

  • milestone set to 6.8

Changed 6 years ago by simonmar

I bet this is the lack of a specialised version of truncate for Int16. We have RULEs for truncate at Float->Int and Double->Int. We could presumably add RULEs for small integral types, of the form truncate = fromIntegral . double2Int (truncate is undefined outside the range of the target anyway).

Changed 5 years ago by dons

  • cc dons@… added
  • owner set to dons

I looked into this after the realToFrac issues last week. It's a similar story -- the noop rules are currently only set up for Int conversions.

So, for example, this program:

import Data.Word
import Data.Array.Vector

main = print . sumU
             . mapU (truncate::Double->Word16)
             $ (enumFromToFracU 0.1 (100000000 :: Double))

truncate stays as a call to properFraction.

Now, we can add some rules for this stuff:

{-# RULES
"truncate/Double->Int64"
    truncate = fromIntegral . double2Int :: Double -> Int64
"truncate/Double->Int32"
    truncate = fromIntegral . double2Int :: Double -> Int32
"truncate/Double->Int16"
    truncate = fromIntegral . double2Int :: Double -> Int16
"truncate/Double->Int8"
    truncate = fromIntegral . double2Int :: Double -> Int8
"truncate/Double->Word64"
    truncate = fromIntegral . double2Int :: Double -> Word64
"truncate/Double->Word32"
    truncate = fromIntegral . double2Int :: Double -> Word32
"truncate/Double->Word16"
    truncate = fromIntegral . double2Int :: Double -> Word16
"truncate/Double->Word8"
    truncate = fromIntegral . double2Int :: Double -> Word8
  #-}

And the running time drops from:

$ time ./henning     
28800
./henning  45.84s user 0.11s system 98% cpu 46.448 total

before, to:

$ time ./henning     
28800
./henning  0.39s user 0.00s system 97% cpu 0.398 total

after.

I think this is as good a time as any to do an audit of the Int/noop conversion rules in base, looking for others that also should work for integral types. I'll take care of this.

Changed 5 years ago by dons

  • keywords rules added
  • type changed from bug to task

While hunting around, I noticed we have nice RULES defined for the basic math and boolean ops,

"x# ># x#"  forall x#. x# >#  x# = False
"x# >=# x#" forall x#. x# >=# x# = True
"x# ==# x#" forall x#. x# ==# x# = True
"x# /=# x#" forall x#. x# /=# x# = False
"x# <# x#"  forall x#. x# <#  x# = False
"x# <=# x#" forall x#. x# <=# x# = True

For example, but I couldn't see rules for the Word# primitives.

However, if I write test cases,

x =  case int2Word# 7# of x# -> x# `gtWord#` x#

We do get rules firing,

2 RuleFired
    1 gtWord#
    1 int2Word#
M.x = False

Where are the Word# rules defined? Are they wired in?

Changed 5 years ago by igloo

  • milestone changed from 6.8 branch to 6.10.1

Changed 5 years ago by simonmar

  • architecture changed from Unknown to Unknown/Multiple

Changed 5 years ago by igloo

  • milestone changed from 6.10.1 to 6.10.2

Changed 4 years ago by igloo

  • milestone changed from 6.10.2 to 6.12.1

Changed 4 years ago by Khudyakov

  • cc alexey.skladnoy@… added

Changed 4 years ago by simonmar

  • failure set to Runtime performance bug
  • os changed from Linux to Unknown/Multiple

Changed 4 years ago by igloo

  • owner dons deleted
  • milestone changed from 6.12.1 to 6.12 branch

Changed 3 years ago by LouisWasserman

Is there anything going on here, except that we just need to add in all the rules for the various truncated types?

Changed 3 years ago by simonpj

I believe that's all it is. See the comment from Don Stewart 23 months ago, in this ticket. It would be great if you could sort this out, and perhaps look for other similar cases in the base library.

There is a collection of tiresome arithmetic-related bugs waiting for a generous person to fix them:  http://hackage.haskell.org/trac/ghc/wiki/Status/SLPJ-Tickets#Tiresomearithmeticthings.

Changed 3 years ago by isaacdupree

Does this RULE work on 32-bit, where Int is not as big as Int64? (I'm dubious)

"truncate/Double->Int64"

truncate = fromIntegral . double2Int
Double -> Int64

Changed 3 years ago by igloo

  • milestone changed from 6.12 branch to 6.12.3

Changed 3 years ago by simonmar

  • summary changed from Slow conversion from Double to Int to Missing RULEs for truncate

Changed 3 years ago by igloo

  • priority changed from normal to low
  • milestone changed from 6.12.3 to 6.14.1

Changed 3 years ago by daniel.is.fischer

rewrite rules for sized Int and Word types

Changed 3 years ago by daniel.is.fischer

  • status changed from new to patch

Patch with rewrite rules for the sized Int and Word types whose ranges are contained in Int's range. Unfortunately, we can't have a rule for Word here, that would need its own primop to improve much on the default implementation. It doesn't depend on #2271, but it will only help with truncate without it.

Please review and merge.

Changed 3 years ago by simonmar

  • owner set to simonmar
  • priority changed from low to high

Changed 3 years ago by simonmar

  • status changed from patch to merge

Applied, thanks!

Wed Oct 20 10:10:14 BST 2010  Daniel Fischer <daniel.is.fischer@web.de>
  * FIX #1434
  Rewrite rules for RealFrac methods with sized Int and Word targets.
  For all types whose range is contained in Int's range, there are now
  rewrite rules for properFraction, truncate, floor, ceiling and round
  from Double and Float, going through the specialised methods for Int.
  
  Unfortunately, we can't have a rewrite rule for Word.

Changed 3 years ago by igloo

  • status changed from merge to closed
  • resolution set to fixed

Merged

Note: See TracTickets for help on using tickets.