Ticket #5413 (closed feature request: fixed)

Opened 22 months ago

Last modified 22 months ago

Add population count primop

Reported by: tibbe Owned by: simonmar
Priority: normal Milestone:
Component: Compiler Version: 7.2.1
Keywords: Cc: johan.tibell@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Difficulty:
Test Case: Blocked By:
Blocking: Related Tickets:

Description

Modern CPUs have a POPCNT instruction for efficient population count. This instruction can be used to implement various data structures.

I propose we add the following set of primops

popCnt8# :: Word# -> Word#
popCnt16# :: Word# -> Word#
popCnt32# :: Word# -> Word#
popCnt64# :: Word64# -> Word#
popCnt# :: Word# -> Word#

(We use Word# for all functions except the 64 bit version as there are no Word8, Word16# and Word32# types).

Each primop compiles into either a single POPCNT instruction or a call to some fallback function, implemented in C.

Attachments

Change History

Changed 22 months ago by tibbe

  Changed 22 months ago by tibbe

  • status changed from new to patch

  Changed 22 months ago by tibbe

  • owner set to simonmar

I've implemented the primops. Optionally we might want to create a small static library in ghc-prim containing the C fallbacks, to avoid the overhead of dynamic linking for these "fat machine instructions".

Changed 22 months ago by tibbe

follow-up: ↓ 4   Changed 22 months ago by simonmar

Shouldn't -msse4.2 imply -msse2?

in reply to: ↑ 3   Changed 22 months ago by tibbe

  • cc johan.tibell@… added

Replying to simonmar:

Shouldn't -msse4.2 imply -msse2?

Yes and I think it does. Check the helpers in the native code gen.

  Changed 22 months ago by tibbe

  • status changed from patch to closed
  • resolution set to fixed

Fixed in 2d0438f329ac153f9e59155f405d27fac0c43d65

Note: See TracTickets for help on using tickets.