Ticket #4086 (closed bug: fixed)

Opened 3 years ago

Last modified 3 years ago

Data.List 'nub' function is O(n^2)

Reported by: Pete Owned by: igloo
Priority: normal Milestone: 7.0.1
Component: libraries/base Version: 6.12.1
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Documentation bug Difficulty: Easy (less than 1 hour)
Test Case: Blocked By:
Blocking: Related Tickets:

Description (last modified by igloo) (diff)

I recently discovered that some Haskell code was running much slower than I would have expected. I eventually traced the problem to the 'nub' function in the Data.List, which ghc implements as follows:

nub l                   = nub' l []
  where
    nub' [] _           = []
    nub' (x:xs) ls
        | x `elem` ls   = nub' xs ls
        | otherwise     = x : nub' xs (x:ls)

This would seem to be O(n**2), because it accumulates the values it sees in a list. If it used a different data structure like a Set, it could be made O(n log n).

I'm not sure whether this should be considered a bug or not. The list nub returns is correct. On the other hand, when I call a library function in any programming language, I would normally expect it to use an algorithm that provides the best achievable asymptotic performance.

If you decide that you don't consider this to be a bug, can I suggest adding a note to the documentation, so people are aware that nub should only be used with short lists?

Change History

Changed 3 years ago by simonmar

  • difficulty set to Easy (less than 1 hour)
  • milestone set to 6.14.1
  • os changed from Linux to Unknown/Multiple
  • architecture changed from x86 to Unknown/Multiple
  • failure changed from Runtime performance bug to Documentation bug

Given nub's type, the best complexity it can have is O(n2). See #2717 for a proposal (currently abandoned) to add a different nub with O(n log n) complexity to Data.Set.

You're quite right that the docs should be improved here, so I'll leave the ticket open. If you want to consider adopting #2717, that would be a worthwhile step I think.

Changed 3 years ago by igloo

  • description modified (diff)

Changed 3 years ago by igloo

  • owner set to igloo

Changed 3 years ago by igloo

  • status changed from new to closed
  • resolution set to fixed

Docs tweaked.

Note: See TracTickets for help on using tickets.