hxt-8.5.4: A collection of tools for processing XML with Haskell.

Portabilitynon-portable
Stabilityexperimental
MaintainerUwe Schmidt (uwe\@fh-wedel.de)

Data.Atom

Contents

Description

Unique Atoms generated from Strings and managed as flyweights

Data.Atom can be used for caching and storage optimisation of frequently used strings. An Atom is constructed from a String. For two equal strings the identical atom is returned.

This module can be used for optimizing memory usage when working with strings or names. Many applications use data types like Map String SomeAttribute where a rather fixed set of keys is used. Especially XML applications often work with a limited set of element and attribute names. For these applications it becomes more memory efficient when working with types like Map Atom SomeAttribute and convert the keys into atoms before operating on such a map.

Internally this module manages a map of atoms. The atoms are internally represented by ByteStrings. When creating a new atom from a string, the string is first converted into an UTF8 Word8 sequence, which is packed into a ByteString. This ByteString is looked up in the table of atoms. If it is already there, the value in the map is used as atom, else the new ByteString is inserted into the map.

Of course the implementation of this name cache uses unsavePerformIO and MVars for managing this kind of global state.

The following laws hold for atoms

 s  ==       t => newAtom s  ==       newAtom t
 s `compare` t => newAtom s `compare` newAtom t
 show . newAtom == id

Equality test for Atoms runs in O(1), it is just a pointer comarison. The Ord comparisons have the same runtime like the ByteString comparisons. Internally there is an UTF8 comparison, but UTF8 encoding preserves the total order.

Warning: The internal cache never shrinks during execution. So using it in a undisciplined way can lead to memory leaks.

Synopsis

Atom objects

newAtom :: String -> AtomSource

creation of an Atom from a String

share :: String -> StringSource

Insert a String into the atom cache and convert the atom back into a String.

locically share == id holds, but internally equal strings share the same memory.