Skip to content

Commit

Permalink
Improve module Haddocks
Browse files Browse the repository at this point in the history
* Make module Haddocks largely consistent across Set, Map, IntSet,
  IntMap, Seq.
* Make public module docs for each structure (e.g. Data.Map,
  Data.Map.Lazy, Data.Map.Strict) have similar structure and content.
  This means some duplication of text, but there is no way to avoid it.
* Trim down internal module Haddocks (e.g. Data.Map.Internal) to a
  structure description and references. There is no need to repeat
  everything that's in the public module.
* Add a few lines about the complexity of common operations on each set
  and map structure. In particular, union and intersection are mentioned
  with a clear description of 'm' and 'n', to reduce the chances that a
  reader assumes them to be tied to positions and not sizes.
  • Loading branch information
meooow25 committed Feb 25, 2025
1 parent 02b56b7 commit a38a597
Show file tree
Hide file tree
Showing 15 changed files with 217 additions and 209 deletions.
40 changes: 29 additions & 11 deletions containers/src/Data/IntMap.hs
Original file line number Diff line number Diff line change
Expand Up @@ -14,16 +14,30 @@
-- Maintainer : [email protected]
-- Portability : portable
--
-- The @'IntMap' v@ type represents a finite map (sometimes called a dictionary)
-- from key of type @Int@ to values of type @v@.
--
-- = Finite Int Maps (lazy interface)
--
-- This module re-exports the value lazy "Data.IntMap.Lazy" API.
--
-- This module is intended to be imported qualified, to avoid name
-- clashes with Prelude functions, e.g.
-- The @'IntMap' v@ type represents a finite map (sometimes called a dictionary)
-- from keys of type @Int@ to values of type @v@.
--
-- The functions in "Data.IntMap.Strict" are careful to force values before
-- installing them in an 'IntMap'. This is usually more efficient in cases where
-- laziness is not essential. The functions in this module do not do so.
--
-- > import Data.IntMap (IntMap)
-- > import qualified Data.IntMap as IntMap
-- For a walkthrough of the most commonly used functions see the
-- <https://haskell-containers.readthedocs.io/en/latest/map.html maps introduction>.
--
-- This module is intended to be imported qualified, to avoid name clashes with
-- Prelude functions, e.g.
--
-- > import Data.IntMap.Lazy (IntMap)
-- > import qualified Data.IntMap.Lazy as IntMap
--
-- Note that the implementation is generally /left-biased/. Functions that take
-- two maps as arguments and combine them, such as `union` and `intersection`,
-- prefer the values in the first argument to those in the second.
--
--
-- == Implementation
Expand Down Expand Up @@ -52,11 +66,11 @@
-- referring to the number of entries in the map and \(W\) referring to the
-- number of bits in an 'Int' (32 or 64).
--
-- Many operations have a worst-case complexity of \(O(\min(n,W))\).
-- This means that the operation can become linear in the number of
-- elements with a maximum of \(W\) -- the number of bits in an 'Int'
-- (32 or 64). These peculiar asymptotics are determined by the depth
-- of the Patricia trees:
-- Operations like 'lookup', 'insert', and 'delete' have a worst-case
-- complexity of \(O(\min(n,W))\). This means that the operation can become
-- linear in the number of elements with a maximum of \(W\) -- the number of
-- bits in an 'Int' (32 or 64). These peculiar asymptotics are determined by the
-- depth of the Patricia trees:
--
-- * even for an extremely unbalanced tree, the depth cannot be larger than
-- the number of elements \(n\),
Expand All @@ -74,6 +88,10 @@
-- The worst scenario are exponentially growing keys \(1,2,4,\ldots,2^n\),
-- for which complexity grows as fast as \(n\) but again is capped by \(W\).
--
-- Binary set operations like 'union' and 'intersection' take
-- \(O(\min(n, m \log \frac{2^W}{m}))\) time, where \(m\) and \(n\)
-- are the sizes of the smaller and larger input maps respectively.
--
-----------------------------------------------------------------------------

module Data.IntMap
Expand Down
26 changes: 23 additions & 3 deletions containers/src/Data/IntMap/Internal.hs
Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,30 @@
-- Authors importing this module are expected to track development
-- closely.
--
-- = Description
--
-- This defines the data structures and core (hidden) manipulations
-- on representations.
-- = Finite Int Maps (lazy interface internals)
--
-- The @'IntMap' v@ type represents a finite map (sometimes called a dictionary)
-- from keys of type @Int@ to values of type @v@.
--
--
-- == Implementation
--
-- The implementation is based on /big-endian patricia trees/. This data
-- structure performs especially well on binary operations like 'union'
-- and 'intersection'. Additionally, benchmarks show that it is also
-- (much) faster on insertions and deletions when compared to a generic
-- size-balanced map implementation (see "Data.Map").
--
-- * Chris Okasaki and Andy Gill,
-- \"/Fast Mergeable Integer Maps/\",
-- Workshop on ML, September 1998, pages 77-86,
-- <https://web.archive.org/web/20150417234429/https://ittc.ku.edu/~andygill/papers/IntMap98.pdf>.
--
-- * D.R. Morrison,
-- \"/PATRICIA -- Practical Algorithm To Retrieve Information Coded In Alphanumeric/\",
-- Journal of the ACM, 15(4), October 1968, pages 514-534,
-- <https://doi.org/10.1145/321479.321481>.
--
-- @since 0.5.9
-----------------------------------------------------------------------------
Expand Down
16 changes: 10 additions & 6 deletions containers/src/Data/IntMap/Lazy.hs
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
-- <https://haskell-containers.readthedocs.io/en/latest/map.html maps introduction>.
--
-- This module is intended to be imported qualified, to avoid name clashes with
-- Prelude functions:
-- Prelude functions, e.g.
--
-- > import Data.IntMap.Lazy (IntMap)
-- > import qualified Data.IntMap.Lazy as IntMap
Expand Down Expand Up @@ -64,11 +64,11 @@
-- referring to the number of entries in the map and \(W\) referring to the
-- number of bits in an 'Int' (32 or 64).
--
-- Many operations have a worst-case complexity of \(O(\min(n,W))\).
-- This means that the operation can become linear in the number of
-- elements with a maximum of \(W\) -- the number of bits in an 'Int'
-- (32 or 64). These peculiar asymptotics are determined by the depth
-- of the Patricia trees:
-- Operations like 'lookup', 'insert', and 'delete' have a worst-case
-- complexity of \(O(\min(n,W))\). This means that the operation can become
-- linear in the number of elements with a maximum of \(W\) -- the number of
-- bits in an 'Int' (32 or 64). These peculiar asymptotics are determined by the
-- depth of the Patricia trees:
--
-- * even for an extremely unbalanced tree, the depth cannot be larger than
-- the number of elements \(n\),
Expand All @@ -86,6 +86,10 @@
-- The worst scenario are exponentially growing keys \(1,2,4,\ldots,2^n\),
-- for which complexity grows as fast as \(n\) but again is capped by \(W\).
--
-- Binary set operations like 'union' and 'intersection' take
-- \(O(\min(n, m \log \frac{2^W}{m}))\) time, where \(m\) and \(n\)
-- are the sizes of the smaller and larger input maps respectively.
--
-- Benchmarks comparing "Data.IntMap.Lazy" with other dictionary
-- implementations can be found at https://github.com/haskell-perf/dictionaries.
--
Expand Down
16 changes: 10 additions & 6 deletions containers/src/Data/IntMap/Strict.hs
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
-- <https://haskell-containers.readthedocs.io/en/latest/map.html maps introduction>.
--
-- This module is intended to be imported qualified, to avoid name clashes with
-- Prelude functions:
-- Prelude functions, e.g.
--
-- > import Data.IntMap.Strict (IntMap)
-- > import qualified Data.IntMap.Strict as IntMap
Expand Down Expand Up @@ -80,11 +80,11 @@
-- referring to the number of entries in the map and \(W\) referring to the
-- number of bits in an 'Int' (32 or 64).
--
-- Many operations have a worst-case complexity of \(O(\min(n,W))\).
-- This means that the operation can become linear in the number of
-- elements with a maximum of \(W\) -- the number of bits in an 'Int'
-- (32 or 64). These peculiar asymptotics are determined by the depth
-- of the Patricia trees:
-- Operations like 'lookup', 'insert', and 'delete' have a worst-case
-- complexity of \(O(\min(n,W))\). This means that the operation can become
-- linear in the number of elements with a maximum of \(W\) -- the number of
-- bits in an 'Int' (32 or 64). These peculiar asymptotics are determined by the
-- depth of the Patricia trees:
--
-- * even for an extremely unbalanced tree, the depth cannot be larger than
-- the number of elements \(n\),
Expand All @@ -102,6 +102,10 @@
-- The worst scenario are exponentially growing keys \(1,2,4,\ldots,2^n\),
-- for which complexity grows as fast as \(n\) but again is capped by \(W\).
--
-- Binary set operations like 'union' and 'intersection' take
-- \(O(\min(n, m \log \frac{2^W}{m}))\) time, where \(m\) and \(n\)
-- are the sizes of the smaller and larger input maps respectively.
--
-- Benchmarks comparing "Data.IntMap.Strict" with other dictionary
-- implementations can be found at https://github.com/haskell-perf/dictionaries.
--
Expand Down
35 changes: 1 addition & 34 deletions containers/src/Data/IntMap/Strict/Internal.hs
Original file line number Diff line number Diff line change
Expand Up @@ -28,44 +28,11 @@
-- closely.
--
--
-- = Description
-- = Finite Int Maps (strict interface internals)
--
-- The @'IntMap' v@ type represents a finite map (sometimes called a dictionary)
-- from key of type @Int@ to values of type @v@.
--
-- Each function in this module is careful to force values before installing
-- them in an 'IntMap'. This is usually more efficient when laziness is not
-- necessary. When laziness /is/ required, use the functions in
-- "Data.IntMap.Lazy".
--
-- In particular, the functions in this module obey the following law:
--
-- - If all values stored in all maps in the arguments are in WHNF, then all
-- values stored in all maps in the results will be in WHNF once those maps
-- are evaluated.
--
-- For a walkthrough of the most commonly used functions see the
-- <https://haskell-containers.readthedocs.io/en/latest/map.html maps introduction>.
--
-- This module is intended to be imported qualified, to avoid name clashes with
-- Prelude functions:
--
-- > import Data.IntMap.Strict (IntMap)
-- > import qualified Data.IntMap.Strict as IntMap
--
-- Note that the implementation is generally /left-biased/. Functions that take
-- two maps as arguments and combine them, such as `union` and `intersection`,
-- prefer the values in the first argument to those in the second.
--
--
-- == Warning
--
-- The 'IntMap' type is shared between the lazy and strict modules, meaning that
-- the same 'IntMap' value can be passed to functions in both modules. This
-- means that the 'Functor', 'Traversable' and 'Data.Data.Data' instances are
-- the same as for the "Data.IntMap.Lazy" module, so if they are used the
-- resulting map may contain suspended values (thunks).
--
--
-- == Implementation
--
Expand Down
27 changes: 16 additions & 11 deletions containers/src/Data/IntSet.hs
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@
--
-- The implementation is based on /big-endian patricia trees/. This data
-- structure performs especially well on binary operations like 'union'
-- and 'intersection'. However, my benchmarks show that it is also
-- and 'intersection'. Additionally, benchmarks show that it is also
-- (much) faster on insertions and deletions when compared to a generic
-- size-balanced set implementation (see "Data.Set").
--
Expand All @@ -57,16 +57,16 @@
--
-- == Performance information
--
-- Operation comments contain the operation time complexity in
-- The time complexity is given for each operation in
-- [big-O notation](http://en.wikipedia.org/wiki/Big_O_notation), with \(n\)
-- referring to the number of entries in the map and \(W\) referring to the
-- number of bits in an 'Int' (32 or 64).
--
-- Many operations have a worst-case complexity of \(O(\min(n,W))\).
-- This means that the operation can become linear in the number of
-- elements with a maximum of \(W\) -- the number of bits in an 'Int'
-- (32 or 64). These peculiar asymptotics are determined by the depth
-- of the Patricia trees:
-- Operations like 'member', 'insert', and 'delete' have a worst-case
-- complexity of \(O(\min(n,W))\). This means that the operation can become
-- linear in the number of elements with a maximum of \(W\) -- the number of
-- bits in an 'Int' (32 or 64). These peculiar asymptotics are determined by the
-- depth of the Patricia trees:
--
-- * even for an extremely unbalanced tree, the depth cannot be larger than
-- the number of elements \(n\),
Expand All @@ -79,10 +79,15 @@
-- the set is sufficiently "dense", this becomes \(O(\min(n, \log n))\) or
-- simply the familiar \(O(\log n)\), matching balanced binary trees.
--
-- The most performant scenario for 'IntSet' are keys from a contiguous subset,
-- in which case the complexity is proportional to \(\log n\), capped by \(W\).
-- The worst scenario are exponentially growing elements \(1,2,4,\ldots,2^n\),
-- for which complexity grows as fast as \(n\) but again is capped by \(W\).
-- The most performant scenario for 'IntSet' are elements from a contiguous
-- subset, in which case the complexity is proportional to \(\log n\), capped
-- by \(W\). The worst scenario are exponentially growing elements \(1,2,4,
-- \ldots,2^n\), for which complexity grows as fast as \(n\) but again is capped
-- by \(W\).
--
-- Binary set operations like 'union' and 'intersection' take
-- \(O(\min(n, m \log \frac{2^W}{m}))\) time, where \(m\) and \(n\)
-- are the sizes of the smaller and larger input sets respectively.
--
-----------------------------------------------------------------------------

Expand Down
12 changes: 4 additions & 8 deletions containers/src/Data/IntSet/Internal.hs
Original file line number Diff line number Diff line change
Expand Up @@ -34,22 +34,18 @@
-- Authors importing this module are expected to track development
-- closely.
--
-- = Description
--
-- An efficient implementation of integer sets.
-- = Finite Int Sets (internals)
--
-- These modules are intended to be imported qualified, to avoid name
-- clashes with Prelude functions, e.g.
--
-- > import Data.IntSet (IntSet)
-- > import qualified Data.IntSet as IntSet
-- The @'IntSet'@ type represents a set of elements of type @Int@. An @IntSet@
-- is strict in its elements.
--
--
-- == Implementation
--
-- The implementation is based on /big-endian patricia trees/. This data
-- structure performs especially well on binary operations like 'union'
-- and 'intersection'. However, my benchmarks show that it is also
-- and 'intersection'. Additionally, benchmarks show that it is also
-- (much) faster on insertions and deletions when compared to a generic
-- size-balanced set implementation (see "Data.Set").
--
Expand Down
61 changes: 49 additions & 12 deletions containers/src/Data/Map.hs
Original file line number Diff line number Diff line change
Expand Up @@ -14,16 +14,50 @@
-- Maintainer : [email protected]
-- Portability : portable
--
--
-- = Finite Maps (lazy interface)
--
-- This module re-exports the value lazy "Data.Map.Lazy" API.
--
-- The @'Map' k v@ type represents a finite map (sometimes called a dictionary)
-- from keys of type @k@ to values of type @v@. A 'Map' is strict in its keys but lazy
-- in its values.
--
-- This module re-exports the value lazy "Data.Map.Lazy" API.
-- The functions in "Data.Map.Strict" are careful to force values before
-- installing them in a 'Map'. This is usually more efficient in cases where
-- laziness is not essential. The functions in this module do not do so.
--
-- When deciding if this is the correct data structure to use, consider:
--
-- * If you are using 'Prelude.Int' keys, you will get much better performance for most
-- operations using "Data.IntMap.Lazy".
--
-- * If you don't care about ordering, consider using @Data.HashMap.Lazy@ from the
-- <https://hackage.haskell.org/package/unordered-containers unordered-containers>
-- package instead.
--
-- For a walkthrough of the most commonly used functions see the
-- <https://haskell-containers.readthedocs.io/en/latest/map.html maps introduction>.
--
-- This module is intended to be imported qualified, to avoid name
-- clashes with Prelude functions, e.g.
-- This module is intended to be imported qualified, to avoid name clashes with
-- Prelude functions, e.g.
--
-- > import qualified Data.Map as Map
-- > import Data.Map (Map)
-- > import qualified Data.Map as Map
--
-- Note that the implementation is generally /left-biased/. Functions that take
-- two maps as arguments and combine them, such as `union` and `intersection`,
-- prefer the values in the first argument to those in the second.
--
--
-- == Warning
--
-- The size of a 'Map' must not exceed @'Prelude.maxBound' :: 'Prelude.Int'@.
-- Violation of this condition is not detected and if the size limit is exceeded,
-- its behaviour is undefined.
--
--
-- == Implementation
--
-- The implementation of 'Map' is based on /size balanced/ binary trees (or
-- trees of /bounded balance/) as described by:
Expand All @@ -48,16 +82,19 @@
-- \"/Parallel Ordered Sets Using Join/\",
-- <https://arxiv.org/abs/1602.02120v4>.
--
-- Note that the implementation is /left-biased/ -- the elements of a
-- first argument are always preferred to the second, for example in
-- 'union' or 'insert'.
--
-- /Warning/: The size of the map must not exceed @maxBound::Int@. Violation of
-- this condition is not detected and if the size limit is exceeded, its
-- behaviour is undefined.
-- == Performance information
--
-- The time complexity is given for each operation in
-- [big-O notation](http://en.wikipedia.org/wiki/Big_O_notation), with \(n\)
-- referring to the number of entries in the map.
--
-- Operations like 'lookup', 'insert', and 'delete' take \(O(\log n)\) time.
--
-- Binary set operations like 'union' and 'intersection' take
-- \(O\bigl(m \log\bigl(\frac{n}{m}+1\bigr)\bigr)\) time, where \(m\) and \(n\)
-- are the sizes of the smaller and larger input maps respectively.
--
-- Operation comments contain the operation time complexity in
-- the Big-O notation (<http://en.wikipedia.org/wiki/Big_O_notation>).
-----------------------------------------------------------------------------

module Data.Map
Expand Down
Loading

0 comments on commit a38a597

Please sign in to comment.