Decision tree learning, used in statistics, data mining and machine learning, uses a decision tree as a predictive model which maps observations about an item to conclusions about the item's target value. In these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels.
In data mining, a decision tree describes data but not decisions; rather the resulting classification tree can be an input for decision making.
(https://en.wikipedia.org/wiki/Decision_tree_learning, Dec 6 2011)
- data DTree decider branch label
- data DTreeAlgebra decider branch label a = DTreeAlgebra {}
- buildDTree :: (Ord label, Ord branch, Decider decider attr branch) => DeciderGenerator attr decider -> (x -> attr) -> (x -> label) -> [x] -> DTree decider branch label
- foldD :: DTreeAlgebra dec branch label a -> DTree dec branch label -> a
- toDot :: (Show decider, Show branch, Show label) => DTree decider branch label -> String
- class Decider decider attr branch | decider -> attr branch where
- decide :: decider -> attr -> branch
- data Ord t => DecideOrd t = DecideOrd t
- data Eq t => DecideSet t = DecideSet [t]
- class AutoDecide attr decider | attr -> decider where
- autoDeciders :: [attr] -> [decider]
- genOrds :: Ord attr => [attr] -> [DecideOrd attr]
- genOrdsAvg :: Ord attr => (attr -> attr -> attr) -> [attr] -> [DecideOrd attr]
- genPair :: DeciderGenerator attra decidera -> DeciderGenerator attrb deciderb -> DeciderGenerator (attra, attrb) (Either decidera deciderb)
- genMany :: DeciderGenerator attr decider -> DeciderGenerator [attr] (Ixd decider)
- avgF :: Fractional a => a -> a -> a
- avgI :: Integral a => a -> a -> a
Decision Tree
data DTree decider branch label Source
A decision tree data structure that allows arbitrary numbers of children. It has been proven that a binary tree is equally expressive, but considering that decision trees are a 'white box' model, we do not want to limit ourselves to the binary case because other numbers of children may make more sense to humans.
Converting between binary and arbitrary-child trees is feasible though, but probably not very interesting.
(Eq decider, Eq branch, Eq label) => Eq (DTree decider branch label) | |
(Show decider, Show branch, Show label) => Show (DTree decider branch label) | |
(Show decider, Show branch, Show label) => Convertible (DTree decider branch label) DisplayLatex | |
(Show decider, Show branch, Show label) => Convertible (DTree decider branch label) DisplayText | |
(Decider decider attr branch, Eq branch) => Classifier (DTree decider branch label) attr label |
buildDTree :: (Ord label, Ord branch, Decider decider attr branch) => DeciderGenerator attr decider -> (x -> attr) -> (x -> label) -> [x] -> DTree decider branch labelSource
foldD :: DTreeAlgebra dec branch label a -> DTree dec branch label -> aSource
fold on a DTree
toDot :: (Show decider, Show branch, Show label) => DTree decider branch label -> StringSource
Render a decision tree to Graphviz Dot format.
Deciders
class Decider decider attr branch | decider -> attr branch whereSource
decide
defines the type and semantics of a split. For example,
the split "attr <= 20" is created by DecideOrd 20
.
For every possible value of type branch
, an actual tree branch
may be created. Allowing many distinct values in branch
is a bad
idea. Too many of these may have little predictive value and
exhaust the training database more quickly.
decider
: The representation of the decider
attr
: The data it needs
branch
: The key of that leads to a branch
data Ord t => DecideOrd t Source
Decide with Ord
AutoDecide Double (DecideOrd Double) | |
AutoDecide Float (DecideOrd Float) | |
AutoDecide Int (DecideOrd Int) | |
AutoDecide Integer (DecideOrd Integer) | |
(Ord t, Read t) => Read (DecideOrd t) | |
(Ord t, Show t) => Show (DecideOrd t) | |
Ord attr => Decider (DecideOrd attr) attr Bool | |
Integral a => AutoDecide (Ratio a) (DecideOrd (Ratio a)) |
class AutoDecide attr decider | attr -> decider whereSource
AutoDecide
is used to generate possible splits based on actual
attributes, in a straightforward fashion. Think of AutoDecide as a
default implementation for Decider
generation.
autoDeciders :: [attr] -> [decider]Source
AutoDecide Char (DecideSet Char) | |
AutoDecide Double (DecideOrd Double) | |
AutoDecide Float (DecideOrd Float) | |
AutoDecide Int (DecideOrd Int) | |
AutoDecide Integer (DecideOrd Integer) | |
AutoDecide [Char] (DecideSet [Char]) | |
Integral a => AutoDecide (Ratio a) (DecideOrd (Ratio a)) | |
(AutoDecide a xa, AutoDecide b xb) => AutoDecide (a, b) (Either xa xb) |
Composing Decider
s
Though autoDeciders
may provide good results, deciders
may be out there that do not get generated. These
functions let deviate from autoDeciders and compose very
specific deciders.
genOrds :: Ord attr => [attr] -> [DecideOrd attr]Source
Decider generator implementation for any ordered data; considers all sensible (<= pivot)
s.
genOrdsAvg :: Ord attr => (attr -> attr -> attr) -> [attr] -> [DecideOrd attr]Source
Decider generator for any ordered data; considers all possible (<= pivot)
s.
genPair :: DeciderGenerator attra decidera -> DeciderGenerator attrb deciderb -> DeciderGenerator (attra, attrb) (Either decidera deciderb)Source
avgF :: Fractional a => a -> a -> aSource
avgF a b = (a+b) / 2
, to be used with genOrdsAvg