|Function:||Decision tree generation|
|Input Type:||csv, discrete or continuous|
|Input from Agent:||Dizzy et al., Ultimate Discretizer, Data Selection Agent|
|Output Type:||Classsification advice as xml (description of the n best trees), dot (graphiviz, to make a tree in PDF), xls (confusion matrix, boundaries file, train and test classification)|
|Output to Agent:||Advice, Ceres, Juno|
Moku is a decision tree-building algorithm.
To build decision trees with Moku, an historic dataset is required, with input variables and one output variable. Typically, 70% of the historic data is used to create the trees, the remaining 30% to test or validate the trees. The output variable needs to be categorical, whereas the input variables may be categorized on the way.
Tree performance is calculated from (user adjustable) weighted parameters including classification accuracy of the train set (the 70%) and the validation set (the 30%), the average depth of the tree, the number of leafs (final decision nodes, without children) and the average predictability of all leafs. Moku generates x trees (typically 1000), but only prints the best y trees (typically 10) in both machine readable xml and human readable graphs (based on Graphviz's dot language).
Further tree information that is supplied are the confusion matrix (including sensitivity and specifictiy measures) and classification of all train and test cases.