Few weeks ago I programmed in Mathematica the set-up of the waveform recognition problem as described in Chapter 2, Section 2.6.2 in the book “Classification And Regression Trees” by Breiman et al. Here is the document that describes the problem formulation and the classification experiments in detail: Waveform recognition with decision trees. The rest of this post is some sort of an introduction to the problem.
We have three waveforms h1, h2, h3 that are piecewise linear functions shown on this plot:
We have data array D with n rows and 21 columns. The rows of D are linear combinations of the form
,
in which the last term is noise generated with the normal distribution centered around 0 and standard deviation 1.
This figure shows how the rows of D look and how they can be interpreted:
The blue points represent the vectors generated with the formula above. The dashed red lines show the corresponding “clean” waves, with the noise vector ξ removed. The plot labels tell the corresponding waveform combination class labels.
The problem is:
Given D and a vector v generated with the equation above we want to determine which base waveforms have been used to generate v.
This is a classification problem, and to solve it we can construct classifiers using decision trees. Here is a short decision tree made over 300 rows of D:
As it was mentioned above, a more detailed exposition is given here: Waveform recognition with decision trees. In that document the classification problem is solved using both (i) decision trees with different tuning parameters, and (ii) random forests specific to the problem. With the random forests it is possible to attain 78-85% successful recognition. The experiments in the document also illustrate how to use the functions of the decision trees and forests package of the project MathematicaForPrediction at GitHub.
Pingback: Classification of handwritten digits | Mathematica for prediction algorithms