I made the following enhancements of the function MosaicPlot
which I described (and proclaimed the implementation of) in my previous blog post:
1. Tooltips with precise contingency statistics.
2. If the last data column is numerical then MosaicPlot
can use it as pre-computed contingency statistics.
3. Coloring of the rectangles according to a list of index->color rules.
The document “Mosaic plots for data visualization” hosted at MathematicaForPrediction at GitHub, combines the information of this blog post and previous one. The document also has Mathematica code examples of usage and description of MosaicPlot
‘s options.
Tooltips with precise contingency statistics
I already proclaimed in my previous blog post the tooltips functionality — when hovering with the mouse over the rectangles then MosaicPlot
, using Tooltip
, gives a table with the exact co-occurrence (contingency) values. Here is an example:
Visualizing categorical columns + a numerical column
If the last data column is numerical then MosaicPlot
can use it as pre-computed contingency statistics. This functionality is specified with the option “ExpandLastColumn”->True.
In order to explain the functionality we are going to use following interpretation. If the last of column of the data is numerical then we can treat the data as a contracted version of a longer list of records made only of the categorical columns. For example, consider the following table with observations of people’s hair and eyes color:
The table above can be considered as a contracted version of this table:
Setting the option “ExpandLastColumn” to True gives a mosaic plot corresponding to that latter, observations-expanded table:
The last data column (which is numerical) does not need to be made of integers:
Rectangle coloring
The rectangles can be colored using the option ColorRules which specifies how the colors of the rectangles are determined from the indices of the data columns.
More precisely, the values of the option ColorRules should be a list of rules, {i1->c1,i2->c2,…}, matching the form
{(_Integer->(_RGBColor|_GrayLevel))..} .
If coloring for only one column index is specified the value of ColorRules can be of the form
{_Integer->{(_RGBColor|_GrayLevel)..}} .
The colors are used with Blend
in order to color the rectangles according to the order of the unique values of the specified data columns.
The default value for ColorRules
is Automatic
. When Automatic
is given to ColorRules
, MosaicPlot
finds the data column with the largest number of unique values and colors them according to their order using ColorData[7,"ColorList"]
.
The grid of plots below shows mosaic plots of the same data with different values for the option ColorRules
(given as plot labels).