Introduction
This blog post was mostly made as a response to a comment of my previous blog post “Directional quantile envelopes”, [1]:
This looks extremely useful and elegant – any plans on generalizing to 3 (or more) dimensions?
–Jan Liphardt
Since I did say in the previous blog post that the algorithm can be easily extended to 3D and higher dimensions, I decided to illustrate that with an example. The 3D implementation is really easy using Mathematica’s geometric computation functions (i) for derivation of rotation matrices and (ii) for regions manipulation. (The latter are new in version 10.)
The algorithm shown below is a slightly modified version of the algorithm in “Directional quantile envelopes”. Instead of the step that does the intersection of directional quantile lines with Reduce, we can use ImplicitRegion to specify the region enclosed by the directional quantile planes. We can use other built-in functions for region predicates and manipulation to derive, visualize, or utilize the directional quantile regions found by the algorithm. (All these are exemplified below.)
Data generation
Let us first generate some data.
npoints = 10000;
data = {RandomReal[{0, 2 Pi}, npoints],
RandomReal[{-Pi, Pi}, npoints],
RandomVariate[PoissonDistribution[4.8], npoints]};
data = MapThread[#3*{Cos[#1] Cos[#2], Sin[#1] Cos[#2], Sin[#2]} &, data];
Let us plot the data
Block[{qs = 12},
qs = Map[Quantile[#, Range[0, 1, 1/(qs - 1)]] &, Transpose[data]];
ListPointPlot3D[data, PlotRange -> All, PlotTheme -> "Detailed",
FaceGrids -> {{{0, 0, -1}, Most[qs]}, {{0, 1, 0},
qs[[{1, 3}]]}, {{-1, 0, 0}, Rest[qs]}}, ImageSize -> 700]
]
On the plot the grid lines are derived from the quantiles of x, y and z coordinates of the data set.
Algorithm application step-by-step
1. Standardize the data
1.1. This step is generally not needed and in this implementation would complicate things.
2. Create a set of uniform angles. (Here we generate vectors that define 3D directions.)
nqs = 20;
dirs = N@Flatten[
Table[{Cos[\[Theta]] Cos[\[Phi]], Sin[\[Theta]] Cos[\[Phi]],
Sin[\[Phi]]}, {\[Theta], 2 \[Pi]/(10 nqs), 2 \[Pi],
2 \[Pi]/nqs}, {\[Phi], -\[Pi], \[Pi], 2 \[Pi]/nqs}], 1];
Graphics3D[Map[Arrow[{{0, 0, 0}, #}] &, dirs]]
3. Rotate the data for each direction vector, a ∈ A. (Here we just generate the rotation matrices.)
In[672]:= rmats = RotationMatrix[{{1, 0, 0}, #}] & /@ dirs;
Length[rmats]
Out[673]= 420
4. Find the q-quantile of the z-coordinates of the data rotated at direction a. Denote that point with Z[a,q].
In[674]:= qs = {0.95};
qDirPoints =
Flatten[Map[Function[{m}, Quantile[(m.Transpose[data])[[3]], qs]], rmats]];
5. Find the points z[a,q] corresponding to Z[a,q] in the original coordinate system.
5.1. In this implementation this step is redundant, see the next step.
6. Find the quantile planes intersections
6.1. Instead of doing this step we simply use ImplicitRegion.
In[676]:= qRegion = ImplicitRegion[
MapThread[(#1.{x, y, z})[[3]] <= #2 &, {rmats, qDirPoints}], {x, y, z}];
In[677]:= Shallow[qRegion]
Out[677]//Shallow= ImplicitRegion[
LessEqual[<>] && LessEqual[<>] && LessEqual[<>] &&
LessEqual[<>] && LessEqual[<>] && LessEqual[<>] &&
LessEqual[<>] && LessEqual[<>] && LessEqual[<>] &&
LessEqual[<>] && <>, {x, y, z}]
Wrapping it in a function
Here is a definition of a function that wraps all of the algorithms steps in the previous section.
Clear[QuantileEnvelopeRegion]
QuantileEnvelopeRegion[points_?MatrixQ, quantile_?NumberQ, numberOfDirections_Integer] :=
Block[{nd = numberOfDirections, dirs, rmats, qDirPoints, qRegion},
dirs =
N@Flatten[
Table[{Cos[\[Theta]] Cos[\[Phi]], Sin[\[Theta]] Cos[\[Phi]],
Sin[\[Phi]]}, {\[Theta], 2 \[Pi]/(10 nd), 2 \[Pi],
2 \[Pi]/nd}, {\[Phi], -\[Pi], \[Pi], 2 \[Pi]/nd}], 1];
rmats = RotationMatrix[{{1, 0, 0}, #}] & /@ dirs;
qDirPoints =
Flatten[Map[
Function[{m}, Quantile[(m.Transpose[points])[[3]], quantile]],
rmats]];
qRegion =
ImplicitRegion[
MapThread[(#1.{x, y, z})[[3]] <= #2 &, {rmats, qDirPoints}], {x, y,
z}];
qRegion
] /; Dimensions[points][[2]] == 3 && 0 < quantile <= 1;
Visualizing
From the implicit region specification we can generate an approximation of the boundary surface using the built-in function BoundaryDiscretizeRegion.
qRegion = QuantileEnvelopeRegion[data, 0.95, 20];
qRegion2 = QuantileEnvelopeRegion[data, 0.8, 20];
qSurface = BoundaryDiscretizeRegion[qRegion];
qSurface2 = BoundaryDiscretizeRegion[qRegion2];
Now we can visualize the quantile surface together with the data points:
Block[{c = 3, opts, pntsgr},
opts = {PlotRange -> {{-c, c}, {0, c}, {-c, c}}, Boxed -> True, Axes -> True,
ImageSize -> {Automatic, 500}};
pntsgr = Graphics3D[Point[data]];
Grid[{{Show[{qSurface, pntsgr}, opts], Show[{qSurface2, pntsgr}, opts]}}]
]
Check
Now we can answer the question how many data points are inside the quantile directions regions. Again, this is easy with the built-in region functionalities of Mathematica version 10.
In[656]:= inPoints = Select[data, # \[Element] qRegion &];
Length[inPoints]/Length[data] // N
Out[657]= 0.5974
In[700]:= inPoints = Select[data, # \[Element] qRegion2 &];
Length[inPoints]/Length[data] // N
Out[701]= 0.1705
(In these code lines I kept “[\Element]” instead of replacing it with “∈” in order the lines to be copy and pasted. )
As we can see the algorithm would produce a region Subscript[R, q] which contains a much smaller number of points than the requested quantile, q. This property of the algorithm is discussed in [2]. (In the two dimensional space this property is less pronounced.)
Obviously, though, we would expect the bias of the algorithm to be monotonic with respect to the quantiles requested. Given a set of points P and quantiles u and v, 0<u<v<=1, for the respectively produced by the algorithm regions Ru and Rv we have |P ∩ Ru| < |P ∩ Rv| .
References
[1] Anton Antonov, Directional quantile envelopes, “Mathematica for prediction algorithms” blog at WordPress.com, 11/3/2014 .
[2] Linglong Kong, Ivan Mizera, “Quantile tomography: using quantiles with multivariate data”, 2013, arXiv:0805.0056v2 [stat.ME] URL: http://arxiv.org/abs/0805.0056 .
Pingback: Directional Quantile Envelopes – making sense of 2D and 3D point clouds | Biology, Data, and Devices
Pingback: Finding outliers in 2D and 3D numerical data | Mathematica for prediction algorithms
Pingback: Biology, Data, and Devices · Directional Quantile Envelopes – making sense of 2D and 3D point clouds