Quantile regression with B-splines

Last weekend I made two implementations of Quantile Regression (QR) calculation with B-spline bases. The first implementation is based on the Linear Programming (LP) formulation of the quantile minimization problem. The second implementation is a direct translation of the non-LP minimization formulation. (Mathematica’s functions LinearProgramming and Minimize are used respectively.)

The implementations are in the package QuantileRegression.m provided by the project MathematicaForPrediction at GitHub.

Because of the signature required by the QR implementation with B-splines and the returned results, I had to rename the previous QR implementation to QuantileRegressionFit and I named the new one (with the B-splines) QuantileRegression. I think the new names reflect better what the functions are doing.

Right now, to me the most interesting application of QR is detection of outliers in time series and reconstruction of conditional density functions. Below is a demonstration of using QR for outlier detection.

Consider this data:
Atlanta temperature 2006-2013 changed with y = y+Sqrt[x]

which was generated with the code:

tempData = WeatherData["Atlanta", "Temperature", {{2006, 1, 1}, {2013, 12, 31}, "Day"}];
tempPData = tempData;
tempPData[[All, 1]] = DateDifference[tempData[[1, 1]], #, "Day"] & /@ tempData[[All, 1]];
tempPData[[All, 1]] = tempPData[[All, 1, 1]];
tempPData[[All, 2]] = Sqrt[tempPData[[All, 1]]] + tempPData[[All, 2]];

(I took the temperature in Atlanta, GA, USA for the last 7 years and modified the data with the formula Yi = sqrt(Xi) + Yi .)

Here is a plot with five regression quantiles calculated using B-splines for the quantiles {0.02, 0.2, 0.4, 0.6, 0.8, 0.98}:
Regression quantiles for Atlanta temperature 2006-2013 changed with y = y+Sqrt[x]

Here is another plot with the 0.02 and 0.98 regression quantiles outlining the data:
Outline with regression quantiles for Atlanta temperature 2006-2013 changed with y = y+Sqrt[x]

The regression quantiles were computed with the command:

qFuncs = QuantileRegression[tempPData, 44, {0.02, 0.2, 0.4, 0.6, 0.8, 0.98}]

Note that in order to get satisfactory results, the only tuning value I had to choose is for the second parameter, which in this case specifies the number of B-spline basis knots. The second parameter can also be a list of knots. The option InterpolationOrder specifies the splines order. (By default InterpolationOrder->3.)

Update, 2015.07.04

This blog post was written with an older version of Mathematica, version 9. (The current is 10.1.)  Mathematica 10 uses different format for the WeatherData results — TimeSeries object are returned, which use QuantityArray and/or Quantity around the numerical values like temperature, wind speed, etc. 

Below is given code that runs with Mathematica 10.1. (The only change is the second line of the data loading part.)

1. Data load:

tempData = WeatherData["Atlanta", "Temperature", {{2006, 1, 1}, {2013, 12, 31}, "Day"}];
tempData = Transpose[{Normal[tempData["Dates"]], Normal[tempData["Values"]] /. Quantity[q_, _] :> q}];
tempPData = tempData;
tempPData[[All, 1]] = DateDifference[tempData[[1, 1]], #, "Day"] & /@ tempData[[All, 1]];
tempPData[[All, 1]] = tempPData[[All, 1, 1]];
tempPData[[All, 2]] = Sqrt[tempPData[[All, 1]]] + tempPData[[All, 2]];

2. Regression quantiles computation:

qFuncs = QuantileRegression[tempPData, 44, {0.02, 0.2, 0.4, 0.6, 0.8, 0.98}];

3. Plots:

gr1 = ListPlot[tempPData];
gr2 = Plot[
Evaluate[Through[qFuncs[x]]], {x, Min[tempPData[[All, 1]]],
Max[tempPData[[All, 1]]]}, PlotPoints -> 1200,
PerformanceGoal -> "Speed"];
jan1Pos = Position[DateList /@ tempData[[All, 1]], {_, 1, 1, ___}][[All, 1]];
Show[{gr1, gr2}, PlotRange -> All, Axes -> False, Frame -> True
, GridLines -> {jan1Pos, FindDivisions[{##}, 10] &},
GridLinesStyle -> Directive[GrayLevel[0.8], Dashed],
AspectRatio -> 1/3]

Advertisements

6 thoughts on “Quantile regression with B-splines

  1. Pingback: Find Fit for Non-linear data | Mathematica for prediction algorithms

  2. Pingback: Directional quantile envelopes | Mathematica for prediction algorithms

  3. Pingback: Implementation of smoothing splines function | DL-UAT

  4. I get an error at the line:
    tempPData[[All, 1]] = DateDifference[tempData[[1, 1]], #, “Day”] & /@ tempData[[All, 1]];

    Here’s the error message:
    Part::partd: Part specification TemporalData[TimeSeries,{{{6.28 \[Degree]C,15.11 \[Degree]C,12.06 \[Degree]C,8.5 \[Degree]C,8.83 \[Degree]C,4.17 \[Degree]C,2.39 \[Degree]C,5.67 \[Degree]C,15.5 \[Degree]C,11.67 \[Degree]C,14.5 \[Degree]C,10.44 \[Degree]C,10.22 \[Degree]C,3.44 \[Degree]C,3.33 \[Degree]C,9.33 \[Degree]C,12.94 \[Degree]C,5.44 \[Degree]C,5.06 \[Degree]C,7.94 \[Degree]C,14.44 \[Degree]C,13.56 \[Degree]C,8.56 \[Degree]C,11.28 \[Degree]C,8.44 \[Degree]C,5.11 \[Degree]C,6.22 \[Degree]C,8.78 \[Degree]C,14.11 \[Degree]C,9.22 \[Degree]C,7.72 \[Degree]C,6.61 \[Degree]C,7.33 \[Degree]C,12.67 \[Degree]C,10.61 \[Degree]C,2.72 \[Degree]C,2.11 \[Degree]C,4.06 \[Degree]C,2.94 \[Degree]C,4.94 \[Degree]C,2.44 \[Degree]C,5.5 \[Degree]C,1.06 \[Degree]C,0.28 \[Degree]C,3 \[Degree]C,8.5 \[Degree]C,12.56 \[Degree]C,12.06 \[Degree]C,4.78 \[Degree]C,-0.22 \[Degree]C,<>}},<>,{ValueDimensions->1,ResamplingMethod->{Interpolation,InterpolationOrder->1}}},True,10.][[All,1]] is longer than depth of object. >>

    • You get these errors because I wrote this blog post with an older version of Mathematica, version 9. (The current is 10.1.)  Mathematica 10 uses different format for the results of WeatherData – TimeSeries objects are returned, which use QuantityArray and/or Quantity around the numerical values like temperature, wind speed, etc. 

      I am going to update the blog post with code that runs on Mathematica 10.1.

      Thank you very much for letting me know this problem!

  5. Pingback: Finding local extrema in noisy data using Quantile Regression | Mathematica for prediction algorithms

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s