Data Smoothing - Maple Help

All Products Maple MapleSim

Home : Support : Online Help : Statistics and Data Analysis : Statistics Package : Example Worksheets : Data Smoothing

Data Smoothing

The Statistics package provides several functions for performing data smoothing - the process of extracting identifiable patterns from data and obscuring noise. The data smoothing functionality includes algorithms to produce smoothed data (MovingAverage, MovingStatistic, ExponentialFit) or to produce an estimation curve to approximate the distribution of the population (i.e. kernel density estimation).

1 Data Filters

The Statistics package includes several data filters for smoothing otherwise rough data including moving average, moving median, moving statistic, a general linear filter, exponential fit and weighted moving average.

1.1 Stock Prices

This example demonstrates the use of data filters in analyzing stock prices.

>	$restart &colon;$ $with (Statistics) &colon;$

Consider the following function that generates a sample stock path over N time periods. The stock is considered to have initial cost S0, trend parameter r and fluctuation parameter sigma.

$StockPath := proc (N ∷ posint, S0 ∷ realcons, r ∷ realcons, σ ∷ realcons) local h, i, C, R, S &semi; h ≔ \frac{1.}{N - 1} &semi; C ≔ evalf (\exp (r \cdot h - \frac{σ^{2} \cdot h}{2})) &semi; R ≔ C \cdot \exp (σ \cdot \sqrt{h} \cdot RandomVariable (Normal (0, 1))) &semi; S := Sample (R, N + 1) &semi; S [1] := S0 &semi; return CumulativeProduct (S) end proc &colon;$

Generate a sample stock path over 500 time periods and plot.

>	$S := StockPath (1000, 100., 0.15, 0.2) &colon;$ $LineChart (S, symbolsize = 4, thickness = 2)$

The data smoothing functions provided in the Statistics library now give us a means to analyze the overall trend of the data while disregarding small fluctuations. Consider the moving average function, which calculates the average value of a window around each data point.

>	$T := MovingAverage (S, 20) &colon;$ $LineChart (T, symbolsize = 4, thickness = 2)$

Exponential smoothing can also be applied. This method works by 'smoothing' out rough edges, generally caused by cyclic or irregular patterns in the data.

>	$T := ExponentialSmoothing (S, 0.9) &colon;$ $LineChart (T, symbolsize = 4, thickness = 2)$

1.2 Department Store Sales

This example demonstrates the use of data filters in analyzing sales at a department store.

>	$restart &semi;$ $with (Statistics) &colon;$

Consider the following function that randomly generates the times of n sales at a department store. The rate of sales is represented by the parameter r and the deviation in this rate by the parameter theta.

>	$SaleTimes := proc (N ∷ realcons, r ∷ realcons, θ ∷ realcons) local R, S, T, i &semi; R ≔ r \cdot RandomVariable (Exponential (θ)) &semi; S ≔ Sample (R, N) &semi; return CumulativeSum (S) end proc &colon;$

Consider the first 100 sales with rate parameter 0.5 and deviation parameter 0.2.

>	$S := SaleTimes (100, 0.5, 0.2) &colon;$ $LineChart (S, thickness = 2, symbolsize = 4)$

The overall trend is readily apparent with the application of the moving average filter.

>	$T := MovingAverage (S, 20) &colon;$ $LineChart (T, symbolsize = 4, thickness = 2)$

2 Kernel Density Estimation

The Statistics package provides algorithms for computing, plotting and sampling from kernel density estimates. A kernel density estimate is a continuous probability distribution used to approximate the population of a sample, constructed by considering a normalized sum of kernel functions for each data point.

The following is an example of Maple's kernel density estimation routines in action.

>	$restart &semi;$ $with (Statistics) &colon;$

Consider the following bimodal data sample (hypothesized as bimodal since there appear to be two distinct clusterings of data - those in the range -1.2 to -0.8 and those in the range 0.7 to 0.9).

>	$A := Array ([- 1.18, - 1.12, - 1.06, - 1.02, - 0.84, 0.72, 0.78, 0.89]) &colon;$ $Z := Array ([0.]) &colon;$

By applying kernel density estimation, we can create a function to interpolate the data. Since our data sample is relatively small, we can perform exact kernel density estimation. The exact method of kernel density estimation returns a probability density function which can then be evaluated at specific points.

>	$F := KernelDensity (A, bandwidth = 0.4, kernel = gaussian, method = exact) &colon;$ $evalf ([F (- 1.0), F (0.0), F (0.5), F (2.0)])$

$[0.5947413597, 0.08016057122, 0.2829169446, 0.004587682613]$

(2.1)

We can convert the kernel density estimate to a distribution using one of the standard RandomVariable constructors.

>	$R := RandomVariable (Distribution (PDF = (x \to F (x)))) &colon;$

>	$evalf ([PDF (R, - 1.0), PDF (R, 0.0), PDF (R, 0.5), PDF (R, 2.0)])$

$[0.5947413597, 0.08016057122, 0.2829169446, 0.004587682613]$

(2.2)

>	$evalf ([CDF (R, - 1.0), CDF (R, 0.0), CDF (R, 0.5), CDF (R, 2.0)])$

$[0.3394631178, 0.6303924803, 0.7121675015, 0.9994260712]$

(2.3)

This probability density function can also be plotted, in this case against the cumulative distribution function.

>	$P1 := plot (PDF (R, x), x = - 2.5 .. 2.5, thickness = 3) &colon;$ $P2 := plot (CDF (R, x), x = - 2.5 .. 2.5, thickness = 3, color = blue) &colon;$ $plots [display] (P2, P1)$

With the KernelDensitySample function, similar data can be quickly drawn from a data sample.

>	$S := KernelDensitySample (A, 100000, bandwidth = 0.4, kernel = gaussian) &colon;$ $P1 := Histogram (S, averageshifted = 1, binwidth = 0.1, range = - 2.5 .. 2.5) &colon;$ $P2 := plot (PDF (R, x), x = - 2.5 .. 2.5, thickness = 3, color = red) &colon;$ $plots [display] (P1, P2)$

A kernel density estimate can be directly plotted using the KernelDensityPlot function. The following example demonstrates the difference between different choices of bandwidth.

$P1 := KernelDensityPlot (A, bandwidth = 0.1, kernel = biweight, method = exact, color = turquoise, thickness = 2, range = - 2 .. 2) &colon;$ $P2 := KernelDensityPlot (A, bandwidth = 0.3, kernel = biweight, method = exact, color = blue, thickness = 2, range = - 2 .. 2) &colon;$ $P3 ≔ KernelDensityPlot (A, bandwidth = 0.6, kernel = biweight, method = exact, color = navy, thickness = 2, range = - 2 .. 2) &colon;$ $plots [display] (P1, P2, P3)$

In most cases, only a few hundred samples are needed to roughly approximate the original probability distribution with a kernel density estimate.

$B := Sample (StudentT (2), 600) &colon;$ $P1 ≔ Histogram (B, range = - 5 .. 5) &colon;$ $P2 ≔ DensityPlot (StudentT (2), color = blue, thickness = 3, range = - 5 .. 5) &colon;$ $P3 := KernelDensityPlot (B, kernel = gaussian, method = piecewise, color = red, thickness = 3, range = - 5 .. 5) &colon;$ $plots [display] (P1, P2, P3)$

Available Kernels

Kernel density estimation requires the use of a kernel function - a normalized continuous function that is mapped to each data point. Five standard kernel functions are available with kernel density estimation.

2.1 Gaussian Kernel

The Gaussian kernel should be used with continuous data that is defined on the whole real line. It possesses the familiar bell shape and is based on the Gaussian probability density function.

>	$KernelDensityPlot (Z, kernel = gaussian, method = exact, thickness = 3) &semi;$ $KernelDensityPlot (A, kernel = gaussian, bandwidth = 0.4, method = exact, thickness = 3)$

2.2 Triangular Kernel

The triangular kernel is a piecewise function related to the triangular distribution. This kernel generally creates a kernel density estimate with sharp edges, although remaining relatively smooth.

>	$KernelDensityPlot (Z, kernel = triangular, method = exact, thickness = 3) &semi;$ $KernelDensityPlot (A, kernel = triangular, bandwidth = 0.4, method = exact, thickness = 3)$

2.3 Rectangular Kernel

The rectangular kernel is a piecewise function related to the uniform distribution. This kernel creates a kernel density estimate that resembles a staircase function.

>	$KernelDensityPlot (Z, kernel = rectangular, method = exact, thickness = 3) &semi;$ $KernelDensityPlot (A, kernel = rectangular, bandwidth = 0.4, method = exact, thickness = 3)$

2.4 Biweight Kernel

The biweight kernel is a smooth kernel that is defined on a finite interval, unlike the gaussian kernel. It should be used for bounded data that is smooth along the interval it is defined on.

>	$KernelDensityPlot (Z, kernel = biweight, method = exact, thickness = 3) &semi;$ $KernelDensityPlot (A, kernel = biweight, bandwidth = 0.4, method = exact, thickness = 3)$

2.5 Epanechnikov Kernel

The Epanechnikov kernel is the standard kernel for kernel density estimation. It generally provides the closest match to a probability density function under most circumstances. The kernel itself is a rounded function similar to the biweight, except it is not differentiable at its boundaries.

>	$KernelDensityPlot (Z, kernel = epanechnikov, method = exact, thickness = 3) &semi;$ $KernelDensityPlot (A, kernel = epanechnikov, bandwidth = 0.4, method = exact, thickness = 3)$

Return to Index for Example Worksheets

Download Help Document

Maple

Maple Add-Ons

MapleSim

MapleSim Add-Ons

Systems Engineering

Consulting Services

Maple T.A. and Möbius

Education

Industries

Automotive and Aerospace

Robotics

Machine Design & Industrial Automation

Other

Application Areas

Product Pricing

Purchasing

Institutional Student Licensing

Maplesoft Elite Maintenance (EMP)

Support

Product Training

Online Product Help

Webinars & Events

Publications

Content Hubs

Examples & Applications

Community

About Maplesoft

Media Center

User Community

Contact

Online Help

All Products Maple MapleSim

Maple

Powerful math software that is easy to use

Maple Add-Ons

MapleSim

Advanced System Level Modeling

MapleSim Add-Ons

Systems Engineering

Consulting Services

Maple T.A. and Möbius

Education

Industries

Automotive and Aerospace

Robotics

Machine Design & Industrial Automation

Other

Application Areas

Product Pricing

Purchasing

Institutional Student Licensing

Maplesoft Elite Maintenance (EMP)

Support

Product Training

Online Product Help

Webinars & Events

Publications

Content Hubs

Examples & Applications

Community

About Maplesoft

Media Center

User Community

Contact

Online Help

All Products Maple MapleSim