RousseeuwCrouxSn - Maple Help

All Products Maple MapleSim

Home : Support : Online Help : Statistics and Data Analysis : Statistics Package : Quantities : RousseeuwCrouxSn

Statistics

RousseeuwCrouxSn

compute Rousseeuw and Croux' Sn

Random Variable Options

Examples

References

Compatibility

	Calling Sequence
	RousseeuwCrouxSn(A, ds_options) RousseeuwCrouxSn(X, rv_options)

Parameters

A	-	data set or Matrix data set
X	-	algebraic; random variable or distribution
ds_options	-	(optional) equation(s) of the form option=value where option is one of correction, ignore, or weights; specify options for computing Rousseeuw and Croux' Sn statistic of a data set
rv_options	-	(optional) equation of the form numeric=value; specifies options for computing Rousseeuw and Croux' Sn statistic of a random variable

Description

•	The RousseeuwCrouxSn function computes a robust measure of the dispersion of the specified data set or random variable, as introduced by Rousseeuw and Croux in [2].

•	This statistic, referred to as $S_{n}$ in the remainder of this help page, is defined for a data set $A_{1}, A_{2}, ..., A_{n}$ as:

$S_{n} = LowMedian (HighMedian (|A_{i} - A_{j}|, i = 1 .. n), j = 1 .. n)$

where the $LowMedian$ of $n$ values is its $⌊\frac{n}{2} + \frac{1}{2}⌋$ th OrderStatistic and the $HighMedian$ is its $⌈\frac{n}{2} + \frac{1}{2}⌉$ th OrderStatistic. ( $HighMedian$ and $LowMedian$ are not Maple functions - they are only used here to define $S_{n}$ .)

•	$S_{n}$ is a robust statistic: it has a high breakdown point (the proportion of arbitrarily large observations it can handle before giving an arbitrarily large result). The breakdown point of $S_{n}$ is the maximum possible value, $\frac{1}{2}$ .

•	$S_{n}$ is a measure of dispersion, also called a measure of scale: if $S_{n} (X) = a$ , then for all real constants $α$ and $β$ , we have $S_{n} (α X + β) = \|α\| a$ .

•

The first parameter can be a data set, a distribution (see Statistics[Distribution]), a random variable, or an algebraic expression involving random variables (see Statistics[RandomVariable]). For a data set $A$ , RousseeuwCrouxSn computes $S_{n}$ as defined above. For a distribution or random variable $X$ , RousseeuwCrouxSn computes the asymptotic equivalent - the value that $S_{n}$ converges to for ever larger samples of $X$ .

Computation

•	By default, all computations involving random variables are performed symbolically (see option numeric below).

•	All computations involving data are performed in floating-point; therefore, all data provided must have type/realcons and all returned solutions are floating-point, even if the problem is specified with exact values.

•	For more information about computation in the Statistics package, see the Statistics[Computation] help page.

Data Set Options

•	The ds_options argument can contain one or more of the options shown below. More information for some options is available in the Statistics[DescriptiveStatistics] help page.

•

ignore=truefalse -- This option controls how missing data is handled by the RousseeuwCrouxSn command. Missing items are represented by undefined or Float(undefined). So, if ignore=false and A contains missing data, the RousseeuwCrouxSn command may return undefined. If ignore=true all missing items in A will be ignored. The default value is false.

•	weights=Vector -- Data weights. The number of elements in the weights array must be equal to the number of elements in the original data sample. By default all elements in A are assigned weight $1$ .

•	correction=samplesize or correction=none -- In [2], Rousseeuw and Croux define a correction factor $c_{n}$ for finite sample size as:

$c_{n} = {\begin{array}{c} 0.743 & n = 2 \\ 1.851 & n = 3 \\ 0.954 & n = 4 \\ 1.351 & n = 5 \\ 0.993 & n = 6 \\ 1.198 & n = 7 \\ 1.005 & n = 8 \\ 1.131 & n = 9 \\ \frac{n}{n - 0.9} & n > 9 and n ∷ odd \\ 1 & n > 9 and n ∷ even \end{array}$

If the option correction = samplesize is given, then this correction factor is applied before the result is returned. The default is correction = none, that is, no correction factor is applied.

Random Variable Options

The rv_options argument can contain one or more of the options shown below. More information for some options is available in the Statistics[RandomVariables] help page.

•	numeric=truefalse -- By default, $S_{n}$ is computed using exact arithmetic. To compute $S_{n}$ numerically, specify the numeric or numeric = true option.

Examples

>	$with (Statistics) &colon;$

Compute $S_{n}$ for a data sample.

>	$s ≔ ⟨1, 5, 2, 2, 7, 4, 1, 6⟩$

$s ≔ [\begin{array}{c} 1 \\ 5 \\ 2 \\ 2 \\ 7 \\ 4 \\ 1 \\ 6 \end{array}]$

(1)

>	$RousseeuwCrouxSn (s)$

$3.$

(2)

Employ Rousseeuw and Croux's finite sample size correction.

>	$RousseeuwCrouxSn (s,'correction = samplesize')$

$3.01500000000000$

(3)

Let's replace three of the values with very large values.

>	$t ≔ copy (s) &colon;$

>	$t [1 .. 3] ≔ 10^{100} &colon;$

$t$

$[\begin{array}{c} 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 \\ 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 \\ 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 \\ 2 \\ 7 \\ 4 \\ 1 \\ 6 \end{array}]$

(4)

>	$RousseeuwCrouxSn (t)$

$6.$

(5)

The value of $S_{n}$ stays bounded, because it has a high breakdown point.

Compute $S_{n}$ for a normal distribution.

>	$RousseeuwCrouxSn ('Normal' (3, 5),'numeric')$

$4.192525630$

(6)

The symbolic result is a rather complicated expression. It evaluates to the same floating-point number.

>	$RousseeuwCrouxSn ('Normal' (3, 5))$

$5 RootOf (\erf (\frac{\sqrt{2}_Z}{2} + RootOf (2 \erf (_Z) - 1)) + \erf (\frac{\sqrt{2}_Z}{2} - RootOf (2 \erf (_Z) - 1)) - 1)$

(7)

>	$evalf ()$

$4.192525630$

(8)

Generate a random sample of size 1000000 from the same distribution and compute the sample's $S_{n}$ .

>	$A ≔ Sample ('Normal' (3, 5), 1000000) &colon;$

>	$RousseeuwCrouxSn (A)$

$4.19120824538362$

(9)

Consider the following Matrix data set.

>	$M ≔ Matrix ([[3, 1130, 114694], [4, 1527, 127368], [3, 907, 88464], [2, 878, 96484], [4, 995, 128007]])$

$M ≔ [\begin{array}{c} 3 & 1130 & 114694 \\ 4 & 1527 & 127368 \\ 3 & 907 & 88464 \\ 2 & 878 & 96484 \\ 4 & 995 & 128007 \end{array}]$

(10)

We compute $S_{n}$ for each of the columns.

>	$RousseeuwCrouxSn (M)$

$[\begin{array}{c} 1. & 117. & 13313. \end{array}]$

(11)

References

[1] Stuart, Alan, and Ord, Keith. Kendall's Advanced Theory of Statistics. 6th ed. London: Edward Arnold, 1998. Vol. 1: Distribution Theory.

[2] Rousseeuw, Peter J., and Croux, Christophe. Alternatives to the Median Absolute Deviation. Journal of the American Statistical Association 88(424), 1993, pp.1273-1283.

Compatibility

•	The Statistics[RousseeuwCrouxSn] command was introduced in Maple 17.

•	For more information on Maple 17 changes, see Updates in Maple 17.

Maple

Maple Add-Ons

MapleSim

MapleSim Add-Ons

Systems Engineering

Consulting Services

Maple T.A. and Möbius

Education

Industries

Automotive and Aerospace

Robotics

Machine Design & Industrial Automation

Other

Application Areas

Product Pricing

Purchasing

Institutional Student Licensing

Maplesoft Elite Maintenance (EMP)

Support

Product Training

Online Product Help

Webinars & Events

Publications

Content Hubs

Examples & Applications

Community

About Maplesoft

Media Center

User Community

Contact

Online Help

All Products Maple MapleSim

Maple

Powerful math software that is easy to use

Maple Add-Ons

MapleSim

Advanced System Level Modeling

MapleSim Add-Ons

Systems Engineering

Consulting Services

Maple T.A. and Möbius

Education

Industries

Automotive and Aerospace

Robotics

Machine Design & Industrial Automation

Other

Application Areas

Product Pricing

Purchasing

Institutional Student Licensing

Maplesoft Elite Maintenance (EMP)

Support

Product Training

Online Product Help

Webinars & Events

Publications

Content Hubs

Examples & Applications

Community

About Maplesoft

Media Center

User Community

Contact

Online Help

All Products Maple MapleSim