RousseeuwCrouxQn - Maple Help

All Products Maple MapleSim

Home : Support : Online Help : Statistics and Data Analysis : Statistics Package : Quantities : RousseeuwCrouxQn

Statistics

RousseeuwCrouxQn

compute Rousseeuw and Croux' Qn

Random Variable Options

Examples

References

Compatibility

	Calling Sequence
	RousseeuwCrouxQn(A, ds_options) RousseeuwCrouxQn(X, rv_options)

Parameters

A	-	data set or Matrix data set
X	-	algebraic; random variable or distribution
ds_options	-	(optional) equation(s) of the form option=value where option is one of correction, ignore, or weights; specify options for computing Rousseeuw and Croux' Qn statistic of a data set
rv_options	-	(optional) equation of the form numeric=value; specifies options for computing Rousseeuw and Croux' Qn statistic of a random variable

Description

•	The RousseeuwCrouxQn function computes a robust measure of the dispersion of the specified data set or random variable, as introduced by Rousseeuw and Croux in [2].

•	This statistic, referred to as $Q_{n}$ in the remainder of this help page, is defined for a sorted data set $A_{1} \leq A_{2} \leq \dots \leq A_{n}$ as:

$Q_{n} = OrderStatistic ([seq (seq (A_{i} - A_{j}, i = j + 1 .. n), j = 1 .. n - 1)], k)$

where $k$ is $(\binom{⌊\frac{n}{2}⌋ + 1}{2})$ .

•	$Q_{n}$ is a robust statistic: it has a high breakdown point (the proportion of arbitrarily large observations it can handle before giving an arbitrarily large result). The breakdown point of $Q_{n}$ is the maximum possible value, $\frac{1}{2}$ .

•	$Q_{n}$ is a measure of dispersion, also called a measure of scale: if $Q_{n} (X) = a$ , then for all real constants $α$ and $β$ , we have $Q_{n} (α X + β) = \|α\| a$ .

•

The first parameter can be a data set, a distribution (see Statistics[Distribution]), a random variable, or an algebraic expression involving random variables (see Statistics[RandomVariable]). For a data set $A$ , RousseeuwCrouxQn computes $Q_{n}$ as defined above. For a distribution or random variable $X$ , RousseeuwCrouxQn computes the asymptotic equivalent - the value that $Q_{n}$ converges to for ever larger samples of $X$ .

Computation

•	By default, all computations involving random variables are performed symbolically (see option numeric below).

•	All computations involving data are performed in floating-point; therefore, all data provided must have type/realcons and all returned solutions are floating-point, even if the problem is specified with exact values.

•	For more information about computation in the Statistics package, see the Statistics[Computation] help page.

Data Set Options

•	The ds_options argument can contain one or more of the options shown below. More information for some options is available in the Statistics[DescriptiveStatistics] help page.

•

ignore=truefalse -- This option controls how missing data is handled by the RousseeuwCrouxQn command. Missing items are represented by undefined or Float(undefined). So, if ignore=false and A contains missing data, the RousseeuwCrouxQn command may return undefined. If ignore=true all missing items in A will be ignored. The default value is false.

•	weights=Vector -- Data weights. The number of elements in the weights array must be equal to the number of elements in the original data sample. By default all elements in A are assigned weight $1$ .

•	correction=samplesize or correction=none -- In [2], Rousseeuw and Croux define a correction factor $c_{n}$ for finite sample size as:

$d_{n} = {\begin{array}{c} 0.399 & n = 2 \\ 0.994 & n = 3 \\ 0.512 & n = 4 \\ 0.844 & n = 5 \\ 0.611 & n = 6 \\ 0.857 & n = 7 \\ 0.669 & n = 8 \\ 0.872 & n = 9 \\ \frac{n}{n + 1.4} & n > 9 and n ∷ odd \\ \frac{n}{n + 3.8} & n > 9 and n ∷ even \end{array}$

If the option correction = samplesize is given, then this correction factor is applied before the result is returned. The default is correction = none, that is, no correction factor is applied.

Random Variable Options

The rv_options argument can contain one or more of the options shown below. More information for some options is available in the Statistics[RandomVariables] help page.

•	numeric=truefalse -- By default, $Q_{n}$ is computed using exact arithmetic. To compute $Q_{n}$ numerically, specify the numeric or numeric = true option.

Examples

>	$with (Statistics) &colon;$

Compute $Q_{n}$ for a data sample.

>	$s ≔ ⟨1, 5, 2, 2, 7, 4, 1, 6, 9⟩$

$s ≔ [\begin{array}{c} 1 \\ 5 \\ 2 \\ 2 \\ 7 \\ 4 \\ 1 \\ 6 \\ 9 \end{array}]$

(1)

>	$RousseeuwCrouxQn (s)$

$2.$

(2)

Employ Rousseeuw and Croux's finite sample size correction.

>	$RousseeuwCrouxQn (s,'correction = samplesize')$

$1.74400000000000$

(3)

Let's replace four of the values with very large values.

>	$t ≔ copy (s) &colon;$

>	$t [1 .. 4] ≔ 10^{100} &colon;$

$t$

$[\begin{array}{c} 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 \\ 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 \\ 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 \\ 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 \\ 7 \\ 4 \\ 1 \\ 6 \\ 9 \end{array}]$

(4)

>	$RousseeuwCrouxQn (t)$

$3.$

(5)

The value of $Q_{n}$ stays bounded, because it has a high breakdown point.

Compute $Q_{n}$ for a normal distribution.

>	$RousseeuwCrouxQn ('Normal' (3, 5),'numeric')$

$2.25312055012086$

(6)

The symbolic result is an expression involving the inverse (see RootOf) of the error function (see erf). It evaluates to the same floating-point number.

>	$RousseeuwCrouxQn ('Normal' (3, 5))$

$10 RootOf (4 \erf (_Z) - 1)$

(7)

>	$evalf ()$

$2.253120550$

(8)

Generate a random sample of size 1000000 from the same distribution and compute the sample's $Q_{n}$ .

>	$A ≔ Sample ('Normal' (3, 5), 1000000) &colon;$

>	$RousseeuwCrouxQn (A)$

$2.25266620862896$

(9)

Consider the following Matrix data set.

>	$M ≔ Matrix ([[3, 1130, 114694], [4, 1527, 127368], [3, 907, 88464], [2, 878, 96484], [4, 995, 128007]])$

$M ≔ [\begin{array}{c} 3 & 1130 & 114694 \\ 4 & 1527 & 127368 \\ 3 & 907 & 88464 \\ 2 & 878 & 96484 \\ 4 & 995 & 128007 \end{array}]$

(10)

We compute $Q_{n}$ for each of the columns.

>	$RousseeuwCrouxQn (M)$

$[\begin{array}{c} 1. & 117. & 12674. \end{array}]$

(11)

References

[1] Stuart, Alan, and Ord, Keith. Kendall's Advanced Theory of Statistics. 6th ed. London: Edward Arnold, 1998. Vol. 1: Distribution Theory.

[2] Rousseeuw, Peter J., and Croux, Christophe. Alternatives to the Median Absolute Deviation. Journal of the American Statistical Association 88(424), 1993, pp.1273-1283.

Compatibility

•	The Statistics[RousseeuwCrouxQn] command was introduced in Maple 18.

•	For more information on Maple 18 changes, see Updates in Maple 18.

Maple

Maple Add-Ons

MapleSim

MapleSim Add-Ons

Systems Engineering

Consulting Services

Maple T.A. and Möbius

Education

Industries

Automotive and Aerospace

Robotics

Machine Design & Industrial Automation

Other

Application Areas

Product Pricing

Purchasing

Institutional Student Licensing

Maplesoft Elite Maintenance (EMP)

Support

Product Training

Online Product Help

Webinars & Events

Publications

Content Hubs

Examples & Applications

Community

About Maplesoft

Media Center

User Community

Contact

Online Help

All Products Maple MapleSim

Maple

Powerful math software that is easy to use

Maple Add-Ons

MapleSim

Advanced System Level Modeling

MapleSim Add-Ons

Systems Engineering

Consulting Services

Maple T.A. and Möbius

Education

Industries

Automotive and Aerospace

Robotics

Machine Design & Industrial Automation

Other

Application Areas

Product Pricing

Purchasing

Institutional Student Licensing

Maplesoft Elite Maintenance (EMP)

Support

Product Training

Online Product Help

Webinars & Events

Publications

Content Hubs

Examples & Applications

Community

About Maplesoft

Media Center

User Community

Contact

Online Help

All Products Maple MapleSim