Descriptive Statistics, Data Summary and Related Commands
The Statistics package provides various commands for computing descriptive statistics and related quantities. These include location, dispersion and shape statistics, moments and cumulants. The package also provides several data summary and tabulation commands. In addition, most of these functions can handle weighted data and data with missing values. Here is the list of available commands
Available Commands
Floating Point Environment
Supplying Data
Data with Missing Values
Adding Weights to Data
Examples
Location Statistics
Decile
deciles
GeometricMean
geometric mean
HarmonicMean
harmonic mean
HodgesLehmann
Hodges-Lehmann statistic
MakeProcedure
generate a procedure for calculating statistical quantities
Mean
arithmetic mean
Median
median
Mode
mode
Percentile
percentiles
QuadraticMean
quadratic mean
Quantile
quantiles
Quartile
quartiles
TrimmedMean
trimmed mean
WinsorizedMean
winsorized mean
Dispersion Statistics
AbsoluteDeviation
compute the average absolute deviation
InterquartileRange
interquartile range
MeanDeviation
average absolute deviation from the mean
MedianDeviation
compute the median absolute deviation
Range
range
RousseeuwCrouxQn
Rousseeuw and Croux' Qn
RousseeuwCrouxSn
Rousseeuw and Croux' Sn
StandardDeviation
standard deviation
Variance
variance
Variation
coefficient of variation
Shape Statistics
Kurtosis
kurtosis
Skewness
skewness
Moments and Cumulants
CentralMoment
central moments
Cumulant
cumulants
Moment
moments
StandardizedMoment
standardized moments
Data Summary
DataSummary
seven summary statistics
FivePointSummary
five-point summary
FrequencyTable
frequency table
Related Commands
AutoCorrelation
autocorrelations
Correlation
correlation/correlation matrix
Covariance
covariance/covariance matrix
CrossCorrelation
cross-correlations
ExpectedValue
compute expected values
OrderStatistic
order statistics
PCA
principal component analysis
PrincipalComponentAnalysis
StandardError
standard error of the sampling distribution
All computations involving data are performed in floating-point; therefore, all data provided must have type realcons and all returned solutions are floating-point, even if the problem is specified with exact values.
Most of the commands above can accept one- and two-dimensional data sets. One-dimensional data sets can be supplied as a list, a Vector, a one-dimensional Array, or a DataSeries. Two-dimensional data sets can be supplied as a list of lists, a Matrix, a two-dimensional Array, or a DataFrame.
For more details on how two-dimensional data is handled, see the DataFrames in Statistics help page.
Missing values are represented by undefined or Float(undefined). Note that Float(undefined) propagates freely through most floating-point operations, which means that most statistics for a data set with missing values will yield undefined. The option ignore - which is available for most commands listed above - controls how missing data is handled. If ignore=true all missing items in a data set will be ignored. The default value of this option is false. For more details on a particular command, see the corresponding help page.
Weights can be added to data by supplying an optional argument weights=value, where value is a vector of numeric constants. The number of elements in the weights array must be equal to the number of elements in the original data set. By default all elements in a data set are assigned weight 1. For more details on a particular command, see the corresponding help page.
Generate random sample drawn from the non-central Beta distribution.
with⁡Statistics:
X≔RandomVariable⁡NonCentralBeta⁡3,10,2:
A≔Sample⁡X,106:
Compute the five point summary of the data sample.
FivePointSummary⁡A
minimum=0.00284705659174078lowerhinge=0.188221218565660median=0.271179303591104upperhinge=0.364759629644148maximum=0.855336805054773
Compute the mean, standard deviation, skewness, kurtosis, etc.
DataSummary⁡A
mean=0.282220601028456standarddeviation=0.125176906233528skewness=0.440322599667426kurtosis=2.84590882372508minimum=0.00284705659174078maximum=0.855336805054773cumulativeweight=1.000000×106
Estimate the mode.
Mode⁡A
0.237418113813262
Compute the second moment about .3.
Moment⁡A,2,origin=0.3
0.0159853492127288
Compute mean, trimmed mean and winsorized mean.
Mean⁡A,TrimmedMean⁡A,1,99,WinsorizedMean⁡A,1,99
0.282220601028476,0.280991773683514,0.281923266592907
Compute frequency table for A.
FrequencyTable⁡A,range=0..1,bins=5
0...0.200000000000000283732.28.37320000283732.28.373200000.200000000000000..0.400000000000000536939.53.69390000820671.82.067100000.400000000000000..0.600000000000000168880.16.88800000989551.98.955100000.600000000000000..0.80000000000000010420.1.042000000999971.99.997100000.800000000000000..1.29.0.0029000000001.000000×106100.
See Also
Statistics
Statistics/Computation
Statistics/Distributions
Download Help Document