Overview of the Statistics Package
The Statistics package is a collection of tools for mathematical statistics and data analysis. The package supports a wide range of common statistical tasks such as quantitative and graphical data analysis, simulation, and curve fitting.
In addition to standard data analysis tools the Statistics package provides a wide range of symbolic and numeric tools for computing with random variables. The package supports over 35 major probability distributions and provides facilities for defining new distributions.
Much of the functionality in the Statistics package is accessible through the Context Panel. Context-sensitive functionality is available when selecting any data container (such as a Vector, list, or Array), known probability distributions (such as Normal(1,2)), or random variables.
Some related functionality regarding time series is available through the TimeSeriesAnalysis package.
For additional examples detailing the uses of the Statistics package, see the following example worksheets.
Data Smoothing
Estimation
Hypothesis Testing
Probability Distributions
Robust Statistics
Statistics with DataFrames
Below is the list of primary topics. See also Statistics[Commands] for an alphabetical list of Statistics commands.
Each command in the Statistics package can be accessed by using either the long form or the short form of the command name in the command calling sequence.
The long form, Statistics:-command, is always available.
Inventory of Probability Distributions
Descriptive Statistics, Data Summary and Tabulation
Probability Calculations, Random Variables
Visualization
Simulation
Regression
Data Manipulation
Hypothesis Testing and Inference
Over 35 continuous and discrete probability distributions as well as tools for creating new distributions. Here is the list of relevant commands.
Bernoulli
Bernoulli distribution
Beta
beta distribution
Binomial
binomial distribution
Cauchy
Cauchy distribution
ChiSquare
chi-square distribution
DiscreteUniform
discrete uniform distribution
DiscreteValueMap
create a non-integer discrete distribution
Distribution
create new distribution
EmpiricalDistribution
empirical distribution
Erlang
Erlang distribution
Error
error (exponential power) distribution
Exponential
exponential distribution
FRatio
Fisher f-distribution
Gamma
gamma distribution
Geometric
geometric distribution
Gumbel
Gumbel distribution
Hypergeometric
hypergeometric distribution
InverseGaussian
inverse Gaussian (Wald) distribution
Laplace
Laplace distribution
Logistic
logistic distribution
LogNormal
log normal distribution
Maxwell
Maxwell distribution
Moyal
Moyal distribution
NegativeBinomial
negative binomial (Pascal) distribution
NonCentralBeta
noncentral beta distribution
NonCentralChiSquare
noncentral chi-square distribution
NonCentralFRatio
noncentral f-distribution
NonCentralStudentT
noncentral t-distribution
Normal
normal (Gaussian) distribution
Pareto
Pareto distribution
Poisson
Poisson distribution
Power
power distribution
ProbabilityTable
probability table
Rayleigh
Rayleigh distribution
StudentT
Student-t distribution
Triangular
triangular distribution
Uniform
uniform (rectangular) distribution
VonMises
von Mises distribution
Weibull
Weibull distribution
More information is available in the Statistics[Distributions] help page.
A wide range of functions for computing descriptive statistics. This includes location, dispersion and shape statistics, moments and cumulants, as well as several data summary and tabulation commands. Here is the list of available commands.
AbsoluteDeviation
compute the average absolute deviation
AutoCorrelation
autocorrelations
CentralMoment
central moments
Correlation
correlation/correlation matrix
Covariance
covariance/covariance matrix
CrossCorrelation
cross-correlations
Cumulant
cumulants
DataSummary
seven summary statistics
Decile
deciles
ExpectedValue
compute expected values
FivePointSummary
five-point summary
FrequencyTable
frequency table
GeometricMean
geometric mean
HarmonicMean
harmonic mean
HodgesLehmann
Hodges-Lehmann statistic
InterquartileRange
interquartile range
Kurtosis
kurtosis
MakeProcedure
generate a procedure for calculating statistical quantities
Mean
arithmetic mean
MeanDeviation
average absolute deviation from the mean
Median
median
MedianDeviation
compute the median absolute deviation
Mode
mode
Moment
moments
OrderStatistic
order statistics
PCA
principal component analysis
Percentile
percentiles
PrincipalComponentAnalysis
QuadraticMean
quadratic mean
Quantile
quantiles
Quartile
quartiles
Range
range
RousseeuwCrouxQn
Rousseeuw and Croux' Qn
RousseeuwCrouxSn
Rousseeuw and Croux' Sn
Skewness
skewness
StandardDeviation
standard deviation
StandardError
standard error of the sampling distribution
StandardizedMoment
standardized moments
TrimmedMean
trimmed mean
Variance
variance
Variation
coefficient of variation
WinsorizedMean
winsorized mean
More information is available in the Statistics[DescriptiveStatistics] help page.
Tools for creating and manipulating random variables as well as functions for computing their densities, moments, generating functions and other quantities. Here is the list of available commands.
CDF
cumulative distribution function
CGF
cumulant generating function
CharacteristicFunction
characteristic function
CumulantGeneratingFunction
CumulativeDistributionFunction
FailureRate
hazard (failure) rate
HazardRate
InverseSurvivalFunction
inverse survival function
MGF
moment generating function
MillsRatio
Mills ratio
MomentGeneratingFunction
PDF
probability density function
Probability
compute the probability of an event
ProbabilityDensityFunction
ProbabilityFunction
probability function
RandomVariable
create new random variable
Support
support set of a random variable
SurvivalFunction
survival function
More information is available in the Statistics[RandomVariables] help page.
Various statistical plots such as box plots, bar charts, histograms, probability plots, scatterplots, etc. Here is the list of available commands.
AgglomeratedPlot
generate agglomerated plots
AreaChart
create area charts from data
BarChart
create bar charts from data
Biplot
generate biplots
BoxPlot
create box plots from data
BubblePlot
generate bubble plots
ColumnGraph
create column graphs from data
Correlogram
create autocorrelation plot from data
CumulativeSumChart
generate cumulative sum charts
DensityPlot
plot the density of a random variable
ErrorPlot
generate error plots
FrequencyPlot
generate frequency plots
GridPlot
generate a grid of plots
HeatMap
generate heat maps
Histogram
generate histograms
KernelDensityPlot
plot the kernel density estimate of a data set
LineChart
generate line charts
NormalPlot
generate normal plots
ParetoChart
generate Pareto chart
PieChart
generate pie charts
PointPlot
generate point plots
ProbabilityPlot
generate probability plots
ProfileLikelihood
plot a profile of the likelihood function
ProfileLogLikelihood
plot a profile of the log likelihood function
QuantilePlot
generate quantile-quantile plots
ScatterPlot
generate scatter plots
ScatterPlot3D
generate 3D scatter plots
ScreePlot
generate scree plots for variance
SunflowerPlot
generate sunflower plots
SurfacePlot
generate surface plots
SymmetryPlot
generate symmetry plots
TreeMap
generate tree maps
VennDiagram
generate Venn diagrams
ViolinPlot
create violin plots from data
WeibullPlot
generate Weibull plots
More information is available in the Statistics[Visualization] help page.
Optimized algorithms for simulating from all supported distributions as well as tools for creating custom random number generators, parametric and non-parametric bootstrap. Here is the list of available commands.
Bootstrap
compute bootstrap statistics
KernelDensitySample
sample a kernel density estimate
Sample
generate random sample
More information is available in the Statistics[Simulation] help page.
Tools for fitting linear and nonlinear models to data points and performing regression analysis. Here is the list of available commands.
ExponentialFit
fit an exponential function to data
Fit
fit a model function to data
LeastTrimmedSquares
robust linear regression
LinearFit
fit a linear model function to data
LogarithmicFit
fit a logarithmic function to data
Lowess
produce lowess smoothed functions
NonlinearFit
fit a nonlinear model function to data
OneWayANOVA
generate a one-way ANOVA table
PolynomialFit
fit a polynomial to data
PowerFit
fit a power function to data
PredictiveLeastSquares
fit a predictive linear model function to data
RepeatedMedianEstimator
More information is available in the Statistics[Regression] help page.
Tools for manipulating likelihood functions, maximum likelihood estimation, kernel density estimation, bootstrap. Here is the list of available commands.
FisherInformation
Fisher information
Information
statistical information
KernelDensity
estimate the probability density of a data set
Likelihood
likelihood function
LikelihoodRatioStatistic
compute the likelihood ratio statistic
LogLikelihood
log likelihood function
MaximumLikelihoodEstimate
compute the maximum likelihood estimate
Score
statistical score
More information is available in the Statistics[Estimation] help page.
Tools for manipulating statistical data. Here is the list of available commands.
Count
compute number/total weight of observations
CountMissing
compute number/total weight of missing values
CumulativeProduct
compute cumulative products
CumulativeSum
compute cumulative sums
Detrend
remove any trend from a set of data
Difference
compute lagged differences between elements
EvaluateToFloat
evaluate data using floating-point arithmetic
Excise
remove data items based on density
Join
join data samples
OrderByRank
order data items according to their ranks
Rank
rank data items according to their numeric values
Remove
remove data items satisfying a condition
RemoveInRange
remove data items which belong to the given range
RemoveNonNumeric
remove non-numeric values
Scale
center and/or scale a set of data
Select
select data items satisfying a condition
SelectInRange
select data items which belong to the given range
SelectNonNumeric
select non-numeric values
Shuffle
apply random permutation to a data sample
Sort
sort numeric data
SplitByColumn
split matrix data into submatrices
Tally
compute data frequencies
TallyInto
compute cumulative data frequencies
Trim
trim data set
Winsorize
winsorize data set
More information is available in the Statistics[DataManipulation] help page.
Data smoothing functions including moving averages, exponential smoothing, linear filters, etc. Here is the list of available commands.
ExponentialSmoothing
apply exponential smoothing to a data set
LinearFilter
apply linear filter to a data set
MovingAverage
compute moving averages for a data set
MovingMedian
compute moving medians for a data set
MovingStatistic
compute moving statistics for a data set
WeightedMovingAverage
compute weighted moving averages for a data set
More information is available in the Statistics[DataSmoothing] help page.
Common tools for performing hypothesis testing and inference, including several parametric and non-parametric tests. Here is the list of available commands.
ChiSquareGoodnessOfFitTest
apply the chi-square test for goodness-of-fit
ChiSquareIndependenceTest
apply the chi-square test for independence in a matrix
ChiSquareSuitableModelTest
apply the chi-square suitable model test
OneSampleChiSquareTest
apply the one sample chi-square test for the population standard deviation
OneSampleTTest
apply the one sample t-test for the population mean
OneSampleZTest
apply the one sample z-test for the population mean
ShapiroWilkWTest
apply Shapiro and Wilk's W-test for normality
TwoSampleFTest
apply the two sample F-test for population variances
TwoSamplePairedTTest
apply the paired t-test for population means
TwoSampleTTest
apply the two sample t-test for population means
TwoSampleZTest
apply the two sample z-test for population means
More information is available at the Statistics[Tests] help page.
See Also
Data Smoothing Example Worksheet
Estimation Example Worksheet
Hypothesis Testing Example Worksheet
Probability Distributions Example Worksheet
Robust Statistics Example Worksheet
Statistics with DataFrames Example Worksheet
Statistics[Computation]
TimeSeriesAnalysis
Download Help Document