Data Manipulation Commands
The Statistics package provides various functions for manipulating statistical data. These include sorting, searching, and data selection routines. The following is a list of available commands.
Count
compute number/total weight of observations
CountMissing
compute number/total weight of missing values
CumulativeProduct
compute cumulative products
CumulativeSum
compute cumulative sums
Detrend
remove any trend from a set of data
Difference
compute lagged differences between elements
EvaluateToFloat
evaluate data using floating-point arithmetic
Excise
remove data items based on density
Join
join data samples
OrderByRank
order data items according to their ranks
Rank
rank data items according to their numeric values
Remove
remove data items satisfying a condition
RemoveInRange
remove data items which belong to the given range
RemoveNonNumeric
remove non-numeric values
Scale
center and/or scale a set of data
Select
select data items satisfying a condition
SelectInRange
select data items which belong to the given range
SelectNonNumeric
select non-numeric values
Shuffle
apply random permutation to a data sample
Sort
sort numeric data
SplitByColumn
split matrix data into submatrices
Tally
compute data frequencies
TallyInto
compute cumulative data frequencies
Trim
trim data set
Winsorize
winsorize data set
The ArrayTools package provides a number of useful tools for manipulating rectangular arrays. Here is the list of available commands.
AddAlongDimension
add the elements of an Array
Alias
provide different view of rectangular Matrix, Vector, or Array
AllNonZero
true when the Array has no zero entries
AnyNonZeros
check for nonzero Array entries
Append
append element to Array
BlockCopy
copy a block of several segments of elements from one Matrix, Vector, or Array to another
CircularShift
shift Array data
ComplexAsFloat
provide real view of a complex Matrix, Vector, or Array
Concatenate
Array concatenation
Copy
copy portion of Matrix, Vector, or Array to another
DataTranspose
perform in-place data transpose
Diagonal
extract the diagonals from a Matrix or create a diagonal Matrix
Dimensions
size of an Array in each dimension
ElementDivide
element-wise division of Array entries
ElementMultiply
element-wise multiplication of Array entries
ElementPower
element-wise power of Array entries
Extend
extend Array with additional elements
Fill
fill portion of Matrix, Vector, or Array with specified value
FlipDimension
reverse order of elements in an Array
GeneralInnerProduct
compute a general inner product of two Arrays
GeneralOuterProduct
compute a general outer product of two Arrays
HasNonZero
true when the Array has a nonzero entry
HasZero
true when the Array has a zero entry
Insert
insert an element in an Array
IsEqual
compare Arrays for equality
IsZero
true when the Array has only zero entries
LowerTriangle
return the lower triangular region of a matrix
MultiplyAlongDimension
multiply rows of an array
NumElems
return the number of elements in an Array
Permute
permute dimensions of an Array
PermuteInverse
inverse permute dimensions of an Array
RandomArray
randomly generate scalars, Matrices, and Arrays of values drawn from a uniform or normal distribution
ReduceAlongDimension
reduce the elements of an Array by a function
RegularArray
generate an array of numbers with specified spacing in a given range
remove entries and shrink an Array
RemoveSingletonDimensions
remove singleton Array dimensions
Replicate
Array replication
Reshape
create a reshaped copy of a Matrix, Vector, or Array
Reverse
reverse a Matrix, Vector, or Array
ScanAlongDimension
accumulate the elements of an Array by a function
SearchArray
return the indices of nonzero elements of the given Array
Size
return the size of an Array in each dimension
SuggestedDatatype
suggest Array datatype for an operation on two Arrays
SuggestedOrder
suggest rtable order for an operation on two rtables
SuggestedSubtype
suggest rtable subtype for an operation on two rtables
UpperTriangle
compute the upper triangular Matrix
Examples
with⁡Statistics:
X≔Sample⁡Normal⁡0,1,100
X≔−1.07242412799827−0.329077870547065−0.6170919369097900.214466745245291−0.02542812804238461.72882128417783−1.634856754343901.571172171754300.1703584214104081.006478408751810.268712679158310−1.46598044695878−0.0357578538394966−0.00650336346263664−0.0571830022332381−0.1846000946249290.203855402371112−0.0988718240271234−0.0826980970419822−0.1681191281214080.162021299867933−0.675918709672617−0.4407976440485320.8960199852595500.4610229533800231.277977052273850.234565961794581−0.0362744912936830−0.448848123512147−1.16737308532471−1.21794288394124−1.154691960926250.109659323283598−0.6198093806412820.642151004644923−0.170770918098244−0.8200522410279821.34633519663234−0.986038560158125−0.965838793499848−1.364625719805741.650078719564081.34804151352017−1.025373926186590.2297880608639101.48493457968053−1.30030905411587−0.3761311162362070.06178419666720650.801907829141039−0.1299754705508590.1124845306041330.630360591794658−0.0208959396841725−0.830442822997399−0.262652510141184−0.7003002619680081.95556905861273−0.259474165622608−0.08208506256242991.21557945262500−0.9831318137596571.930690648889350.1227351085867530.432945421151540−0.5244438178628050.224215551927832−0.4142255680983720.03552613622963871.983712097679992.213379589390360.8812298886268760.886856450560421−0.861665947308802−0.906611303340893−0.876730984349719−0.217490459619989−0.3361551742891360.549376576019130−1.19754843404833−0.927510339539523−1.32995060542228−1.130436031469781.27187025067002−1.476515463449061.61338249093609−0.6853420693674861.08241290092064−0.776734900479867−0.5554814380946441.072391847768250.0713529776295548−0.1821229244895550.628549543618577−0.725850645010288−0.4133803386000231.54241530115661−0.220309427423809−0.8408160247527150.385066012066646
Select only values between -2 and 2;
Y≔SelectInRange⁡X,−2..2
Y≔−1.07242412799827−0.329077870547065−0.6170919369097900.214466745245291−0.02542812804238461.72882128417783−1.634856754343901.571172171754300.1703584214104081.006478408751810.268712679158310−1.46598044695878−0.0357578538394966−0.00650336346263664−0.0571830022332381−0.1846000946249290.203855402371112−0.0988718240271234−0.0826980970419822−0.1681191281214080.162021299867933−0.675918709672617−0.4407976440485320.8960199852595500.4610229533800231.277977052273850.234565961794581−0.0362744912936830−0.448848123512147−1.16737308532471−1.21794288394124−1.154691960926250.109659323283598−0.6198093806412820.642151004644923−0.170770918098244−0.8200522410279821.34633519663234−0.986038560158125−0.965838793499848−1.364625719805741.650078719564081.34804151352017−1.025373926186590.2297880608639101.48493457968053−1.30030905411587−0.3761311162362070.06178419666720650.801907829141039−0.1299754705508590.1124845306041330.630360591794658−0.0208959396841725−0.830442822997399−0.262652510141184−0.7003002619680081.95556905861273−0.259474165622608−0.08208506256242991.21557945262500−0.9831318137596571.930690648889350.1227351085867530.432945421151540−0.5244438178628050.224215551927832−0.4142255680983720.03552613622963871.983712097679990.8812298886268760.886856450560421−0.861665947308802−0.906611303340893−0.876730984349719−0.217490459619989−0.3361551742891360.549376576019130−1.19754843404833−0.927510339539523−1.32995060542228−1.130436031469781.27187025067002−1.476515463449061.61338249093609−0.6853420693674861.08241290092064−0.776734900479867−0.5554814380946441.072391847768250.0713529776295548−0.1821229244895550.628549543618577−0.725850645010288−0.4133803386000231.54241530115661−0.220309427423809−0.8408160247527150.385066012066646
Select only values between 5th and 95th percentiles (trim).
Z≔Trim⁡X,5,95
Replace extreme points with the values of the 5th or the 95th percentile (whichever is closer).
W≔Winsorize⁡X,5,95
W≔−1.07242412799827−0.329077870547065−0.6170919369097900.214466745245291−0.02542812804238461.72882128417783−1.300309054115871.571172171754300.1703584214104081.006478408751810.268712679158310−1.30030905411587−0.0357578538394966−0.00650336346263664−0.0571830022332381−0.1846000946249290.203855402371112−0.0988718240271234−0.0826980970419822−0.1681191281214080.162021299867933−0.675918709672617−0.4407976440485320.8960199852595500.4610229533800231.277977052273850.234565961794581−0.0362744912936830−0.448848123512147−1.16737308532471−1.21794288394124−1.154691960926250.109659323283598−0.6198093806412820.642151004644923−0.170770918098244−0.8200522410279821.34633519663234−0.986038560158125−0.965838793499848−1.300309054115871.650078719564081.34804151352017−1.025373926186590.2297880608639101.48493457968053−1.30030905411587−0.3761311162362070.06178419666720650.801907829141039−0.1299754705508590.1124845306041330.630360591794658−0.0208959396841725−0.830442822997399−0.262652510141184−0.7003002619680081.72882128417783−0.259474165622608−0.08208506256242991.21557945262500−0.9831318137596571.728821284177830.1227351085867530.432945421151540−0.5244438178628050.224215551927832−0.4142255680983720.03552613622963871.728821284177831.728821284177830.8812298886268760.886856450560421−0.861665947308802−0.906611303340893−0.876730984349719−0.217490459619989−0.3361551742891360.549376576019130−1.19754843404833−0.927510339539523−1.30030905411587−1.130436031469781.27187025067002−1.300309054115871.61338249093609−0.6853420693674861.08241290092064−0.776734900479867−0.5554814380946441.072391847768250.0713529776295548−0.1821229244895550.628549543618577−0.725850645010288−0.4133803386000231.54241530115661−0.220309427423809−0.8408160247527150.385066012066646
W1..10
−1.07242412799827−0.329077870547065−0.6170919369097900.214466745245291−0.02542812804238461.72882128417783−1.300309054115871.571172171754300.1703584214104081.00647840875181
Sort 2-D array according to the numeric values in the second column.
A≔seq⁡i,i=1..10|seq⁡sin⁡i,i=1..10
A≔1sin⁡12sin⁡23sin⁡34sin⁡45sin⁡56sin⁡67sin⁡78sin⁡89sin⁡910sin⁡10
R≔Rank⁡A1..10,2
R≔89521471063
B≔OrderByRank⁡A,R
B≔5sin⁡54sin⁡410sin⁡106sin⁡63sin⁡39sin⁡97sin⁡71sin⁡12sin⁡28sin⁡8
evalf⁡B
5.−0.95892427474.−0.756802495310.−0.54402111096.−0.27941549823.0.14112000819.0.41211848527.0.65698659871.0.84147098482.0.90929742688.0.9893582466
Handling non-numeric data.
A≔1,4,3,4,5,undefined:
B≔4,undefined,∞:
C≔a,b,c,a,b,a:
Join samples A, B, and C
U≔Join⁡A,B,C
U≔14345undefined4undefined∞abcaba
Remove non-numeric values (keep missing values).
V≔RemoveNonNumeric⁡U,exclude=undefined
V≔14345undefined4undefined
Count total number of values and the number of missing values in V.
Count⁡V
8
CountMissing⁡V
2
The same using weights.
W≔12,13,14,15,16,17,18,19:
CountMissing⁡V,weights=W
0.253968253968254
See Also
Statistics
Statistics[Computation]
Statistics[DataSmoothing]
Download Help Document