Statistics
LinearFit
fit a linear model function to data
Calling Sequence
Parameters
Description
Options
Notes
Examples
Compatibility
LinearFit(flst, X, Y, v, options)
LinearFit(flst, XY, v, options)
LinearFit(falg, X, Y, v, options)
LinearFit(falg, XY, v, options)
LinearFit(fop, X, Y, options)
LinearFit(fop, XY, options)
flst
-
list(algebraic) or Vector(algebraic); component functions in algebraic form
X
Vector or Matrix; values of independent variable(s)
Y
Vector; values of dependent variable
XY
Matrix; values of independent and dependent variables
v
name or list(names); name(s) of independent variables in the component functions
falg
algebraic expression, linear in all its variables except the ones in v; model
fop
list(procedure) or Vector(procedure); component functions in operator form
options
(optional) equation(s) of the form option=value where option is one of output, summarize, svdtolerance or weights; specify options for the LinearFit command
The LinearFit command fits a model function that is linear in the model parameters to data by minimizing the least-squares error. It performs both simple and multiple linear regression. This command accepts the model function in algebraic form in two variants, and in operator form, and data for independent and dependent variables can be specified together or separately. For more information about the input forms, see the Input Forms help page.
Consider the model y=f⁡x1,x2,...,xn where y is the dependent variable and f is the model function of n independent variables x1,x2,...,xn. This function is a linear combination a1⁢f1+a2⁢f2+am⁢fm+... of component functions fj⁡x1,x2,...,xn, for j from 1 to n. Given k data points, where each data point is an (n+1)-tuple of numerical values for x1,x2,...,xn,y, the LinearFit command finds values of model parameters a1,a2,...,am such that the sum of the k residuals squared is minimized. The ith residual is the value of y−f⁡x1,x2,...,xn evaluated at the ith data point.
In the first two calling sequences, the first parameter flst is a list or Vector of component functions in algebraic form. Each component is an algebraic expression in the independent variables x1,x2,...,xn.
In the second pair of calling sequences, the first parameter is an algebraic expression for f⁡x1,x2,...,xn, including the parameters a1,a2,...,am.
In the last two calling sequences, the first parameter fop is a list or Vector of component functions in operator form. The jth component is a procedure having n input parameters representing the independent variables x1,x2,...,xn and returning the single value fj⁡x1,x2,...,xn.
The parameter X is a Matrix containing the values of the independent variables. Row i in the Matrix contains the n values for the ith data point while column j contains all values of the single variable xj. If there is only one independent variable, X can be either a Vector or a k-by-1 Matrix. The parameter Y is a Vector containing the k values of the dependent variable y. The parameter XY is a Matrix consisting of the n columns of X and, as last column, Y. For X, Y, and XY, one can also use lists or Arrays; for details, see the Input Forms help page.
The parameter v is a list of the independent variable names used in falg. If there is only one independent variable, then v can be a single name. The order of the names in the list must match exactly the order in which the independent variable values are placed in the columns of X.
By default, either the model function with the final parameter values or a Vector containing the parameter values is returned, depending on the input form. Additional results or a solution module that allows you to query for various settings and results can be obtained with the output option. For more information, see the Statistics/Regression/Solution help page.
Weights for the data points can be supplied through the weights option.
The options argument can contain one or more of the options shown below. These options are described in more detail on the Statistics/Regression/Options help page.
output = name or string -- Specify the form of the solution. The output option can take as a value the name solutionmodule, or one of the following names (or a list of these names): AtkinsonTstatistic, confidenceintervals, CookDstatistic, degreesoffreedom, externallystandardizedresiduals, internallystandardizedresiduals, leastsquaresfunction, leverages, parametervalues, parametervector, residuals, residualmeansquare, residualstandarddeviation, residualsumofsquares, rsquared, rsquaredadjusted, standarderrors, tprobability, tvalue, variancecovariancematrix. For more information, see the Statistics/Regression/Solution help page.
summarize = identical( true, false, embed ) -- Display a summary of the regression model
svdtolerance = realcons(nonnegative) -- Set the tolerance that determines whether a singular-value decomposition is performed.
weights = Vector -- Provide weights for the data points.
The underlying computation is done in floating-point; therefore, all data points must have type realcons and all returned solutions are floating-point, even if the problem is specified with exact values. For more information about numeric computation in the Statistics package, see the Statistics/Computation help page.
The LinearFit command uses various methods implemented in a built-in library provided by the Numerical Algorithms Group (NAG). Normally, a method using QR decomposition is applied. If it is determined that the system does not have full rank, then a singular-value decomposition (SVD) is performed. The svdtolerance option allows you to specify when an SVD should be performed. See the Statistics/Regression/Options help page for additional details.
To obtain more details as the least-squares problem is being solved, set infolevel[Statistics] to 2 or higher.
with⁡Statistics:
A simple example using the first form for the first argument, flst:
X≔Vector⁡1,2,3,4,5,6,datatype=float:
Y≔Vector⁡2,3,4,3.5,5.8,7,datatype=float:
LinearFit⁡1,t,t2,X,Y,t
1.96000000000000+0.164999999999999⁢t+0.110714285714286⁢t2
The summarize option returns a summary for the regression:
ls≔LinearFit⁡1,t,t2,X,Y,t,summarize=true:
Summary: ---------------- Model: 1.9600000+.16500000*t+.11071429*t^2 ---------------- Coefficients: Estimate Std. Error t-value P(>|t|) Parameter 1 1.9600 1.1720 1.6724 0.1930 Parameter 2 0.1650 0.7667 0.2152 0.8434 Parameter 3 0.1107 0.1072 1.0325 0.3778 ---------------- R-squared: 0.9252, Adjusted R-squared: 0.8753
ls
Here is the same example using the second form for the first argument, falg:
LinearFit⁡a+b⁢t+c⁢t2,X,Y,t
The summary can also be returned as an embedded table:
LinearFit⁡1,t,t2,X,Y,t,summarize=embed
Model:
1.9600000+0.16500000⁢t+0.11071429⁢t2
Coefficients
Estimate
Standard Error
t-value
P(>|t|)
Parameter 1
1.96000
1.17199
1.67237
0.193045
Parameter 2
0.165000
0.766748
0.215194
0.843415
Parameter 3
0.110714
0.107226
1.03253
0.377769
R-squared:
0.925169
Adjusted R-squared:
0.875282
Residuals
Residual Sum of Squares
Residual Mean Square
Residual Standard Error
Degrees of Freedom
1.28771
0.429238
0.655163
3
Five Point Summary
Minimum
First Quartile
Median
Third Quartile
Maximum
−0.891429
−0.290357
0.155714
0.290595
0.548571
And finally using the third form, fop:
constant_function≔t↦1
linear_function≔t↦t
quadratic_function≔t↦t2
LinearFit⁡constant_function,linear_function,quadratic_function,X,Y
Use the output=solutionmodule option to see the full results.
m≔LinearFit⁡1,t,t2,X,Y,t,output=solutionmodule
m≔module...end module
m:-Results⁡
residualmeansquare=0.429238095238095,residualsumofsquares=1.28771428571429,residualstandarddeviation=0.655162647926525,degreesoffreedom=3,parametervalues=,parametervector=,leastsquaresfunction=1.96000000000000+0.164999999999999⁢t+0.110714285714286⁢t2,standarderrors=,confidenceintervals=,rsquared=0.925169145624351,rsquaredadjusted=0.875281909373919,residuals=,leverages=,variancecovariancematrix=,internallystandardizedresiduals=,externallystandardizedresiduals=,CookDstatistic=,AtkinsonTstatistic=,tvalue=1.67236839957606,0.215194489556356,1.03253056607013,tprobability=0.193045057943908,0.843415034784358,0.377768512636454
Consider now an experiment where quantities x, y, and z are quantities influencing a quantity w according to an approximate relationship
w=a⁢x+b⁢x2y+c⁢y⁢z
with unknown parameters a, b, and c. Six data points are given by the following matrix, with respective columns for x, y, z, and w.
ExperimentalData≔1,1,1,2,2,2|1,2,3,1,2,3|1,2,3,4,5,6|0.531,0.341,0.163,0.641,0.713,−0.040
ExperimentalData≔
We can find the fitted model function as follows:
LinearFit⁡x,x2y,y⁢z,ExperimentalData,x,y,z
0.823072918385878⁢x−0.167910114211606⁢x2y−0.0758022678386438⁢y⁢z
Alternatively, if we have the input and output data separately, we can use the following calling sequence.
Input≔ExperimentalData..,..3
Input≔
Output≔ExperimentalData..,4
Output≔
LinearFit⁡x,x2y,y⁢z,Input,Output,x,y,z
We might want to know the residuals and the parameter values instead of just the model function.
LinearFit⁡x,x2y,y⁢z,ExperimentalData,x,y,z,output=parametervector,residuals
,
The XY parameter was introduced in Maple 15.
For more information on Maple 15 changes, see Updates in Maple 15.
The falg parameter was introduced in Maple 17.
For more information on Maple 17 changes, see Updates in Maple 17.
The Statistics[LinearFit] command was updated in Maple 2016.
The summarize option was introduced in Maple 2016.
For more information on Maple 2016 changes, see Updates in Maple 2016.
See Also
CurveFitting
Statistics/Computation
Statistics/Fit
Statistics/Regression
Statistics/Regression/InputForms
Statistics/Regression/Options
Statistics/Regression/Solution
Download Help Document