DataLab is a compact statistics package aimed at exploratory data analysis. Please visit the DataLab Web site for more information....



Simple Regression

Command: Math -> Simple Regression...

The command Math/Simple Regression... allows to calculate a regression between a single descriptor variable and a single response variable. Following is the list of regression functions(1) which can be parametrized:
straight line y = kx + d parabolic curve y = a + bx + cx2
reciprocal curve y = 1/(a + bx) hyperbolic curve y = a + b/x
logarithmic curve y = a + bln(x) exponential curve y = a ebx
polynomial curve y = a0 + Saixi     centered polynomial y = a0 + Sai(x-x0)i
Hoerl function y = k0k1xxk2 normal distribution
natural spline y = Sk(x) = ak,0 + ak,1(x-xk) +
ak,2(x-xk)2 + ak,3(x-xk)3
für x [xk, xk+1] und k=0,1,2...n-1
smoothed spline y = Sk(x) = ak,0 + ak,1(x-xk) +
ak,2(x-xk)2 + ak,3(x-xk)3
für x [xk, xk+1] und k=0,1,2...n-1
smoothing factor γ = 0.0...1.0

Hint: Using data which span a small range at a large offset may lead to numeric instabilities and round-off errors when calculating polynomial fits of higher order (this is not a deficiency of DataLab but a general problem of limited numeric accuracy in digital computers). A possible remedy to this is to use centered polynomials with the x0 value set to the center of the x-values (DataLab does this automatically when selecting the "Shifted Polyn." function).

After clicking the command, the user has first to select the input (descriptor) variable and then the response variable. Thereafter a window is displayed which allows to select among the above-mentioned curve types and to calculate the corresponding parameters:

The regression results are displayed on five pages:

Regression On this page the plot of the data and the regression curve are shown. Data can be marked by drawing a rectangular window around the data to be marked.
Residuals The residuals plotted against the descriptor are shown on this page. Again, data can be marked by drawing a rectangular window around the data to be marked.
Distribution of Residuals This page shows the histogram of the residuals which the ideal normal distribution plotted on top of the historgram.
Details The most important details on the regression function are shown here: the parameters of the curve, the Durbin-Watson test for serial correlation among the residuals, the Lilliefors test on normality of the residuals, and the ANOVA.
Calculate On this page the user may calculate values of the dependent variable for particular values of the independent variable. The confidence interval is both calculated for the means and for individual values (at a level of confidence specified by the user).

There are several shortcut buttons left to the chart area which support the following actions:

switching the axes The currently selected variable can be switched by clicking the arrow buttons. The top buttons affect the dependent variable, the bottom buttons the independent variable.
selecting new variables Another variable may be selected by using the eye dropper button. Clicking this button enables you to pick a particular variable by means of the variable selection dialog.
removing marks Marked data can be unmarked by clicking this button.
copying residuals Clicking this button allows to copy the residuals of the regression into the matrix clipboard.
saving the regression equation The regression equation can be saved as a script file. Please note that the destination variable is undefined and indicated by the place holder "CX$$".



(1) Some of these functions are not available in the evaluation copy.


Last Update: 2012-Jul-28