`
``
use Statistics::Regression;
# Create regression object
my $reg = Statistics::Regression->new( "sample regression", [ "const", "someX", "someY" ] );
# Add data points
$reg->include( 2.0, [ 1.0, 3.0, -1.0 ] );
$reg->include( 1.0, [ 1.0, 5.0, 2.0 ] );
$reg->include( 20.0, [ 1.0, 31.0, 0.0 ] );
$reg->include( 15.0, [ 1.0, 11.0, 2.0 ] );
`

or

`
`

`
my %d;
$d{const} = 1.0; $d{someX}= 5.0; $d{someY}= 2.0; $d{ignored}="anything else";
$reg->include( 3.0, \%d ); # names are picked off the Regression specification
`

Please note that *you* must provide the constant if you want one.

`
`

`
# Finally, print the result
$reg->print();
`

This prints the following:

`
`

`
****************************************************************
Regression sample regression
****************************************************************
Name Theta StdErr T-stat
[0=const] 0.2950 6.0512 0.05
[1=someX] 0.6723 0.3278 2.05
[2=someY] 1.0688 2.7954 0.38
R^2= 0.808, N= 4
****************************************************************
`

The hash input method has the advantage that you can now just
fill the observation hashes with all your variables, and use the
same code to run regression, changing the regression
specification at one and only one spot (the *new()* invokation).
You do not need to change the inputs in the *include()* statement.
For example,

`
`

`
my @obs; ## a global variable. observations are like: %oneobs= %{$obs[1]};
sub run_regression {
my $reg = Statistics::Regression->new( $_[0], $_[2] );
foreach my $obshashptr (@obs) { $reg->include( $_[1], $_[3] ); }
$reg->print();
}
run_regression("bivariate regression", $obshashptr->{someY}, [ "const", "someX" ] );
run_regression("trivariate regression", $obshashptr->{someY}, [ "const", "someX", "someZ" ] );
`

Of course, you can use the subroutines to do the printing work yourself:

`
`

`
my @theta = $reg->theta();
my @se = $reg->standarderrors();
my $rsq = $reg->rsq();
my $adjrsq = $reg->adjrsq();
my $ybar = $reg->ybar(); ## the average of the y vector
my $sst = $reg->sst(); ## the sum-squares-total
my $sigmasq= $reg->sigmasq(); ## the variance of the residual
my $k = $reg->k(); ## the number of variables
my $n = $reg->n(); ## the number of observations
`

In addition, there are some other helper routines, and a
subroutine *linearcombination_variance()*. If you don’t know what
this is, don’t use it.

`
``
W. M. Gentleman, University of Waterloo, "Basic
Description For Large, Sparse Or Weighted Linear Least
Squares Problems (Algorithm AS 75)," Applied Statistics
(1974) Vol 23; No. 3
`

Gentleman’s algorithm is *the* statistical standard. Insertion
of a new observation can be done one observation at any time
(WITH A WEIGHT!), and still only takes a low quadratic time.
The storage space requirement is of quadratic order (in the
indep variables). A practically infinite number of observations
can easily be processed!