Class Notes, Chapter 11, first part

Reading 11.1-5

Where are we in our roadmap?

Probability theory
Probability distributions
Expectation
Sampling and sampling distributions
Estimation
Hypothesis testing

Next element added: Experimental design

First example: Simple linear regression
Assume a linear relationship between X and Y, the independent and dependent variables
X is the design part, Y is the random variable (slight change of notation)
The model is a linear relationship: Y = &beta0 + &beta1X
The model parameters are slope and intercept, &beta1 and &beta0
Empirical model: Y = &beta0 + &beta1X
Statistical model: Y = &beta0 + &beta1X + &epsilon "noise", "random noise", "error", "random error", etc.
It is the &epsilon that is normally distributed, and with &mu = 0 and &sigma2 = &sigma2
When sampling, it is the &epsilons that are independent and identically distributed
The population parameters that will be estimated are &beta1 and &beta0, and also the variance of &epsilon, &sigma2
Hypotheses will concern &beta1 and/or &beta0
Least Squares estimation of model parameters
Empirical approach to estimation
Parent model includes predictor variables
Used for "curve" fitting (Regression, ANOVA)
Find curve with minimal deviation from all data points, Y
Define best fit as minimizing the Sum of Squared Errors in Y
(that is, error is measured vertically to the fitted line)
(n.b. this is related to the mean, which minimizes SSE about itself)
xi predictors, yi outcomes
Y = &beta1X + &beta0 + &epsilon random variable
yi = &beta1xi + &beta0 + &epsiloni one observation
&epsiloni = yi - (&beta1xi + &beta0) error for one observation
Least Squares estimation of &beta1 and &beta0 using the least squares normal equations:
(See pp 395 (bottom) - 396 in your text)
SSE
Computational formulas for &beta1-hat and &beta0-hat are on p. 396; calculate &beta0-hat first and use it to calculate &beta1-hat
Predicted Y: plug the parameter estimates into the equation
Predicted &epsilons, also called residuals
Another example of Least Squares estimation:
Assume the model is Y = c. That is, a constant relationship between X and Y. We want to estimate c.
First step: set up error for a single observation
Second step: set up the sum of squared differences
Third step: take the partial with respect to the parameter, c
Fourth step: set equal to zero and solve
Y-hat = c-hat
yi-hat = c-hat
etc.

General Linear Model, GLM

There is also a matrix version of this model:
Y is the data
X is the design, i.e. the values of the independent variables
&beta0 is the intercept
&beta1 is the slope
The estimation equation (from setting the partials of the squared error to zero):
produces the exact same normal equations and the exact same estimators.
Also,

Interval estimation and tests of &beta1 and &beta0

Interval estimation (confidence intervals) for &beta1 and &beta0
p. 403, p. 406
t-tests for &beta1 and &beta0, often H0: &beta1 = 0
p. 404, p. 405

Coefficient of Determination, R2
R2 = 1 - (SSE/SST) = (SST-SSE)/SST = SSR/SST
R2 is interpreted as the proportion of variance accounted for by the regression model.
It is often reported as a substitute for a test of the fit of a model
Hoever, a bloated model with lots of useless predictors can have a huge R2.
There are other tests of "best fit" that are used in stepwise regression.
A real test of model fit can be made as long as there are replications to produce a good estimate of noise variability; that test is more like the robust ANOVA. (As in section 11.9 in your text)



Class Home