Class Notes, Chapter 11, first part
Reading 11.1-5
Where are we in our roadmap?
- Probability theory
- Probability distributions
- Expectation
- Sampling and sampling distributions
- Estimation
- Hypothesis testing
Next element added: Experimental design
- First example: Simple linear regression
- Assume a linear relationship between X and Y, the independent and dependent
variables
- X is the design part, Y is the random variable (slight change of
notation)
- The model is a linear relationship: Y = &beta0 + &beta1X
- The model parameters are slope and intercept, &beta1 and &beta0
- Empirical model: Y = &beta0 + &beta1X
- Statistical model: Y = &beta0 + &beta1X + &epsilon
"noise", "random noise", "error", "random error", etc.
- It is the &epsilon that is normally distributed, and with &mu = 0 and
&sigma2 = &sigma2
- When sampling, it is the &epsilons that are independent and
identically distributed
- The population parameters that will be estimated are &beta1 and
&beta0, and also the variance of &epsilon, &sigma2
- Hypotheses will concern &beta1 and/or &beta0
Least Squares estimation of model parameters
- Empirical approach to estimation
- Parent model includes predictor variables
- Used for "curve" fitting (Regression, ANOVA)
- Find curve with minimal deviation from all data points, Y
- Define best fit as minimizing the Sum of Squared Errors in Y
- (that is, error is measured vertically to the fitted line)
- (n.b. this is related to the mean, which minimizes SSE about itself)
- xi predictors, yi outcomes
- Y = &beta1X + &beta0 + &epsilon
random
variable
- yi = &beta1xi + &beta0 +
&epsiloni
one observation
- &epsiloni = yi -
(&beta1xi + &beta0)
error for one observation
- Least Squares estimation of &beta1 and &beta0
using the least squares normal
equations:
- (See pp 395 (bottom) - 396 in your text)
- SSE
- Computational formulas for &beta1-hat and
&beta0-hat are on p. 396; calculate
&beta0-hat first and use it to calculate &beta1-hat
- Predicted Y: plug the parameter estimates into the equation
- Predicted &epsilons, also called residuals
- Another example of Least Squares estimation:
- Assume the model is Y = c. That is, a constant relationship
between X and Y. We want to estimate c.
- First step: set up error for a single
observation
- Second step: set up the sum of squared
differences
- Third step: take the partial with respect to
the parameter, c
- Fourth step: set equal to zero and solve
- Y-hat = c-hat
- yi-hat = c-hat
- etc.
General Linear Model, GLM
- There is also a matrix version of this model:
- Y is the data
- X is the design, i.e. the values of the independent
variables
- &beta0 is the intercept
- &beta1 is the slope
- The estimation equation (from setting the partials of the squared
error to zero):
- produces the exact same normal equations and the exact same
estimators.
- Also,
Interval estimation and tests of &beta1 and &beta0
Interval estimation (confidence intervals) for &beta1 and &beta0
- p. 403, p. 406
t-tests for &beta1 and &beta0, often H0: &beta1 = 0
- p. 404, p. 405
Coefficient of Determination, R2
R2 = 1 - (SSE/SST) = (SST-SSE)/SST = SSR/SST
- R2 is interpreted as the proportion of variance accounted for by the regression
model.
- It is often reported as a substitute for a test of the fit of a model
- Hoever, a bloated model with lots of useless predictors can have a huge R2.
- There are other tests of "best fit" that are used in stepwise regression.
- A real test of model fit can be made as long as there are replications to produce
a good estimate of noise variability; that test is more like the robust ANOVA. (As in
section 11.9 in your text)
Class Home