ESTIMATING THE HETEROSKEDASTIC/AUTOCORRELATED LINEAR
REGRESSION USING MLE
The purpose of
this session is to show you how to estimate and test the heteroskedastic
and/or autocorrelated normal general linear models
using MLE.
HETEROSKEDASTICITY:
Heteroskedasticity can be treated directly in the
context of the normal MLE simply by specifying an equation to reflect the form
of the heteroskedasticity in place of the variance
term in the log likelihood function. This equation can take many different forms
to correspond with the type of heteroskedasticity.
There could simply be a weight that reflects the variance of each disturbance
(White's approach), or the heteroskedasticity could
be some linear combination of independent variables that may (or may not) be in
the equation for the conditional mean. Other forms include dependent variable heteroskedasticity, where the weighting term is some
function of the dependent variable (logs, absolute values, etc.), ARCH, and
GARCH. All of these are easily treated in the context of the normal MLE. The
example below shows one form, but this program could easily be modified to
include all of the other forms.
/* This file
demonstrates maximum likelihood estimation of normal models with heteroskedasticity
of various forms. */
clear
/* The next line
will read a data file from Greene, Chapter 11. Change the path to find data */
use "c:\users\Wood\Documents\My
Teaching\Maximum Likelihood\Data\credit.dta", clear
/* Set the sample
and generate a required variable to replicate Greene. */
gen id = _n
keep if (1<=id<=100)
drop if avgexp==0
summarize
gen income2=income^2
/* First, let's
estimate the regression using least squares and look at the residuals.
This makes it
clear that income is the source of the heteroskedasticity.
*/
regress avgexp age ownrent income income2
predict res, res
graph twoway (scatter
res income), yline(0)
hettest
hettest income
hettest income income2
hettest , mtest(b)
whitetst
imtest, white
/* Other
available tests in STATA */
/* Influence Statistics
*/
predict dfits, dfits
list dfits
predict cooksd, cooksd
list cooksd
predict dfits, dfits
list dfits
predict wd, welsch
list wd
predict covr, covratio
list covr
predict dincome, dfbeta(income)
list dincome
dfbeta
/* Ramsey's reset
test for omitted variables */
ovtest
/* Now, let's
estimate the heteroskedastic regression the easy way.
The following
perform's White's procedure for robust standard
errors. */
regress avgexp age ownrent income income2, robust
/* You can also
specify a weighted least squares procedure. */
regress avgexp age ownrent income income2 [aweight=income]
/*You can test
linear hypotheses using a Wald procedure following STATA's canned
estimation commands. Here are some examples.*/
test income=0
test income=200
test income income2
test income-age=1
test income=age
test .5*income+2*age=1
/* Now let's introduce STATA's margins command to enhance the interpretation of
the model. Will use this later.
i. identifies a
variable as a factor variable. */
regress avgexp age i.ownrent income income2
margins ownrent
marginsplot
/* Using margins
to interpret an interaction effect. c. identifies a
variable as a continuous variable.
## creates the interaction. */
regress avgexp age i.ownrent i.ownrent##c.age income income2
margins ownrent,
at(age=(20(1)55))
marginsplot, xdimension(at(age))
/* Now, let's
develop our own procedure using the maximize command. Fit the Normal
Maximum
Likelihood heteroskedastic regression model. We use the same procedure as last time.*/
program define regressest
args lnf theta1 theta2
quietly replace
`lnf'=-(1/2)*ln(2*_pi)-(1/2)*ln(exp(`theta2'))-(1/(2*(exp(`theta2'))))*($ML_y1-`theta1')^2
end
/*Now lets use
the above function to estimate the model.*/
ml model lf regressest
(eq1:avgexp=age ownrent income income2)
(eq2:avgexp=income), technique(bhhh)
ml maximize
/* Not close to
the wls estimates */
/* Let's try
putting in our own starting values */
ml model lf regressest
(eq1:avgexp=age ownrent income income2) (eq2:avgexp=income),
technique(bfgs)
ml init -3.57 -3.81 254.82 -16.41 -261 1 8 ,
copy
ml maximize
/* We can use a
Wald statistic to test for heteroskedasticity. The square of the t-statistic
on income in the second equation is a Wald test
that the coefficient on income=0.
Note that the test command does not allow
testing other hypotheses than zero when using
the ml
procedure. */
test [eq2]income=0
/*We might also
want to test the joint hypothesis that both income coefficients in the main
equation are zero. */
test [eq1]income income2
/*
Below save the log likelihood from the ml model for additional hypothesis
testing. */
est store A
/* Now let's do a
likelihood ratio test for no heteroskedasticity using
STATA's canned
procedure for likelihood ratio tests. First, restimate
the model with the restriction
of homoskedasticity. */
ml model lf regressest
(eq1:avgexp=age ownrent income income2) (eq2:avgexp=)
ml maximize
/* Now calculate
the likelihood ratio test for no heteroskedasticity using
STATA's canned
procedure */
lrtest A
/*Now clear the
program "regresset"*/
program drop regressest
exit
/* This file
demonstrates various methods for dealing with autocorrelation within STATA. */
clear
/* The next line
will read a data file. Change the path to find data */
use "c:\users\Wood\Documents\My
Teaching\Maximum Likelihood\Data\ostrom.dta", clear
/* Set the Sample
*/
gen id = _n
keep if (1<=id<=22)
summarize us ussr
/* Now let's fit a
regression the easy way, save and plot the residuals and do a Wald
test for whether the coefficient on USSR=0.*/
regress us ussr
predict res, res
test ussr=0
graph twoway (scatter
res year), yline(0)
tsset year
/* Now let's
compute robust (autocorrelation and heteroskedasticity
consistent) standard
errors using the Newey-West
procedure. */
newey us ussr, lag(1)
/* Now let's do Prais-Winston regression.
This is an iterative procedure that jointly
estimates the coefficients and autocorrelation coefficient. */
prais us ussr
/*Now let's do a
maximum likelihood procedure for Cochrane-Orcutt
estimation. Note that
unlike the prais
procedure above, we are not fully efficient since we are dropping the
first observation.
It would take a bit more programming to retrieve the first observation
using the Prais-Winston
transformation. */
gen uslag=us[_n-1]
gen ussrlag=ussr[_n-1]
drop if(year==67)
program define arregressest
args lnf theta1 theta2
theta3 theta4
quietly replace `lnf'= /*
*/ -(1/2)*ln(2*_pi)-(1/2)*ln(`theta4'^2)-(1/(2*`theta4'^2))*(($ML_y1-`theta1'-`theta2'*ussr)
-`theta3'*(uslag-`theta1'-`theta2'*ussrlag))^2+(1/(2*_N))*ln(1-`theta3'^2)
end
/*Now lets use
the above function to estimate the model.
*/
ml model lf arregressest
(b0:us=) (b1:) (rho:) (sigma:)
, technique(nr)
ml init 7 .9 .9 7, copy
ml maximize
/* Now clear the
program "arregresset." */
program drop arregressest
exit