ESTIMATING THE HETEROSKEDASTIC/AUTOCORRELATED LINEAR REGRESSION USING MLE

The purpose of this session is to show you how to estimate and test the heteroskedastic and/or autocorrelated normal general linear models using MLE.

HETEROSKEDASTICITY: Heteroskedasticity can be treated directly in the context of the normal MLE simply by specifying an equation to reflect the form of the heteroskedasticity in place of the variance term in the log likelihood function. This equation can take many different forms to correspond with the type of heteroskedasticity. There could simply be a weight that reflects the variance of each disturbance (White's approach), or the heteroskedasticity could be some linear combination of independent variables that may (or may not) be in the equation for the conditional mean. Other forms include dependent variable heteroskedasticity, where the weighting term is some function of the dependent variable (logs, absolute values, etc.), ARCH, and GARCH. All of these are easily treated in the context of the normal MLE. The example below shows one form, but this program could easily be modified to include all of the other forms.

/* This file demonstrates maximum likelihood estimation of normal models with heteroskedasticity

of various forms. */

clear

/* The next line will read a data file from Greene, Chapter 11. Change the path to find data */

use "c:\users\Wood\Documents\My Teaching\Maximum Likelihood\Data\credit.dta", clear

/* Set the sample and generate a required variable to replicate Greene. */

gen id = _n

keep if (1<=id<=100)

drop if avgexp==0

summarize

gen income2=income^2

/* First, let's estimate the regression using least squares and look at the residuals.

This makes it clear that income is the source of the heteroskedasticity. */

regress avgexp age ownrent income income2

predict res, res

graph twoway (scatter res income), yline(0)

hettest

hettest income

hettest income income2

hettest , mtest(b)

whitetst

imtest, white

/* Other available tests in STATA */

/* Influence Statistics */

predict dfits, dfits

list dfits

predict cooksd, cooksd

list cooksd

predict dfits, dfits

list dfits

predict wd, welsch

list wd

predict covr, covratio

list covr

predict dincome, dfbeta(income)

list dincome

dfbeta

/* Ramsey's reset test for omitted variables */

ovtest

/* Now, let's estimate the heteroskedastic regression the easy way. The following

perform's White's procedure for robust standard errors. */

regress avgexp age ownrent income income2, robust

/* You can also specify a weighted least squares procedure. */

regress avgexp age ownrent income income2 [aweight=income]

/*You can test linear hypotheses using a Wald procedure following STATA's  canned

estimation commands. Here are some examples.*/

test income=0

test income=200

test income income2

test income-age=1

test income=age

test .5*income+2*age=1

/* Now let's introduce STATA's margins command to enhance the interpretation of the model. Will use this later.

i. identifies a variable as a factor variable. */

regress avgexp age i.ownrent income income2

margins ownrent

marginsplot

/* Using margins to interpret an interaction effect. c. identifies a variable as a continuous variable.

## creates the interaction. */

regress avgexp age i.ownrent i.ownrent##c.age income income2

margins ownrent, at(age=(20(1)55))

marginsplot, xdimension(at(age))

/* Now, let's develop our own procedure using the maximize command. Fit the Normal

Maximum Likelihood heteroskedastic regression model. We use the same procedure as last time.*/

program define regressest

args lnf theta1 theta2

quietly replace `lnf'=-(1/2)*ln(2*_pi)-(1/2)*ln(exp(`theta2'))-(1/(2*(exp(`theta2'))))*(\$ML_y1-`theta1')^2

end

/*Now lets use the above function to estimate the model.*/

ml model lf regressest (eq1:avgexp=age ownrent income income2) (eq2:avgexp=income), technique(bhhh)

ml maximize

/* Not close to the wls estimates */

/* Let's try putting in our own starting values */

ml model lf regressest (eq1:avgexp=age ownrent income income2) (eq2:avgexp=income), technique(bfgs)

ml init -3.57 -3.81 254.82 -16.41 -261 1 8 , copy

ml maximize

/* We can use a Wald statistic to test for heteroskedasticity.  The square of the t-statistic

on income in the second equation is a Wald test that the coefficient on income=0.

Note that the test command does not allow testing other hypotheses than zero when using

the ml procedure.  */

test [eq2]income=0

/*We might also want to test the joint hypothesis that both income coefficients in the main

equation are zero. */

test [eq1]income income2

/* Below save the log likelihood from the ml model for additional hypothesis testing.  */

est store A

/* Now let's do a likelihood ratio test for no heteroskedasticity using STATA's canned

procedure for likelihood ratio tests.  First, restimate the model with the restriction

of homoskedasticity.  */

ml model lf regressest (eq1:avgexp=age ownrent  income income2) (eq2:avgexp=)

ml maximize

/* Now calculate the likelihood ratio test for no heteroskedasticity using STATA's canned

procedure */

lrtest A

/*Now clear the program "regresset"*/

program drop regressest

exit

/* This file demonstrates various methods for dealing with autocorrelation within STATA. */

clear

/* The next line will read a data file. Change the path to find data */

use "c:\users\Wood\Documents\My Teaching\Maximum Likelihood\Data\ostrom.dta", clear

/* Set the Sample */

gen id = _n

keep if (1<=id<=22)

summarize us ussr

/* Now let's fit a regression the easy way, save and plot the residuals and do a Wald

test for whether the coefficient on USSR=0.*/

regress us ussr

predict res, res

test ussr=0

graph twoway (scatter res year), yline(0)

tsset year

/* Now let's compute robust (autocorrelation and heteroskedasticity consistent) standard

errors using the Newey-West procedure. */

newey us ussr, lag(1)

/* Now let's do Prais-Winston regression.  This is an iterative procedure that jointly

estimates the coefficients and autocorrelation coefficient.   */

prais us ussr

/*Now let's do a maximum likelihood procedure for Cochrane-Orcutt estimation. Note that

unlike the prais procedure above, we are not fully efficient since we are dropping the

first observation.  It would take a bit more programming to retrieve the first observation

using the Prais-Winston transformation. */

gen uslag=us[_n-1]

gen ussrlag=ussr[_n-1]

drop if(year==67)

program define arregressest

args lnf theta1 theta2 theta3 theta4

quietly replace `lnf'= /*

*/ -(1/2)*ln(2*_pi)-(1/2)*ln(`theta4'^2)-(1/(2*`theta4'^2))*((\$ML_y1-`theta1'-`theta2'*ussr) -`theta3'*(uslag-`theta1'-`theta2'*ussrlag))^2+(1/(2*_N))*ln(1-`theta3'^2)

end

/*Now lets use the above function to estimate the model.  */

ml model lf arregressest (b0:us=) (b1:) (rho:) (sigma:) , technique(nr)

ml init 7 .9 .9 7, copy

ml maximize

/* Now clear the program "arregresset."  */

program drop arregressest

exit