內(nèi)容njue期末計(jì)量chapter_第1頁(yè)
內(nèi)容njue期末計(jì)量chapter_第2頁(yè)
內(nèi)容njue期末計(jì)量chapter_第3頁(yè)
內(nèi)容njue期末計(jì)量chapter_第4頁(yè)
內(nèi)容njue期末計(jì)量chapter_第5頁(yè)
已閱讀5頁(yè),還剩23頁(yè)未讀 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說(shuō)明:本文檔由用戶(hù)提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

1、Chapter 6Linear Regression with Multiple RegressorsSimple Omitted Variables ProblemMotivation:It is very difficult to identify the effect of one independent variable, x, on the dependent variable, y, without considering the other factors.ExampleWe can measure the effect of schooling more precisely i

2、f we compare people who have the exactly the same characteristics, except the schooling. (There have been many studies on twins.) If we find a difference in wage rates between the two identical people except the schooling, then we can conclude that the difference is due to the difference in the scho

3、oling. Unfortunately, in social science, it is impossible to find even two persons exactly the same except one factor. (Even twins are different in many ways.)Simple Omitted Variables ProblemQuestion:Is there any omitted variables problem from the regression of test scores on the student-ratio? If y

4、es, what are they? Please give us examples.Answers:Percentage of English learnersSchool districtParents Education backgroundIQ test scoreTime of day of the testParking lot space per pupilSimple Omitted Variables ProblemOmitted variable biasThe omitted variable is correlated with the included regress

5、or;The omitted variable is a determinant of the dependent variable.Omitted variable bias and the first least squares assumptionE(ui|Xi)0;Not vanishing even in large samples;Estimator is inconsistent.Simple Omitted Variables ProblemA formula for Omitted variable biasBeta1_hat - Beta1 + corr(X,u)(Var(

6、u)/Var(x)0.5Whether the bias is large or small in practice depends on the correlation between the regressor and the error term, corr(X,u).The direction of the bias in Beta1_hat depends on whether X and u are positively or negatively correlated. Addressing omitted variable bias by dividing the data i

7、nto GroupExample1:We are interested in the effect of the student-teacher ratio on test scores, holding constant other factors, including the percentage of English learnersAddressing omitted variable bias by dividing the data into GroupOnce we hold the percentage of English learners constant, the dif

8、ference in performance between districts with high and low student-teacher ratios is perhaps half (or less) of the overall estimate of 7.4 points.Addressing omitted variable bias by dividing the data into GroupExample 2In rural areas of Uganda, non-farm e provides much needed cash to farm households

9、. Most educated men and women in rural areas have opportunities to hold regular jobs, making a constant monthly wage throughout a year. Other people who are not fortunate enough to hold regular jobs are likely to earn non-farm e from small self-employed businesses such as making baskets or trading g

10、oods.Suppose that we are interested in the gender differences in non-farm e and want to test a hypothesis that women make less non-farm e than men. But a simple comparison of non-farm e between men and women does not provide a reliable test for this hypothesis if characteristics of men and women are

11、 not similar. One major factor in non-farm e is education. If men are better educated than women and men make more non-farm e than women, we can not be sure if the high non-farm e is a result of gender or education.For example, lets use the data from Uganda. The data are collected by FASID in collab

12、oration with Makerere University in Uganda in 2003. The data come from 940 households. Among them, we find 648 people who earned some e from non-farm activities. Table indicates average non-farm e in US$, average schooling years, age, and observations for men and women.men are slightly better educat

13、ed and make more non-farm e. To compare non-farm e between men and women while holding education levels constant, we have created categories for education: no education (category 0), 1-4 years of schooling (1), 5-7 years of schooling (2), 8-11 years of schooling (3), and more than 12 years of school

14、ing (4).The Multiple Regression ModelPermitting estimating the effect on Y of changing one variable (X1) while holding the other regressors constant;The Multiple Regression ModelBetaThe slope of coefficient of XThe partial effect on Y of X, holding other factors fixedThe Multiple Regression ModelThe

15、 OLS Estimator in Multiple RegressionMeasures of Fit in Multiple RegressionMeasures of Fit in Multiple RegressionThe “R-squared”R-squared = 1- SSR/TSSR-squared range 0,1Usually, the higher R-squared, the better. But how high is high? Until now, we do not know. Generally, the R-squared is higher if d

16、ata is a time series data. If the R-squared is around 0.5 with cross-section data, we say it is a good fit. Sometimes, it is valuable if the R-squared is equal to 0.2.Measures of Fit in Multiple RegressionThe problem of “R-squared”In multiple regression, the “R-squared” increases whenever a regresso

17、r is added, unless the estimated coefficient on the added regressor is exactly zero. (This is a theory can be shown)The “Adjusted R-squared”Adjusted R-squared = 1-(n-1)/(n-k-1)SSR/TSSMeasures of Fit in Multiple RegressionThe “Adjusted R-squared”With the degree-of-freedom correction(n-1)/(n-k-1) is a

18、lways larger than 1, so the “adjusted R-squared” is always less than “R-squared”Adding a regressor has two opposite effects on the “adjusted R-squared”. On the one hand, the SSR falls which increases the “adjusted R-squared”. On the other hand, the factor (n-1)/(n-k-1) increases. Whether the “adjust

19、ed R-squared” increases or decreases depends on which of these two effects is strongerthe “adjusted R-squared” can be negativeMeasures of Fit in Multiple RegressionHeavy reliance on the “Adjusted R-squared” or “R-squared” can be a trapMaximizing the “Adjusted R-squared” is rarely the answer to any e

20、conomically or statistically meaningful questionOur decision about whether to include a variable in a multiple regression should be based on whether including that variable allows you better to estimate the causal effect of interest.Perfect MulticollinearityDefinationOne of the regressors is a perfe

21、ct linear function of the other regressors.ReasonIt is illogical. Intuitively, the coefficient on one of the regressor is the effect of a change in that regressor with holding the other regressors cosntant in multiple regression. Perfect MulticollinearityPercentage of English Learners (PctEL)Student

22、-Teacher Ratio (STR)TestScore = Beta0 + Beta1*STR + Beta2*PctEL + ErrortermExample #1Fraction of English learners (FracEL)TestScore = Beta0 + Beta1*STR + Beta2*PctEL + Beta3*FracEL + ErrortermPctEL= 100* FracELPerfect MulticollinearityExample #2“Not very small”, a binary variable that equals 1 if ST

23、R=12 and equals 0 otherwise. (NVS)TestScore = Beta0 + Beta1*STR + Beta2*PctEL + Beta3*NVS + ErrortermNVS = 1*Constant termExample #3The percentage of English speakers (PctES)TestScore = Beta0 + Beta1*STR + Beta2*PctEL + Beta3*PctES + ErrortermPctES = 100*Constant term - PctElPerfect Multicollinearit

24、yExample #4 (Dummy variable trap)The school districts can be divided into three categories: rural, suburban and urbanTestScore = Beta0 + Beta1*STR + Beta2*PctEL + Beta3*Rural + Beta4*Suburban + Beta5*Urban + ErrortermRural + suburban + urban = 1 = Constant termIf rural was excluded, then the coeffic

25、ient on suburban would be average difference between test scores in suburban and rural districts holding constant the other variables in the regression.Dummy variable trap: if there are G binary variables, if each observation fall into one and only one category, if there is an intercept in the regre

26、ssion and if all G binary variables are included as regressors, then the regression will fail because of perfect multicollinearity.How to avoid? Excluding one of the binary variable, so only G-1 of the G binary variables are included as regressors.Perfect MulticollinearitySolutionModifying the regressors to eliminate the problem.In STATA, it will drop one of the occurrences of regressor.ImPerfect MulticollinearityDefinitionTwo or more of the regressors are highly correlatedProblemThe coefficients on at least one individual

溫馨提示

  • 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶(hù)所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶(hù)上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶(hù)上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶(hù)因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

最新文檔

評(píng)論

0/150

提交評(píng)論