版權(quán)說(shuō)明:本文檔由用戶(hù)提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
1、Chapter 6Linear Regression with Multiple RegressorsSimple Omitted Variables ProblemMotivation:It is very difficult to identify the effect of one independent variable, x, on the dependent variable, y, without considering the other factors.ExampleWe can measure the effect of schooling more precisely i
2、f we compare people who have the exactly the same characteristics, except the schooling. (There have been many studies on twins.) If we find a difference in wage rates between the two identical people except the schooling, then we can conclude that the difference is due to the difference in the scho
3、oling. Unfortunately, in social science, it is impossible to find even two persons exactly the same except one factor. (Even twins are different in many ways.)Simple Omitted Variables ProblemQuestion:Is there any omitted variables problem from the regression of test scores on the student-ratio? If y
4、es, what are they? Please give us examples.Answers:Percentage of English learnersSchool districtParents Education backgroundIQ test scoreTime of day of the testParking lot space per pupilSimple Omitted Variables ProblemOmitted variable biasThe omitted variable is correlated with the included regress
5、or;The omitted variable is a determinant of the dependent variable.Omitted variable bias and the first least squares assumptionE(ui|Xi)0;Not vanishing even in large samples;Estimator is inconsistent.Simple Omitted Variables ProblemA formula for Omitted variable biasBeta1_hat - Beta1 + corr(X,u)(Var(
6、u)/Var(x)0.5Whether the bias is large or small in practice depends on the correlation between the regressor and the error term, corr(X,u).The direction of the bias in Beta1_hat depends on whether X and u are positively or negatively correlated. Addressing omitted variable bias by dividing the data i
7、nto GroupExample1:We are interested in the effect of the student-teacher ratio on test scores, holding constant other factors, including the percentage of English learnersAddressing omitted variable bias by dividing the data into GroupOnce we hold the percentage of English learners constant, the dif
8、ference in performance between districts with high and low student-teacher ratios is perhaps half (or less) of the overall estimate of 7.4 points.Addressing omitted variable bias by dividing the data into GroupExample 2In rural areas of Uganda, non-farm e provides much needed cash to farm households
9、. Most educated men and women in rural areas have opportunities to hold regular jobs, making a constant monthly wage throughout a year. Other people who are not fortunate enough to hold regular jobs are likely to earn non-farm e from small self-employed businesses such as making baskets or trading g
10、oods.Suppose that we are interested in the gender differences in non-farm e and want to test a hypothesis that women make less non-farm e than men. But a simple comparison of non-farm e between men and women does not provide a reliable test for this hypothesis if characteristics of men and women are
11、 not similar. One major factor in non-farm e is education. If men are better educated than women and men make more non-farm e than women, we can not be sure if the high non-farm e is a result of gender or education.For example, lets use the data from Uganda. The data are collected by FASID in collab
12、oration with Makerere University in Uganda in 2003. The data come from 940 households. Among them, we find 648 people who earned some e from non-farm activities. Table indicates average non-farm e in US$, average schooling years, age, and observations for men and women.men are slightly better educat
13、ed and make more non-farm e. To compare non-farm e between men and women while holding education levels constant, we have created categories for education: no education (category 0), 1-4 years of schooling (1), 5-7 years of schooling (2), 8-11 years of schooling (3), and more than 12 years of school
14、ing (4).The Multiple Regression ModelPermitting estimating the effect on Y of changing one variable (X1) while holding the other regressors constant;The Multiple Regression ModelBetaThe slope of coefficient of XThe partial effect on Y of X, holding other factors fixedThe Multiple Regression ModelThe
15、 OLS Estimator in Multiple RegressionMeasures of Fit in Multiple RegressionMeasures of Fit in Multiple RegressionThe “R-squared”R-squared = 1- SSR/TSSR-squared range 0,1Usually, the higher R-squared, the better. But how high is high? Until now, we do not know. Generally, the R-squared is higher if d
16、ata is a time series data. If the R-squared is around 0.5 with cross-section data, we say it is a good fit. Sometimes, it is valuable if the R-squared is equal to 0.2.Measures of Fit in Multiple RegressionThe problem of “R-squared”In multiple regression, the “R-squared” increases whenever a regresso
17、r is added, unless the estimated coefficient on the added regressor is exactly zero. (This is a theory can be shown)The “Adjusted R-squared”Adjusted R-squared = 1-(n-1)/(n-k-1)SSR/TSSMeasures of Fit in Multiple RegressionThe “Adjusted R-squared”With the degree-of-freedom correction(n-1)/(n-k-1) is a
18、lways larger than 1, so the “adjusted R-squared” is always less than “R-squared”Adding a regressor has two opposite effects on the “adjusted R-squared”. On the one hand, the SSR falls which increases the “adjusted R-squared”. On the other hand, the factor (n-1)/(n-k-1) increases. Whether the “adjust
19、ed R-squared” increases or decreases depends on which of these two effects is strongerthe “adjusted R-squared” can be negativeMeasures of Fit in Multiple RegressionHeavy reliance on the “Adjusted R-squared” or “R-squared” can be a trapMaximizing the “Adjusted R-squared” is rarely the answer to any e
20、conomically or statistically meaningful questionOur decision about whether to include a variable in a multiple regression should be based on whether including that variable allows you better to estimate the causal effect of interest.Perfect MulticollinearityDefinationOne of the regressors is a perfe
21、ct linear function of the other regressors.ReasonIt is illogical. Intuitively, the coefficient on one of the regressor is the effect of a change in that regressor with holding the other regressors cosntant in multiple regression. Perfect MulticollinearityPercentage of English Learners (PctEL)Student
22、-Teacher Ratio (STR)TestScore = Beta0 + Beta1*STR + Beta2*PctEL + ErrortermExample #1Fraction of English learners (FracEL)TestScore = Beta0 + Beta1*STR + Beta2*PctEL + Beta3*FracEL + ErrortermPctEL= 100* FracELPerfect MulticollinearityExample #2“Not very small”, a binary variable that equals 1 if ST
23、R=12 and equals 0 otherwise. (NVS)TestScore = Beta0 + Beta1*STR + Beta2*PctEL + Beta3*NVS + ErrortermNVS = 1*Constant termExample #3The percentage of English speakers (PctES)TestScore = Beta0 + Beta1*STR + Beta2*PctEL + Beta3*PctES + ErrortermPctES = 100*Constant term - PctElPerfect Multicollinearit
24、yExample #4 (Dummy variable trap)The school districts can be divided into three categories: rural, suburban and urbanTestScore = Beta0 + Beta1*STR + Beta2*PctEL + Beta3*Rural + Beta4*Suburban + Beta5*Urban + ErrortermRural + suburban + urban = 1 = Constant termIf rural was excluded, then the coeffic
25、ient on suburban would be average difference between test scores in suburban and rural districts holding constant the other variables in the regression.Dummy variable trap: if there are G binary variables, if each observation fall into one and only one category, if there is an intercept in the regre
26、ssion and if all G binary variables are included as regressors, then the regression will fail because of perfect multicollinearity.How to avoid? Excluding one of the binary variable, so only G-1 of the G binary variables are included as regressors.Perfect MulticollinearitySolutionModifying the regressors to eliminate the problem.In STATA, it will drop one of the occurrences of regressor.ImPerfect MulticollinearityDefinitionTwo or more of the regressors are highly correlatedProblemThe coefficients on at least one individual
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶(hù)所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶(hù)上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶(hù)上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶(hù)因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 2024年度青海省公共營(yíng)養(yǎng)師之四級(jí)營(yíng)養(yǎng)師全真模擬考試試卷A卷含答案
- 2024年度青海省公共營(yíng)養(yǎng)師之二級(jí)營(yíng)養(yǎng)師考試題庫(kù)
- 2024年度陜西省公共營(yíng)養(yǎng)師之四級(jí)營(yíng)養(yǎng)師高分通關(guān)題庫(kù)A4可打印版
- 2025年度城市公共安全監(jiān)控項(xiàng)目合同范本4篇
- 2025年度個(gè)人物流倉(cāng)儲(chǔ)產(chǎn)業(yè)股份轉(zhuǎn)讓合同協(xié)議書(shū)4篇
- 2025年度土地使用權(quán)出讓合同標(biāo)準(zhǔn)文本3篇
- 二零二五版幕墻工程節(jié)能評(píng)估與優(yōu)化合同4篇
- 2025至2030年中國(guó)紙帽數(shù)據(jù)監(jiān)測(cè)研究報(bào)告
- 2025至2030年中國(guó)寵物掛飾數(shù)據(jù)監(jiān)測(cè)研究報(bào)告
- 2025至2030年中國(guó)全自動(dòng)飯柜數(shù)據(jù)監(jiān)測(cè)研究報(bào)告
- 2024-2025學(xué)年北京石景山區(qū)九年級(jí)初三(上)期末語(yǔ)文試卷(含答案)
- 第一章 整式的乘除 單元測(cè)試(含答案) 2024-2025學(xué)年北師大版數(shù)學(xué)七年級(jí)下冊(cè)
- 春節(jié)聯(lián)歡晚會(huì)節(jié)目單課件模板
- 中國(guó)高血壓防治指南(2024年修訂版)
- 糖尿病眼病患者血糖管理
- 抖音音樂(lè)推廣代運(yùn)營(yíng)合同樣本
- 教育促進(jìn)會(huì)會(huì)長(zhǎng)總結(jié)發(fā)言稿
- 北師大版(2024新版)七年級(jí)上冊(cè)數(shù)學(xué)第四章《基本平面圖形》測(cè)試卷(含答案解析)
- 心理調(diào)適教案調(diào)整心態(tài)積極應(yīng)對(duì)挑戰(zhàn)
- 噴漆外包服務(wù)合同范本
- JT-T-390-1999突起路標(biāo)行業(yè)標(biāo)準(zhǔn)
評(píng)論
0/150
提交評(píng)論