版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認(rèn)領(lǐng)
文檔簡介
ChapterSeventeen
CorrelationAndRegression
第十七章相關(guān)分析與回歸分析
學(xué)習(xí)目標(biāo)討論積矩相關(guān)系數(shù)、偏相關(guān)和部分相關(guān)的概念,并說明這些相關(guān)關(guān)系如何為回歸分析建立基礎(chǔ)。解釋二元回歸的特點和方法,描述其一般模型、參數(shù)估計、標(biāo)準(zhǔn)化回歸系數(shù)、顯著性檢驗、預(yù)測準(zhǔn)確性、殘差分析和模型交叉檢驗。解釋多元回歸分析的特點和方法,尤其是逐步回歸、含虛擬變量的回歸以及回歸中的方差和協(xié)方差分析。描述多元回歸分析中用的特殊方法,尤其是逐步回歸、含虛擬變量的回歸以及回歸中的方差和協(xié)方差分析。探討非定量相關(guān)及其測量指標(biāo)學(xué)習(xí)內(nèi)容積矩相關(guān)系數(shù)偏相關(guān)非定量相關(guān)回歸分析二元回歸統(tǒng)計與二元回歸分析的關(guān)系進行二元回歸分析多元回歸
與二元回歸分析相關(guān)的統(tǒng)計量
進行多元回歸分析逐步回歸多重共線性預(yù)測的相對重要性交叉驗證回歸與虛擬變量方差分析與回歸分析總結(jié)ProductMomentCorrelation
積矩相關(guān)系數(shù)積矩相關(guān)系數(shù)r是最常用的概括兩個定量(定距或定比尺度)變量X與Y的關(guān)系強度的統(tǒng)計量它是一個決定X與Y是否存在線性關(guān)系的指標(biāo)。由于這個指標(biāo)最早由KarlPearson提出的,因此也被稱為Pearson相關(guān)系數(shù)。
它同時也叫簡單相關(guān)系數(shù)、雙變量相關(guān)系數(shù)或者相關(guān)系數(shù)。ProductMomentCorrelation
積矩相關(guān)系數(shù)對于n個觀測值的樣本,變量為X和Y,積矩相關(guān)系數(shù)r計算為r=(Xi-X)(Yi-Y)Si=1n(Xi-X)2Si=1n(Yi-Y)2Si=1nDivisionofthenumeratoranddenominatorby(n-1)givesr=(Xi-X)(Yi-Y)n-1Si=1n(Xi-X)2n-1Si=1n(Yi-Y)2n-1Si=1n=COVxySxSyProductMomentCorrelation
積矩相關(guān)系數(shù)r在-1.0和+1.0之間變化。不論兩個變量各自的測量單位是什么,相關(guān)系數(shù)都是不變的。
ExplainingAttitudeTowardtheCityofResidence
研究對居住城市的態(tài)度Table17.1調(diào)查對象編號對城市的態(tài)度居住年限天氣的重要性161032912113812443415101211646175878224911188109910111017812225ProductMomentCorrelation
積矩相關(guān)系數(shù)相關(guān)系數(shù)的計算如下: =(10+12+12+4+12+6+8+2+18+9+17+2)/12 =9.333
XY =(6+9+8+3+10+4+5+2+11+9+10+2)/12 =6.583(Xi-X)(Yi-Y)Si=1n =(10-9.33)(6-6.58)+(12-9.33)(9-6.58) +(12-9.33)(8-6.58)+(4-9.33)(3-6.58) +(12-9.33)(10-6.58)+(6-9.33)(4-6.58) +(8-9.33)(5-6.58)+(2-9.33)(2-6.58) +(18-9.33)(11-6.58)+(9-9.33)(9-6.58) +(17-9.33)(10-6.58)+(2-9.33)(2-6.58) =-0.3886+6.4614+3.7914+19.0814 +9.1314+8.5914+2.1014+33.5714 +38.3214-0.7986+26.2314+33.5714 =179.6668ProductMomentCorrelation
積矩相關(guān)系數(shù)DecompositionoftheTotalVariation
總變差分解r2
=
Explained
variationTotal
variation
=
SSxSSy
=
Total
variation
-
Error
variationTotal
variation=
SSy
-
SSerrorSSy
DecompositionoftheTotalVariation
總方差分解
DecompositionoftheTotalVariation
總變差分解r=0時的非線性關(guān)系PartialCorrelation偏相關(guān)偏相關(guān)系數(shù)是用于測量在控制或調(diào)整了一個或多個變量的基礎(chǔ)上,兩個變量之間的關(guān)系計算偏相關(guān)系數(shù)是需要考慮其“階數(shù)”,這
“階數(shù)”說明有多少個變量被控制或調(diào)整簡單相關(guān)系數(shù)r是零階的,因為在測量兩個變量之間關(guān)系時不需要控制額外變量的作用。
PartialCorrelation偏相關(guān)
PartCorrelationCoefficient
部分相關(guān)系數(shù)部分相關(guān)系數(shù)代表從X中去除其他自變量線性影響后,Y和X之間的相關(guān)性。ry(x.z)部分相關(guān)系數(shù)計算如下:通常認(rèn)為偏相關(guān)系數(shù)比部分相關(guān)系數(shù)重要。ry(x.z)
=
rxy
-
ryzrxz1
-
rxz2NonmetricCorrelation非定量相關(guān)
rs
t
rt
t
rs
rRegressionAnalysis回歸分析Regression
analysis
examinesassociativerelationshipsbetweenametricdependentvariableandoneormoreindependentvariablesinthefollowingways:回歸分析是分析定量因變量與一個或多個自變量之間相關(guān)關(guān)系的有效且易用的方法,可以用于以下幾方面;Determinewhethertheindependentvariablesexplainasignificantvariationinthedependentvariable:whetherarelationshipexists.確定自變量是否能夠解釋因變量的重要變差,即二者之間是否存在關(guān)系。Determinehowmuchofthevariationinthedependentvariablecanbeexplainedbytheindependentvariables:strengthoftherelationship.確定因變量中有多大比例的變差可以有自變量來解釋,即關(guān)系的強度有多大。RegressionAnalysis回歸分析Determinethestructureorformoftherelationship:themathematicalequationrelatingtheindependentanddependentvariables.確定二者關(guān)系的形式,即與自變量和因變量有關(guān)的數(shù)學(xué)方程式。Predictthevaluesofthedependentvariable.預(yù)測因變量的值。Controlforotherindependentvariableswhenevaluatingthecontributionsofaspecificvariableorsetofvariables.在評估特定變量貢獻時,控制其他變量的作用。Regressionanalysisisconcernedwiththenatureanddegreeofassociationbetweenvariablesanddoesnotimplyorassumeanycausality.盡管自變量可能解釋一部分因變量的變差,但這并不表示必然存在因果關(guān)系StatisticsAssociatedwithBivariate
RegressionAnalysis與二元回歸分析相關(guān)的統(tǒng)計量Bivariateregressionmodel.ThebasicregressionequationisYi=+Xi
+ei,whereY=dependentorcriterionvariable,X=independentorpredictorvariable,=interceptoftheline,=slopeoftheline,andeiistheerrortermassociatedwiththeithobservation.二元回歸模型,基本的回歸等式為Yi=+Xi
+ei,其中Yi
是因變量或標(biāo)準(zhǔn)變量,Xi為自變量或預(yù)測變量,為直線截距,為直線斜率,ei為第i個觀測值的誤差。Coefficientofdetermination.Thestrengthofassociationismeasuredbythecoefficientofdetermination,r2.Itvariesbetween0and1andsignifiestheproportionofthetotalvariationinYthatisaccountedforbythevariationinX.可決系數(shù)變量之間聯(lián)系的強度由可決系數(shù)r2
類測量,其值在0和1之間變化,表表示Y的總變差中能被X變差解釋的比例。Estimatedorpredictedvalue.TheestimatedorpredictedvalueofYiisi
=a+bx,whereiisthepredictedvalueofYi,andaandbareestimatorsof
and,respectively.
估計值或預(yù)測值:Yi的估計值或預(yù)測值為=a+bx,為Yi預(yù)測值,a
和
b
分別為和的估計值。
b0
b1
b0
b1
b0
b1StatisticsAssociatedwithBivariate
RegressionAnalysis與二元回歸分析相關(guān)的統(tǒng)計量Regressioncoefficient.Theestimatedparameterbisusuallyreferredtoasthenon-standardizedregressioncoefficient.回歸系數(shù)。估計的參數(shù)b通常是指非標(biāo)準(zhǔn)化回歸系數(shù)。Scattergram.Ascatterdiagram,orscattergram,isaplotofthevaluesoftwovariablesforallthecasesorobservations.散點圖。散點圖是根據(jù)兩個變量的所有觀測值繪制的圖。Standarderrorofestimate.Thisstatistic,SEE,isthestandarddeviationoftheactualYvaluesfromthepredictedvalues.估計標(biāo)準(zhǔn)誤。SEE表示Y的實際值與預(yù)測值之間的標(biāo)準(zhǔn)差Standarderror.Thestandarddeviationofb,SEb,iscalledthestandarderror.標(biāo)準(zhǔn)誤。B的標(biāo)準(zhǔn)差Seb被稱作標(biāo)準(zhǔn)誤。YStatisticsAssociatedwithBivariate
RegressionAnalysis與二元回歸分析相關(guān)的統(tǒng)計量Standardizedregressioncoefficient.Alsotermedthebetacoefficientorbetaweight,thisistheslopeobtainedbytheregressionofYonXwhenthedataarestandardized.標(biāo)準(zhǔn)化回歸系數(shù)。也被稱作beta系數(shù)或beta權(quán)數(shù),是X與Y均為標(biāo)準(zhǔn)化數(shù)據(jù)時的斜率。Sumofsquarederrors.Thedistancesofallthepointsfromtheregressionlinearesquaredandaddedtogethertoarriveatthesumofsquarederrors,whichisameasureoftotalerror,誤差平方和。將所有偏離回歸擬合線的點的距離的平方和加總就得到誤差平方和,值總誤差的測量指標(biāo),記作tstatistic.Atstatisticwithn-2degreesoffreedomcanbeusedtotestthenullhypothesisthatnolinearrelationshipexistsbetweenXandY,orH0:β=0,wheret=b/SEbT統(tǒng)計量。自由度為n-2的t統(tǒng)計量可用于檢驗X與Y不存在線性關(guān)系的零假設(shè)。
ejS2ConductingBivariateRegressionAnalysis
PlottheScatterDiagram
二元回歸分析散點圖Ascatterdiagram,orscattergram,isaplotofthevaluesoftwovariablesforallthecasesorobservations.
散點圖就是根據(jù)兩個變量的所有觀測值繪制的圖表Themostcommonlyusedtechniqueforfittingastraightlinetoascattergramistheleast-squaresprocedure.Infittingtheline,theleast-squaresprocedureminimizesthesumofsquarederrors,用一條直線對散點圖進行擬合的最常用方法為最小二乘法.為找到最佳擬合線,最小二乘法可以令誤差平方和最小。
ejS2ConductingBivariateRegressionAnalysis進行二元回歸分析法Fig.17.2圖PlottheScatterDiagram繪制散點圖FormulatetheGeneralModel建立二元回歸模型EstimatetheParameters估計參數(shù)EstimateStandardizedRegressionCoefficients估計標(biāo)準(zhǔn)化回歸系數(shù)TestforSignificance顯著性檢驗DeterminetheStrengthandSignificanceofAssociation確定相關(guān)關(guān)系的強度和顯著性CheckPredictionAccuracy檢查預(yù)測準(zhǔn)確度ExaminetheResiduals殘差檢驗
Cross-ValidatetheModel模型交叉檢驗ConductingBivariateRegressionAnalysis
FormulatetheBivariateRegressionModel分析二元回歸模型Inthebivariateregressionmodel,thegeneralformofastraightlineis:Y
=X
b0+
b1whereY=dependentorcriterionvariable因變量或標(biāo)準(zhǔn)變量X=independentorpredictorvariable自變量或預(yù)測變量
=interceptoftheline直線的截距
b0
b1=slopeoftheline直線的斜率
Theregressionprocedureaddsanerrortermtoaccountfortheprobabilisticorstochasticnatureoftherelationship:在回歸分析中需要加上誤差項,以便考察變量之間關(guān)系的隨機性Yi
=
b0+
b1
Xi+eiwhereeiistheerrortermassociatedwiththeithobservation.式中ei為第I個觀察值相關(guān)的誤差項PlotofAttitudewithDurationFig.17.34.52.256.7511.25913.593615.7518DurationofResidenceAttitudeWhichStraightLineIsBest?Fig.17.49
6
3
2.25
4.5
6.75
9
11.25
13.5
15.75
18
Line1
Line2
Line3
Line4
BivariateRegression二元回歸
Fig.17.5X2X1X3X5X4YJeJeJYJXYβ0+β1XConductingBivariateRegressionAnalysis
EstimatetheParameters二元回歸參數(shù)分析 areunknownandareestimatedfromthesampleobservationsusingtheequation在大多數(shù)情況下,和是未知的,需要根據(jù)等式從樣本觀測值中估計
whereiistheestimatedorpredictedvalueofYi,andaandbareestimatorsofInmostcases,
b0and
b1Yi=a+bxiYand,respectively.
b=COVxySx2=(Xi-X)(Yi-Y)Si=1n(Xi-X)Si=1n2=XiYi-nXYSi=1nXi2-nX2Si=1n
b0
b1b0ConductingBivariateRegressionAnalysis
EstimatetheParameters二元回歸參數(shù)分析Theintercept,a,maythenbecalculatedusing:截距a則可以計算如下 a=ForthedatainTable17.1,theestimationofparametersmaybe
illustratedasfollows:
=(10)(6)+(12)(9)+(12)(8)+(4)(3)+(12)(10)+(6)(4) +(8)(5)+(2)(2)+(18)(11)+(9)(9)+(17)(10)+(2)(2) =917
Xi2 =102+122+122+42+122+62 +82+22+182+92+172+22 =1350-bYXS12iS=112=i1XiYiConductingBivariateRegressionAnalysis
EstimatetheParameters二元回歸參數(shù)分析Itmayberecalledfromearliercalculationsofthesimplecorrelationthat:前面我們講過簡單相關(guān)系數(shù)的計算為: =9.333 =6.583
Givenn=12,bcanbecalculatedas:
=0.5897
a=XYb
=
917
-
(12)
(9.333)
(
6.583)1350
-
(12)
(9.333)2Y-bX
=6.583-(0.5897)(9.333) =1.0793ConductingBivariateRegressionAnalysis
EstimatetheStandardizedRegressionCoefficient估計標(biāo)準(zhǔn)化回歸系數(shù)Standardizationistheprocessbywhichtherawdataaretransformedintonewvariablesthathaveameanof0andavarianceof1(Chapter14).標(biāo)準(zhǔn)化就是將原始數(shù)據(jù)轉(zhuǎn)換為均值為0,方差為1的新變量的過程(見14章)Whenthedataarestandardized,theinterceptassumesavalueof0.數(shù)據(jù)進行標(biāo)準(zhǔn)化后,截距取值為0Thetermbetacoefficientorbetaweight
isusedtodenotethestandardizedregressioncoefficient.Bata系數(shù)被用來表示標(biāo)準(zhǔn)化回歸系數(shù)。 Byx=Bxy
=rxy
Thereisasimplerelationshipbetweenthestandardizedandnon-standardizedregressioncoefficients:標(biāo)準(zhǔn)化和非標(biāo)準(zhǔn)化回歸系數(shù)的關(guān)系可以簡單表示如下:
Byx=byx(Sx/Sy)ConductingBivariateRegressionAnalysis
TestforSignificance二元回歸顯著性檢驗ThestatisticalsignificanceofthelinearrelationshipbetweenXandYmaybetestedbyexaminingthehypotheses:對于X和Y之間的線性關(guān)系的統(tǒng)計顯著性可以通過以下假設(shè)進行檢驗Atstatisticwithn-2degreesoffreedomcanbeused,where通常采用雙尾檢驗,對此要采用自由度為n-2的t統(tǒng)計量SEbdenotesthestandarddeviationofbandiscalledthestandarderror.Seb表示b的標(biāo)準(zhǔn)差,被稱為標(biāo)準(zhǔn)誤。
H0:b1=0H1:b110t
=
bSEbConductingBivariateRegressionAnalysis
TestforSignificance二元回歸顯著性檢驗Usingacomputerprogram,theregressionofattitudeondurationofresidence,usingthedatashowninTable17.1,yieldedtheresultsshowninTable17.2.Theintercept,a,equals1.0793,andtheslope,b,equals0.5897.Therefore,theestimatedequationis:用計算機程序,根據(jù)表17-1的數(shù)據(jù),可以建立對城市態(tài)度與居住年限的回歸方程,其中截距a=1.0793,斜率b=0.5897,估計的方程式為:Attitude()=1.0793+0.5897(Durationofresidence)Thestandarderror,orstandarddeviationofbisestimatedas0.07008,andthevalueofthetstatisticast=0.5897/0.0700=8.414,withn-2=10degreesoffreedom.B的標(biāo)準(zhǔn)誤或標(biāo)準(zhǔn)差為0.07008,t=0.5897/0.0700自由度為n-2=10FromTable4intheStatisticalAppendix,weseethatthecriticalvalueoftwith10degreesoffreedomand=0.05is2.228foratwo-tailedtest.Sincethecalculatedvalueoftislargerthanthecriticalvalue,thenullhypothesisisrejected.從附錄統(tǒng)計表4中,我們可以找到自由度為10,a=0.05時雙尾檢驗t的臨界值為2.228,由于t的計算值大于臨界值,零假設(shè)被拒絕。
aYConductingBivariateRegressionAnalysis
DeterminetheStrengthandSignificanceofAssociation確定相關(guān)關(guān)系的強度和顯著性Thetotalvariation,SSy,maybedecomposedintothevariationaccountedforbytheregressionline,SSreg,andtheerrororresidualvariation,SSerrororSSres,asfollows:總變差SSy可以分解為回歸變差,SSreg和殘差SSerror或Ssres,即SSy=SSreg+SSreswhere
S
S
y
=
(
Y
i
-
Y
)
2
n
S
i
=1
S
S
r
e
g
=
(
Y
i
-
Y
)
2
S
S
r
e
s
=(
Y
i
-
Y
i
)
2
n
S
i
=1n
S
i
=1DecompositionoftheTotal
VariationinBivariateRegression
二元回歸中的總變差分解Fig.17.6X2X1X3X5X4YXTotalVariationSSyResidualVariation殘余變差SSresExplainedVariation解釋變差SSregYConductingBivariateRegressionAnalysis
DeterminetheStrengthandSignificanceofAssociation確定相關(guān)關(guān)系的強度和顯著性Toillustratethecalculationsofr2,letusconsideragaintheeffectofattitudetowardthecityonthedurationofresidence.Itmayberecalledfromearliercalculationsofthesimplecorrelationcoefficientthat:為說明r2
計算,我們?nèi)稳灰跃幼∧晗迣Τ鞘袘B(tài)度的影響為例。在此之前我們曾經(jīng)計算過簡單相關(guān)系數(shù)
=120.9168SSy=(Yi-Y)2Si=1n
r
2
=
S
S
r
e
g
S
S
y
=
S
S
y
-
S
S
r
e
s
S
S
y
Thestrengthofassociationmaythenbecalculatedasfollows:變量之間聯(lián)系的強度計算如下;ConductingBivariateRegressionAnalysis
DeterminetheStrengthandSignificanceofAssociation確定相關(guān)關(guān)系的強度和顯著性Thepredictedvalues()canbecalculatedusingtheregression預(yù)測值可以通過回歸方程來計算equation:Attitude()=1.0793+0.5897(Durationofresidence)ForthefirstobservationinTable17.1,thisvalueis:()=1.0793+0.5897x10=6.9763.Foreachsuccessiveobservation,thepredictedvaluesare,inorder,8.1557,8.1557,3.4381,8.1557,4.6175,5.7969,2.2587,11.6939,6.3866,11.1042,and2.2587.對以后各項觀測值,預(yù)測值依次為8.1557,8.1557,3.4381,8.1557,4.6175,5.7969,2.2587,11.6939,6.3866,11.1042,和2.2587YYYConductingBivariateRegressionAnalysis
DeterminetheStrengthandSignificanceofAssociation確定相關(guān)關(guān)系的強度和顯著性Therefore,
=(6.9763-6.5833)2+(8.1557-6.5833)2 +(8.1557-6.5833)2+(3.4381-6.5833)2 +(8.1557-6.5833)2+(4.6175-6.5833)2 +(5.7969-6.5833)2+(2.2587-6.5833)2 +(11.6939-6.5833)2+(6.3866-6.5833)2 +(11.1042-6.5833)2+(2.2587-6.5833)2 =0.1544+2.4724+2.4724+9.8922+2.4724 +3.8643+0.6184+18.7021+26.1182 +0.0387+20.4385+18.7021
=105.9524SSreg=(Yi-Y)2Si=1nConductingBivariateRegressionAnalysis
DeterminetheStrengthandSignificanceofAssociation確定相關(guān)關(guān)系的強度和顯著性 =(6-6.9763)2+(9-8.1557)2+(8-8.1557)2
+(3-3.4381)2+(10-8.1557)2+(4-4.6175)2 +(5-5.7969)2+(2-2.2587)2+(11-11.6939)2 +(9-6.3866)2+(10-11.1042)2+(2-2.2587)2
=14.9644ItcanbeseenthatSSy=SSreg+SSres.Furthermore,
r2 =SSreg/SSy =105.9524/120.9168 =0.8762SSres=(Yi-Yi)2Si=1nConductingBivariateRegressionAnalysis
DeterminetheStrengthandSignificanceofAssociation
確定相關(guān)關(guān)系的強度和顯著性Another,equivalenttestforexaminingthesignificanceofthelinearrelationshipbetweenXandY(significanceofb)isthetestforthesignificanceofthecoefficientofdetermination.Thehypothesesinthiscaseare:
另外一個考察X與Y之間線性關(guān)系顯著性(b的顯著性)的等價檢驗,是可決系數(shù)顯著性檢驗。該檢驗的假設(shè)為:
H0:R2pop=0
H1:R2pop>0ConductingBivariateRegressionAnalysis
DeterminetheStrengthandSignificanceofAssociation
確定相關(guān)關(guān)系的強度和顯著性TheappropriateteststatisticistheFstatistic:適當(dāng)?shù)慕y(tǒng)計檢驗量為F統(tǒng)計量
whichhasanFdistributionwith1andn-2degreesoffreedom.TheFtestisageneralizedformofthettest(seeChapter15).Ifarandomvariableistdistributedwithndegreesoffreedom,thent2isFdistributedwith1andndegreesoffreedom.Hence,theFtestfortestingthesignificanceofthecoefficientofdeterminationisequivalenttotestingthefollowinghypotheses:它服從F分布,自由度為1和n-2。F檢驗是t檢驗的一般形式,如果隨機變量服從自由度為n的t分布,那么t2就服從自由度為1和n的F分布。因此檢驗可決系數(shù)顯著性的F檢驗與以下假設(shè)意義相同: orF
=
SSregSSres/(n-2)
H0:b1=0H1:b110
H0:r=0H1:r10ConductingBivariateRegressionAnalysis
DeterminetheStrengthandSignificanceofAssociation
確定相關(guān)關(guān)系的強度和顯著性FromTable17.2,itcanbeseenthat:
r2=105.9522/(105.9522+14.9644)
=0.8762
Whichisthesameasthevaluecalculatedearlier.ThevalueoftheFstatisticis:
F=105.9522/(14.9644/10)=70.8027
with1and10degreesoffreedom.ThecalculatedFstatisticexceedsthecriticalvalueof4.96determinedfromTable5intheStatisticalAppendix.Therefore,therelationshipissignificantat=0.05,corroboratingtheresultsofthettest.自由度為1和10.計算出的F統(tǒng)計量超過了根據(jù)附錄統(tǒng)計表5查到的臨界值4.96,因此,變量之間的關(guān)系在a=0.05的對平下顯著,證實了t檢驗的結(jié)果
aBivariateRegression
二元回歸Table17.2MultipleR 0.93608R2 0.87624AdjustedR2 0.86387StandardError 1.22329
ANALYSISOFVARIANCE
df SumofSquares MeanSquareRegression 1 105.95222 105.95222Residual 10 14.964441.49644F=70.80266 SignificanceofF=0.0000VARIABLESINTHEEQUATIONVariable b SEb Beta(?) TSignificance ofTDuration 0.58972 0.070080.93608 8.414 0.0000(Constant) 1.07932 0.74335 1.452 0.1772BivariateRegression
二元回歸Table17.2多元R
0.93608R2 0.87624調(diào)整的
R2 0.86387標(biāo)準(zhǔn)誤
1.22329
方差分析
自由度
平方和
均方回歸方程 1 105.95222 105.95222殘差
10 14.964441.49644F=70.80266
F
的顯著性
=0.0000等式中的變量變量
b SEb Beta(?) TSignificanceofT
居住年限 0.58972 0.070080.93608 8.414 0.0000(常數(shù)項 1.07932 0.74335 1.452 0.1772ConductingBivariateRegressionAnalysis
CheckPredictionAccuracy檢查預(yù)測準(zhǔn)確度Toestimatetheaccuracyofpredictedvalues,,itisusefultocalculatethestandarderrorofestimate,SEE.為估計預(yù)測值的準(zhǔn)確性,有必要計算估計的標(biāo)準(zhǔn)誤SEE,這個統(tǒng)計量表示Y的實際值與預(yù)測值之間的標(biāo)準(zhǔn)差。
orormoregenerally,iftherearekindependentvariables,如果有K個自變量,一般形式為
ForthedatagiveninTable17.2,theSEEisestimatedasfollows:
=1.22329Y2(12)?--=?=nSEEniiiYY2-=nSEESSres1--=knSEESSresSEE
=
14.9644/(12-2)Assumptions假設(shè)Theerrortermisnormallydistributed.ForeachfixedvalueofX,thedistributionofYisnormal.誤差項呈正態(tài)分布,對于每個X的固定值,Y為正態(tài)分布ThemeansofallthesenormaldistributionsofY,givenX,lieonastraightlinewithslopeb.給定X,所有正態(tài)分布的Y的均值位于一條斜率為b的直線上Themeanoftheerrortermis0.誤差項的均值為0Thevarianceoftheerrortermisconstant.ThisvariancedoesnotdependonthevaluesassumedbyX.誤差項的方差固定,方差不隨X值變化Theerrortermsareuncorrelated.Inotherwords,theobservationshavebeendrawnindependently.誤差項是不相關(guān)的,即觀測值是相互獨立的。MultipleRegression多元回歸Thegeneralformofthemultipleregressionmodelisasfollows:多元回歸模型的一般形式如下:whichisestimatedbythefollowingequation:該模型通過以下公式盡享估算 =a+b1X1+b2X2+b3X3+...+bkXk
Asbefore,thecoefficientarepresentstheintercept,buttheb'sarenowthepartialregressioncoefficients.如前所述,系數(shù)a代表的是截距,但b現(xiàn)在是偏回歸系數(shù)。Y
Y=b0+b1X1+b2X2+b3X3+...+bkXk+eeStatisticsAssociatedwithMultipleRegression與多元回歸有關(guān)的統(tǒng)計量AdjustedR2.R2,coefficientofmultipledetermination,isadjustedforthenumberofindependentvariablesandthesamplesizetoaccountforthediminishingreturns.Afterthefirstfewvariables,theadditionalindependentvariablesdonotmakemuchcontribution.
調(diào)整的。將多元可決系數(shù)根據(jù)自變量和樣本規(guī)模進行調(diào)整,除了前幾個自變量,其他自變量對因變量的影響不大。Coefficientofmultipledetermination.Thestrengthofassociationinmultipleregressionismeasuredbythesquareofthemultiplecorrelationcoefficient,R2,whichisalsocalledthecoefficientofmultipledetermination.多元可決系數(shù)。多元回歸中變量之間關(guān)系的強度由多元相關(guān)系數(shù)的平方R2來測量Ftest.TheFtestisusedtotestthenullhypothesisthatthecoefficientofmultipledeterminationinthepopulation,R2pop,iszero.Thisisequivalenttotestingthenullhypothesis.TheteststatistichasanFdistributionwithkand(n-k-1)degreesoffreedom.F檢驗。F檢驗用于檢驗樣本總體多元可決系數(shù)R2pop為0的假設(shè)。這與檢驗零假設(shè)是等價的。檢驗統(tǒng)計量服從F分布,自由度為k和(n-k-1).StatisticsAssociatedwithMultipleRegression與多元回歸有關(guān)的統(tǒng)計量PartialFtest.Thesignificanceofapartialregressioncoefficient,,ofXimaybetestedusinganincrementalFstatistic.TheincrementalFstatisticisbasedontheincrementintheexplainedsumofsquaresresultingfromtheadditionoftheindependentvariableXitotheregressionequationafteralltheotherindependentvariableshavebeenincluded.偏F檢驗。對Xi的偏回歸系數(shù)進行顯著性檢驗可以應(yīng)用遞增F統(tǒng)計量。遞增F統(tǒng)計量取決于在所有其他自變量都包括在模型中的情況下,向回歸方程引入新自變量時可解釋平方和的增量。
Partialregressioncoefficient.Thepartialregressioncoefficient,b1,denotesthechangeinthepredictedvalue,,perunitchangeinX1whentheotherindependentvariables,X2toXk,areheldconstant.偏回歸系數(shù)。偏回歸系數(shù)b1表示在X2到Xk均固定不變時,改變一單位X1引起的預(yù)測值的變化。Y
biConductingMultipleRegressionAnalysis
PartialRegressionCoefficients偏回歸系數(shù)
Tounderstandthemeaningofapartialregressioncoefficient,letusconsideracaseinwhichtherearetwoindependentvariables,sothat:
為便于理解偏回歸系數(shù)的意義,我們假設(shè)有兩個自變量,所以有如下公式
=a+b1X1+b2X2First,notethattherelativemagnitudeofthepartialregressioncoefficientofanindependentvariableis,ingeneral,differentfromthatofitsbivariateregressioncoefficient.首先,注意一個自變量的偏回歸系數(shù)的相對重要性在總體上不如其二元回歸系數(shù)。Theinterpretationofthepartialregressioncoefficient,b1,isthatitrepresentstheexpectedchangeinYwhenX1ischangedbyoneunitbutX2isheldconstantorotherwisecontrolled.Likewise,b2representstheexpectedchangein
YforaunitchangeinX2,whenX1isheldconstant.Thus,callingb1andb2partialregressioncoefficientsisappropriate.偏回歸系數(shù)b1代表的意義是,X2在不變或受到控制的前提下,X1變化一個單位會使Y產(chǎn)生的預(yù)期變化。同樣b2代表的意義是,X1在不變或受到控制的前提下,X2變化一個單位會引起Y產(chǎn)生的預(yù)期變化。YConductingMultipleRegressionAnalysis
PartialRegressionCoefficients偏回歸系數(shù)ItcanalsobeseenthatthecombinedeffectsofX1andX2onYareadditive.Inotherwords,ifX1andX2areeachchangedbyoneunit,theexpectedchangeinYwouldbe(b1+b2).。X2,和
X1對Y的聯(lián)合作用是累加的。即如果都改變一個單位,Y的預(yù)期變化就是(b1+b2
)SupposeonewastoremovetheeffectofX2fromX1.ThiscouldbedonebyrunningaregressionofX1onX2.Inotherwords,onewouldestimatetheequation1=a+bX2andcalculatetheresidualXr=(X1-1).Thepartialregressioncoefficient,
b1,isthenequaltothebivariateregressioncoefficient,br,obtainedfromtheequation=a+brXr.假設(shè)我們希望從X1中X2去除的影響,可以用X2對X1進行回歸,也就是估計方程=a+bX2
,并計算殘差Xr=(X1-1),因此,偏回歸系數(shù)br與方程Y=a+brXr.中的二元回歸系數(shù)相等。XXYConductingMultipleRegressionAnalysis
PartialRegressionCoefficients偏回歸系數(shù)Extensiontothecaseofkvariablesisstraightforward.Thepartialregressioncoefficient,b1,representstheexpectedchangeinYwhenX1ischangedbyoneunitandX2throughXkareheldconstant.Itcanalsobeinterpretedasthebivariateregressioncoefficient,b,fortheregressionofYontheresidualsofX1,whentheeffectofX2throughXkhasbeenremovedfromX1.
以上方程可以直接擴展到K個變量的情況。偏回歸系數(shù)b1道標(biāo)X2到Xk固定時,X1 變化一單位引起Y的預(yù)期變化。它也可以解釋為去除X2到Xk對X1的影響后,Y對X1殘差回歸的二元回歸系數(shù)。Therelationshipofthestandardizedtothenon-standardizedcoefficientsremainsthesameasbefore:
標(biāo)準(zhǔn)化與非標(biāo)準(zhǔn)化系數(shù)之間的關(guān)系為:
B1=b1(Sx1/Sy) Bk=bk(Sxk/Sy)Theestimatedregressionequationis:估計出的回歸方程為;
()=0.33732+0.48108X1+0.28865X2orAttitude=0.33732+0.48108(Duration)+0.28865(Importance)YMultipleRegressionTable17.3MultipleR 0.97210R2 0.94498AdjustedR2 0.93276StandardError 0.85974
ANALYSISOFVARIANCE
df SumofSquares MeanSquareRegression 2 114.26425 57.13213
Residual 9 6.65241 0.73916
F=77.29364 SignificanceofF=0.0000VARIABLESINTHEEQUATIONVariable b SEb Beta(?) TSignificance ofTIMPORTANCE 0.28865 0.086080.31382 3.353 0.0085
DURATION 0.48108 0.058950.76363 8.160 0.0000
(Constant) 0.33732 0.56736 0.595 0.5668
多元回歸Table17.3多元
R
0.97210R2 0.94498調(diào)整的
R2 0.93276標(biāo)準(zhǔn)誤 0.85974
ANALYSISOFVARIANCE
df SumofSquares MeanSquare回歸方程 2 114.26425 57.13213
殘差 9 6.65241 0.73916
F=77.29364 SignificanceofF=0.0000VARIABLESINTHEEQUATION變量
b SEb Beta(?) TSignificance ofT天氣重要性 0.28865 0.086080.31382 3.353 0.0085
居住年限 0.48108 0.058950.76363 8.160 0.0000
(常數(shù)項
0.33732 0.56736 0.595 0.5668
ConductingMultipleRegressionAnalysis
StrengthofAssociation聯(lián)系的強度SSy=SSreg+SSreswhereSSreg=(Yi-Y)2Si=1nSSy=(Yi-Y)2Si=1nSSres=(Yi-Yi)2Si=1nConductingMultipleRegressionAnalysis
StrengthofAssociation聯(lián)系的強度Thestrengthofassociationismeasuredbythesquareofthemultiplecorrelationcoefficient,R2,whichisalsocalledthecoefficientofmultipledetermination. 變量之間聯(lián)系的強度可以用多元相關(guān)系數(shù)的平方R2,來測量,也稱多元可決系數(shù)。R2
=
SSregSSyR2isadjustedforthenumberofindependentvariablesandthesamplesizebyusingthefollowingformula:
R2可以根據(jù)自變量的數(shù)量和樣本規(guī)模按照如下公式調(diào)整AdjustedR2
=R2
-
k(1
-
R2)n
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2025版門面租賃合同范本(含裝修標(biāo)準(zhǔn))3篇
- 2025年個人房產(chǎn)繼承合同范本4篇
- 2024年度青海省公共營養(yǎng)師之四級營養(yǎng)師通關(guān)提分題庫(考點梳理)
- 2024年度陜西省公共營養(yǎng)師之二級營養(yǎng)師通關(guān)題庫(附答案)
- 科技創(chuàng)新背景下的學(xué)校資金管理研究
- 餐廳員工食品安全知識與技能培訓(xùn)
- 二零二五年度綠色農(nóng)業(yè)技術(shù)引進與推廣合同3篇
- 2025年度個人向企業(yè)借款抵押物評估及處置合同4篇
- 個人手車交易合同
- 2025版企業(yè)宣傳片拍攝服務(wù)合同細則3篇
- 2024年甘肅省武威市、嘉峪關(guān)市、臨夏州中考英語真題
- DL-T573-2021電力變壓器檢修導(dǎo)則
- 繪本《圖書館獅子》原文
- 安全使用公共WiFi網(wǎng)絡(luò)的方法
- 2023年管理學(xué)原理考試題庫附答案
- 【可行性報告】2023年電動自行車相關(guān)項目可行性研究報告
- 歐洲食品與飲料行業(yè)數(shù)據(jù)與趨勢
- 放療科室規(guī)章制度(二篇)
- 中高職貫通培養(yǎng)三二分段(中職階段)新能源汽車檢測與維修專業(yè)課程體系
- 浙江省安全員C證考試題庫及答案(推薦)
- 目視講義.的知識
評論
0/150
提交評論