一組空氣污染大數(shù)據(jù)地主成分分析報告_第1頁
一組空氣污染大數(shù)據(jù)地主成分分析報告_第2頁
一組空氣污染大數(shù)據(jù)地主成分分析報告_第3頁
一組空氣污染大數(shù)據(jù)地主成分分析報告_第4頁
一組空氣污染大數(shù)據(jù)地主成分分析報告_第5頁
已閱讀5頁,還剩9頁未讀 繼續(xù)免費閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認(rèn)領(lǐng)

文檔簡介

實用文檔一組空氣污染數(shù)據(jù)的主成分分析【說明】下面的多元統(tǒng)計分析練習(xí)題摘自 R.A.Johnson 等編寫的《應(yīng)用多元統(tǒng)計分析(第五版)》,原書為:RichardA.JohnsonandDeanW.Wichern. AppliedMultivariateStatisticalAnalysis (5thEd).PearsonEducation,Inc.2003 。我看的是中國統(tǒng)計出版社(ChinaStatisticsPress )2003年發(fā)行的影印本。第一題為原書第 1.6題,即第 1章的第6題,第二題為原書第 8.12題,即第 8章的第題。第二題用的是第一題的數(shù)據(jù)。習(xí)題1.6.ThedatainTable1.5are42measurementsonair-pollutionvariablesrecordedat12:00noonintheLosAngelesareaondifferentdays.(a)Plotthemarginaldotdiagramsforallthevariables.(b)Constructthex,Sn,andRarrays,andinterprettheentriesinR.TABLE1.5AIR-POLLUTIONDATASolarradiationWind(x1)(x2)CO(x3)NO(x4)NO2(x5)O3(x6)HC(7)x898721282710743953710343563108852815469142810389052121249847412155572642114478251111138645213946715410336914212737727418103107042117310724181039774191038764177387153164496742132396933953文案大全實用文檔106253144498842763880421311453033523683511023488432763678421111387921710366243983103731723871411073752411284548658436754110243103541692885419102586316122586721318277974925377952862668621114384043652Source:DatacourtesyofProfessorG.C.Tiao.8.12.Considertheair-pollutiondatalistedinTable1.5.Yourjobistosummarizethesedatainfewerthanp=7dimensionsifpossible.ConductaprincipalcomponentanalysisofthedatausingboththecovariancematrixSandthecorrelationmatrixR.Whathaveyoulearned?Doesitmakeanydifferencewhichmatrixischosenforanalysis?Canthedatabesummarizedinthreeorfewerdimensions?Canyouinterprettheprincipalcomponents?部分解答2.1 部分統(tǒng)計參數(shù)利用Excel計算的平均值( x)和標(biāo)準(zhǔn)差SolarWind radiation CO NO NO2 O3 HCAverage 7.5 73.857143 4.5476192.190476210.0476199.40476193.0952381Stdev1.5811388 17.3353881.23372091.08735743.37098375.56583450.6917466文案大全實用文檔Excel給出的協(xié)方差矩陣 SSolarWindradiationCONONOOHC23Wind2.4404762Solarradiation-2.714286293.36054CO-0.3690483.81632651.4858277NO-0.452381-1.3537410.65759641.154195NO-0.5714296.60204082.25963721.062358311.0929712O-2.17857130.0578232.7545351-0.7913833.052154230.240933HC0.16666670.60884350.1383220.17233561.01927440.58049890.4671202Excel給出相關(guān)系數(shù)矩陣 RSolarWindradiationCONONO2O3HCWind1Solarradiation-0.1014421CO-0.1938030.18279341NO-0.269543-0.0735690.50215251NO-0.1098250.1157320.55658380.296898112O3-0.2535930.31912370.4109288-0.1339520.16664221HC0.15609790.05201040.16603230.23470430.44776780.15445061從相關(guān)系數(shù)矩陣可以看出, CO與NO、NO2相關(guān)性明顯, O3與Solar radiation 、CO相關(guān)性明顯。后面的主成分分析將 CO與NO、NO2歸并到一個主成分,將 O3與Solarradiation歸并到一個主成分,將 HC、Wind歸并到一個主成分。 HC與Wind的相關(guān)系數(shù)并不高,但從正相關(guān)的角度看,二者的數(shù)值倒是最高的。方差極大正交旋轉(zhuǎn)之后, HC與CO、NO、NO2歸并到一個因子,因為 HC與NO2的相關(guān)系數(shù)較高,與 CO、NO的相關(guān)系數(shù)高于其他變量。2.2 主成分分析之一——數(shù)據(jù)未經(jīng)標(biāo)準(zhǔn)化下面是從相關(guān)矩陣 R出發(fā),SPSS給出的結(jié)果。原始數(shù)據(jù)未經(jīng)標(biāo)準(zhǔn)化。所謂從 R出發(fā),就是在SPSS的Factor Analysis: Extraction —Analysis 選項中選中 Correlation Matrix。SPSS給出的相關(guān)系數(shù)矩陣( CorrelationMatrix ),與Excel計算的結(jié)果一樣。文案大全實用文檔CorrelationMatrixWINDSolarradiationCONONO2O3HCWIND1.000-.101-.194-.270-.110-.254.156Solarradiation-.1011.000.183-.074.116.319.052CO-.194.1831.000.502.557.411.166NO-.270-.074.5021.000.297-.134.235NO2-.110.116.557.2971.000.167.448O3-.254.319.411-.134.1671.000.154HC.156.048.1541.000公因子方差(Communalities)表如下。公因子方差變化于0.544~0.795之間,相差不是很大。但是,公因子方差值沒有達到0.8以上的,可見每一個變量體現(xiàn)在三個主成分中的信息都不超過80%。CommunalitiesInitialExtractionWIND1.000.737Solarradiation1.000.544CO1.000.725NO1.000.795NO21.000.681O31.000.722HC1.000.722ExtractionMethod:PrincipalComponentAnalysis.特征根與方差貢獻(TotalVarianceExplained)如下表??梢娞崛∪齻€主成分可以解釋原來7格變量的70.384%。TotalVarianceExplainedInitialEigenvaluesExtractionSumsofSquaredLoadingsComponentTotal%ofVarianceCumulative%Total%ofVarianceCumulative%12.33733.38333.3832.33733.38333.38321.38619.80053.1831.38619.80053.18331.20417.20170.3841.20417.20170.3844.72710.38780.7715.6539.33590.1066.5377.66797.7737.1562.227100.000ExtractionMethod:PrincipalComponentAnalysis.文案大全eulavnegiE

實用文檔ScreePlot2.52.00.01 2 3 4 5 6 7ComponentNumber主成分載荷矩陣( ComponentMatrix )見下表。ComponentMat rixaWINDSolarradiationCONONO2O3HC

Component123-.362.328.706.314-.620.246.842-8.03E-03-.125.577.512-.447.796-.667.175.488.362.594ExtractionMethod:PrincipalComponentAnalysis.a.3componentsextracted.將上表從SPSS中復(fù)制到 Excel中,進行涂色分類,結(jié)果如下表所示。Component123WIND-0.362020.3278090.706084Solarradiation0.31424-0.619970.24631CO0.842417-0.00803-0.12466NO0.5772430.511736-0.44671NO0.7612940.2351830.2156822O30.496126-0.667490.175399HC0.4882570.3624660.593692主成分分類如下:文案大全實用文檔第一主成分的主要相關(guān)變量: CO、NO、NO2。第二主成分的主要相關(guān)變量: Solarradiation 、O3。第三主成分的主要相關(guān)變量: Wind、HC。在主成分載荷圖( ComponentPlot)中,三個變量分別落入三個不同的主成分代表的區(qū)域。主成分得分表如下。最后一欄對幾個典型的樣本給出了簡單的解釋。注意解釋的時候看清主成分載荷矩陣中載荷值的正負(fù)號。Casesf1f2f3典型的說明S10.61591-0.8186-0.38418S20.03194-0.36015-0.26343S3-0.34752-0.54481-0.49701S40.2425-0.302931.80367樣本4代表的區(qū)域Wind、HC污染嚴(yán)重S5-0.12729-0.91941-0.4042S60.72612-0.192781.21954S72.036860.899821.4607樣本7和8代表的區(qū)域與CO、NO、NO污染有明顯2S82.573090.77732-0.34124的關(guān)系S90.09802-0.817360.30334S100.506640.788030.88735S110.39040.97744-1.48345S120.14485-0.45848-0.27016S131.924770.88883-0.66029S14-0.506620.631390.91242S15-0.89378-0.170361.19632S16-0.66037-0.398620.93758文案大全實用文檔S17-0.87787-0.36350.3701S180.887331.53060.65731S19-0.429351.092530.48155S20-0.7510.924240.11384S21樣本21代表的區(qū)域Solarradiation、O污染較30.428261.961331.18659小S22-0.69373-0.097470.51522S230.414840.206811.21242S24-1.162631.39047-2.12097S250.86691-1.703350.91799S26-0.91899-0.139150.18106S270.09994-0.51948-0.37202S28-1.32458-0.69110.65186S29-0.104720.39184-1.08681S30-1.85931.379330.6047S31-0.62672-0.083470.47051S32-0.142640.649410.72066S330.674211.56899-2.63096樣本33代表的區(qū)域Wind、HC污染較小S340.24874-1.956810.22088S35-1.714290.39216-0.08554S36-0.80238-1.13269-0.0517S37-1.00653-1.92662-1.17569樣本37和38代表的區(qū)域Solarradiation、O3S381.29486-1.77265-1.32357污染嚴(yán)重S391.68145-1.04272-0.66334S40-0.48079-0.49683-1.07633S410.72122-0.53042-0.57934S42-1.177760.98919-1.555382.3 主成分分析之二——數(shù)據(jù)未經(jīng)標(biāo)準(zhǔn)化下面是從協(xié)方差矩陣 S出發(fā),SPSS給出的結(jié)果。原始數(shù)據(jù)未經(jīng)標(biāo)準(zhǔn)化。所謂從 S出發(fā),就是在SPSS的FactorAnalysis:Extraction —Analysis 選項中選中 CovarianceMatrix 。公因子方差(Communalities)表如下。在未經(jīng)處理的(Raw)公因子方差一欄,其Initial數(shù)值都是原始數(shù)據(jù)的方差。不過與前面 Excel給出的協(xié)方差矩陣有所不同, Excel給出的是總體方差,SPSS給出的是抽樣方差。例如以Wind的Initial 值為例,2.4404762×42/41=2.5,或者2.5×41/42=2.4404762(對照前面的協(xié)方差矩陣) 。重標(biāo)的(Rescaled)結(jié)果是 Extraction 值與Initial 值之比。文案大全實用文檔CommunalitiesRawRescaledInitialExtractionInitialExtractionWIND2.5003.067E-021.0001.227E-02Solarradiation300.516300.1341.000.999CO1.5226.017E-021.0003.953E-02NO1.1826.750E-031.0005.709E-03NO211.364.1791.0001.575E-02O330.9793.8461.000.124HC.4791.667E-031.0003.484E-03ExtractionMethod:PrincipalComponentAnalysis.公因子方差的合計結(jié)果如下:RawRescaledInitialExtractionInitialExtractionWIND2.50.030665110.012266Solarradiation300.51568300.1336710.9987288CO1.52206740.060166610.0395295NO1.18234610.006750210.0057091NO211.3635310.179005910.0157527O330.9785133.845942810.1241487HC0.47851340.001667110.0034839合計348.54065304.2578671.1996188特征根與方差貢獻(TotalVarianceExplained)如下表。在Raw一欄中顯示,提取一個主成分似乎可以解釋原來7格變量的87.295%。但重標(biāo)之后顯示的數(shù)值卻是17.137%。根據(jù)公因子方差表和合計結(jié)果,重標(biāo)之前,全部的方差解釋為304.25786/348.54065*100=87.295% ;重標(biāo)之后,全部的方差解釋為1.1996188/7*100 =17.137%。文案大全實用文檔TotalVarianceExplainedaExtractionSumsofSquaredLoadingsInitialEigenvaluesComponentTotal%ofVarianceCumulative%Total%ofVarianceCumulative%Raw1304.25887.29587.295304.25887.29587.295228.2768.11395.408311.4643.28998.69742.524.72499.42151.280.36799.7886.529.15299.9407.2106.014E-02100.000Rescaled1304.25887.29587.2951.20017.13717.137228.2768.11395.408311.4643.28998.69742.524.72499.42151.280.36799.7886.529.15299.9407.2106.014E-02100.000ExtractionMethod:PrincipalComponentAnalysis.a.Whenanalyzingacovariancematrix,theinitialeigenvaluesarethesameacrosstherawandrescaledsolution.ScreePloteulavnegiE

40030020010001 2 3 4 5 6 7ComponentNumber主成分載荷矩陣( ComponentMatrix )見下表??梢钥磥恚捎谧兞?Solarradiation的方差很大,它絕對地控制了第一主成分。文案大全實用文檔ComponentMatrix

aWINDSolarradiationCONONO2O3HC

RawRescaledComponeComponentnt11-.175-.11117.324.999.245.199-.082-.076.423.1261.961.352.041.059ExtractionMethod:PrincipalComponentAnalysis.a.1componentsextracted.2.4 主成分分析之三——數(shù)據(jù)經(jīng)過標(biāo)準(zhǔn)化下面是從協(xié)方差矩陣 S出發(fā),SPSS給出的結(jié)果。原始數(shù)據(jù)經(jīng)過標(biāo)準(zhǔn)化。可以看到所有的結(jié)果重標(biāo)前后一樣,并且與從相關(guān)矩陣 R出發(fā)計算的結(jié)果一樣。公因子方差( Communalities)表如下,重標(biāo)前后的結(jié)果一樣。CommunalitiesRawRescaledInitialExtractionInitialExtractionWIND1.000.7371.000.737Solarradiation1.000.5441.000.544CO1.000.7251.000.725NO1.000.7951.000.795NO21.000.6811.000.681O31.000.7221.000.722HC1.000.7221.000.722ExtractionMethod:PrincipalComponentAnalysis.特征根與方差貢獻( TotalVarianceExplained )如下表。重標(biāo)前后結(jié)果一樣。文案大全eulavnegiE

實用文檔TotalVarianceExplainedaExtractionSumsofSquaredLoadingsInitialEigenvaluesComponentTotal%ofVarianceCumulative%Total%ofVarianceCumulative%Raw12.33733.38333.3832.33733.38333.38321.38619.80053.1831.38619.80053.18331.20417.20170.3841.20417.20170.3844.72710.38780.7715.6539.33590.1066.5377.66797.7737.1562.227100.000Rescaled12.33733.38333.3832.33733.38333.38321.38619.80053.1831.38619.80053.18331.20417.20170.3841.20417.20170.3844.72710.38780.7715.6539.33590.1066.5377.66797.7737.1562.227100.000ExtractionMethod:PrincipalComponentAnalysis.a.Whenanalyzingacovariancematrix,theinitialeigenvaluesarethesameacrosstherawandrescaledsolution.ScreePlot2.52.00.01 2 3 4 5 6 7ComponentNumber主成分載荷矩陣( ComponentMatrix)見下表,重標(biāo)前后一樣??梢钥吹?,第一主成分的相對重要性受到標(biāo)準(zhǔn)化的極大影響。 結(jié)論自然是:如果在極其不同的范圍內(nèi)測量變量, 或者測量單位的量綱不同,變量必須經(jīng)過標(biāo)準(zhǔn)化。否則,應(yīng)該從相關(guān)系數(shù)矩陣出發(fā)開展主成分分析。文案大全實用文檔ComponentMatrixaRawRescaledComponentComponent123123WIND-.362.328.706-.362.328.706Solarradiation.314-.620.246.314-.620.246CO.842-.008-.125.842-.008-.125NO.577.512-.447.577.512-.447NO2.761.235.216O3.496-.667.175.496-.667.175HC.488.362.594.488.362.594ExtractionMethod:PrincipalComponentAnalysis.a.3componentsextracted.ComponentPlot1.0no hcno2 wind.5coComponent20.0solaro3radiation-.51.00.00.0-.5-.5Component1Component32

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論