數(shù)據(jù)分析實(shí)驗(yàn)報(bào)告分析解析_第1頁
數(shù)據(jù)分析實(shí)驗(yàn)報(bào)告分析解析_第2頁
數(shù)據(jù)分析實(shí)驗(yàn)報(bào)告分析解析_第3頁
數(shù)據(jù)分析實(shí)驗(yàn)報(bào)告分析解析_第4頁
數(shù)據(jù)分析實(shí)驗(yàn)報(bào)告分析解析_第5頁
已閱讀5頁,還剩29頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡介

1、-PAGE . z. - w -實(shí)驗(yàn)課程: 數(shù)據(jù)分析專 業(yè): 信息與計(jì)算科學(xué)班 級(jí):學(xué) 號(hào):姓 名:中北大學(xué)理學(xué)院實(shí)驗(yàn)一SAS系統(tǒng)的使用【實(shí)驗(yàn)?zāi)康摹苛私釹AS系統(tǒng),熟練掌握SAS數(shù)據(jù)集的建立及一些必要的SAS語句?!緦?shí)驗(yàn)容】1. 將SCORE數(shù)據(jù)集的容復(fù)制到一個(gè)臨時(shí)數(shù)據(jù)集test。SCORE數(shù)據(jù)集NameSe*MathChineseEnglishAlicef908591Tomm958784Jennyf939083Mikem808580Fredm848589Katef978382Ale*m929091Cookm757876Bennief827984Hellenf857484Winceletf90

2、8287Buttm778179Geogem868582Todm898484Chrisf898487Janetf8665872將SCORE數(shù)據(jù)集中的記錄按照math的高低拆分到3個(gè)不同的數(shù)據(jù)集:math大于等于90的到good數(shù)據(jù)集,math在80到89之間的到normal數(shù)據(jù)集,math在80以下的到bad數(shù)據(jù)集。3將3題中得到的good,normal,bad數(shù)據(jù)集合并?!緦?shí)驗(yàn)所使用的儀器設(shè)備與軟件平臺(tái)】SAS【實(shí)驗(yàn)方法與步驟】1:DATA SCORE;INPUT NAME $ Se* $ Math Chinese English;CARDS;Alicef908591Tomm958784Jen

3、nyf939083Mikem808580Fredm848589Katef978382Ale*m929091Cookm757876Bennief827984Hellenf857484Wincelet f908287Buttm778179Geogem868582Tod m898484Chrisf898487Janetf866587;Run;PROC PRINT DATA=SCORE;DATA test;SET SCORE;2:DATA good normal bad;SET SCORE;SELECT;when(math=90) output good;when(math=80&math90) ou

4、tput normal;when(math80) output bad;end;Run;PROCPRINTDATA=good;PROCPRINTDATA=normal;PROCPRINTDATA=bad;3:DATA All;SET good normal bad;PROCPRINTDATA=All;Run;【實(shí)驗(yàn)結(jié)果】結(jié)果一:結(jié)果二:結(jié)果三:實(shí)驗(yàn)二 上市公司的數(shù)據(jù)分析【實(shí)驗(yàn)?zāi)康摹客ㄟ^使用SAS軟件對(duì)實(shí)驗(yàn)數(shù)據(jù)進(jìn)行描述性分析和回歸分析,熟悉數(shù)據(jù)分析方法,培養(yǎng)學(xué)生分析處理實(shí)際數(shù)據(jù)的綜合能力。【實(shí)驗(yàn)容】表2是一組上市公司在2001年的每股收益(eps)、流通盤(scale)的規(guī)模以及2001年最后一

5、個(gè)交易日的收盤價(jià)(price). 表2 *上市公司的數(shù)據(jù)表代碼流通盤每股收益股票價(jià)格00009685000.05913.2700009960000.02814.200015012600-0.0037.12000151105000.02610.0800015325000.05622.7500015513000-0.0096.8500015636000.03314.95000157100000.0612.65000158100000.0188.3800015970000.00812.15000301153650.047.3100048877000.10113.2600072560000.04412

6、.3300083513380.0722.5800086932000.19418.290008777800-0.08412.550008856000-0.07312.48000890169340.0319.12000892120000.0317.88000897141660.0026.91000900214230.0588.5900090148000.00527.950009026500-0.03110.9200090360000.10911.7900090595000.0469.2900090666500.00714.4700090889880.0068.2800090960000.0029.

7、9900091080000.0368.900091172800.0679.01000912150000.1128.0600091384500.06211.8600091545990.00114.4000916340000.0385.15000917118000.08616.230009186000-0.04510.121、對(duì)股票價(jià)格1)計(jì)算均值、方差、標(biāo)準(zhǔn)差、變異系數(shù)、偏度、峰度;2)計(jì)算中位數(shù),上、下四分位 數(shù),四分位極差,三均值;3)作出直方圖;4)作出莖葉圖;5)進(jìn)行正態(tài)性檢驗(yàn)(正態(tài)W檢驗(yàn));6)計(jì)算協(xié)方差矩陣,Pearson相關(guān)矩陣;7)計(jì)算Spearman相關(guān)矩陣;8)分析各指標(biāo)間的

8、相關(guān)性。2、1)對(duì)股票價(jià)格,擬合流通盤和每股收益的線性回歸模型,求出回歸參數(shù)估計(jì)值及殘差; 2)給定顯著性水平=0.05,檢驗(yàn)回歸關(guān)系的顯著性,檢驗(yàn)各自變量對(duì)因變量的影響的顯著性; 3)擬合殘差關(guān)于擬合值的殘差圖及殘差的正態(tài)QQ圖。分析這些殘差,并予以評(píng)述?!緦?shí)驗(yàn)所使用的儀器設(shè)備與軟件平臺(tái)】SAS【實(shí)驗(yàn)方法與步驟】data prices;input num scale eps price;cards;00009685000.05913.2700009960000.02814.200015012600-0.0037.12000151105000.02610.0800015325000.05622

9、.7500015513000-0.0096.8500015636000.03314.95000157100000.0612.65000158100000.0188.3800015970000.00812.15000301153650.047.3100048877000.10113.2600072560000.04412.3300083513380.0722.5800086932000.19418.290008777800-0.08412.550008856000-0.07312.48000890169340.0319.12000892120000.0317.88000897141660.002

10、6.91000900214230.0588.5900090148000.00527.950009026500-0.03110.9200090360000.10911.7900090595000.0469.2900090666500.00714.4700090889880.0068.2800090960000.0029.9900091080000.0368.900091172800.0679.01000912150000.1128.0600091384500.06211.8600091545990.00114.4000916340000.0385.15000917118000.08616.230

11、009186000-0.04510.12run;PROCPRINTDATA=prices;run;procmeansdata=prices mean var stdskewnesskurtosiscv;var price;outputout=result;run;procunivariatedata=prices plot freq normal;var price;outputout=result2;run;proccapabilitydata=prices graphics noprint;histogram price/normal;run;proccorrdata=prices pea

12、rsonspearmancovnosimple;var price;with price;run;procregdata=prices;model price=scale eps/selection=backward nointp r;outputout =prices p=p r=r;procprintdata=prices;run【實(shí)驗(yàn)結(jié)果】對(duì)于問題二結(jié)果:實(shí)驗(yàn)三美國50個(gè)州七種犯罪比率的數(shù)據(jù)分析【實(shí)驗(yàn)?zāi)康摹客ㄟ^使用SAS軟件對(duì)實(shí)驗(yàn)數(shù)據(jù)進(jìn)行主成分分析和因子分析,熟悉數(shù)據(jù)分析方法,培養(yǎng)學(xué)生分析處理實(shí)際數(shù)據(jù)的綜合能力?!緦?shí)驗(yàn)容】表3給出的是美國50個(gè)州每100 000個(gè)人中七種犯罪的比率數(shù)據(jù)。

13、這七種犯罪是:Murder(殺人罪),Rape(強(qiáng)奸罪),Robbery(搶劫罪),Assault(斗毆罪),Burglary(夜盜罪),Larceny(偷盜罪),Auto(汽車犯罪)。表3美國50個(gè)州七種犯罪的比率數(shù)據(jù)StateMurderRapeRobberyAssaultBurglaryLarcenyAutoAlabama14.225.296.8278.31135.51881.9280.7Alaska10.851.696.8284.01331.73369.8753.3Arizona9.534.2138.2312.32346.14467.4439.5Arkansas8.827.683.22

14、03.4972.61862.1183.4California11.549.4287.0358.02139.43499.8663.5Colorado6.342.0170.7292.91935.23903.2477.1Connecticut4.216.8129.5131.81346.02620.7593.2Delaware6.024.9157.0194.21682.63678.4467.0Florida10.239.6187.9449.11859.93840.5351.4Georgia11.731.1140.5256.51351.12170.2297.9Hawaii7.225.5128.064.1

15、1911.53920.4489.4Idaho5.519.439.6172.51050.82599.6237.6Illinois9.921.8211.3209.01085.02828.5528.6Indiana7.426.5123.2153.51086.22498.7377.4Iowa2.310.641.289.8812.52685.1219.9Kansas6.622.0100.7180.51270.42739.3244.3Kentucky10.119.181.1123.3872.21662.1245.4Louisiana15.530.9142.9335.51165.52469.9337.7Ma

16、ine2.413.538.7170.01253.12350.7246.9Maryland8.034.8292.1358.91400.03177.7428.5Massachusetts3.120.8169.1231.61532.22311.31140.1Michigan9.338.9261.9274.61522.73159.0545.5Minnesota2.719.585.985.81134.72559.3343.1Mississippi14.319.665.7189.1915.61239.9144.4Missouri9.628.3189.0233.51318.32424.2378.4Monta

17、na5.416.739.2156.8804.92773.2309.2Nebraska3.918.164.7112.7760.02316.1249.1Nevada15.849.1323.1355.02453.14212.6559.2New Hampshire3.210.723.276.01041.72343.9293.4New Jersey5.621.0180.4185.11435.82774.5511.5New Me*ico8.839.1109.6343.41418.73008.6259.5New York10.729.4472.6319.11728.02782.0745.8North Car

18、olina10.617.061.3318.31154.12037.8192.1Ohio7.827.3190.5181.11216.02696.8400.4North Dakota0.99.013.343.8446.11843.0144.7Oklahoma8.629.273.8205.01288.22228.1326.8Oregon4.939.9124.1286.91636.435061388.9Pennsylvania5.619.0130.3128.0877.51624.1333.2Rhode Island3.610.586.5201.01489.52844.1791.4South Carol

19、ina11.933.0105.9485.31613.62342.4245.1South Dakota2.013.517.9155.7570.51704.4147.5Tennessee10.129.7145.8203.91259.71776.5314.0Te*as13.333.8152.4208.21603.12988.7397.6Utah3.520.368.8147.31171.63004.6334.5Vermont1.415.930.8101.21348.22201.0265.2Virginia9.023.392.1165.7986.22521.2226.7Washington4.339.6

20、106.2224.81605.63386.9360.3West Virginia6.013.242.290.9597.41341.7163.3Wisconsin2.812.952.263.7846.92614.2220.7Wyoming5.421.939.7173.9811.62772.2282.0分別用樣本協(xié)方差矩陣和樣本相關(guān)矩陣作主成分分析,二者的結(jié)果有何差異? 2)原始數(shù)據(jù)的變化可否由三個(gè)或者更少的主成分反映,對(duì)所選取的主成分給出合理的解釋。 3)計(jì)算從樣本相關(guān)矩陣出發(fā)計(jì)算的第一樣本主成分的得分并予以排序.2、從樣本相關(guān)矩陣出發(fā),做因子分析?!緦?shí)驗(yàn)所使用的儀器設(shè)備與軟件平臺(tái)】SAS【實(shí)驗(yàn)

21、方法與步驟】首先將上述數(shù)據(jù)復(fù)制到e*cel,再通過SAS導(dǎo)入數(shù)據(jù)至數(shù)據(jù)集crime。樣本協(xié)方差矩陣做主成分分析:procprinpdata=work.crime covariance;run;樣本相關(guān)矩陣做主成分分析:procprinpdata=work.crime;run;對(duì)第一樣本主成分排序procprinpdata=crime out=defen;run;procsortdata=defen;by prin1;run;procprintdata=defen;run;2、程序:procfactordata=work.crime score;run;【實(shí)驗(yàn)結(jié)果】實(shí)驗(yàn)四1991年全國各省、區(qū)、

22、市城鎮(zhèn)居民月平均收入的數(shù)據(jù)分析【實(shí)驗(yàn)?zāi)康摹客ㄟ^使用SAS軟件對(duì)實(shí)驗(yàn)數(shù)據(jù)進(jìn)行判別分析和聚類分析,熟悉數(shù)據(jù)分析方法,培養(yǎng)學(xué)生分析處理實(shí)際數(shù)據(jù)的綜合能力?!緦?shí)驗(yàn)容】1991年全國各省、區(qū)、市城鎮(zhèn)居民月平均收入情況見下表,變量含義如下:*1-人均生活費(fèi)收入(元/人);*2-人均全民所有制職工工資(元/人);*3-人均來源于全民標(biāo)準(zhǔn)工資(元/人);*4-人均集體所有制工資(元/人);*5-人均集體職工標(biāo)準(zhǔn)工資(元/人);*6-人均各種獎(jiǎng)金及超額工資(元/人);*7-人均各種津貼(元/人);*8-職工人均從工作單位得到的其他收入(元/人);*9-個(gè)體勞動(dòng)者收入(元/人)。省(區(qū)市)名類型*1*2*3*4*

23、5*6*7*8*91170.03110.259.768.384.4926.816.4411.90.41*1141.5582.5850.9813.49.3321.312.369.211.051119.483.3353.39117.5217.311.79120.71194.53107.860.2415.68.883121.0111.80.161130.4686.2152.315.910.520.6112.149.610.471119.2985.4153.0213.18.4413.8716.478.380.51*1134.4698.6148.188.94.3421.4926.1213.64.5611

24、43.7999.9745.66.31.5618.6729.4911.83.821128.0574.9650.1313.99.6216.1410.1814.510211127.4193.5450.5710.55.8719.4121.212.60.9*1122.96101.469.76.33.8611.318.965.624.622102.4971.7247.729.426.9613.127.96.660.612106.1476.2746.199.656.279.65520.16.970.962104.9372.9944.613.79.019.43520.616.651.682103.3462.9

25、942.9511.17.418.34210.196.452.68298.08969.4543.0411.47.9510.5916.57.691.082104.1272.2347.319.486.4313.1410.438.31.112108.4980.7947.526.063.4213.6916.538.372.852113.9975.650.885.213.8612.949.4926.771.272114.0684.3152.787.815.4410.8216.433.791.192108.880.4150.457.274.078.37118.985.950.832115.9688.2151

26、.858.815.6313.9522.654.750.973128.4668.9143.4122.415.313.8812.429.011.413135.2473.1844.5423.915.222.389.66113.91.193162.5380.1145.9924.313.929.5410.9133.473111.7771.0743.6419.412.516.689.6987.020.633139.0979.0944.1918.510.520.2316.477.673.08312484.6644.0513.57.4719.1120.4910.31.76待判211.311441.4433.211.248.7230.7714.911.1待判175.93163.857.894.223.3717.8182.3215.701、1)判定、兩省區(qū)屬于哪種收入類型,并用回代法及交叉確認(rèn)法對(duì)誤判率作出估計(jì)。 2)進(jìn)行Bayes判別,并用回代法與交叉確認(rèn)法驗(yàn)證判別結(jié)果。2、1)用最短距離法、最長距離法與類平均法聚類,畫出譜系圖,并寫出分3類的結(jié)果; 2)快速聚類法聚類,并寫出分3類的結(jié)果?!緦?shí)驗(yàn)所使用的儀器設(shè)備與軟件平臺(tái)】SAS【實(shí)驗(yàn)方法與步驟】1:發(fā)現(xiàn)數(shù)

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論