




版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
1、 一、 考慮表中二元分類問(wèn)題的訓(xùn)練樣本集1. 整個(gè)訓(xùn)練樣本集關(guān)于類屬性的熵是多少?2. 關(guān)于這些訓(xùn)練集中a1,a2的信息增益是多少?3. 對(duì)于連續(xù)屬性a3,計(jì)算所有可能的劃分的信息增益。4. 根據(jù)信息增益,a1,a2,a3哪個(gè)是最佳劃分?5. 根據(jù)分類錯(cuò)誤率,a1,a2哪具最佳?6. 根據(jù)gini指標(biāo),a1,a2哪個(gè)最佳?答1.P(+) = 4/9 and P() = 5/94/9 log2(4/9) 5/9 log2(5/9) = 0.9911.答2:(估計(jì)不考)答3:答4:According to information gain, a1 produces the best split.答
2、5:For attribute a1: error rate = 2/9.For attribute a2: error rate = 4/9.Therefore, according to error rate, a1 produces the best split.答6:二、 考慮如下二元分類問(wèn)題的數(shù)據(jù)集1. 計(jì)算a.b信息增益,決策樹(shù)歸納算法會(huì)選用哪個(gè)屬性2. 計(jì)算a.b gini指標(biāo),決策樹(shù)歸納會(huì)用哪個(gè)屬性?這個(gè)答案沒(méi)問(wèn)題3. 從圖4-13可以看出熵和gini指標(biāo)在0,0.5都是單調(diào)遞增,而0.5,1之間單調(diào)遞減。有沒(méi)有可能信息增益和gini指標(biāo)增益支持不同的屬性?解釋你的理由Yes,
3、 even though these measures have similar range and monotonousbehavior, their respective gains, , which are scaled differences of themeasures, do not necessarily behave in the same way, as illustrated bythe results in parts (a) and (b).貝葉斯分類1. P(A = 1|) = 2/5 = 0.4, P(B = 1|) = 2/5 = 0.4,P(C = 1|) =
4、1, P(A = 0|) = 3/5 = 0.6,P(B = 0|) = 3/5 = 0.6, P(C = 0|) = 0; P(A = 1|+) = 3/5 = 0.6,P(B = 1|+) = 1/5 = 0.2, P(C = 1|+) = 2/5 = 0.4,P(A = 0|+) = 2/5 = 0.4, P(B = 0|+) = 4/5 = 0.8,P(C = 0|+) = 3/5 = 0.6.2.3. P(A = 0|+) = (2 + 2)/(5 + 4) = 4/9,P(A = 0|) = (3+2)/(5 + 4) = 5/9,P(B = 1|+) = (1 + 2)/(5 +
5、 4) = 3/9,P(B = 1|) = (2+2)/(5 + 4) = 4/9,P(C = 0|+) = (3 + 2)/(5 + 4) = 5/9,P(C = 0|) = (0+2)/(5 + 4) = 2/9.4. Let P(A = 0,B = 1, C = 0) = K5. 當(dāng)?shù)臈l件概率之一是零,則估計(jì)為使用m-估計(jì)概率的方法的條件概率是更好的,因?yàn)槲覀儾幌M麄€(gè)表達(dá)式變?yōu)榱恪?. P(A = 1|+) = 0.6, P(B = 1|+) = 0.4, P(C = 1|+) = 0.8, P(A =1|) = 0.4, P(B = 1|) = 0.4, and P(C = 1|)
6、= 0.22.Let R : (A = 1,B = 1, C = 1) be the test record. To determine itsclass, we need to pute P(+|R) and P(|R). Using Bayes theorem, P(+|R) = P(R|+)P(+)/P(R) and P(|R) = P(R|)P()/P(R).Since P(+) = P() = 0.5 and P(R) is constant, R can be classified byparing P(+|R) and P(|R).For this question,P(R|+)
7、 = P(A = 1|+) × P(B = 1|+) × P(C = 1|+) = 0.192P(R|) = P(A = 1|) × P(B = 1|) × P(C = 1|) = 0.032Since P(R|+) is larger, the record is assigned to (+) class.3.P(A = 1) = 0.5, P(B = 1) = 0.4 and P(A = 1,B = 1) = P(A) ×P(B) = 0.2. Therefore, A and B are independent.4.P(A = 1) =
8、 0.5, P(B = 0) = 0.6, and P(A = 1,B = 0) = P(A =1)× P(B = 0) = 0.3. A and B are still independent.5.pare P(A = 1,B = 1|+) = 0.2 against P(A = 1|+) = 0.6 andP(B = 1|Class = +) = 0.4. Since the product between P(A = 1|+)and P(A = 1|) are not the same as P(A = 1,B = 1|+), A and B arenot conditionally independent given the class.三、 使用下表中的相似度矩陣進(jìn)行單鏈和全鏈層次聚類。繪制樹(shù)狀況顯示結(jié)果,樹(shù)狀圖應(yīng)該清楚地顯示合并的次序。There are no apparent relationships between s1, s2, c1, and c2.A2: Percentage of frequent itemsets = 16/32 = 50.0% (including the nullset).A4:False ala
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 心理健康教育課學(xué)習(xí)目標(biāo)
- 鉗工知識(shí)培訓(xùn)
- 少先隊(duì)中隊(duì)工作總結(jié)
- 鐵路公路交通安全班會(huì)
- 起重機(jī)地面操作安全培訓(xùn)
- 浙江省寧波市慈溪市J5共同體聯(lián)盟2024-2025學(xué)年九年級(jí)(上)期末科學(xué)試卷【含答案】
- 浙江新高考備戰(zhàn)2025年高考生物考點(diǎn)一遍過(guò)07細(xì)胞的增殖含解析
- 2024高考?xì)v史一輪復(fù)習(xí)第10單元世界經(jīng)濟(jì)的全球化趨勢(shì)單元綜合提升教學(xué)案新人教版
- 透析導(dǎo)管防脫出護(hù)理
- 地毯清洗作業(yè)培訓(xùn)課件
- UL1332標(biāo)準(zhǔn)中文版-2020戶外設(shè)備鋼外殼用有機(jī)涂料UL標(biāo)準(zhǔn)中文版
- 2024年10月自考00149國(guó)際貿(mào)易理論與實(shí)務(wù)試題及答案
- 大數(shù)據(jù)與會(huì)計(jì)專業(yè)專業(yè)的實(shí)習(xí)報(bào)告
- 招標(biāo)基礎(chǔ)知識(shí)題庫(kù)單選題100道及答案解析
- 中專實(shí)習(xí)協(xié)議書
- 550GIS技術(shù)講課課件
- (2023版)機(jī)動(dòng)車駕駛培訓(xùn)教學(xué)與考試大綱
- CloudFabric云數(shù)據(jù)中心網(wǎng)解決方案-云網(wǎng)一體化設(shè)計(jì)指南
- 兒童游樂(lè)沙坑施工方案
- 2023屆初中生物學(xué)業(yè)考試說(shuō)明
- 泰戈?duì)?飛鳥(niǎo)集中英文版全
評(píng)論
0/150
提交評(píng)論