灰色預(yù)測(cè)模型對(duì)準(zhǔn)確推薦在存在數(shù)據(jù)稀疏和相關(guān)性

上傳人：5*** IP屬地：湖北上傳時(shí)間：2021-11-22 格式：DOC 頁(yè)數(shù)：7 大?。?94.05KB 積分：30 舉報(bào) 版權(quán)申訴

灰色預(yù)測(cè)模型對(duì)準(zhǔn)確推薦在存在數(shù)據(jù)稀疏和相關(guān)性_第2頁(yè)

灰色預(yù)測(cè)模型對(duì)準(zhǔn)確推薦在存在數(shù)據(jù)稀疏和相關(guān)性_第3頁(yè)

灰色預(yù)測(cè)模型對(duì)準(zhǔn)確推薦在存在數(shù)據(jù)稀疏和相關(guān)性_第4頁(yè)

灰色預(yù)測(cè)模型對(duì)準(zhǔn)確推薦在存在數(shù)據(jù)稀疏和相關(guān)性_第5頁(yè)

已閱讀5頁(yè)，還剩2頁(yè)未讀，繼續(xù)免費(fèi)閱讀

版權(quán)說明：本文檔由用戶提供并上傳，收益歸屬內(nèi)容提供方，若內(nèi)容存在侵權(quán)，請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

1、clustering CF models Bayesian belief nets (BNs) CF modelsMarkov decision process based (MDP-based) CF modelslatent semantic CF models利用降維技術(shù)來處理數(shù)據(jù)稀疏性的問題（SVD）丟失關(guān)鍵數(shù)據(jù)本論文中是利用simplest method（Cosine Distance measurement method）,來處理原理：we do not directly use the exact value of the similarities, but ratherran

2、k the items according to their similarities可以應(yīng)用的領(lǐng)域：such as nance 23, integrated circuit industry 24, the market for air travel 25, and underground pressure for working surface26.實(shí)驗(yàn)的數(shù)據(jù)集：Movie Lens and Each Movie論文構(gòu)架：2部分是對(duì)傳統(tǒng)CF方法的描述，基于CF (ICF) methods，對(duì)存在問題的描述，本人的貢獻(xiàn) 3部分詳細(xì)描述了基于算法提出的GF模型 4部分描述了實(shí)驗(yàn)的研究，包括實(shí)驗(yàn)

3、的數(shù)據(jù)集，評(píng)估的度量，方法，實(shí)驗(yàn)的分析，總結(jié)和將來的工作。2.主要的工作可以分為兩個(gè)大的部分：1.相似度的測(cè)量和2.評(píng)分的預(yù)測(cè) 相似度的測(cè)量方法： 1.In ICF meth-ods, the similarity sðix; iyÞ between the items ix, and iyis determinedby the users who have rated both the items 2.最流行的方法：余弦距離和皮爾遜相關(guān)，運(yùn)算原理：let I be the set of all items rated by both the users ux, and u

4、y, and let U be the set ofall users who have rated the items ix, and iy 例如：題目：(item set I)是Bread and Milk ，d is equal to the size of set I. In this case, d is equal to two（d=2) Cake (ix) and Milk (iy) are rated by both Alice and Lucy (user set U) 2.相似度的計(jì)算方法2.1.1 余弦距離用來計(jì)算兩個(gè)向量之間的相似度For UCF, the simil

5、arity between two users with Cosine Distancemethod can be calculated as follows:Cosine Distance用戶之間的相似度是用戶ux和用戶uy對(duì)項(xiàng)目i的評(píng)分，F(xiàn)or ICF 則為：物品之間的相似度是用戶u對(duì)項(xiàng)目ix，iy的評(píng)分2.1.2 皮爾遜相關(guān)系數(shù)在相似度的計(jì)算過程中，消除評(píng)分相關(guān)性，可以利用平均評(píng)分來消除，皮爾遜相關(guān)系數(shù)在一定的程度上提高了相似度計(jì)算的準(zhǔn)確度，對(duì)于用戶之間的相似度計(jì)算如下：是用戶對(duì)所有電影評(píng)分的平均值對(duì)于物品的計(jì)算則如下：2.2 評(píng)分預(yù)測(cè)思路：The k Nearest Neighbors

6、 (KNN) method 37 is usually used for prediction by weighting the sum of the ratings that similar users give to the target item or the ratings of the active user on similar items depending on whether UCF or ICF is used2.2.1 用戶之間思想： is based on the basic assumption that people who share similar past

7、preferences will be interested in similar items. 算法步驟： rst, the similarities between the users are computed using similarity measurementmethods introduced in Section 2.1; then, the prediction for the active user is determined by taking the weighted average of all the ratings of the similar users for

8、 a certain item 37 according to the formula in Eq. (5); nally, the items with the highest pre-dicted ratings will be recommended to the userwhere U(ux) denotes the set of users similar to the user ux, and pux ;iis the prediction for the user ux on item i2.2.2 物品之間思想：algorithm recommends items to use

9、rs that are similar to the items that they have already consumed. Similarly, after cal-culating the similarities between the items,where I(ix) denotes the set of similar items of item ix. Further, pu;ix denotes the prediction of user u on item ix.2.3 問題分析從數(shù)據(jù)的稀疏性和數(shù)據(jù)的相關(guān)性不用數(shù)據(jù)本身，即就是對(duì)數(shù)據(jù)本身排序使用本文的重點(diǎn)：we onl

10、y rank the items accord-ing to the similarity. 用戶的相似度和物品相似度，不是去利用計(jì)算出來的相似度，而是利用計(jì)算出來的排序相似度本身存在誤差Then, to generate the prediction of the active user u on item i, the k most similar items that have been rated by the active user on item i are selected. Finally, we use these items as the input to build a

11、 GF model and predict the rating of the active user u on item i. If the user u does not rate k items, a xed value will be used to complete the k ratings. Empirically, the xed value can be the median value of the rating scale. For example, when the rating scale is 15, the number 3 is selected as the

12、xed value. The proposed method provides the following three main contributions:優(yōu)點(diǎn)：1. Overcoming data sparsity2. Beneting from data correlation3. Obtaining accurate predictions.3. 提出算法思想：ratings of similar users for a target item or ratings of the active user for similar items togenerate prediction。

13、In this paper, the GF model is used for rating prediction. It involves two steps: rating preprocess-ing and rating prediction.3.1. Rating preprocessing利用物品之間的相似度來產(chǎn)生評(píng)分的預(yù)測(cè)，算法步驟：First, for simplicity, the Cosine Distance method is utilized to compute the similarity between two items. Then, an m m simil

14、arity matrix is generated, where m is the number of items. If we want to predict the unrated entry of the user u on item i in the rating matrix, the k most similar items to the item i that have been rated by the user u are selected. Note that when the user u does not rate k items, the xed value with

15、 the lowest similarity will be used to complete the k ratings. Finally, the k ratings are sorted according to their incremental similarities to the item i to produce a rating sequence. In the next step, the pro-posed algorithm inputs the rating sequence to the GF model and forecasts the rating that

16、the user u will give to item i.計(jì)算出物品之間的思想度后，把物品之間的相似度排序（降序），當(dāng)K最近鄰物品數(shù)，用最低的評(píng)分?jǐn)?shù)來替代如：原本為(4, 3, 5).當(dāng)K=7時(shí)，則為(3, 3, 5, 4, 4, 3, 5)題目：用余弦相似度（系數(shù)矩陣）計(jì)算得出與i1相似的物品為10,6,2,8，4,9,3，7,5（降序）（5,7,3,9,4）如果k=3 則為3,9,4，之所以這么選擇是因?yàn)樗麄儽挥脩魎3評(píng)論，u3評(píng)分過的只有3,4,5,7,9，若K=7，評(píng)分為(3, 3, 5, 4, 4, 3, 5),，因?yàn)樵u(píng)過的只有5個(gè)，給出的評(píng)分為5,4,4,3,5，剩余的兩個(gè)用最

17、低分填充，3,3,5,4,4,3,5.出現(xiàn)用戶之間相同的隨機(jī)性，則以評(píng)分隨機(jī)為準(zhǔn)則1. 按相關(guān)性的降低排序，使得這與物品間的相似性比預(yù)測(cè)評(píng)分更有效2. 只選取K個(gè)最高相似的，所以更精確3.2 評(píng)分預(yù)測(cè)為什么用灰預(yù)測(cè)模型：mainly focuses on model uncertainty and information insufficiency when analyzing and understanding systems via research on conditional analysis, prediction, and decision making. A recommende

18、r system can be considered as a grey system; further, with our algorithm,the GF model is used to yield the rating prediction. The GF model utilizes accumulated generation operations to build differential equations, which benefit from the data correlations. Meanwhile,it has another significant charac

19、teristic wherein it requires less data so it can overcome the data sparsity problem. The rating sequence generated in the rating preprocessing stage is the only input required for model construction and subsequent forecasting.步驟1：設(shè)定原始的評(píng)分序列為：K為最近鄰物品序列步驟2：是通過的如下累加生成：這一步是最為重要的，例如：For example, is a users original rating sequence. Obviously, the sequence does not have a clear regularity.If AGO is applied to this sequence, is obtained which has a clear growing tendency.步驟3,：灰色導(dǎo)數(shù)和背景灰色號(hào)碼是近似的線性回歸，光滑離散函數(shù)。一個(gè)灰色微分模型GM（1.1）定義如下：a,b 分別為系數(shù)，a為灰色發(fā)展系數(shù)，b為

人人文庫(kù)> 全部分類> 教育資料 > 課件下載

溫馨提示

1. 本站所有資源如無(wú)特殊說明，都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
2. 本站的文檔不包含任何第三方提供的附件圖紙等，如果需要附件，請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
3. 本站RAR壓縮包中若帶圖紙，網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽，若沒有圖紙預(yù)覽就沒有圖紙。
4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間，僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理，對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯，并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容，請(qǐng)與我們聯(lián)系，我們立即糾正。
7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

灰色預(yù)測(cè)模型對(duì)準(zhǔn)確推薦在存在數(shù)據(jù)稀疏和相關(guān)性

文檔簡(jiǎn)介

溫馨提示

最新文檔

評(píng)論

相關(guān)文檔