版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)
文檔簡介
1、Wikilens Data Set(WikiLens數(shù)據(jù)集)數(shù)據(jù)摘要:WikiLens was a generalized collaborative recommender system that allowed its community to define item types (e.g. beer) and categories (e.g. microbrews, pale ales, stouts), and then rate and get recommendations for items.It was taken offline in 2009 due to lack of
2、system maintenance and support.This data set was extracted in February 2008.中文關(guān)鍵詞:WikiLens,推薦系統(tǒng),項目類型,類別,英文關(guān)鍵詞:WikiLens,recommender system,item types,categories,數(shù)據(jù)格式:TEXT數(shù)據(jù)用途:Information ProcessingClassification數(shù)據(jù)詳細(xì)介紹:Wikilens Data SetsWikiLens was a generalized collaborative recommender system that
3、allowed itscommunity to define item types (e.g. beer) and categories (e.g. microbrews, pale ales, stouts), and then rate and get recommendations for items.It was taken offline in 2009 due to lack of system maintenance and support.This directory contains a dump.txt.gz for this WikiLens instance.This
4、file is a gzip-ed output of a mysqldump command with the 'latin1'charset, after suitable erasing of private data.The intent is for this dump to have all data you could get byspidering the site.The easiest way to see the data is to install MySQL (see create a database, and loadthis dump file,
5、 e.g.zcat dump.txt.gz | mysql -uuser -ppassword -Dmy_databaseOtherwise, the text dump is human-readable, so it is possible to write tools to parse it. I wouldn't recommend it.The dump has the following tables:category - Map items to categorieschefmoz - Cache of Chefmoz import data for the Restau
6、rant category. This is EMPTY because otherwise it is very large.link - Cache of wiki page linkslogging - Log of actions taken on the wiki. This is EMPTY for privacy. member - EMPTY.nonempty - Cache of page ids of pages that have some content.page - Page data.page_urn - Mapping of pages to URNs for r
7、atings.pref - User preferences. This is EMPTY for privacy.rating - Ratings of URNs (often pages, mapped through page_urn). NOTE: rateepage is a URN id, not a page id.recent - Cache of page ids of pages recently changed.session - Cache of user sessions for the wiki. This is EMPTY for privacy. urn - U
8、RN (Universal Resource Identifier) ids.user - EMPTY.version - Page data for every version of a page.These tables are mostly the same as PhpWiki 1.3.9 (see). The new tables are category,page_urn, rating, and logging.* WARNING *The easiest mistake to make while looking at the data is to join the ratee
9、page field of the rating table and the id table of page.rateepage is a page id, right? NOT SO. The rating.rateepage field is actually the id of a URN, NOT a page. That field name has not been changed to something reflecting URN simply due to lack of time to do it correctly (including database migrat
10、ion upgrades).Look carefully at the example queries below to see how to use the various fields.* WARNING * Example queriesThe words "item" and "ratee" (the object of a rating action) are used synonymously below. Similarly for "user" and "rater".Here are some e
11、xample queries to get data:1. Select all ratings. Columns are- Ratee (item) page id- Ratee (item) page name (truncated to 25 characters)- Rater page id- Rater page name- URN id- Rating value- Rating timestampselect p.id, left(p.pagename, 25), r.raterpage,rp.pagename as rater, r.rateepage, r.ratingva
12、lue as rat, r.tstamp from page p, page rp, rating r, page_urn pu, urn uwhere pu.pagename = p.pagename and pu.urn = u.urn and r.raterpage = rp.idand r.rateepage = u.idorder by p.pagename2. Select all ratings of an item called "Book_Foo"select p.id, left(p.pagename, 25), r.raterpage,rp.pagen
13、ame as rater, r.rateepage, r.ratingvalue as rat from page p, page rp, rating r, page_urn pu, urn uwhere pu.pagename = p.pagename and pu.urn = u.urn and r.raterpage = rp.idand r.rateepage = u.id and p.pagename like 'Book_Foo' order by p.pagename3. Select all ratings of a user called "Use
14、r_Bar"select p.id, left(p.pagename, 25), r.raterpage,rp.pagename as rater, r.rateepage, r.ratingvalue as rat from page p, page rp, rating r, page_urn pu, urn uwhere pu.pagename = p.pagename and pu.urn = u.urn and r.raterpage = rp.idand r.rateepage = u.id and rp.pagename like 'User_Bar'
15、order by rp.pagename4. Select number of things in the "Book" category:select count(*) cnt from category c, page pwhere c.category = p.id and p.pagename = 'Book'5. The number of items in any categoryselect count(*) from category6. The number of usersselect count(*) cntfrom category
16、c, page pwhere c.category = p.id and p.pagename = 'User'7. The number of ratingsselect count(*) from rating8. The number of ratings per monthselect left(tstamp, 6) as yearmonth, count(*) from ratinggroup by yearmonthorder by yearmonth asc9. Pages per category for every categoryselect pagename, count(*) cntfrom category c, page pwhere c.category = p.idgroup by categoryorder by cnt desc10. Ratings per category for every categoryselect left(cp.pagename, 30) cat, cou
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 預(yù)制構(gòu)件供應(yīng)購銷協(xié)議
- 家長對孩子進(jìn)行生命教育的保證書
- 大樓租賃合同范本
- 自覺維護(hù)公共秩序
- 防水工程保證書范文編寫規(guī)范
- 土方建設(shè)勞務(wù)分包合同
- 信息化顧問服務(wù)合同
- 圍墻建設(shè)合同模板范本
- 木結(jié)構(gòu)勞務(wù)分包協(xié)議
- 酒店家紡采購合同
- 幼兒園班級幼兒圖書目錄清單(大中小班)
- 烈士陵園的數(shù)字化轉(zhuǎn)型與智能服務(wù)
- 醫(yī)院與陪護(hù)公司的協(xié)議范文
- 古琴介紹(英文)(部編)課件
- DL-T5704-2014火力發(fā)電廠熱力設(shè)備及管道保溫防腐施工質(zhì)量驗收規(guī)程
- 2024年山東省煙臺市中考道德與法治試題卷
- 女性生殖健康與疾病智慧樹知到期末考試答案章節(jié)答案2024年山東中醫(yī)藥大學(xué)
- (高清版)JGT 225-2020 預(yù)應(yīng)力混凝土用金屬波紋管
- 2023-2024學(xué)年四川省綿陽市九年級上冊期末化學(xué)試題(附答案)
- 心電圖進(jìn)修匯報
- 中醫(yī)科進(jìn)修總結(jié)匯報
評論
0/150
提交評論