版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認(rèn)領(lǐng)
文檔簡介
外文文獻翻譯(含:英文原文及中文譯文)英文原文InvestigatingtheQueryingandBrowsingBehaviorofAdvancedSearchEngineUsersWhite,Ryen,MorrisDanABSTRACTOnewaytohelpallusersofcommercialWebsearchenginesbemoresuccessfulintheirsearchesistobetterunderstandwhatthoseuserswithgreatersearchexpertisearedoing,andusethisknowledgetobenefiteveryone.Inthispaperwestudytheinteractionlogsofadvancedsearchengineusers(andthosenotsoadvanced)tobetterunderstandhowtheseusergroupssearch.Theresultsshowthattherearemarkeddifferencesinthequeries,resultclicks,post-querybrowsing,andsearchsuccessofusersweclassifyasadvanced(basedontheiruseofqueryoperators),relativetothoseclassifiedasnon-advanced.Ourfindingshaveimplicationsforhowadvancedusersshouldbesupportedduringtheirsearches,andhowtheirinteractionscouldbeusedtohelpsearchersofallexperiencelevelsfindmorerelevantinformationandlearnimprovedsearchingstrategies.Keywords:Querysyntax,advancedsearchfeatures,expertsearching.INTRODUCTIONTheformulationofquerystatementsthatcaptureboththesalientaspectsofinformationneedsandaremeaningfultoInformationRetrieval(IR)systemsposesachallengeformanysearchers[3].CommercialWebsearchenginessuchasGoogle,Yahoo!,andWindowsLiveSearchofferuserstheabilitytoimprovethequalityoftheirqueriesusingqueryoperatorssuchasquotationmarks,plusandminussigns,andmodifiersthatrestrictthesearchtoaparticularsiteortypeoffile.Thesetechniquescanbeusefulinimprovingresultprecisionyet,otherthanvialoganalyses(e.g.,[15][27]),theyhavegenerallybeenoverlookedbytheresearchcommunityinattemptstoimprovethequalityofsearchresults.IRresearchhasgenerallyfocusedonalternativewaysforuserstospecifytheirneedsratherthanincreasingtheuptakeofadvancedsyntax.Researchonpracticaltechniquestosupplementexistingsearchtechnologyandsupportusershasbeenintensifyinginrecentyears(e.g.[18][34]).However,itischallengingtoimplementsuchtechniquesatlargescalewithtolerablelatencies.TypicalqueriessubmittedtoWebsearchenginestaketheformofaseriesoftokensseparatedbyspaces.ThereisgenerallyanimpliedBooleanANDoperatorbetweentokensthatrestrictssearchresultstodocumentscontainingallqueryterms.DeLimaandPedersen[7]investigatedtheeffectofparsing,phraserecognition,andexpansiononWebsearchqueries.TheyshowedthattheautomaticrecognitionofphrasesinqueriescanimproveresultprecisioninWebsearch.However,thevalueofadvancedsyntaxfortypicalsearchershasgenerallybeenlimited,sincemostusersdonotknowaboutadvancedsyntaxordonotunderstandhowtouseit[15].Sinceitappearsoperatorscanhelpretrieverelevantdocuments,furtherinvestigationoftheiruseiswarranted.Inthispaperweexploretheuseofqueryoperatorsinmoredetailandproposealternativeapplicationsthatdonotrequirealluserstouseadvancedsyntaxexplicitly.Wehypothesizethatsearcherswhouseadvancedquerysyntaxdemonstrateadegreeofsearchexpertisethatthemajorityoftheuserpopulationdoesnot;anassertionsupportedbypreviousresearch[13].Studyingthebehavioroftheseadvancedsearchengineusersmayyieldimportantinsightsaboutsearchingandresultbrowsingfromwhichothersmaybenefit.Throughanexperimentalstudyandanalysis,weofferpotentialanswersforeachofthesequestions.Arelationshipbetweentheuseofadvancedsyntaxandanyofthesefeaturescouldsupportthedesignofsystemstailoredtoadvancedsearchengineusers,oruseadvancedusers’interactionstohelpnon-advancedusersbemoresuccessfulintheirsearches.RELATEDWORKFactorssuchaslackofdomainknowledge,poorunderstandingofthedocumentcollectionbeingsearched,andapoorlydevelopedinformationneedcanallinfluencethequalityofthequeriesthatuserssubmittoIRsystems([24],[28]).Therehasbeenavarietyofresearchintodifferentwaysofhelpingusersspecifytheirinformationneedsmoreeffectively.Belkinetal.[4]experimentedwithprovidingadditionalspaceforuserstotypeamoreverbosedescriptionoftheirinformationneeds.AsimilarapproachwasattemptedbyKellyetal.[18],whousedclarificationformstoelicitadditionalinformationaboutthesearchcontextfromusers.Theseapproacheshavebeenshowntobeeffectiveinbest-matchretrievalsystemswherelongerqueriesgenerallyleadtomorerelevantsearchresults[4].However,inWebsearch,wheremanyofthesystemsarebasedonanextendedBooleanretrievalmodel,longerqueriesmayactuallyhurtretrievalperformance,leadingtoasmallnumberofpotentiallyirrelevantresultsbeingretrieved.Itisnotsimplysufficienttorequestmoreinformationfromusers;thisinformationmustbeofbetterquality.RelevanceFeedback(RF)andinteractivequeryexpansionarepopulartechniquesthathavebeenusedtoimprovethequalityofinformationthatusersprovidetoIRsystemsregardingtheirinformationneeds.InthecaseofRF,theuserpresentsthesystemwithexamplesofrelevantinformationthatarethenusedtoformulateanimprovedqueryorretrieveanewsetofdocuments.IthasprovendifficulttogetuserstouseRFintheWebdomainduetodifficultyinconveyingthemeaningandthebenefitofRFtotypicalusers.Querysuggestionsofferedbasedonquerylogshavethepotentialtoimproveretrievalperformancewithlimiteduserburden.Thisapproachislimitedtore-executingpopularqueries,andsearchersoftenignorethesuggestionspresentedtothem.Inaddition,bothofthesetechniquesdonothelpuserslearntoproducemoreeffectivequeries.Log-basedanalysisofusers’interactionswiththeExciteandAltaVistasearchengineshasshownthatonly10-20%ofqueriescontainedanyadvancedsyntax.ThisanalysiscanbeausefulwayofcapturingcharacteristicsofusersinteractingwithIRsystems.Researchinusermodelingandpersonalizationhasshownthatgatheringmoreinformationaboutuserscanimprovetheeffectivenessofsearches,butrequiremoreinformationaboutusersthanistypicallyavailablefrominteractionlogsalone.Unlesscoupledwithaqualitativetechnique,suchasapost-sessionquestionnaire[23],itcanbedifficulttoassociateinteractionswithusercharacteristics.Inourstudyweconjecturethatgiventhedifficultyinlocatingadvancedsearchfeatureswithinthetypicalsearchinterface,andthepotentialproblemsinunderstandingthesyntax,thoseusersthatdouseadvancedsyntaxregularlyrepresentadistinctclassofsearcherswhowillexhibitothercommonsearchbehaviors.Inthispaperwestudyothersearchcharacteristicsofusersofadvancedsyntaxinanattempttodeterminewhetherthereisanythingdifferentabouthowthesesearchengineuserssearch,andwhethertheirsearchescanbeusedtobenefitthosewhodonotmakeuseoftheadvancedfeaturesofsearchengines.Todothisweuseinteractionlogsgatheredfromlargesetofconsentingusersoveraprolongedperiod.Inthenextsectionwedescribethedataweusetostudythebehavioroftheuserswhouseadvancedsyntax,relativetothosethatdonotusethissyntax.DATAToperformthisstudywerequiredadescriptionofthequeryingandbrowsingbehaviorofmanysearchers,preferablyoveraperiodoftimetoallowpatternsinuserbehaviortobeanalyzed.ToobtainthesedataweminedtheinteractionlogsofconsentingWebusersoveraperiodof13weeks,fromJanuarytoApril2006.Whendownloadingapartnerclient-sideapplication,theuserswereinvitedtoconsenttotheirinteractionwithWebpagesbeinganonymouslyrecorded(withauniqueidentifierassignedtoeachuser)andusedtoimprovetheperformanceoffuturesystems.Theinformationcontainedintheselogentriesincludedauniqueidentifierfortheuser,atimestampforeachpageview,auniquebrowserwindowidentifier(toresolveambiguitiesindeterminingwhichbrowserapagewasviewed),andtheURLoftheWebpagevisited.Thisprovideduswithsufficientdataonqueryingbehavior(frominteractionwithsearchengines),andbrowsingbehavior(frominteractionwiththepagesthatfollowasearch)tomorebroadlyinvestigatesearchbehavior.Inadditiontothedatagatheredduringthecourseofthisstudywealsohadrelevancejudgmentsofdocumentsthatusersexaminedfor10,680uniquequerystatementspresentintheinteractionlogs.Thesejudgmentswereassignedonasix-pointscalebytrainedhumanjudgesatthetimethedatawerecollected.Weusethesejudgmentsinthisanalysistoassesstherelevanceofsitesusersvisitedontheirbrowsetrailawayfromsearchresultpages.Theprivacyofourvolunteerswasmaintainedthroughouttheentirecourseofthestudy:nopersonalinformationwaselicitedaboutthem,participantswereassignedauniqueanonymousidentifierthatcouldnotbetracedbacktothem,andwemadenoattempttoidentifyaparticularuserorstudyindividualbehaviorinanyway.Allfindingswereaggregatedovermultipleusers,andnoinformationotherthanconsentforloggingwaselicited.DISCUSSIONANDIMPLICATIONSOurfindingsindicatesignificantdifferencesinthequerying,result-click,post-querynavigation,andsearchsuccessofthosethatuseadvancedsyntaxversusthosethatdonot.Manyofthesefindingsmirrorthosealreadyfoundinpreviousstudieswithgroupsofself-identifiednovicesandexperts.Thereareseveralwaysinwhichacommercialsearchenginesystemmightbenefitfromaquantitativeindicationofsearcherexpertise.Thismightbeyetanotherfeatureavailabletoarankingengine;i.e.itmaybethecasethatexpertsearchersinsomecasespreferdifferentpagesthannovicesearchers.Theuserinterfacetoasearchenginemightbetailoredtoauser’sexpertiselevel;perhapsevenmoreadvancedfeaturessuchastermweightingandqueryexpansionsuggestionscouldbepresentedtomoreexperiencedsearcherswhilepreservingthesimplicityofthebasicinterfacefornovices.Resultpresentationmightalsobecustomizedbasedonsearchskilllevel;futureworkmightre-evaluatethebenefitsofcontentsnippets,thumbnails,etc.inamannerthatallowsdifferentoutcomesfordifferentexpertiselevels.Additionally,ifbrowsinghistoriesareavailable,thedestinationsofadvancedsearcherscouldbeusedassuggestedresultsforqueries,bypassingandpotentiallyimprovinguponthetraditionalsearchprocess.Theuseoftheinteractionofadvancedsearchengineuserstoguideotherswithlessexpertiseisanattractivepropositionforthedesignersofsearchsystems.Inpart,thesesearchersmayhavemorepost-querybrowsingexpertisethatallowsthemtoovercometheshortcomingsofsearchsystems.Theirinteractionscanbeusedtopointuserstoplacesthatadvancedsearchengineusersvisitorsimplytotrainlessexperiencedsearchershowtosearchmoreeffectively.However,ifexpertusersaregoingtobeusedinthisway,issuesofdatasparsitywillneedtobeovercome.Ouradvancedusersonlyaccountedfor20.1%oftheuserswhoseinteractionswestudied.Whilstthesemaybeamongstthemostactiveusersitisunlikelythattheywillviewdocumentsthatcoverlargenumberofsubjectareas.However,ratherthanfocusingonwheretheygo(whichisperhapsmoreappropriateforthosewithdomainknowledge),advancedsearchengineusersmayusemoves,tacticsandstrategies[2]thatinexperienceduserscanlearnfrom.Encouraginguserstouseadvancedsyntaxhelpsthemlearnhowtoformulatebettersearchqueries;leveragingthesearchingstyleofexpertsearcherscouldhelpthemlearnmoresuccessfulpost-queryinteractions.Onepotentiallimitationtotheresultswereportisthatinpriorresearch,ithasbeenshownthatqueryoperatorsdonotsignificantlyimprovetheeffectivenessofWebsearchresults[8],andthatsearchersmaybeabletoperformjustaswellwithoutthem[27].Itcouldthereforebearguedthattheuserswhodonotusequeryoperatorsareinfactmoreadvanced,sincetheydonotwastetimeusingpotentiallyredundantsyntaxintheirquerystatements.However,thisseemsunlikelygiventhatthosewhouseadvancedsyntaxexhibitedsearchbehaviorstypicalofuserswithexpertise[13],andaremoresuccessfulintheirsearching.However,infutureworkwewillexpandofdefinitionof“advanceduser”beyondattributesofthequerytoalsoincludeotherinteractionbehaviors,someofwhichwehavedefinedinthisstudy,andotheravenuesofresearchsuchaseye-tracking[12].中文譯文高級搜索引擎用戶的查詢和瀏覽行為懷特,瑞恩,莫里斯,丹摘要幫助商業(yè)網(wǎng)絡(luò)搜索引擎的所有用戶在搜索中取得更大成功的一種方法是更好地了解具有更高搜索專業(yè)知識的用戶在做什么,并利用這些知識為每個人帶來收益。在本文中,我們研究高級搜索引擎用戶(以及那些不那么先進的)的交互日志,以更好地了解這些用戶組搜索的方式。結(jié)果顯示,與分類為非高級的用戶相比,查詢,結(jié)果點擊,查詢后瀏覽以及我們分類為高級(基于查詢運算符的使用)的用戶的搜索成功率存在顯著差異。我們的研究結(jié)果意味著在搜索過程中應(yīng)該如何支持高級用戶,以及他們的互動如何用于幫助所有經(jīng)驗級別的搜索者找到更多相關(guān)信息并學(xué)習(xí)改進的搜索策略。關(guān)鍵字:查詢語法,高級搜索功能,專家搜索。引言查詢語句的制定既捕獲了信息需求的突出方面,又對信息檢索(IR)系統(tǒng)有意義,這對許多搜索者提出了挑戰(zhàn)。諸如Google,Yahoo!和WindowsLiveSearch等商業(yè)Web搜索引擎為用戶提供了使用查詢運算符(如引號,加號和減號)以及限制搜索到特定站點的修飾符或文件類型。除了通過日志分析,這些技術(shù)可以用于提高結(jié)果精度,但研究人員一般忽略這些技術(shù)來提高搜索結(jié)果的質(zhì)量。IR研究一般側(cè)重于用戶指定需求的替代方式,而不是增加高級語法的使用。近年來,對補充現(xiàn)有搜索技術(shù)和支持用戶的實用技術(shù)的研究一直在加?。ɡ鏪18][34])。然而,以可忍受的延遲大規(guī)模實施這些技術(shù)是具有挑戰(zhàn)性的。提交給Web搜索引擎的典型查詢采用由空格分隔的一系列令牌的形式。在令牌之間通常存在隱含的布爾AND運算符,它將搜索結(jié)果限制為包含所有查詢項的文檔。DeLima和Pedersen[7]研究了解析,短語識別和擴展對Web搜索查詢的影響。他們表明,在查詢中自動識別短語可以提高網(wǎng)絡(luò)搜索的結(jié)果精度。然而,對于典型的搜索者來說,高級語法的價值通常是有限的,因為大多數(shù)用戶不知道高級語法或不知道如何使用它[15]。由于操作員可以幫助檢索相關(guān)文件,因此需要對其使用進行進一步調(diào)查。在本文中,我們更詳細(xì)地探討了查詢運算符的用法,并提出了不要求所有用戶都明確使用高級語法的替代應(yīng)用程序。我們假設(shè)使用高級查詢語法的搜索者表現(xiàn)出大多數(shù)用戶群體沒有的搜索專業(yè)知識程度;一個斷言支持以前的研究[13]。研究這些高級搜索引擎用戶的行為可能會產(chǎn)生對其他人可能從中受益的搜索和結(jié)果瀏覽的重要見解。通過實驗研究和分析,我們?yōu)槊總€問題提供可能的答案。高級語法的使用與任何這些功能之間的關(guān)系可以支持為高級搜索引擎用戶量身定制的系統(tǒng)設(shè)計,或者使用高級用戶的交互來幫助非高級用戶在他們的搜索中更加成功。文獻綜述諸如缺乏領(lǐng)域知識,對正在搜索的文檔集合理解不深以及信息需求不足等因素都會影響用戶提交給IR系統(tǒng)的查詢質(zhì)量([24],[28])。已經(jīng)有各種不同的方法來幫助用戶更有效地指定他們的信息需求。Belkin等人[4]嘗試為用戶提供更多的空間來輸入他們的信息需求的更詳細(xì)的描述。Kelly等人嘗試了類似的方法。[18],他使用澄清表格來從用戶中獲得關(guān)于搜索上下文的附加信息。已經(jīng)證明這些方法在最佳匹配檢索系統(tǒng)中是有效的,其中較長的查詢通常導(dǎo)致更相關(guān)的搜索結(jié)果[4]。然而,在網(wǎng)絡(luò)搜索中,許多系統(tǒng)基于擴展布爾檢索模型,較長的查詢實際上可能會損害檢索性能,導(dǎo)致檢索到少量可能不相關(guān)的結(jié)果。要求用戶提供更多信息并不足夠,這些信息必須具有更好的質(zhì)量。相關(guān)性反饋(RF)和交互式查詢擴展是常用的技術(shù),用于提高用戶向IR系統(tǒng)提供的有關(guān)其信息需求的信息的質(zhì)量。在RF的情況下,用戶向系統(tǒng)呈現(xiàn)相關(guān)信息的例子,然后用這些信息來制定改進的查詢或檢索新的文檔集合。由于難以向典型用戶傳達(dá)RF的含義和好處,因此很難讓用戶在Web域中使用RF?;诓樵?nèi)罩咎峁┑牟樵兘ㄗh有可能在用戶負(fù)擔(dān)有限的情況下提高檢索性能。這種方法僅限于重新執(zhí)行流行的查詢,而搜索者經(jīng)常忽視向他們提出的建議。另外,這兩種技術(shù)都不能幫助用戶學(xué)習(xí)產(chǎn)生更有效的查詢。用戶與Excite和AltaVista搜索引擎交互的日志分析表明,只有10-20%的查詢包含任何高級語法。該分析可以是捕獲與IR系統(tǒng)交互的用戶特征的有用方式。對用戶建模和個性化的研究表明,收集更多關(guān)于用戶的信息可以提高搜索的有效性,但需要更多關(guān)于用戶的信息,而不僅僅是單獨從交互日志中獲得的信息。除非結(jié)合定性技術(shù),如會后調(diào)查問卷[23],否則將交互與用戶特征相關(guān)聯(lián)可能很困難。在我們的研究中,我們猜想鑒于在典型的搜索界面中定位高級搜索功能存在困難,并且在理解語法方面存在潛在的問題,那些使用高級語法的用戶通常會表現(xiàn)出一類獨特的搜索者,他們將展示其他常見搜索行為。在本文中,我們將研究高級語法的用戶的其他搜索特性,以試圖確定這些搜索引擎用戶搜索的方式是否有任何不同,以
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2024年債務(wù)重組與債務(wù)重組法律援助服務(wù)合同3篇
- 海報背景課程設(shè)計理念
- 直播選品培訓(xùn)課程設(shè)計
- 2024年網(wǎng)絡(luò)營銷策劃與執(zhí)行合同
- 幼兒園看病課程設(shè)計
- 2024年股權(quán)與債權(quán)一體化轉(zhuǎn)讓協(xié)議范本
- 2024年協(xié)議離婚雙方隱私保護合同3篇
- 2024年版保險咨詢專業(yè)服務(wù)協(xié)議版B版
- 污水處理課程設(shè)計致謝
- 2024年新興產(chǎn)業(yè)項目規(guī)劃與項目管理合同3篇
- 第二章國際石油合作合同
- 甲型H1N1流感防治應(yīng)急演練方案(1)
- LU和QR分解法解線性方程組
- 設(shè)計后續(xù)服務(wù)承諾書
- 漏油器外殼的落料、拉深、沖孔級進模的設(shè)計【畢業(yè)論文絕對精品】
- 機械加工設(shè)備清單及參考價格
- 北京市西城區(qū)20192020學(xué)年六年級上學(xué)期數(shù)學(xué)期末試卷
- 加工中心全部的報警說明
- 【圖文】環(huán)保氣體絕緣環(huán)網(wǎng)柜
- 供應(yīng)室-護理不良事件報告表
- 醫(yī)療器械質(zhì)量工作記錄管理制度
評論
0/150
提交評論