翻譯以.原文和在同一文件中前_第1頁
翻譯以.原文和在同一文件中前_第2頁
翻譯以.原文和在同一文件中前_第3頁
翻譯以.原文和在同一文件中前_第4頁
翻譯以.原文和在同一文件中前_第5頁
已閱讀5頁,還剩33頁未讀 繼續(xù)免費閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)

文檔簡介

前該領(lǐng)域最主要的研究瓶頸。在這篇文章里,不同分辨率的行人檢測問題系模型甚至在訓(xùn)練中沒有車輛注釋的時候都可以自動訓(xùn)練。我們方法在CaltechPedestrianBenark上將行人分辨率高于30像素的的行人檢測的 定分辨率上訓(xùn)練之后可以通過改變檢測器和檢測的大小而延伸到所有分辨的下降,檢測效果會發(fā)生急劇下降。例如,最好的檢測器在CaltechPedestrian80像間的上卻提高到了73%的檢測。我們在CaltechPedestrianBenark對我們的方法進行檢測,我們的結(jié)果比state-of-the-artmethods和其他的一些方法都有進步的表現(xiàn)。在高于30像素的行辨下不可靠的圖像特征會對已經(jīng)訓(xùn)練過的檢測器造成誤解使其很難進行準(zhǔn)確的見,開始兩種策略都是這個策略的一種情況。Figure1在這里,分辨率分為兩種不同的情況(就像[4]建議的一樣,低分辨率:30~80像間的行人;高分辨率:大于80像素的行人。很明顯,將這個思想應(yīng)用到基于其他的圖像特征的線性檢測器和的分辨率分類是很容易的。分辨率檢測為了簡化,我們介紹一個基于DPM的矩陣。假設(shè)給定一個圖像I和他m個不同部分在圖像中的位置??=(??0????),第??個部分的HOG特征可以用????(??,表示,如果用???和????來表示這個部分寬和高所占細胞的數(shù)量,????用來表示一細胞的梯度統(tǒng)計向量的維度,則該特征可以表示為一個???×????×????們將該特征重新表示為一個新的矩陣Φ??(??,????),在這個矩陣里,每一列代表一個細胞表示的特征向量。Φ??(??,????)可以進一步聚類成一個更大的矩陣Φ??(??,??)=[Φ??(??,??0Φ??(??,??1Φ??(??,????)]。該矩陣列的數(shù)量可以表示為????×不同部分之間的空間特征可以用向量????(??,??表示,同時空間參數(shù)表示為????。這 ??????????(??,??)=????(????Φ(??,??))+??????(??, ????(·)表示這個矩陣主對角線所有元和。給定根的位置,所有其他部件的位置都是可以變化的。最后的結(jié)果是???????????????????(??,???),???表示讓根的位置固定一個????×????的矩陣,它們將來自不同分辨率的樣本的特征向量從原本的????子空間的檢測矩陣定義為一個????×????矩陣????,和????Φ??(??,??)大小是一樣的。????(??????Φ(??,??))+??????(??,??),??????? ???? ????(??????Φ(??,??))+??????(??,??),?????? 來了一些,因為????,????,????,????都是未知的。接下來的部分,列出這多任務(wù)

Figure2不同分辨率特征映射圖示?????? 1||??||2+1?????? ?? 2 +??∑??????[0,1? (??,???))+??????(??,??? 大化In的檢測得分得到的最佳部件結(jié)構(gòu)。在訓(xùn)練過程中,部件的位置被視為可變的,所以這一優(yōu)化問題可以視為Latent-SVM 行人和非行人。因為空間表達式????可以直接應(yīng)用于不同的分辨率的,所以??????和??????是相同的形式,所以我以其中一種為例來表示這個表??(??,??,??)= ??+??∑??????[0,1?????(????(??????Φ(??,?????))+??????????(??,????? × 優(yōu)化????和1當(dāng)????和????是固定的時候,我們就可以將圖像特征映射到一個DPM檢測器1以訓(xùn)練學(xué)習(xí)的子空間。我們用A來表示

????+??????,用

來表示??2

高分辨率要本,我們用?

,???)來表示

??? ??2????Φ??(????,們用?

,???)來表示

?。 (4)可以改寫為?? ??2????Φ??(????,????????????1? 1? ||????||??+???? +??∑??????[0,1?????(????(??? (??,???))+??????(??,??? 1 +?2? 到????優(yōu)化????和以分解成兩個獨立的子問題解決:??????????????????????(????,????,????)和假設(shè)給定????和????,對于一個訓(xùn)練樣本,我們第一步通過最大化表達式(2)?? ?? 示 ?

,???)表示

?,則表達式(4)

?? ??2????Φ??(??????,????????????1||?||2??? ??+??∑??????[0,1?????(????(????(??,???))+??????(??,??? 表達式(7)和標(biāo)準(zhǔn)SVM之間的唯一差異在于額外的項??????(???) ???1?訓(xùn)練之后計算這些圖像塊的HOG特征值,然后我們保留這些特征值的前????個特征向量這樣就可以得到一個31維度的HOG特征,即????=31。維度值????決定了映射后在我們的實驗當(dāng)中,坐標(biāo)下降過程的最大循環(huán)次數(shù)設(shè)定為8。對于高分辨于高分辨或者低分辨率而言,我們的檢測當(dāng)中,每個更過濾器都將含有8×4個離。我們用表達式??(??,??)來表示行人與車輛之間的相關(guān)關(guān)系。對于一個行人檢測那么相關(guān)特征中相應(yīng)維度定義為(??(??????????1),其余的維度保持為0;如果兩者的空間關(guān)系為far或者檢測中并沒有車輛出現(xiàn),那么行人車輛相關(guān)特征中的所有維度的值都為0。其中?????=|?????????????|,?????=?????????????,??=???/???,(????????????)和(????????????)是行人檢測結(jié)果和車輛檢測結(jié)果的中心坐標(biāo)。??(??結(jié)果為??,我們可以進一步將地理相關(guān)特征定義為??(??)=(??(????????22)根據(jù)的高度進行了規(guī)范化。有檢測到的行人的內(nèi)容得分的累加和。假設(shè)在一張中,有n和行人檢測結(jié)??=(??1??2????)和m個車輛檢測結(jié)果??=(??1??2????),則內(nèi)容得分可以定義 ??(??,??)=∑(??????(??)+∑??????(??,??

在表達式(8)中,????和????是地理模型和行人車輛模型的相關(guān)參數(shù),它們可,

∑??(????????(??)+ ∑?? ??????(??,????????, ???? ??=1???? 在這個表達式里,????與????都是一位二進制數(shù),0代表檢測錯誤而1代表檢測通場景下,車輛的數(shù)量幾乎是限定的。舉例而言,在Caltech

[??,?? ????(??), ∑??????(??,????????, ??=1 ??=1 ??=1 表達式(10)提供了一種自然的方式去進行貪婪學(xué)習(xí)。我們用????來表[????????],假設(shè)給定基于地平線的行人和車輛的位置假設(shè),標(biāo)準(zhǔn)的結(jié)構(gòu)化SVM可

||??||2+??∑ ????,???? ?? ??.??.???′,???′,??(????,????)???(??′,??′)≥??(????,??′)? ′ ,

||??||2+??∑ ????,???? ?? ??.??.???′,

??(??,??)???(??′,??′)≥??(??,??′)???? 在這個表達式當(dāng)中,???是????的子集,它通過最大化整體的相關(guān)得分來反映目前車??? ark為測試標(biāo)準(zhǔn)進行的。根實驗協(xié)議,我們用setsetsetset0進行結(jié)果測試。按照[14]中建議的,我們用OC特性或者平均檢測率來比較各種方法。想關(guān)的參上MT-DPM子空間維度的實MT-DPM中分辨率轉(zhuǎn)換過程中映射的子空間的維度反映了不同分辨率并且當(dāng)維度為16的時候,平均檢測最低。我們在后續(xù)的實驗當(dāng)中,將子空與其他檢測方法的在這一章里,我們MT-DPM方法同其他多分辨行人檢測方法進所有的檢測器都是應(yīng)用在處理的上。這些方法包括:在高分辨率集上訓(xùn)練的DPM檢測器在高分辨率集上訓(xùn)練的DPM檢測器,檢測的時候?qū)esize為原 在低分辨率集上訓(xùn)練的DPM檢測器率的訓(xùn)練不同的檢測器。我們的MT-DPM由于綜合考慮不同分辨率之間的差Figure3相關(guān)關(guān)系模型的提升作用越明顯,例如,在1FPPI的地方,相關(guān)系模型可以減低3.2%的檢測。和基于state-of-the-art的方法進行在這一章節(jié),我們方法和那些基于state-of-the-art的方法進行PoseInv[30],HOGLbp[43],HikSVM[31],HOG[6],FtrMine[13],MultFtr[44],MultiFtr+CSS[44],Pls[37],MultiFtr+Motion[44],FPDW[11],FeatSynthChn-Ftrs[12],MultiResC[33]。我們的方法是用MT-DPM和MT-DPM+Context來表示。由于文章篇幅的限制,我們只是將按照分辨率分為高(高于80像素)、們的MT-DPM檢測的檢測幾乎比其他所有的方法少6%。我們相關(guān)關(guān)系模型可以再減少3%的。優(yōu)于ROC對于[9]而言并不可用,所以我們并沒有實現(xiàn)標(biāo)準(zhǔn)的PC上面,無論對于高分辨率還是低分辨率,處理一張的時間都大約state-of-the-art的檢測方法效果一點。我們希望可以在未來通過探索物體的Bar-Hillel,D.Levi,E.Krupka,andC.Goldberg.Part-basedfeaturesynthesisforhumandetection.ECCV,2010.7Barinova,V.Lempitsky,andP.Kholi.Ondetectionofmultipleobjectinstancesusinghoughtransforms.PAMI,2012.2BeleznaiandH.Bischof.Fasthumandetectionincrowdedscenesbycontourintegrationandlocalshapeestimation.InCVPR.IEEE,2009.2R.Benenson,M.Mathias,R.Timofte,andL.VanGool.Pedestriandetectionat100framespersecond.InCVPR.IEEE,2012.1,2S.Biswas,K.W.Bowyer,andP.J.Flynn.Multidimensionalscalingformatchinglow-resolutionfaceimages.PAMI,2012.2N.DalalandB.Triggs.Histogramsoforientedgradientsforhumandetection.InCVPR.IEEE,2005.1,2,7N.Dalal,B.Triggs,andC.S id.Humandetectionusingorientedhistogramsofflowandappearance.ECCV,2006.2Desai,D.Ramanan,andC.Fowlkes.Discriminativemodelsformulti-classobjectlayout.IJCV,2011.2Y.DingandJ.Xiao.Contextualboostforpedestriandetection.InIEEE,2012.2,P.Dollár,R.Appel,andW.Kienzle.Crosstalkcascadesforframe-ratepedestriandetection.InECCV.Springer,2012.1,2P.Dollár,S.Belongie,andP.Perona.Thefastestpedestriandetectorinthewest.BMVC2010,2010.1,2,7P.Dollár,Z.Tu,P.Perona,andS.Belongie.Integralchannelfeatures.InBMVC,2009.2,7P.Dollár,Z.Tu,H.Tao,andS.Belongie.Featureminingforimageclassification.InCVPR.IEEE,2007.7P.Dollár,C.Wojek,B.Schiele,andP.Perona.Pedestriandetection:Anevaluationofthestateoftheart.TPAMI,2012.1,2,3,6,7DuboutandF.Fleuret.Exactaccelerationoflinearobjectdetectors.ECCV,2012.7M.Enz andD.Gavrila.Monocularpedestriandetection:Surveyandexperiments.TPAMI,2009.2M.Everingham,L.VanGool,C.K.I.Williams,J.Winn,andZisserman.Thepascalvoc2012results.P.Felzenszwalb,R.Girshick,andD.McAllester.Cascadeobjectdetectionwithdeformablepartmodels.InCVPR.IEEE,2010.1,2P.Felzenszwalb,R.Girshick,D.McAllester,andD.Ramanan.Objectdetectionwithdiscriminativelytrainedpart-basedmodels.TPAMI,2010.1,2,3,4,7Geiger,P.Lenz,andR.Urtasun.ArewereadyforautonomousThekittivision arksuite.InCVPR.IEEE,2012.Geronimo,A.Lopez,A.Sappa,andT.Graf.Surveyofpedestriandetectionforadvanceddriverassistancesystems.PAMI,2010.2R.B.Girshick,P.F.Felzenszwalb,andD.McAllester.Discriminativelytraineddeformablepartmodels,release5.rbg/latent-release5/.4D.Hoiem,Y.Chodpathumwan,andQ.Dai.Diagnosingerrorinobjectdetectors.ECCV,2012.1,2D.Hoiem,A.Efros,andM.Hebert.Puttingobjectsin.IJCV,C.HuangandR.Nevatia.Highperformanceobjectdetectionbycollaborativelearningofjointrankingofgranulesfeatures.InCVPR.IEEE,2010.1,2Khosla,T.Zhou,T.Malisiewicz,A.A.Efros,andA.Torralba.Un ngthedamageofdatasetbias.InECCV.Springer,2012.2Kulis,K.Saenko,andT.Darrell.Whatyousawisnotwhatyouget:adaptationusingasymmetrickerneltransforms.InCVPR.IEEE,2011.Z.LeiandS.Z.Li.Coupledspectralregressionformatchingheterogeneousfaces.InCVPR.IEEE,2009.2C.Li,D.Parikh,andT.Chen.Automaticdiscoveryofgroupsofobjectsforsceneunderstanding.InCVPR.IEEE,2012.2Z.LinandL.Davis.Apose-invariantdescriptorforhumandetectionandsegmentation.ECCV,2008.7S.Maji,A.Berg,andJ.Malik.Classificationusingintersectionkernelsupportvectormachinesisefficient.InCVPR.IEEE,2008.1,7C.PapageorgiouandT.Poggio.AtrainablesystemforobjectD.Park,D.Ramanan,andC.Fowlkes.Multiresolutionmodelsforobjectdetection.ECCV,2010.1,2,5,7H.PirsiavashandD.Ramanan.Steerablepartmodels.InCVPR.IEEE,H.Pirsiavash,D.Ramanan,andC.Fowlkes.Bilinearclassifiersforvisualrecognition.InNIPS,2009.2M.SadeghiandA.Farhadi.Recognitionusingvisualphrases.InCVPR,W.Schwartz,A.Kembhavi,D.Harwood,andL.Davis.Humandetectionusingpartialleastsquares ysis.InICCV.IEEE,2009.7S.Tang,M.Andriluka,andB.Schiele.Detectionandtrackingofoccludedpeople.InBMVC,2012.2Tsochantaridis,T.Joachims,T.Hofmann,andY.Altun.Largemarginmethodsforstructuredandinterdependentoutputvariables.JMLR,2006.5P.Viola,M.Jones,andD.Snow.Detectingpedestriansusingpatternsofmotionandappearance.IJCV,2005.1,2,7S.Walk,N.Majer,K.Schindler,andB.Schiele.Newfeaturesandinsightsforpedestriandetection.InCVPR.IEEE,2010.1,2M.Wang,W.Li,andX.Wang.Transferringagenericpedestriandetectortowardsspecificscenes.InCVPR.IEEE,2012.2X.Wang,T.Han,andS.Yan.Anhog-lbphumandetectorwithpartialocclusionhandling.InICCV.IEEE,2009.1,2,7C.WojekandB.Schiele.Aperformanceevaluationofsingleandmulti-featurepeopledetection.DAGM,2008.7C.Wojek,S.Walk,andB.Schiele.Multi-cueonboardpedestrianInCVPR.IEEE,2009.Yan,Z.Lei,D.Yi,andS.Z.Li.Multi-pedestriandetectionincrowdedscenes:Aglobalview.InCVPR.IEEE,2012.2RobustMulti-ResolutionPedestrianDetectioninTrafficJunjieYan XucongZhang ZhenLei ShengcaiLiao StanZ.Li?CenterforBiometricsandSecurityResearch&NationalLaboratoryofPatternRecognitionInstituteofAutomation,AcademyofSciences,tionisthemajorbottleneckforcurrentpedestriandetectiontechniques[14,23].Inthispaper,wetakepedestriande-tectionindifferentresolutionsasdifferentbutrelatedprob-lems,andproposeaMulti-Taskmodeltojointlyconsidertheircommonnessanddifferences.Themodelcontainsres-olutionawaretransformationstomappedestriansindiffer-entresolutionstoacommonspace,whereashareddetectorisconstructedtodistinguishpedestriansfrombackground.Formodellearning,wepresentacoordinatedescentproce-duretolearntheresolutionawaretransformationsandde-formablepartmodel(DPM)baseddetectoritively.Intrafficscenes,therearemanyfalsepositiveslocatedaroundvehicles,therefore,wefurtherbuildacontextmodeltosup-Thecontextmodelcanbelearnedautomaticallyevenwhenthevehicleannotationsarenotavailable.Ourmethodre-ducesthemeanmissrateto60%forpedestrianstallerthan30pixelsontheCaltechPedestrianBenark,whichno-ticeablyoutperformspreviousstate-of-the-art(71%).Pedestriandetectionhasbeenahotresearchtopicincomputervisionfordecades,foritsimportanceinrealap-plications,suchasdrivingassistanceand lance.Inrecentyears,especiallyduetothepopularityofgradientfeatures,pedestriandetectionfieldhasachievedimpressiveprogressesinbotheffectiveness[6,31,43,41,19,33]andefficiency[25,11,18,4,10].Theleadingde-tionbenarks(e.g.INRIA[6]),however,theyencounterdifficultiesforthelowresolutionpedestrians(e.g.30-80pixelstall,Fig.1)[14,23].Unfortunay,thelowresolu-tionpedestriansareoftenveryimportantinrealapplication-s.Forexample,thedriverassistancesystemsneed?StanZ.Liisthecorresponding

Figure1.Examplesofmultipleresolutionpedestriandetectionre-sultofourmethodintheCaltechPedestrianBenark[14].thelowresolutionpedestrianstoprovideenoughtimeforTraditionalpedestriandetectorsusuallyfollowthescaleinvariantassumption:ascaleinvariantfeaturebaseddetec-tortrainedatafixedresolutioncouldbegeneralizedtoallresolutions,byresizingthedetector[40,4],image[6,19]orbothofthem[11].However,thefinitesamplingfrequencyofthesensorresultsinmuchinformationlossforlowreso-lutionpedestrians.Thescaleinvariantassumptiondoesnotholdinthecaseoflowresolution,whichleadstothedis-astrousdropofthedetectionperformancewiththedecreaseofresolution.Forexample,thebestdetectorachieves21%meanmissrateforpedestrianstallerthan80pixelsinCal-techPedestrianBenark[14],whileincreasesto73%forpedestrians30-80pixelshigh.Ourphilosophyisthattherelationshipamongdifferentresolutionsshouldbeexploredforrobustmulti-resolutionpedestriandetection.Forexample,thelowresolutionsam-plescontainalotofnoisethatmaymisleadthedetectorinthetrainingphase,andtheinformationcontainedinhighresolutionsamplescanhelptoregularizeit.Wearguethatforpedestriansindifferentresolutions,thedifferencesexistinthefeaturesoflocalpatch(e.g.thegradienthistogramfeatureofacellinHOG),whiletheglobalspatialstruc-turekeepsthesame(e.g.partconfiguration).Tothisend,weproposetoconductresolutionawaretransformationstomapthelocalfeaturesfromdifferentresolutionstoacom-monsubspace,wherethedifferencesoflocalfeaturesarereduced,andthedetectorislearnedonthemappedfea-turesofsamplesfromdifferentresolutions,thusthestruc-turalcommonnessisp .Particularly,weextendthepopulardeformablepartmodel(DPM)[19]tomulti-taskDPM(MT-DPM),whichaimstofindanoptimalcombina-tionofDPMdetectorandresolutionawaretransformations.Weprovethatwhentheresolutionawaretransformationsarefixed,themulti-taskproblemscanbetransformedtobeaLatent-SVMoptimizationproblem,andwhentheDPMdetectorinthemappedspaceisfixed,theproblemequalstoastandardSVMproblem.Wedividethecomplexnon-convexproblemintothetwosub-problems,andoptimizethemalternatively.Inaddition,weproposeanewcontextmodeltoimprovethedetectionperformanceintrafficscenes.Thereisaphe-nomenonthatquitealargenumberofdetections(33.19%forMT-DPMinourexperiments)arearoundvehicles.Thevehiclelocalizationismucheasierthanpedestrian,whichmotivatesustoemploypedestrian-vehiclerelationshipasanadditionalcuetojudgewhetherthedetectionisafalseortruepositive.Webuildanenergymodeltojointlyencodethepedestrian-vehicleandgeometrycontexts,andinferthelabelsofdetectionsby izingtheenergyfunctiononthewholeimage.Sincethevehicleannotationsareoftennotavailableinpedestrianbenark,wefurtherpresentamethodtolearnthecontextmodelfromgroundtruthpedes-trianannotationsandnoisyvehicledetections.WeconductexperimentsonthechallengingCaltechPedestrianBenark[14],andachievesignificantlyim-provementoverpreviousstate-of-the-artmethodsonallthe9sub-experimentsadvisedin[14].Forthepedestrianstallerthan30pixels,ourMT-DPMreduces8%andourcontex-tmodelfurtherreduces3%meanmissrateoverpreviousstate-of-the-artperformance.Therestofthepaperisorganizedasfollows:Section2reviewstherelatedwork.Themulti-taskDPMdetectorandpedestrian-vehiclecontextmodelarediscussedinSection3andSection4,respectively.Section5showstheexperi-mentsandfinallyinSection6weconcludethepaper.RelatedThereisalonghistoryofresearchonpedestriandetec-tion.Mostofthemoderndetectorsarebasedonstatisticallearningandsliding-windowscan,popularizedby[32]and[40].Largeimprovementscamefromtherobustfeatures,suchas[6,12,25,3].TherearesomepapersfusedHOGwithotherfeatures[43,7,45,41]toimprovetheperfor-mance.Somepapersfocusedonspecialproblemsinpedes-speed[25,11,18,4,10],anddetectortransferinnewscenes[42,27].Wereferthedetailedsurveysonpedestriandetec-tionto[21,14].Resolutionrelatedproblemshaveattractedattentioninrecentevaluations.[16]foundthatthepedestriandetection

performancedependsontheresolutionoftrainingswithdecreasingresolution.[23]observedsimilarphe-nomenoningeneralobjectdetectiontask.However,thereareverylimitedworksproposedtotacklethisproblem.Themostrelatedworkis[33],whichutilizedrootandpartfilter-sforhighresolutionpedestrians,whileonlyusedtherigidrootfilterforlowresolutionpedestrians.[4]proposedtouseasinglemodelperdetectionscale,butthepaperisfocusedonspeedup.OurpedestriandetectorisbuiltonthepopularDPM(de-formablepartmodel)[19],whichcombinedrigidrootfilteranddeformablepartfiltersfordetection.TheDPMonlyperformswellforhighresolutionobjects,whileourMT-DPMgeneralizesittolowresolutioncase.Thecoordinatedescentprocedureinlearningismotivatedbythesteerablepartmodel[35,34],whichtrainedthesharedpartbasestoacceleratethedetection.Notethat[34]learnedasharedfil-terbases,whileourmodellearnsasharedclassifier,whichresultinaquitedifferentformulation.[26]alsoproposedamulti-taskmodeltohandledatasetbias.Themulti-taskideainthispaperismotivatedbyworksonfacerecognitionacrossdifferent s,suchas[28,5].Contexthasbeenusedinpedestriandetection.[24,33]capturedthegeometryconstraintundertheassumptionthatcameraisalignedwithgroundne.[9]tooktheappear-anceofnearbyregionsasthecontext.[8,36,29]capturedthepair-wisespatialrelationshipinmulti-classobjectde-tection.Tothebestofourknowledge,thisisthefirstworktocapturethepedestrian-vehiclerelationshiptoimprovepedestriandetectionintrafficscenes.Therearetwointuitivestrategiestohandlethemulti-resolutiondetection.Oneistocombinesamplesfromd-ifferentresolutionstotrainasingledetector(Fig.2(a)),andanotheristotrainindependentdetectorsfordifferentreso-lutions(Fig.2(b)).However,bothofthetwostrategiesarenotprefect.Thefirstoneconsidersthecommonnessbe-tweendifferentresolutions,whiletheirdifferencesareig-nored.Samplesfromdifferent swouldincreasethecomplexityofthedetectionboundary,whichprobablybe-yondtheabilityofasinglelineardetector.Onthecontrary,multi-resolutionmodeltakespedestriandetectionindiffer-entresolutionsasindependentproblems,andtherelation-shipamongthemaremissed.Theunreliablefeaturesoflowresolutionpedestrianscanmisleadthelearneddetectorandmakeitdifficulttobegeneralizedtonoveltestsamples.Inthispart,wepresentamulti-resolutiondetectionmethodbyconsideringtherelationshipofsamplesfromd-ifferentresolutions,includingthecommonnessandthedif-ferences,whicharecapturedbyamulti-taskstrategysimul-taneously.Consideringthedifferencesofdifferentresolu-

HighResolutionHR

Figure2.Differentstrategiesformulti-resolutionpedestrian awaretions,weusetheresolutionawaretransformationstomapfeaturesfromdifferentresolutionstoacommonsubspace,inwhichtheyhavesimilardistribution.Ashareddetectoristrainedintheresolution-invariantsubspacebysamplesfromallresolutions,tocapturethestructuralcommonness.Iteasytoseethatthefirsttwostrategiesarethespecialcaseofthemulti-taskstrategy.Particularly,weextendthetheideatopopularDPMde-tector[19]andproposeaMulti-TaskformofDPM.Hereweconsiderthepartitionoftworesolutions(lowresolution:30-80pixelstall,andhighresolution:tallerthan80pixels,asadvisedin[14]).Notethatextendingthestrategyforoth-erlocalfeaturebasedlineardetectorsandmoreresolutionpartitionsarestraightforward.ResolutionAwareDetectionTosimplifythenotation,weintroduceamatrixbasedrepresentationforDPM.GiventheimageIandthecollec-tionofmpartlocationsL=(l0,l1,···,lm),theHOGfea-tureφa(I,li)ofthei-thpartisanh×nw×nfdimensionaltensor,wherenh,nwaretheheightandwidthofHOGcellsforthepart,andnfisthedimensionofgradienthistogramfeaturevectorforacell.Wereshapeφa(I,li)tobeama-

fixedtobel0.Theproblemcanbesolvedeffectivelybythedynamicprogramming[19].Mixturecanbeusedtoincreasetheflexibility,butweignoreitforsimplicityinno-tationandaddingmixtureintheformulationsisstraightfor-InDPM,pedestrianconsistsofparts,andeverypartcon-sistsofHOGcells.Whenthepedestrianresolutionchanges,thestructureofpartsandtheHOGcellspatialrelationshipkeepthesame.Theonlydifferenceamongdifferentres-olutionliesinthefeaturevectorofevertcell,sothattheresolutionawaretransformationsPLandPHaredefinedonit.ThePLandPHareofthedimensionnd×nf,andtheymapthelowandhighresolutionsamplesfromtheoriginalnfdimensionalfeaturespacetothenddimensionalsub-space.Thefeaturesfromdifferentresolutionsaremappedintothecommonsubspace,sothatcansharethesamede-tector.WestilldenotethelearnedappearanceparametersinthemappedresolutioninvariantsubspaceasWa,whichisand×ncmatrix,andofthesamesizewithPHΦa(I,L).ThescoreofacollectionofpartlocationsLintheMT-DPMisdefinedas: Tr(WTPHΦa(I,L))+wTφs(I, Tr(WTPLΦa(I,L))+wTφs(I, Low trixΦa(I,li),whereeverycolumnrepresentsfeatures acell.Φa(I,li)isfurtherconcatenatedtobealargema-trixΦa(I,L)=[Φa(I,l0),Φa(I,l1),···Φa(I,lm)].ThecolumnnumberofΦa(I,L)isdenotedasnc,whichisthesumnumberofcellsinpartsandroot.DemonstrationoftheprocedureisshowninFig.3.Theappearancefiltersinthedetectorareconcatenatedtobeanf×ncmatrixWaconcatenatedtobeavectorφs(I,L),andthespatialpriorparameterisdenotedasws.Withthesenotations,thedetec-tionmodelofDPM[19]canbewrittenas:score(I,L)=Tr(WTΦa(I,L))+wTφs(I,

pedestriansofdifferentresolutions,butalsobringschal-lenges,sincetheWa,ws,PH,PLareallunknown.Inthefollowingpart,wepresenttheobjectivefunctionofthemulti-taskmodelforlearning,andshowtheoptimizationMulti-TasktaskDPM.Itsmatrixformcanbewrittenas: 1 arg

/Wa/F

2ws 2whereTr(·)isthetraceoperationdefinedassummation +CXmax[0,1?y(Tr(WTΦ(I,L?))+wTφ(L?2theelementsonthemaindiagonalofamatrix.Givenrootlocationl0,allthepartlocationsarelatent

a s andthefinalscoreismaxL?score(I,L?),whereL?is

where/·/FistheFrobeniusNorm,and bestpossiblepartconfigurationswhentherootlocation

Tr(WWT).

is1ifI(L)ispedestrian,

n ? background.Thefirsttwotermsareusedforregularize A2PLΦa(In,Ln)asa(In,Ln).Eq.4canbedetectorparameters,andthelasttermisthehingelossin tion.TheL?istheoptimizedpartconfiguration izesthedetnectionscoreofIn.Inthephase,thepartlocationsaretakenaslatentvariables,

lated

- 1wT argW

IIWaII+21F區(qū)theproblemcanbeoptimizedbytheLatent-SVM

asmax[0,1?y(Tr(T-(I,L?))+wTφFormulti-tasklearning,therelationshipbetween ogytothe NH

a s 1

whichhasthesameformwiththeoptimizationprobleminEq.3,andtheLatent-SVMsolvercanbeusedhere.OncethesolutiontoEq.6isachieved,Waiscomputedby ws T?Wa,ws,PH,PL+fIH(Wa,ws,PH)+fIL(Wa,ws,whereIHandILdenotethehighandlowresolutiontrain-spatialtermwsisdirectlyappliedtothedatafromdifferen-tresolutions,itcanberegularizedindependently.fIHandfILareusedtoconsiderthedetectionlossandregularizetheparametersPH,PLandWa.fIHandfILareofthesameform,herewetakefIHasanexample.Itcanbewrittenas:

(PHPH+PLPL)2OptimizePHandWhentheWaandwsarefixed,PHandPLareinde-pendent,thustheoptimizationproblemcanbedividedintotwosubproblems:argminPHfIH(Wa,ws,PH)andargminPLfIL(Wa,ws,PL).Sincetheyareofthesameform,hereweonlygivethedetailsforoptimizingPH.nGiventheWaandws,wefirstinferthepartlocationofeverytrainingsamplesL?byfindinTgapartn

izeEq.2.DenotingWaWaasA,A2PHasHfIH(Wa,ws,PH)

2IIPH

andA?12WaΦa(IH,L?) 區(qū) max[0,1?yn(Tr(WTPHΦa(IH,L?))+wTφs(L?

ofEq.4equals

asΦa(IHn,Ln),thewheretheregularizationtermPTWaisanf×nc arg

II Hsionalmatrix,andofthesamedimensionwiththe featurematrix.SincePHandWaareappliedto

H Tr(-- ,L?))+wTφ(L? a s appearancefeatureintegrallyincalculatingthe HscoreTr((PTWa)TΦa(I,L),wetakethemasanensem-bleandregularizethemtogether.ThesecondtermisH

TheonlydifferencebetweenEq.7andstandardSVMisanadditionaltermwTφs(L?).SincewTφs(L?)isa intheoptimization,itcanbetakenasanadditionaltothedetectionmodelinEq.2.TheparametersWand areshared fIL.NotethatmoreofresolutionscanbehandlenaturallyinEq. solvedbyastandardSVMsolver.AfterwegetH,thea?InEq.4,weneedtofindanoptimalcombinationofWa,ws,PH,andPL.However,Eq.4isnotconvexwhenallofthemare.Fortunay,weshowthatgiventhetwo

canthenbecomputedby(WaWT

2Htransformations,theproblemcanbetransformedintoas-tandardDPMproblem,andgiventheDPMdetector,itcanbetransformedintoastandardSVMproblem.We

tacoordinatedescentproceduretooptimizethetwosub-problemsi OptimizeWaandWhenPHandPLarefixed,wecanmapthefeaturestothecommonspaceonwhichDPMdetectorcanbelearned.

Tostarttheloopofthecoordinatedescentprocedure,oneneedtogiveinitialvaluesforeither{Wa,ws}or{PH,PL}.Inourimplementation,wecalculatethePCAofHOGfea-turesfromrandomlygeneratedhighandlowresolutionpatches,andusethefirstndeigenvectorsastheinitialvalueofPHandPL,respectively.WeusetheHOGfeaturesin[19]andabandonthelasttruncationterm,thusnf=31inourexperiment.Thedimensionnddetermineshowmuchinformationiskeptforsharing.Weexaminetheeffectofndintheexperiments.ThesolverinoptimizingtheWedenotePHPH+PLPLasA,A2Waasa. lemEq.6andEq.7arebasedonthe[22]. 2highresolutionsampleswedenoteA?12

HΦa(In,L?)

numberofthecoordinatedescentloopissettobe8.nna(In,L?),andforlowresolutionsampleswe binsizeinHOGissetto8forhighresolutionmodel,andnnforlowresolution.Therootfiltercontains8×4HOGcellsforbothlowandhighresolutiondetectionmodel.Pedestrian-VehicleContextinTrafficAlotofdetectionsarelocatedaroundvehiclesintraf-ficscenes(33.19%forourMT-DPMdetectoronCaltechBenark),asshowninFig.4.Itispossibletousethepedestrian-vehiclerelationshiptoinferwhetherthedetec-tionistrueorfalsepositive.Forexample,ifweknowthelocationofvehiclesinFig.4,thedetectionsaboveavehi-cle,anddetectionatthewheelpositionofavehiclecanbesafelyremoved.Fortunay,thevehiclesaremoreeasiertobelocalizedthanpedestrians,whichhasbeenprovedinpreviouswork(e.g.PascalVOC[17],KITTI[20]).Sinceitisdifficulttocapturethecomplexrelationshipbyhandcraftrules,webuildacontextmodelandlearnitautomaticallyfromdata.

Figure4.Examplesoforiginaldetection,andthedetectionopti-mizedbythecontextmodel.wherewpandwvaretheparametersofgeometrycontextandpedestrian-vehiclecontext,whichensurethetruthde-tection(P,V)haslargercontextscorethananyotherde-tectionhypotheses.GiventheoriginalpedestriansandvehiclesdetectionPandV,whethereachdetectionisafalsepositiveortruepositiveisdecidedby izingthecontextscore:vehiclesintofivetypes,including:“Above”,“Next-

arg

wTg(p)+

twTg(p,v)),“Below”,“Overlap”and“Far”.Wedenotethefeature

i pedestrian-vehiclecontextasg(p,v).Ifapedestriande-tectionpandavehicledetection1vhaveoneofthe

tpi ijj

jfourrelationships,thecontextfeaturesatthecorrespond-ingdimensionsaredefinedas(σ(s),?cx,?cy,?h,1),andotherdimensionsretaintobe0.Ifthepedestriandetectionandvehicledetectionaretoofarorthere’snovehicle,allthedimensionsofitspedestrian-vehiclefeatureis0.Here?cx=|cvx?cpx|,?cy=cvy?cpy,and?h=hv/hp,where(cvx,cvy),(cpx,cpy)arethecentercoordinatesofve-hicledetectionvandpedestriandetectionp,respectively.σ(s)=1/(1+exp(?2s))isusedtonormalizethedetec-tionscoreto[0,1].Fortheleft-rightsymmetry,theabsoluteoperationisconductedfor?cx.Moreover,aspointedin[33],

wheretpiandtvjarethebinaryvalue,0meansthefalsepositiveand1meansthetruepositive.Eq.9isaintegerprogrammingproblem,but estrivialwhenthelabelofVisfixed,sinceitequalsto izingeverypedes-triansindependently.Intypicaltrafficscenes,thenumberofvehiclesislimited.Forexample,inCaltechPedestrianBenark,therearenomorethan8vehiclesinanimage,sothattheproblemcanbesolvedbynomorethan28trivialsub-problems,whichcanbeveryefficientinrealapplica-Forthelinearproperty,Eq.9isequalalsohasarelationshipbetweenthecoordinateandthescale

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論