




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡介
深學(xué)習(xí)綜述討論簡介第1頁/共52頁OutlineConceptionofdeeplearningDevelopmenthistoryDeeplearningframeworksDeepneuralnetworkarchitecturesConvolutionalneuralnetworks
IntroductionNetworkstructureTrainingtricksApplicationinAestheticImageEvaluationIdea
第2頁/共52頁DeepLearning(Hinton,2006)Deeplearningisabranchofmachinelearningbasedonasetofalgorithmsthatattempttomodelhighlevelabstractionsindata.Theadvantageofdeeplearningistoextractingfeaturesautomatically
insteadofextractingfeaturesmanually.ComputervisionSpeechrecognitionNaturallanguageprocessing第3頁/共52頁DevelopmentHistory194319401950196019701980199020002010MPmodel1958Single-layerPerceptron1969XORproblem1986BPalgorithm1989CNN-LeNet19951997SVMLSTMGradientdisappearanceproblem19912006DBNReLU201120122015DropoutAlexNetBNFasterR-CNNResidualNetGeoffreyHintonW.S.McCullochW.PittsRosenblattMarvinMinskyYannLeCunHintonHintonHintonLeCunBengio第4頁/共52頁DeepLearningFrameworks第5頁/共52頁DeepneuralnetworkarchitecturesDeepBeliefNetworks(DBN)RecurrentNeuralNetworks(RNN)GenerativeAdversarialNetworks(GANs)ConvolutionalNeuralNetworks(CNN)LongShort-TermMemory(LSTM)第6頁/共52頁DBN(DeepBeliefNetwork,2006)Hiddenunitsandvisibleunits
Eachunitisbinary(0or1).
Everyvisibleunitconnectstoallthehiddenunits.
Everyhiddenunitconnectstoallthevisibleunits.
Therearenoconnectionsbetweenv-vandh-h.HintonGE.Deepbeliefnetworks[J].Scholarpedia,2009,4(6):5947.Fig1.RBM(restrictedBoltzmannmachine)structure.Fig2.DBN(deepbeliefnetwork)structure.Idea?ComposedofmultiplelayersofRBM.Howtowetraintheseadditionallayers?
Unsupervisedgreedyapproach第7頁/共52頁RNN(RecurrentNeuralNetwork,2013)What?RNNaimstoprocessthesequencedata.RNNwillrememberthepreviousinformationandapplyittothecalculationofthecurrentoutput.Thatis,thenodesofthehiddenlayerareconnected,andtheinputofthehiddenlayerincludesnotonlytheoutputoftheinputlayerbutalsotheoutputofthehiddenlayer.MarhonSA,CameronCJF,KremerSC.RecurrentNeuralNetworks[M]//HandbookonNeuralInformationProcessing.SpringerBerlinHeidelberg,2013:29-65.Applications?MachineTranslationGeneratingImageDescriptionsSpeechRecognitionHowtotrain?
BPTT(Backpropagationthroughtime)第8頁/共52頁GANs(GenerativeAdversarialNetworks,2014)GANsInspiredbyzero-sumGameinGameTheory,whichconsistsofapairofnetworks-ageneratornetworkandadiscriminatornetwork.Thegeneratornetworkgeneratesasamplefromtherandomvector,thediscriminatornetworkdiscriminateswhetheragivensampleisnaturalorcounterfeit.Bothnetworkstraintogethertoimprovetheirperformanceuntiltheyreachapointwherecounterfeitandrealsamplescannotbedistinguished.GoodfellowI,Pouget-AbadieJ,MirzaM,etal.Generativeadversarialnets[C]//Advancesinneuralinformationprocessingsystems.2014:2672-2680.Applacations:ImageeditingImagetoimagetranslationGeneratetextGenerateimagesbasedontextCombinedwithreinforcementlearningAndmore…第9頁/共52頁LongShort-TermMemory(LSTM,1997)第10頁/共52頁NeuralNetworksNeuronNeuralnetwork第11頁/共52頁ConvolutionalNeuralNetworks(CNN)Convolutionneuralnetworkisakindoffeedforwardneuralnetwork,whichhasthecharacteristicsofsimplestructure,lesstrainingparametersandstrongadaptability.CNN
avoids
thecomplexpre-processingofimage(etc.extracttheartificialfeatures),wecandirectlyinput
theoriginalimage.
Basiccomponents:ConvolutionLayers,PoolingLayers,FullyconnectedLayers第12頁/共52頁ConvolutionlayerTheconvolutionkerneltranslates
ona2-dimensionalplane,andeachelementoftheconvolutionkernelismultiplied
bytheelementatthecorrespondingpositionoftheconvolutionimageandthensumalltheproduct.Bymovingtheconvolutionkernel,wehaveanewimage,whichconsistsofthesumoftheproductoftheconvolutionkernelateachposition.localreceptivefieldweightsharingReduced
thenumberofparameters第13頁/共52頁P(yáng)oolinglayerPoolinglayeraimstocompresstheinputfeaturemap,whichcanreducethenumberofparameters
intrainingprocessandthedegreeof
over-fitting
ofthemodel.Max-pooling:Selectingthemaximumvalueinthepoolingwindow.Mean-pooling:Calculatingtheaverageofallvaluesinthepoolingwindow.第14頁/共52頁FullyconnectedlayerandSoftmaxlayerEachnodeofthefullyconnectedlayerisconnectedtoallthenodesofthelastlayer,whichisusedtocombinethefeaturesextractedfromthefrontlayers.Fig1.Fullyconnectedlayer.Fig2.CompleteCNNstructure.Fig3.Softmaxlayer.第15頁/共52頁TrainingandTestingForwardpropagation-Takingasample(X,Yp)fromthesamplesetandputtheXintothenetwork;-CalculatingthecorrespondingactualoutputOp.Backpropagation-CalculatingthedifferencebetweentheactualoutputOpandthecorrespondingidealoutputYp;-Adjustingtheweightmatrixbyminimizingtheerror.Trainingstage:Testingstage:Puttingdifferentimagesandlabelsintothetrainedconvolutionneuralnetworkandcomparingtheoutputandtheactualvalueofthesample.Beforethetrainingstage,weshouldusesomedifferentsmallrandomnumberstoinitializeweights.第16頁/共52頁CNNStructureEvolutionHintonBPNeocognitionLeCunLeNetAlexNetHistoricalbreakthroughReLUDropoutGPU+BigDataVGG16VGG19MSRA-NetDeepernetworkNINGoogLeNetInceptionV3InceptionV4R-CNNSPP-NetFastR-CNNFasterR-CNNInceptionV2(BN)FCNFCN+CRFSTNetCNN+RNN/LSTMResNetEnhancedthefunctionalityoftheconvolutionmoduleClassificationtaskDetectiontaskAdd
newfunctionalunitintegration19801998198920142015ImageNetILSVRC(ImageNetLargeScaleVisualRecognitionChallenge)20132014201520152014,2015201520122015BN(BatchNormalization)RPN第17頁/共52頁LeNet(LeCun,1998)LeNet
isaconvolutionalneuralnetworkdesignedbyYannLeCunforhandwrittennumeralrecognitionin1998.Itisoneofthemostrepresentativeexperimentalsystemsinearlyconvolutionalneuralnetworks.LeNetincludestheconvolutionlayer,poolinglayer
andfull-connectedlayer,whicharethebasiccomponentsofmodernCNNnetwork.LeNetisconsideredtobethebeginningoftheCNN.networkstructure:3convolutionlayers+2poolinglayers+1fullyconnectedlayer+1outputlayerHaykinS,KoskoB.GradientBasedLearningAppliedtoDocumentRecognition[D].Wiley-IEEEPress,2009.第18頁/共52頁AlexNet(Alex,2012)Networkstructure:5convolutionlayers+3fullyconnectedlayersThenonlinearactivationfunction:ReLU(Rectifiedlinearunit)Methodstopreventoverfitting:Dropout,DataAugmentationBigDataTraining:ImageNet--imagedatabaseofmillionordersofmagnitudeOthers:GPU,LRN(localresponsenormalization)layerKrizhevskyA,SutskeverI,HintonGE.ImageNetclassificationwithdeepconvolutionalneuralnetworks[C]//InternationalConferenceonNeuralInformationProcessingSystems.CurranAssociatesInc.2012:1097-1105.第19頁/共52頁Overfeat(2013)SermanetP,EigenD,ZhangX,etal.OverFeat:IntegratedRecognition,LocalizationandDetectionusingConvolutionalNetworks[J].EprintArxiv,2013.第20頁/共52頁VGG-Net(OxfordUniversity,2014)input:afixed-size224*224RGBimagefilters:averysmallreceptivefield--3*3,withstride1Max-pooling:2*2pixelwindow,withstride2Fig1.ArchitectureofVGG16Table1:ConvNetconfigurations(shownincolumns).Theconvolutionallayerparametersaredenotedas“conv<receptivefieldsize>-<numberofchannels>”
SimonyanK,ZissermanA.VeryDeepConvolutionalNetworksforLarge-ScaleImageRecognition[J].ComputerScience,2014.Why3*3filters?Stackedconv.layershavealargereceptivefieldMorenon-linearityLessparameterstolearn第21頁/共52頁Network-in-Network(NIN,ShuichengYan,2013)Networkstructure:4Mlpconvlayers+GlobalaveragepoolinglayerFig1.linearconvolutionMLPconvolutionFig2.fullyconnectedlayerglobalaveragepoolinglayerMinLinetal,NetworkinNetwork,Arxiv2013.Fig3.NINstructureLinearcombinationofmultiplefeaturemaps.Informationintegrationofcross-channel.ReducedtheparametersReducedthenetworkAvoidedover-fitting第22頁/共52頁GoogLeNet(InceptionV1,2014)Fig1.Inceptionmodule,na?veversionProposedinceptionarchitectureandoptimizeditCanceled
thefullyconnnectedlayerUsedauxiliaryclassifierstoacceleratenetworkconvergenceSzegedyC,LiuW,JiaY,etal.Goingdeeperwithconvolutions[C]//ProceedingsoftheIEEEConferenceonComputerVisionandPatternRecognition.2015:1-9.Fig2.InceptionmodulewithdimensionreductionsFig3.GoogLeNetnetwork(22layers)第23頁/共52頁InceptionV2(2015)IoffeS,SzegedyC.Batchnormalization:Acceleratingdeepnetworktrainingbyreducinginternalcovariateshift[J].arXivpreprintarXiv:1502.03167,2015.第24頁/共52頁InceptionV3(2015)SzegedyC,VanhouckeV,IoffeS,etal.Rethinkingtheinceptionarchitectureforcomputervision[C]//ProceedingsoftheIEEEConferenceonComputerVisionandPatternRecognition.2016:2818-2826.第25頁/共52頁ResNet(KaiwenHe,2015)Asimpleandcleanframeworkoftraining“very”deepnetworks.State-of-the-artperformanceforImageclassificationObjectdetectionSemanticSegmentationandmoreHeK,ZhangX,RenS,etal.DeepResidualLearningforImageRecognition[J].2015:770-778.Fig1.ShortcutconnectionsFig2.ResNetstructure(152layers)第26頁/共52頁FractalNet第27頁/共52頁InceptionV4(2015)SzegedyC,IoffeS,VanhouckeV,etal.Inception-v4,inception-resnetandtheimpactofresidualconnectionsonlearning[J].arXivpreprintarXiv:1602.07261,2016.第28頁/共52頁Inception-ResNetHeK,ZhangX,RenS,etal.DeepResidualLearningforImageRecognition[J].2015:770-778.第29頁/共52頁Comparison第30頁/共52頁SqueezeNet
SqueezeNet:AlexNet-levelaccuracywith50xfewerparametersand<0.5MBmodelsize第31頁/共52頁Xception第32頁/共52頁R-CNN(2014)Regionproposals:SelectiveSearch
Resizetheregionproposal:Warpallregionproposalstotherequiredsize(227*227,
AlexNetInput)
ComputeCNNfeature:Extracta4096-dimensionalfeaturevectorfromeachregionproposalusingAlexNet.
Classify:TrainingalinearSVMclassifierforeachclass.[1]UijlingsJRR,SandeKEAVD,GeversT,etal.SelectiveSearchforObjectRecognition[J].InternationalJournalofComputerVision,2013,104(2):154-171.[2]GirshickR,DonahueJ,DarrellT,etal.RichFeatureHierarchiesforAccurateObjectDetectionandSemanticSegmentation[J].2014:580-587.R-CNN:Regionproposals+CNN第33頁/共52頁SPP-Net(Spatialpyramidpoolingnetwork,2015)HeK,ZhangX,RenS,etal.SpatialPyramidPoolinginDeepConvolutionalNetworksforVisualRecognition[J].IEEETransactionsonPatternAnalysis&MachineIntelligence,2015,37(9):1904-1916.Fig2.Anetworkstructurewithaspatialpyramidpoolinglayer.Fig1.Top:AconventionalCNN.Bottom:Spatialpyramidpoolingnetworkstructure.Advantages:Getthefeaturemapoftheentireimagetosavemuchtime.Outputafixedlengthfeaturevectorwithinputsofarbitrarysizes.Extractthefeatureofdifferentscale,andcanexpressmorespatialinformation.TheSPP-Netmethodcomputesaconvolutionalfeaturemapfortheentireinputimageandthenclassifieseachobjectproposalusingafeaturevectorextractedfromthesharedfeaturemap.第34頁/共52頁FastR-CNN(2015)AFastR-CNNnetworktakesanentireimageandasetofobjectproposalsasinput.Thenetworkprocessestheentireimagewithseveralconvolutional(conv)andmaxpoolinglayerstoproduceaconvfeaturemap.Foreachobjectproposal,aregionofinterest(RoI)poolinglayerextractsafixed-lengthfeaturevectorfromthefeaturemap.Eachfeaturevectorisfedintoasequenceoffullyconnectedlayersthatfinallybranchintotwosiblingoutputlayers.
GirshickR.Fastr-cnn[C]//ProceedingsoftheIEEEInternationalConferenceonComputerVision.2015:1440-1448.第35頁/共52頁FasterR-CNN(2015)FasterR-CNN=RPN+FastR-CNN
ARegionProposalNetwork(RPN)takesanimage(ofanysize)asinputandoutputsasetofrectangularobjectproposals,eachwithanobjectnessscore.
RenS,HeK,GirshickR,etal.Fasterr-cnn:Towardsreal-timeobjectdetectionwithregionproposalnetworks[C]//Advancesinneuralinformationprocessingsystems.2015:91-99.Figure1.FasterR-CNNisasingle,unifiednetworkforobjectdetection.Figure2.RegionProposalNetwork(RPN).第36頁/共52頁TrainingtricksDataAugmentationDropoutReLUBatchNormalization第37頁/共52頁DataAugmentation-rotation-flip-zoom-shift-scale-contrast-noisedisturbance-color-...第38頁/共52頁Dropout(2012)Dropoutconsistsofsettingtozerotheoutputofeachhiddenneuronwithprobabilityp.Theneuronswhichare“droppedout”inthiswaydonotcontributetotheforwardbackpropagationanddonotparticipateinbackpropagation.第39頁/共52頁ReLU(RectifiedLinearUnit)
advantagesrectifiedSimplifiedcalculationAvoidedgradientdisappeared第40頁/共52頁BatchNormalization(2015)Intheinputofeachlayerofthenetwork,insertanormalizedlayer.Foralayerwithd-dimensionalinputx=(x(1)...x(d)),wewillnormalizeeachdimension:IoffeS,SzegedyC.Batchnormalization:Acceleratingdeepnetworktrainingbyreducinginternalcovariateshift[J].arXivpreprintarXiv:1502.03167,2015.Internal
Covariate
Shift
第41頁/共52頁ApplicationinAestheticImageEvaluationDongZ,ShenX,LiH,etal.PhotoQualityAssessmentwithDCNNthatUnderstandsImageWell[M]//MultiMediaModeling.SpringerInternationalPublishing,2015:524-535.LuX,LinZ,JinH,etal.Ratingimageaestheticsusingdeeplearning[J].IEEETransactionsonMultimedia,2015,17(11):2021-2034.WangW,ZhaoM,WangL,etal.Amulti-scenedeeplearningmodelforimageaestheticevaluation[J].SignalProcessingImageCommunication,2016,47:511-518.第42頁/共52頁P(yáng)hotoQualityAssessmentwithDCNNthatUnderstandsImageWellDCNN_Aesthtrainedwellnetworkatwo-classSVMclassifierDCNN_Aesth_SPoriginalimagessegmentedimagesspatialpyramidImageNetCUHKAVADongZ,ShenX,LiH,etal.PhotoQualityAssessmentwithDCNNthatUnderstandsImageWell[M]//MultiMediaModeling.SpringerInternationalPublishing,2015:524-535.第43頁/共52頁RatingimageaestheticsusingdeeplearningSupportheterogeneousinputs,i.e.,globaland
localviews.AllparametersinDCNNarejointlytrained.Fig1.GlobalviewsandlocalviewsofanimageFig3.DCNNarchitectureFig2.SCNNarchitecture
SCNNDCNN
Enablesthenetworktojudgeimageaestheticswhilesimultaneouslyconsideringboththeglobalandlocalviewsofanimage.LuX,LinZ,JinH,etal.Ratingimageaestheticsusingdeeplearning[J].IEEETransactionsonMultimedia,2015,17(11):2021-2034.第44頁/共52頁Amulti-scenedeeplearningmodelforimageaestheticevaluationDesignasceneconvolutionallayerconsistofmulti-groupdescriptorsinthenetwork.Designapre-trainingproceduretoinitializeourmodel.Fig1.Thearchitectureofthemulti-scenedeeplearningmodel(MSDLM).Fig2.TheoverviewofproposedMSDLM.ArchitectureofMSDLM:4
convolutionallayers+1sceneconvolutionallayer+3fullyconnectedlayersWangW,ZhaoM,WangL,etal.Amulti-scenedeeplearningmodelforimageaestheticevaluation[J].SignalProcessingImageCommunication,2016,47:511-518.第45頁/共52頁Example-Loadthedatasetdefload_dataset():url='/data/mnist/mnist.pkl.gz'filename='E:/DeepLearning_Library/mnist.pkl.gz'
ifnotos.path.exists(filename):
print("DownloadingMNISTdataset...")
urlretrieve(url,filename)
withgzip.open(filename,'rb')asf:data=pickle.load(f)X_train,y_train=data[0]X_val,y_val=data[1]X_test,y_test=data[2]X_train=X_train.reshape((-1,1,28,28))X_val=X_val.reshape((-1,1,28,28))X_test=X_test.reshape((-1,1,28,28))y_train=y_train.astype(np.uint8)y_val=y_val.astype(np.uint8)y_test=y_test.astype(np.uint8)
returnX_train,y_train,X_val,y_val,X_test,y_test
X_train,y_train,X_val,y_val,X_test,y_test=load_dataset()plt.imshow(X_train[0][0],cmap=cm.binary)第46頁/共52頁Example–Modelnet1=NeuralNet(layers=[('input',layers.InputLayer),
('conv2d1',
layers.Conv2DLayer),
('maxpool1',
layers.MaxPool2DLayer),
('conv2d2',layers.Conv2DLayer),
('maxpool2',layers.MaxPool2DLayer),
('dropout1',layers.DropoutLayer),
('dense',layers.DenseLayer),
('dropout2',layers.DropoutLayer),
('output',layers.DenseLayer),
],
#inputlayerinput_shape=(None,1,28,28),#layerconv2d1conv2d1_num_filters=32,conv2d1_filter_size=(5,5),
conv2d1_nonlinearity=lasagne.nonlinearities.rectify,conv2d1_W=lasagne.init.GlorotUniform(),
#layermaxpool1maxpool1_pool_size=(2,2),#layerconv2d2conv2d2_num_filters=32,conv2d2_filter_size=(5,5),conv2d2_nonlinearity=lasagne.nonlinearities.rectify,
#layermaxpool2maxpool2_pool_size=(2,2),
#dropout1dropout1_p=0.5,
#densei.e.full-connectedlayerdense_num_units=256,dense_nonlinearity=lasagne.nonlinearities.rectify,
#dropout2dropout2_p=0.5,
#outputoutput_nonlinearity=lasagne.nonlinearities.softmax,output_num_units=10,
#optimizationmethodparamsupdate=nesterov_momentum,update_learning_rate=0.01,update_momentum=0.9,max_ep
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 鋼筋運(yùn)輸途中檢驗(yàn)合同
- 2025農(nóng)產(chǎn)品批發(fā)市場的農(nóng)產(chǎn)品交易合同范本
- 2025租房合同范本大全下載
- 山林轉(zhuǎn)讓合同
- 公司股權(quán)代持協(xié)議范本
- 2025年大連市商品供銷合同模板
- 2025標(biāo)準(zhǔn)固定期限雇傭合同
- 合伙門店轉(zhuǎn)讓協(xié)議書
- 保潔服務(wù)用工協(xié)議書
- 2025年03月河南省黃河科技學(xué)院納米功能材料研究所公開招聘筆試歷年典型考題(歷年真題考點(diǎn))解題思路附帶答案詳解
- 2024年江蘇省泰州市姜堰區(qū)中考二模化學(xué)試題(無答案)
- 村辦公樓可行性研究報(bào)告
- MOOC 知識(shí)創(chuàng)新與學(xué)術(shù)規(guī)范-南京大學(xué) 中國大學(xué)慕課答案
- MOOC 企業(yè)文化與商業(yè)倫理-東北大學(xué) 中國大學(xué)慕課答案
- 高考物理二輪復(fù)習(xí)課件力學(xué)三大觀點(diǎn)在電磁感應(yīng)中的應(yīng)用
- (2024年)小學(xué)體育籃球規(guī)則課件
- 吳明珠人物介紹
- 2024年北京京能清潔能源電力股份有限公司招聘筆試參考題庫含答案解析
- 穴位貼敷治療失眠
- 于東來人物故事
- 痛經(jīng)(中醫(yī)婦科學(xué))
評論
0/150
提交評論