版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
Self-driving Surveillancedetection Medicaldiagnostics GamePersonalassistant
DeepLearning深度學(xué)習(xí)正在改變世界
Art Imagerecognition Speechrecognition Naturallanguage Generativemodel Reinforcementlearning catdoghoneybadgercatdoghoneybadger
CatDogRaccoonlossloss????1
????2
????3
????4
????5
ErrorsDogRDMA14Mimages海量的(標(biāo)識(shí))數(shù)據(jù)RDMA14Mimages
深度學(xué)習(xí)算法的進(jìn)步 語言、框架
計(jì)算能力 深度學(xué)習(xí)+系統(tǒng)的進(jìn)步:編程語言、優(yōu)化、計(jì)算機(jī)體系結(jié)構(gòu)、并行計(jì)算以及分布式系統(tǒng)E.g.,imageclassificationproblemMNISTImageNetWebImages60Ksamples16MsamplesBillionsofImages10categories1000categoriesOpenedcategoriesTESTERRORRATE(%)TESTERRORRATE(%)123AlexNet,16.4%ReLU,Dropout,2012Inception,6.7%Batchnormalization,2015ResNet,3.57%Residualway,2015 AlexNet,16.4%ReLU,Dropout,2012Inception,6.7%Batchnormalization,2015ResNet,3.57%Residualway,2015EfficientNet,3.1%NASLeNet,convolution,max-pooling,softmax,1998EfficientNet,3.1%NASLeNet,convolution,max-pooling,softmax,1998 ImagerecognitionSpeechrecognition
NaturallanguageReinforcementlearning TPUv3360TPUv3360TopsV100TPUv1125Tops90TopsPerformance(Op/Sec)?TPUDedicatedPerformance(Op/Sec)?TPUDedicatedHardwareGPUCPUMoore’slaw5KopsENIAC~500GopsXeonE5108x105x
1970 1980 1990 2000
2019CompilerBackendTVMTensorFlowXLACompilerBackendTVMTensorFlowXLALanguageFrontendSwiftforTensorFlowMxNetCNTKLanguageFrontendSwiftforTensorFlowMxNetCNTKPyTorchCustompurposemachinelearningalgorithmsTheanoDisBeliefCaffeAlgebra&linearAlgebra&linearlibsCPUGPUDensematmulengineGPUFPGASpecialAIacceleratorsTPUGraphCoreOtherASICs CustompurposemachinelearningalgorithmsTheanoCustompurposemachinelearningalgorithmsTheanoDisBeliefCaffeDeeplearningframeworksprovideeasierwaystoleveragevariouslibrariesMachineLearningLanguageandCompilerPowerfulCompilerInfrastructure:Codeoptimization,sparsityoptimization,hardwaretargetingAFull-FeaturedProgrammingLanguageforML:ExpressiveandflexibleControlflow,recursion,sparsityAlgebra&Algebra&linearlibsCPUGPUAIframeworkDensematmulengineSIMD→MIMDSparsitySupportControlFlowandDynamicityAssociatedMemory End-to-EndAIUserExperiencesModel,Algorithm,Pipeline,Experiment,End-to-EndAIUserExperiencesModel,Algorithm,Pipeline,Experiment,LifeCycleManagementProgrammingInterfacesComputationgraph,(auto)GradientcalculationIR,CompilerinfrastructureProgrammingInterfacesComputationgraph,(auto)GradientcalculationIR,CompilerinfrastructureHardwareAPIs(GPU,CPU,FPGA,ASIC)ResourceManagement/SchedulerHardwareAPIs(GPU,CPU,FPGA,ASIC)ResourceManagement/Scheduler ScalableNetworkStack(RDMA,IB,NVLink)DeepLearningRuntime:Optimizer,Planner,ExecutorArchitecture(singlenodeandCloud)
class3class4class5class6class7class8更廣泛的AI系統(tǒng)生態(tài)class
機(jī)器學(xué)習(xí)新模式(RL)
深度學(xué)習(xí)算法和框架classclassclass
自動(dòng)機(jī)器學(xué)習(xí)(AutoML)安全與隱私模型推導(dǎo)、壓縮與優(yōu)化
通用AI算法支持與進(jìn)化深度神經(jīng)網(wǎng)絡(luò)編譯架構(gòu)及優(yōu)化
深度學(xué)習(xí)任務(wù)運(yùn)行和優(yōu) 通用資源管理和調(diào)度化環(huán)境 統(tǒng)
新型硬件及相關(guān)高性能網(wǎng)絡(luò)和計(jì)算棧 (2)開始訓(xùn)練
定義網(wǎng)絡(luò)結(jié)構(gòu) Fullyconnected最后幾層
Convolutionalneuralnetwork等Locality強(qiáng)的數(shù)據(jù)
Recurrentneuralnetwork化的數(shù)據(jù),比如文本信息、知識(shí)圖
Transformerneuralnetwork比如文本信息 #ArecursiveTreeBankmodelinadozenlinesofJPLcode#Walkthetree,accumulatingembeddingvecs#Wordembeddingmodelisusedattheleafnodetomapword#indexintohigh-dimensionalsemanticwordrepresentation.#Getsemanticrepresentationsforleftandrightchildren.#Acompositionfunctionisusedtolearnsemantic#representationforphraseattheinternalnode.#Maptreeembeddingtosentiment
更多樣化的結(jié)構(gòu)更復(fù)雜的依賴關(guān)系更細(xì)粒度的計(jì)算模式ExecutionRuntimeCPU,GPU,RDMAdevicesGraphdefinition(IR)xw*b+yFront-endLanguageBinding:Python,Lua,R,C++OptimizationBatching,Cache,Overlap ExecutionRuntimeCPU,GPU,RDMAdevicesGraphdefinition(IR)xw*b+yFront-endLanguageBinding:Python,Lua,R,C++OptimizationBatching,Cache,OverlapData-FlowGraph(DFG)asIntermediateRepresentation
x y z*a+bΣc
TensorFlow AddgradientbackpropagationAddgradientbackpropagationData-FlowGraph(DFG)xyz??x??y*a*????z+bΣc+????a??bΣ??x y z
??x ??yCPUcodeGPUcode
* a+ +??b ??bΣ Σ??c
??a
??zxyz??x??y*a*????z+bΣc+????a??bΣ??xyz??x??y*a*????z+bΣc+????a??bΣ??......1......1Operators IDEProgrammingwith:VSCode,JupiterNotebookIDEProgrammingwith:VSCode,JupiterNotebookLanguageIntegratedwithmainstreamPL:PyTorchandTensorFlowinsidePythonCompilerIntermediaterepresentationCompilationOptimizationBasicdatastructure:TensorLexicalanalysis:TokenUsercontrolled:mini-batchBasiccomputation:DAGParsing:ASTDataparallelismandmodelparallelismAdvancefeatures:controlflowSemanticanalysis:SymbolicADLoopnetsanalysis:pipelineparallelism,controlflowGeneralIRs:MLIRCodeoptimizationDataflowanalysis:Arithmetic,FusionCodegenerationHardwaredependentoptimizations:matrixcomputation,layoutResourceallocationandscheduler:memory,recomputation,RuntimesSinglenode:CuDNNMultimode:Parameterservers,AllreducerComputationclusterresourcemanagementandjobschedulerHardwareHardwareaccelerators:CPU/GPU/ASIC/FPGANetworkaccelerators:RDMA/IB/NVLinkFrameworksArchitectureCompilerBackendTVMTensorFlowXLALanguageFrontendSwiftforTensorFlowMxNetTensorFlowCNTKPyTorch CompilerBackendTVMTensorFlowXLALanguageFrontendSwiftforTensorFlowMxNetTensorFlowCNTKPyTorchDeeplearningframeworksSpecialAIacceleratorsTPUGraphCoreOtherASICsAIFrameworkDensematmulengineGPUFPGAimport"tensorflow/core/framework/to";SpecialAIacceleratorsTPUGraphCoreOtherASICsAIFrameworkDensematmulengineGPUFPGAMachineLearningLanguageandCompilerPowerfulCompilerInfrastructure:Codeoptimization,sparsityoptimization,hardwaretargetingAFull-FeaturedProgrammingLanguageforML:ExpressiveandflexibleControlflow,recursion,sparsityMachineLearningLanguageandCompilerPowerfulCompilerInfrastructure:Codeoptimization,sparsityoptimization,hardwaretargetingAFull-FeaturedProgrammingLanguageforML:ExpressiveandflexibleControlflow,recursion,sparsitySIMD→SIMD→MIMDSparsitySupportControlFlowandDynamicityAssociatedMemory//SyntacticallysimilartoLLVM:func@testFunction(%arg0:i32){%x=call@thingToCall(%arg0):(i32)->i32br^bb1^bb1:%y=addi%x,%x:i32return%y:i32}深度學(xué)習(xí)高度依賴數(shù)據(jù)規(guī)模和模型規(guī)模
8layers1.416%Error2012AlexNet
Image152layersGFLOP%Error2015ResNetSpeech提高訓(xùn)練速度可以加快深度學(xué)習(xí)模型的開發(fā)速度大規(guī)模部署深度學(xué)習(xí)模型需要更快和更高效的推演速度Inferenceperformance→Servinglatency
80GFLOP7,000hrsofData8%Error2014DeepSpeech1
465GFLOP12,000hrsofData5%Error2015DeepSpeech2 Differentarchitectures:CNN,RNN,Transformer,…
Highcomputationresourcerequirements:modelsize,…Differentgoals:throughput,accuracy,…BeBetransparenttovarioususerrequirementsapplyoverheterogeneoushardwareenvironmentScale-out LocalEfficiency MemoryEffectivenessHardware SSD CPU/GPU/FGPA InfiniBand/NVLinkHyper-params OptimizerMini-batchLearningrateOptimizations Hardware SSD CPU/GPU/FGPA InfiniBand/NVLinkHyper-params OptimizerMini-batchLearning
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 二零二五年度智慧農(nóng)業(yè)項(xiàng)目合作協(xié)議范本2篇
- 2025年建筑項(xiàng)目環(huán)保協(xié)議9篇
- Module 2 Unit 2 Im a boy.(說課稿)-2024-2025學(xué)年外研版(一起)英語一年級(jí)上冊(cè)
- 呼叫中心危機(jī)應(yīng)對(duì)與處理考核試卷
- 刀剪產(chǎn)品的用戶體驗(yàn)優(yōu)化策略與實(shí)施案例實(shí)踐案例分析與實(shí)踐考核試卷
- 1《學(xué)習(xí)伴我成長(zhǎng)》(說課稿)-部編版道德與法治三年級(jí)上冊(cè)
- 《解決問題》單元整體設(shè)計(jì)(說課稿)-2023-2024學(xué)年四年級(jí)下冊(cè)數(shù)學(xué)北京版
- Unit 12 Revision(說課稿)-2024-2025學(xué)年科普版(2024)英語三年級(jí)上冊(cè)
- 地質(zhì)勘探設(shè)備生物醫(yī)療應(yīng)用考核試卷
- 2025年新世紀(jì)版選修化學(xué)下冊(cè)月考試卷含答案
- 河南省鄭州外國(guó)語高中-【高二】【上期中】【把握現(xiàn)在 蓄力高三】家長(zhǎng)會(huì)【課件】
- 建設(shè)項(xiàng)目施工現(xiàn)場(chǎng)春節(jié)放假期間的安全管理方案
- 2023年市場(chǎng)部主管年終工作總結(jié)及明年工作計(jì)劃
- 國(guó)有資產(chǎn)出租出借審批表(學(xué)校事業(yè)單位臺(tái)賬記錄表)
- 30第七章-農(nóng)村社會(huì)治理課件
- 考研考博-英語-東北石油大學(xué)考試押題三合一+答案詳解1
- 出國(guó)學(xué)生英文成績(jī)單模板
- 植物細(xì)胞中氨基酸轉(zhuǎn)運(yùn)蛋白的一些已知或未知的功能
- 山東省高等學(xué)校精品課程
- 三菱張力控制器LE-40MTA-E說明書
- 生活垃圾填埋場(chǎng)污染控制標(biāo)準(zhǔn)
評(píng)論
0/150
提交評(píng)論