![每位技術(shù)專家都應(yīng)了解的人工智能和深度學(xué)習(xí)要點(diǎn)_第1頁](http://file4.renrendoc.com/view2/M03/25/39/wKhkFmYm8D-AOrpdAAHzysvoK_M851.jpg)
![每位技術(shù)專家都應(yīng)了解的人工智能和深度學(xué)習(xí)要點(diǎn)_第2頁](http://file4.renrendoc.com/view2/M03/25/39/wKhkFmYm8D-AOrpdAAHzysvoK_M8512.jpg)
![每位技術(shù)專家都應(yīng)了解的人工智能和深度學(xué)習(xí)要點(diǎn)_第3頁](http://file4.renrendoc.com/view2/M03/25/39/wKhkFmYm8D-AOrpdAAHzysvoK_M8513.jpg)
![每位技術(shù)專家都應(yīng)了解的人工智能和深度學(xué)習(xí)要點(diǎn)_第4頁](http://file4.renrendoc.com/view2/M03/25/39/wKhkFmYm8D-AOrpdAAHzysvoK_M8514.jpg)
![每位技術(shù)專家都應(yīng)了解的人工智能和深度學(xué)習(xí)要點(diǎn)_第5頁](http://file4.renrendoc.com/view2/M03/25/39/wKhkFmYm8D-AOrpdAAHzysvoK_M8515.jpg)
版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)
文檔簡介
What
Every
TechnologistShould
Know
About
AI
andDeepLearningAlex
McDonaldStandards
&
Industry
Associations,
NetApp
Inc.The
information
is
intended
to
outline
our
general
product
direction.
It
isintendedfor
information
purposesonly,and
may
not
be
incorporated
into
any
contract.
It
isnot
a
commitment
to
deliver
any
material,
code,
or
functionality,
and
should
notbe
relied
upon
in
making
purchasing
decisions.
NetApp
makes
no
warranties,expressed
or
implied,
on
future
functionality
and
timeline.
The
development,release,
and
timing
of
any
features
or
functionality
described
for
NetApp’sproducts
remains
at
the
sole
discretion
of
NetApp.
NetApp's
strategy
andpossible
future
developments,
products
and
or
platforms
directions
andfunctionality
are
all
subject
to
change
without
notice.
NetApp
has
no
obligation
topursue
any
course
of
business
outlined
in
this
document
or
anyrelatedpresentation,
or
to
develop
or
release
any
functionality
mentioned
therein.2Why
is
AI
&
Deep
Learning
Important?AI
and
Deep
Learning
is
disrupting
everyindustryFor
decades,
AIwas
allabout
improvingalgorithmsNow
thefocus
ison
putting
AI
topracticaluseCritical
to
leverage
well-engineeredsystemsThis
talkwillTake
you
on
a
broad
coherent
tour
of
Deep
Learning
systemsHelp
you
appreciate
the
rolewell-engineered
systems
play
in
AIdisruptionTake
you
a
step
closer
to
being
a
unicornSystems
+
AI
-
something
highly
desirable,
difficult
to
obtainNetApp
INSIGHT
?2019
NetApp,
Inc.
All
rightsreserved.
NetApp
Confidential–
Limited
Use
Only3AgendaAI
PrimerAI
Stacks
OverviewDeep
Learning
ProcessTrainingInferenceDeep
Learning
SystemsHardwareSoftwareDatasets
and
DataflowFutureof
Systems4BackgroundAI
or
ML
orDLAI
–
program
that
imitates
human
intelligenceML
–
program
that
learns
with
experience
(i.e.,
data)DL
–
ML
using
>1
hidden
layers
of
neural
networkNetApp
INSIGHT
?2019
NetApp,
Inc.
All
rightsreserved.
NetApp
Confidential–
Limited
Use
Only5Deep
Learning
101Basic
concepts
and
terminologyNeuron:
computational
unitDL
Model==type&
structureMore
layers
=>
better
capture
the
features
indataset,
better
performance
at
task
(normally)Parameters/WeightsNeuronParameter/WeightsLayer6Deep
Learning
101Basic
concepts
and
terminologyNeuron:
computational
unitDL
Model==type&
structureMore
layers
=>
better
capture
the
features
indataset,
better
performance
at
task
(normally)Parameters/WeightsTraining:
build
a
model
from
datasetEpoch:
a
pass
over
entire
datasetBatch:
a
chunk
of
dataPreprocessing/preparation:
ready
data
to
trainBackpropagationRepeatForward
propagation“mountain”data7Deep
Learning
101Basic
concepts
and
terminologyNeuron:
computational
unitDL
Model==type&
structureMore
layers
=>
better
capture
the
features
indataset,
better
performance
at
task
(normally)Parameters/WeightsTraining:
build
a
model
from
datasetEpoch:
a
pass
over
entire
datasetBatch
size:
a
chunk
of
dataPreprocessing/preparation:
ready
data
to
trainInference:
usinga
trainedmodelForward
propagation“mountain”New
data8Deep
Learning
101Basic
concepts
and
terminologyForward
propagationState-of-the-Art
DL
is
large
scale100s
of
layers
Millions
ofparameters100s
of
GBs
to
TBs
of
dataHours/days
to
train“mountain”data9AI
StackOverview10AI
StackLayersAIPaaS,
End-to-end
solutionsAIStackLayersGPUs,
TPUs,
FPGAsOptimized
hardware
to
provide
tremendousspeed-up
for
training,
sometimesinferenceMore
easily
available
on
cloud
for
rentModern
Compute11AIStackLayersTensorflow,
PyTorch,
Caffe2,
MxNet,
CNTK,Keras,
GluonLibrary
that
implements
algorithms,
providesexecution
engine
andprogramming
APIsUsed
to
train
and
build
sophisticated
models,and
to
do
predictions
based
on
the
trainedmodel
for
new
dataSoftwareModern
Compute12AIStackLayersLaptop,
Cloud
compute
instances,
H2O
Deep
Water,
Spark
DL
pipelinesHardware
accelerated
platforms,
supportingcommon
software
frameworks,
to
run
thetraining
and/or
inference
of
deep
neuralnetworksTypically
optimized
for
a
preferred
softwareframeworkCan
be
hostedon-prem
or
cloudAlso
offered
as
fully-managed
service
(PaaS)by
cloud
vendors
like
Amazon
SageMaker,Google
Cloud
ML,
Azure
MLPlatformSoftwareModern
Compute13AIStackLayersAmazon
Rekognition,
Lex
&
Polly;
GoogleCloud
API;
Microsoft
Cognitive
Services;Allows
query
based
service
access
togeneralizable
state
of
art
AI
models
forcommontasksEx:
send
an
image
and
get
object
tags
as
result,
send
mp3and
get
converted
text
as
result
and
so
onNo
dataset,
no
training
of
model
required
by
userPer-call
cost
modelIntegrated
with
cloud
storage
and/or
bundled
intoend-to-end
solutions
and
AI
consultancyofferingslike
IBMServices
Watson
AI,
ML
&
Cognitive
consulting,
Amazon's
ML
Solutions
Lab,Google's
Advanced
Solutions
LabAPI-based
servicePlatformSoftwareModern
Compute14Deep
LearningProcessTrainingInferenceDL
Process
and
Data
LifecycleDL
lifecycle
is
very
unlike
traditional
systems
software
developmentGather
DataData
Analysis,Transformation,ValidationModel
TrainingModelEvaluation,Validation,
TuningModelServing,MonitoringGathering
and
curating
quality
datasets
andmaking
them
accessible
across
orgDiverse
tools
and
flexible
infrastructure
neededEvaluation
criteria
is
criticalbut
hardComparing
algorithms
is
not
straightforwardTracking
artifacts
like
dataset
transformations,tuning
history,
model
versions,
validation
resultsmore
important
thancodeDebugging,
interpretability
and
fairness
islimitedTension/friction:
Data
security
and
privacy;
ITNetApp
INSIGHT
?2019
NetApp,
Inc.
All
rightsreserved.
NetApp
Confidential–
Limited
Use
Only16Deep
Learning
TrainingTraining:
build
a
model
from
a
datasetIs
memory
and
computeboundBig
datasets,
complex
math
operationsIs
highly
parallelized/distributed
–
acrosscores,across
machinesPartition
data,
or
model,
or
bothScale
Up
before
Scale
OutCommunication
to
computation
ratioSpeed
vs
Accuracy
tradeoffFederated
learningLeans
on
enhancements
to
data
qualityAugmentation,
randomnessTransformationsEfficiently
fit
in
memoryForward
propagationBackpropagationRepeat“mountain”17Deep
Learning
TrainingTraining:
build
a
model
from
a
datasetSupervision
-
rely
on
labeled
dataTransfer
learning:
a
pre-trained
model,
train
few
layersLearning
labelInvolves
a
lot
of
hyperparameter
tuningExample:
#layers,
#neurons,
batch
size
…Multi-model
training
on
same
data
setTrial
and
error
search
-
easier
to
automateRise
of
AutoMLLearn
how
to
model
given
data
–
nomodeling/tuningexpertise
requiredExample:
AmoebaNet
beat
ResNet
ImageClassificationForward
propagationBackpropagationRepeat“mountain”18Deep
Learning
InferenceInference*:
use
a
trained
model
on
new
dataIs
computationally
simplersingle
forward
passTypically
a
containerized
RPC/Web
serverwith
pre-installed
DL
software
+
NNmodelMultiple
inputs
are
batched
for
better
throughputBut
much
smaller
than
training
batchLowlatency*
aka
Model
Serving,
Deployment,
PredictionForward
propagation“mountain”19Deep
Learning
InferenceInference*:
use
a
trained
model
on
new
dataDL
models
can
be
hugemay
need
hardware
accelerationOn-device/Edge
inference
is
gaining
tractionReason:
latency
&
privacyspecial
modeloptimizations
–pruninghardware
on-devicePortability
and
interoperability
of
model
isimportantTrain
any
way,
deploy
anywhereExample:
ONNX
is
a
step
towardsstandardizing*
aka
Model
Serving,
Deployment,
PredictionForward
propagation“mountain”20Deep
LearningSystemsHardware
AccelerationSoftware
FrameworksCPUs
are
still
used
for
ML
trainingCPUs
are
common
forinferenceincluding
certain
DL
inferenceStruggle
to
handle
DL
trainingData
preprocessing
are
suited
for
CPUsHybrid
hardware
of
CPUs
with
otheraccelerators
is
commonRole
of
CPU
in
AIAcceleratorslike
GPU22GPU/Image
source:
https://.
Courtesy
Daniel
Whitenack.De
facto
hardware
for
AI
trainingAlso
for
large
scaleinferenceGPU
vs
CPU
:
many
more
cores,
parallelizationModern
GPU
architectures
used
for
AIHigh
speed
interconnect
between
CPU/GPUs(NVLink)Bypass
CPU
for
communication
(GPUDirect)Efficientmessage
passing
(Collective-All-Reduce)Available
in
cloud
(EC2
P*
instances)
and
on-premise
(DGX)23Hardware
Acceleration
forDLGPU
(Graphic
ProcessingUnit)Without
GPUDirectWithGPUDirectImage
source:
https://ASIC
designed
to
speed
up
DL
operations,
likeGoogle’s
TPU
(Tensor
Processing
Unit)HighperformanceLess
flexibleEconomical
only
at
large
scaleSpecialoptimizations
inhardwareFor
example:
reduced
precision,
matmul
operatorDesign
for
inference
is
different
from
thatfortrainingFor
example:
in
1st
generation
TPUs
fp-units
werereplacedbyint8-unitsHardware
Acceleration
forDLASIC
(Application
Specific
Integrated
Circuit)Image
source:
/nips17/assets/slides/dean-nips17.pdf24Hardware
Acceleration
forDLFPGA
(Field
Programmable
Gated
Arrays)Designed
to
bereconfigurableFlexibility
to
change
as
neural
networks
and
new
algorithms
evolveOffer
much
higher
Performance/Watt
than
GPUsCost
effective
and
excel
at
inferenceReprogramming
an
FPGA
is
not
easyLow
levellanguageAvailable
on
cloud
(EC2
F1instances)25Primarily
limited
to
inference-onlySpecialSoCdesign
with
reduced
die
spaceEnergy
efficiency
and
memory
efficiency
is
more
criticalSpecialoptimization
to
support
specific
tasks-onlyFor
example:
speech-only,
vision-onlyExamples:
Apple’s
Neural
Engine,
Huawei’s
NPUHardware
Acceleration
for
On-device
AI26Software
FrameworksFrontendAbstracts
the
mathematical
and
algorithm
implementationdetails
of
Neural
NetworksProvides
a
high
level
building
blocks
API
to
defineneuralnetwork
models
over
multiple
backendsA
high
level
language
libraryBackendHideshardware-specific
programming
APIs
from
userOptimizes
and
parallelizes
the
training
and
inference
processto
work
efficiently
on
the
hardwareMakes
it
easier
to
preprocess
and
prepare
data
for
trainingSupports
multi-GPU,
multi-node
execution27Dataset&Data
FlowUsing
Tensorflow
as
referenceDataset
Transformation
–
ImageNet
ExampleRawdata
vs
TFRecordsRaw
data
is
converted
into
packed
binary
format
for
training
called
TFRecord
(One
time
step)1.2
M
image
files
are
converted
into
1024
TFRecords
with
each
TFRecord
100s
of
MB
in
size1624761281802322843363884404925445966487007528048569089601,0125,120FREQUENCYSIZE
(IN
KB)Raw
ImageNet
Data10000
900080007000600050004000300020001000002920406080100120126127128129130131132133134135136137138139140141142143144145146147148149150151152153154156158161FREQUENCYSIZE(INMB)TFRecord
ImageNetDataTensorFlow
Data
PipelineIO:
Read
data
from
persistent
storagePrepare:
Use
CPU
cores
to
parse
andpreprocess
dataPreprocessing
includes
Shuffling,
datatransformations,
batching
etc.Train:
Load
the
transformed
data
ontothe
accelerator
devices
(GPUs,
TPUs)and
execute
the
DL
modelRead
IOPrepareTrainStorageNetworkCPU/RAMGPUPCIe/NVLinkGPUPCIe/NVLinkTFRecordsHost30WithoutpipeliningCompute
PipeliningWith
Pipelining
(using
prefetch
API)Image
source:
https:///guide/31Parallelize
IO
and
Prepare
PhaseParallelizeprepareParallelize
IOImage
source:
https:///guide/3
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2025年馬拉松比賽合作協(xié)議書
- 人教版地理八年級下冊6.4《祖國的首都-北京》聽課評課記錄2
- 【部編版】七年級歷史上冊 《中國早期人類的代表-北京人》公開課聽課評課記錄
- 豬欄承包協(xié)議書(2篇)
- 生產(chǎn)工人中介合同(2篇)
- 人教版數(shù)學(xué)九年級上冊《構(gòu)建知識體系級習(xí)題訓(xùn)練》聽評課記錄1
- 北師大版道德與法治九年級上冊4.1《經(jīng)濟(jì)發(fā)展新階段》聽課評課記錄
- 八年級思想讀本《5.1奉法者強(qiáng)則國強(qiáng)》聽課評課記錄
- 五年級上冊數(shù)學(xué)聽評課記錄《4.2 認(rèn)識底和高》(3)-北師大版
- 湘教版數(shù)學(xué)八年級上冊2.3《等腰(邊)三角形的判定》聽評課記錄
- 城市隧道工程施工質(zhì)量驗收規(guī)范
- 2025年湖南高速鐵路職業(yè)技術(shù)學(xué)院高職單招高職單招英語2016-2024年參考題庫含答案解析
- 五 100以內(nèi)的筆算加、減法2.筆算減法 第1課時 筆算減法課件2024-2025人教版一年級數(shù)學(xué)下冊
- 2025江蘇太倉水務(wù)集團(tuán)招聘18人高頻重點(diǎn)提升(共500題)附帶答案詳解
- 2024-2025學(xué)年人教新版高二(上)英語寒假作業(yè)(五)
- 2025年八省聯(lián)考陜西高考生物試卷真題答案詳解(精校打印)
- 2025脫貧攻堅工作計劃
- 借款人解除合同通知書(2024年版)
- 《血小板及其功能》課件
- 江蘇省泰州市靖江市2024屆九年級下學(xué)期中考一模數(shù)學(xué)試卷(含答案)
- 沐足店長合同范例
評論
0/150
提交評論