驅動汽車科技創(chuàng)新發(fā)展演講資料-理想自動駕駛-2024-04-自動駕駛

上傳人：行*** IP屬地：北京上傳時間：2024-06-16 格式：DOCX 頁數(shù)：22 大小：3.51MB 積分：20 舉報 版權申訴

驅動汽車科技創(chuàng)新發(fā)展演講資料-理想自動駕駛-2024-04-自動駕駛_第2頁

驅動汽車科技創(chuàng)新發(fā)展演講資料-理想自動駕駛-2024-04-自動駕駛_第3頁

驅動汽車科技創(chuàng)新發(fā)展演講資料-理想自動駕駛-2024-04-自動駕駛_第4頁

驅動汽車科技創(chuàng)新發(fā)展演講資料-理想自動駕駛-2024-04-自動駕駛_第5頁

已閱讀5頁，還剩17頁未讀，繼續(xù)免費閱讀

版權說明：本文檔由用戶提供并上傳，收益歸屬內(nèi)容提供方，若內(nèi)容存在侵權，請進行舉報或認領

文檔簡介

TheConvergenceofAutonomousDrivingandSystem1&2Thinking

PengJia

LiAuto,China

Contents

LiADOverview

LiADTechnologyHighlights

LiAuto'sViewonAutonomousDriving

Real

Rule-drivenL2:2D/Mono3DData-drivenL3:BEV/End2EndKnowledge-drivenL4:VLM/WorldModel

World

EveryDay

Driving

Scenarios

UnknownScenarios

ExpandedDriving

Scenarios

Known

Scenarios

LiADFramework

SYSTEM1

Intuition&instinct

SYSTEM2

Rationalthinking

Takeseffort

Slow

Logical

Lazy

Indecisive

95%

Unconscious

Fast

Associative

Automaticpilot

System1--End-to-EndModelforL3AD

Fastend-to-endresponsetothesurroundingenvironment.

System2--LargeMultimodal-ActionModel

Exploreandlogicallythinkunderunknownenvironments.Modalitiesincludelanguage,vision,pointclouds,canbusandnavigationtosolveL4unknownscenes.

System1System2TrainingLoop

Perception

Decision&Planning

Control

Short-termMemory

Vehicle

L3EndtoEndModel

Sensors

L4MultimodalLLM

Recognition

GeneralKnowledge

SimReinforcementLearningModel

EvaluationNetwork

GenerativeWorldModel

Cloud

Well-recognizedworksfromLiADteam

MUTR3D

2021World's1st

Incamera-based3Dtracking

FUTR3D

Industryleading

DenseTNR

1stPlaceSolution

InICCV2021INTERPRET

Challenge

DETR3D

2021World's1st

Incamera-based3Ddetection

HDMapNet

CVPR2021ADP3Workshop

Multi-sensor3Ddetectionmodel

(BestPaperNomination)

CORL2021DETR3D:

/pdf/2110.06922.pdf

CVPR2022MUTR3D:

/pdf/2205.00613.pdf

ICML2023VectorMapNet:

/pdf/2206.08920.pdf

CVPR2023NPN:

/pdf/2304.08481.pdf

CVPR2023FUTR3D:

/pdf/2203.10642.pdf

ICRA2022HDMapNet:

/pdf/2107.06307.pdf

CVPR2023VIP3D:

/pdf/2208.01582.pdf

ArchitectureofADMax3.0

SafetyPerception

SafetyPlanner

OneModel(Multi-taskPerception)

Prediction&PlanningNetwork

ShadowApp1

ShadowApp2

Shadow

StaticBEV

ObjectBEV

Occupancy

MPC

……

Spatio-TemporalPlanner

End2EndTrafficSignalNetwork

Camera×7

LiDAR

Radar

NavigationMap

NVIDIADRIVEOrin×2

InferencePerfOptimization

ForPerceptionPipeline111ms/9fps=>48ms/21fps

ItemOptimizationActionOptTypeLatency(percentage)

ApplyMPStoavoidCUDAcontextswitchoverhead(systemofmultipleprocesseswithGPUcalls).

Pipeline

-9.91%

RemoveunneededCUDAcalls(e.g.,cudaWaitExternalSemaphore).

Pipeline

-5.60%

EnlargeCUDA_DEVICE_MAX_CONNECTIONStoresolvefalsedependencyamongCUDAstreams.

Pipeline

-4.80%

Schedulemodelheadswithdifferentinferencefrequency.

Pipeline

-8.10%

ReplaceD2DwithH2DcopyinimagingstreamstoutilizeGPUCopyEngineinsteadofCUDAcore.

Pipeline

-3.25%

Optimizebevpoolv2pluginbyreducingwarpdivergence&fp16(from

Lidar_AI_Solution

Model

-10.80%

EliminatereformattingkernelsduetounfusedQATnodes.

Model

-3.60%

ApplyTRTMHAfastkerneltoacceleratetransformerblocks.

Model

-5.60%

UseSparsity.

Model

-3.60%

OffloadROI-aligntoPVA.

Model

-1.80%

Finalperfimprovement-57.06%

LiADL4Research——Drive-VLM

SequenceofImagesDriveVLMSceneDescriptionCriticalObjectsAnalysisHierarchicalPlanning

Input:

Basedonthesummarized

analysis,pleasedetermine

themetaactionsanddecisions.

Drive-VLM:

MetaActions:['Slowdown’,'Stop',‘Reverse’].

Decisions:Driveslowly,stopinfrontoftheconstructionworkers,andthenreverse.

Input:

Basedontheabovedecisionsandanalysis,pleaseindicatethe3sfuturetrajectoryoftheegovehicle.

Drive-VLM:

Futuretrajectory:[(x1,y1),(x2,y2),…,(xn,yn)]

Input:

Picture:<img>

Pleaseindicatethescenecondition.

Drive-VLM:

Weather:Sunny.

Time:Day.

RoadEnvironment:Urban.

AlternativeLane:Ownlaneaheadimpassable,occupiedbyparked

vehicles,leftlanepassable,oncominglane,proceedwithcaution.

EgoLanePosition:Donotapply.

VisionEncoder

FeatureAlignment

Input:

Pleaselistthekeyobjects.

Drive-VLM:

Criticalobjects:<Twoconstructionworkers>

Box:<(x1,y1),(x2,y2)>

Matching

3DPerceptionResults

Input:

Matchedobjects:

Twoconstructionworkers

Thepositionandhistory

trajectoryofmatchedobjectsinBEVregion:<…>

Unmatchedobjects:NoneDescribetheobjectand

indicateitsinfluence.

Drive-VLM:

Characteristic:Garbagecleaninginprogressontherightlaneahead.

Influence:Blockingtherouteofourvehicle.

Input:

Ego-statesandhistoricaltrajectory:<…>

Basedontheanalysisof

sceneandcriticalobjects,determinethedrivingmetaactionsanddecisions.

Collaboration

Dual

System

Slow-Fast

3DPerception

MotionPredictionTrajectoryPlanning

TraditionalAVPipeline

*SubmittedtoCVPR24

https://openreview.net/forum?id=jL4YMzXYII

LLMDeployedOnNVIDIADRIVEOrin

LLaMA2-3B(BS=1,Input_len=128,Output_len=128)

PlatformConfigContextLatency(ms)DecoderPerf(tokens/s)

DriveOSLinux,OrinINT4(GPTQ)52.565.6

LLaMA2-7B(BS=1,Input_len=128,Output_len=128)

PlatformConfigContextLatency(ms)DecoderPerf(tokens/s)

DriveOSLinux,OrinINT4(GPTQ)73.1541.8

LiADSimResearch——StreetGaussians

Originalscene

Street-gaussianswapping

Originalscene

Street-gaussianswapping

Reallog

Camerasimulation

Originscene

Linrescene

Unisimscene

RenderingImages

Decomposition

Semanticmaps

Geometrymodel

PositionμRotationαOpacityRScale$

Point-basedRendering

BackgroundmodelComposition

Dynamicappearancemodel

TimebasisSHbasis

……?

∑(

OptimizableTrackedboxes

Objectmodel

Scenerepresentation

3DGS[16]NSG[31]MARS[51]Ours

PSNR↑

29.95

30.23

31.37

34.54

PSNR*↑

17.74

22.05

23.07

25.16

SSIM↑

0.907

0.866

0.904

0.936

LPIPS↓

0.140

0.331

0.246

0.091

FPS↑

277

0.47

0.68

133

Table1.QuantitativeresultsontheWaymo[40]dataset.

Therenderingimageresolutionis1066°?1600.“PNSR*”denotesthePSNRofmovingobjects.

*SubmittedtoCVPR24

https://openreview.net/forum?id=jL4YMzXYII

LiADResearch——BEV-CLIP:MultimodalDataRetrieval

Weightmatrix

BEV

Encoder

SharedCross-madalPrompt

Pedscrossing

crosswalk,

manycars……

Language

Textembedding

Encoder

LoRA

Weightmatrix

KGEmbedding

(a)(b)

Knowledgegraph

BEVCaptionGenerationHead

Contrastiveloss

(c)

Figure2.OverallstructureofBEV-CLIP.

(a)ProcessingofBEVandtextfeatures.Theimagefrom6surroundingcamerasaregeneratedintoaBEVfeaturebytheBEVEncoderwithfrozenparameters.Atthesametime,theinputtextembeddingisconcatenatedwiththekeyword-matchedKnowledgeGraphnodeembedding,andfedintotheLanguageEncoder

withLoRAbranchforprocessing.(b)Sharedcross-modalprompt(SCP),whichalignstheBEVandlinguisticfeaturesinthesamehiddenspace.(c)Jointsupervisionofcaptiongenerationandretrievaltasks.⊙denotesdotproduct.

1.81

人人文庫> 全部分類> 應用文書 > 研究報告

溫馨提示

1. 本站所有資源如無特殊說明，都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
2. 本站的文檔不包含任何第三方提供的附件圖紙等，如果需要附件，請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
3. 本站RAR壓縮包中若帶圖紙，網(wǎng)頁內(nèi)容里面會有圖紙預覽，若沒有圖紙預覽就沒有圖紙。
4. 未經(jīng)權益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 人人文庫網(wǎng)僅提供信息存儲空間，僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理，對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯，并不能對任何下載內(nèi)容負責。
6. 下載文件中如有侵權或不適當內(nèi)容，請與我們聯(lián)系，我們立即糾正。
7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

驅動汽車科技創(chuàng)新發(fā)展演講資料-理想自動駕駛-2024-04-自動駕駛

文檔簡介

溫馨提示

最新文檔

評論

驅動汽車科技創(chuàng)新發(fā)展演講資料-理想自動駕駛-2024-04-自動駕駛

文檔簡介

溫馨提示

最新文檔

評論

相關文檔