版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領
文檔簡介
1、HPC高性能計算最佳實踐技術創(chuàng)新 變革未來內容提要HPC 簡介ANSYS HPC 軟件配置ANSYS HPC 硬件選擇應用案例2July 4, 2018HPC簡介什么是HPCHPC帶來的好處HPC計算原理ANSYS HPC加速效果3July 4, 2018什么是高性能計算(HPC)高性能計算一般是指通過集合更多的計算資源提供遠遠超過單一工作站的計算能力去求解科學、工程問題的實踐。4July 4, 2018HPC帶來的好處提高保真度計算更復雜的裝配體考慮更多的非線性更多的設計場景驗證更多的 優(yōu)化分析5July 4, 2018HPC計算原理ANSYS HPC:通過將大規(guī)模計算問題分解成可以并行 計
2、算的子問題,分配到多個計算核心(CPU或者GPU)上進行并行計算。充分利用計算資源,加速計算。ANSYS HPC Parametric Pack:將參數化模型的每一組參數設置分配到 多個計算節(jié)點,同時求解多個不同參數 設置的模型,實現加速??梢越Y合HPC 一同使用。6July 4, 2018加速效果-HPCSovler Rating:It is computed by dividing the number of seconds in a day (86400 seconds) by the number of seconds required to run the benchmark. A
3、higher rating means faster performance.7July 4, 2018加速效果-HPC Parametric Pack020,00040,00060,00080,000100,000120,0001 Job4 Jobs1 HPC PP8 Jobs2 HPC PP16 Jobs3 HPC PP2083206920822068110,16029,46015,0608,220CalculationGeometry update+ 1 HPC PP+ 1 HPC PP+ 1 HPC PP8July 4, 2018參數化模型使用不同數量HPC PP的計算時間對比HPC軟
4、件配置9July 4, 2018HPC (per-process) HPC PackHPC product rewarding volume parallel processing for high-fidelity simulationsEach simulation consumes one or more PacksParallel enabled increases quickly with added PacksHPC WorkgroupHPC product rewards volume parallel processing for increased simulation th
5、roughput shared among engineers throughout a single location or the world16 to 32768 parallel shared across any number of simulations on a single serverHPC Parametric PackEnables simultaneous execution of multiple design pointswhile consuming just one set of licenses2048328128512Parallel Enabled (Co
6、res)3276881921234567HPC Packs per SimulationSingle HPC solution for FEA/CFD/FSI and any level of fidelity12 instead of 8 in 1st Pack at Release 19.0 and higher軟件配置選項10July 4, 2018ANSYS 19.0新特性11July 4, 2018ANSYS Mechanical Pro,Premium, EnterpriseANSYS CFD Premium andEnterpriseANSYS Mechanical CFDANS
7、YS HFSSANSYS AIMANSYS Q3D ExtractorANSYS MaxwellANSYS IcepakANSYS Mechanical CFD Maxwell 3DANSYS Chemkin-Pro and EnterpriseANSYS Mechanical Maxwell 3DANSYS SIwaveMore products are now using ANSYS HPCStandalone HPC licenses, HPC Packs and HPC Workgroup become more flexible and work across physics wit
8、h all ANSYS Mechanical, Fluids and Electronics products*4 Built-in HPCs now across all physics4 built-in HPCs are now included in Mechanical, Fluids and Electronics products, including ANSYS AIM and ANSYS Chemkin Enterprise.HPC Packs are now additiveHPC Packs becomes additive in nature to the 4 buil
9、t- in HPCs (e.g. 1 HPC Pack licenses 8 + 4 = 12 total cores, 2 HPC Pack license 32 + 4 = 36 total cores, etc.)* Impacted products :Note: R19.0 license manager is required. For ANSYS Mechanical and Fluids products changes are backward compatible; for ANSYS Electronics products changes are compatible
10、with version 19.0 and forwardNote: built-in HPCs are linked to a solver seat and cannot be shared with other solver seats!Note: the single, standalone HPCs are not additive to the PacksANSYS HPC Parametric Pack介紹HPC license for running parametric FEA or CFD simulations on multiple CPU cores simultan
11、eously, and more cost effectivelyKey BenefitsAbility to automatically and simultaneously execute design points while consuming just one set of application licensesScalable because number of simultaneous design points enabled increases quickly with added packsAmplifies complete workflow because desig
12、n points can include execution of multiple applications (pre, meshing, solve, HPC, post)Number of SimultaneousDesign PointsEnabled6432168412345Number of HPC Parametric Pack Licenses12July 4, 2018HPC Parametric Packs大幅縮短設計時間dp4 dp3dp2 dp1Sequentialseries of Design pointsUnused Cores94% Reduced Timeto
13、 InnovationHPC Parametric Packs amplify both solver licenses and HPC licenses allowing you to drastically reduce time to innovation, without the cost of additional solver or HPC licensesOne solver key without HPCFour solver keys OROne solver key and one HPC Parametric Pack+ 4 HPC keys13July 4, 2018G
14、PU加速Electronics products4 HPC licenses enable 1 GPU through the available 8 HPC tasks1 HPC Pack enables up to 12 CPU cores + 1 GPUs through the available 12 HPC tasks2 HPC Packs enable up to 36 CPU cores + 4 GPUs through the available 36 HPC tasksFluids / Structural products1 GPU requires 1 HPC task
15、 as long as GPUs CPU coresExamples:2 HPC licenses enable up to 3 CPU cores + 3 GPUs through the available 6 HPC tasks1 HPC Pack enables up to 6 CPU cores + 6 GPUs through the available 12 HPC tasks2 HPC Packs enable up to 18 CPU cores + 18 GPUs through the available 36 HPC tasks14July 4, 20181 GPU u
16、nlocked by every 8 HPC tasksGPU acceleration can be enabled through all ANSYS HPC product licenses: ANSYS HPC, ANSYS HPC Pack and ANSYS HPC Workgroup.HPC license cost decreases as more are purchasedeither as HPC Packs or as HPC Workgroups.ANSYS HPC and ANSYS HPC Workgroup gives flexible use of a poo
17、l of licenses.ANSYS HPC Pack gives “quick” scale-up but is more restrictive in how users can use it.The ability to be more flexible is why the HPC Workgroup options cost more than the HPC Packs.HPC Parametric Pack enables more cost-effective licensing for design exploration and optimization.我該選擇哪種配置
18、?15July 4, 2018Multiple licensing options to fit different requirements.HPC Packs for quick scale-up.HPC Workgroup for Flexibility.GPUs treated the same as cores in the licensing model.As you scale-up license cost decreases per core.Per core pricing becomes less ofan issue.小結- 軟件配置Running on 2,000 c
19、ores instead of 20 coresat 1.5X and not 100XFilling up a 1024- instead of 128-core cluster with 32-core jobs will cut the price per job in half!Enabling 64 instead of 4 simultaneous design points at 3X and not 16X16July 4, 2018HDD vs. SSD選擇什么樣的硬件配置SMP vs. DMPInterconnects?Clusters?CPUs?GPUs?17July 4
20、, 2018HPC硬件術語Machine 1 (or Node 1)GPUProcessor 1(or Socket 1)Processor 2(or Socket 2)Interconnect (GigE or InfiniBand)Machine N (or Node N)GPUProcessor 1(or Socket 1)Processor 2(or Socket 2)18July 4, 2018共享內存并行Single Machine Parallel (SMP) systems share a single global memory image that may be distr
21、ibuted physically across multiple cores, but is globally addressable.OpenMP is the industry standard.Machine 1 (or Node 1)Processor 1(or Socket 1)19July 4, 2018分布式內存并行Distributed memory parallel processing (DMP) assumes that physicalmemory for each process is separate from all other processes.Parall
22、el processing on such a system requires some form of message passing software to exchange data between the cores.MPI (Message Passing Interface) is the industry standard for this.Machine 1 (or Node 1)Processor 1(or Socket 1)20July 4, 2018了解時鐘速度的影響- ANSYS MechanicalEffect of increased core operating
23、frequencies on the DMPbenchmarks running on 12 coresInfluence is highest for sparse solver benchmarksUsing higher clock speed is always helpful to realize productivity gains21July 4, 2018了解內存帶寬的影響- Is 24 Cores Equal to 24 Cores?3 x (2 x 4) = 24 coresx5570 x5570 x55702 x (2 x 6) = 24 coresx5670 x5670
24、22July 4, 20183 x (2 x 4) = 24 coresx5570 x5570 x55702 x (2 x 6) = 24 coresx5670 x5670Consider memory per core!23July 4, 2018了解內存帶寬的影響- Is 24 Cores Equal to 24 Cores?分布式內存并行優(yōu)于共享內存并行SMPDMP48121605.02.50.050.025.06412819225600.0SMP vs. DMPSpeedup Factor vs. Number of Coresfor ANSYS Mechanical24July 4,
25、 2018GPU 加速ANSYS Application Examples25July 4, 2018Need fast interconnects to feed fast processorsTwo main characteristics for each interconnect: latency and bandwidthDistributed ANSYS is highly bandwidth bound26July 4, 2018+- D I S T R I B U T E DA N S Y SS T A T I S T I C S -+Release: 14.5Build: U
26、P20120802Platform: LINUX x64Date Run: 08/09/2012Time: 23:07Processor Model: Intel(R) Xeon(R) CPU E5-2690 0 2.90GHzTotal number of cores available:Number of physical cores available :32324 (Distributed Memory Parallel)Number of cores requested: MPI Type: INTELMPICoreMachine NameWorking Directory-0123
27、hpclnxsmc00 /data1/ansyswork hpclnxsmc00 /data1/ansyswork hpclnxsmc01 /data1/ansyswork hpclnxsmc01 /data1/ansysworkLatency time from master to core Latency time from master to core Latency time from master to core1 =2 =3 =1.171 microseconds2.251 microseconds2.225 microsecondsCommunication speed from
28、 master to core Communication speed from master to core Communication speed from master to core1 =7934.49 MB/sec Same machine2 =3011.09 MB/sec QDR Infiniband3 =3235.00 MB/sec QDR Infiniband了解互聯速度的影響了解互聯速度的影響- ANSYS MechanicalFor ANSYS Mechanical GiGE does not scale to more than 1 node!27July 4, 2018
29、了解互聯速度的影響- ANSYS MechanicalV13sp-5 ModelTurbine geometry2,100 K DOF SOLID187 FEsStatic, nonlinear One iteration Direct sparseLinux cluster (8 cores per node)01020304050608 cores16 cores32 cores64 cores 128 coresRating (runs/day)Interconnect PerformanceGigabit EthernetDDR Infiniband- particularly at
30、higher core/node c28July 4, 2018Using faster interconnects can behelpful to realize productivity gainsountsNeed fast hard drives to feed fast processorsCheck the bandwidth specsANSYS Mechanical can be highly I/O bandwidth boundSparse solver in the out-of-core memory mode does lots of I/ODistributed
31、ANSYS can be highly I/O latency boundSeek time to read/write each set of files causes overheadConsider SSDsHigh bandwidth and extremely low seek timesConsider RAID configurationsRAID 0 for speedRAID 1,5 for redundancyRAID 10 for speed and redundancy29July 4, 2018了解存儲速度的影響- ANSYS Mechanical了解存儲速度的影響-
32、 ANSYS Mechanical 18.1When working directory is assigned to Z Turbo Drive G2 and BMT models for CG solver are used with more than 16 cores, job speeds up by 1.4 times.When working directory is assigned to Z Turbo DriveG2 and BMT models for SPARSE are used with more than 16 cores, job speeds up by1.8
33、-2.6 times.higher is betterhigher isbetter1.4x1.4x1.4x2.6x2.1x1.8xHardware Configuration:HP Z840 workstation with dual E5-2699v4 (2.2 GHz), 128GBs 2400MHz memoryOptional Storage: Micron SATA SSD No RAID or HP Z Turbo Drive G2 512GB No RAID30July 4, 2018了解存儲速度的影響- ANSYS MechanicalRatingNumber of CoresUsing faster disks can be helpful to realize productivity gains- particularly at higher core/node counts31July 4, 2018時鐘速度內存帶寬互聯速度GPU加速存儲速度:I/O is very important for Mechanical SolverRaid 0 mandator
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯系上傳者。文件的所有權益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經權益所有人同意不得將文件中的內容挪作商業(yè)或盈利用途。
- 5. 人人文庫網僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
- 6. 下載文件中如有侵權或不適當內容,請與我們聯系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 脾胃虛弱動畫冬病夏治
- 大叔爺爺課件教學課件
- 2024年分子篩項目投資申請報告代可行性研究報告
- 物聯網畢業(yè)設計論文
- 龍蝦的課件教學課件
- 牙體牙髓病常用藥物
- 2.1.2碳酸鈉和碳酸氫鈉 課件高一上學期化學人教版(2019)必修第一冊
- 糖尿病胰島素注射治療
- 新公司企業(yè)規(guī)劃
- 合唱團說課稿
- 2024年國家公務員考試行測真題卷行政執(zhí)法答案和解析
- 賽力斯招聘在線測評題
- SL-T+62-2020水工建筑物水泥灌漿施工技術規(guī)范
- 《漢字輸入一點通》課件
- 《駝鹿消防員的一天》課件
- 小學思政課《愛國主義教育》
- 農業(yè)合作社全套報表(已設公式)-資產負債表-盈余及盈余分配表-成員權益變動表-現金流量表
- 反吊膜施工安全方案
- GA/T 1147-2014車輛駕駛人員血液酒精含量檢驗實驗室規(guī)范
- IE 標準工時(完整版)
- 機械基礎軸上零件軸向固定公開課課件
評論
0/150
提交評論