*本課程僅開放給有意願成為學生叢集競賽隊員或教練的學生。


教學方式:
討論、講習、實驗操作、口頭報告

教學進度:
MPI 進階知識: MCA,arch,UCX,RDMA
Cluster 系統調控: Prometheus,Grafana,ELK
Application Bnechmark: IO500,MLPerf,osu_micro_benchmark
Application Profile方法學: Amdahl's Law,Roofline model,TMA
Application Profile 實作: CPU Perf,Vtune,Branch prediction,Vector instruction,etc
Profiler: GPU Nsigh System, Ucx_info, Iostat, Rdma ib command
Case Study: AI (GPU) 計算的介紹,平行化與效能優化,Deepspeed,tensorRT,Quantization
Power management: BIOS,IPMI,fan control

成績考核:
每周進度報告: 40%
實驗操作: 20%
期末計畫: 40%