BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20230124T171522Z
LOCATION:C155
DTSTART;TZID=America/Chicago:20221113T113700
DTEND;TZID=America/Chicago:20221113T120000
UID:submissions.supercomputing.org_SC22_sess428_ws_p3hpc109@linklings.com
SUMMARY:Performance Portability of Sparse Block Diagonal Matrix Multiple V
 ector Multiplications on GPUs
DESCRIPTION:Workshop\n\nPerformance Portability of Sparse Block Diagonal M
 atrix Multiple Vector Multiplications on GPUs\n\nIbrahim, Yang, Maris\n\nT
 he emergence of multiple accelerator based computer architectures and prog
 ramming models makes it challenging to achieve performance portability for
  large-scale scientific simulation software. In this paper, we focus on a 
 sparse block diagonal matrix multiple vector (SpMM) computational kernel a
 nd discuss techniques that can be used to achieve performance portability 
 on NVIDIA and AMD based accelerators using CUDA, HIP, OpenACC, Kokkos.  We
  show that performance portability can vary significantly across programmi
 ng models, GPU architectures, and problem settings, up to 52x in the explo
 red problems.  Our study visits the performance portability aggregation me
 tric to guide the development and the selection of performance portable va
 riants.\n\nSession Format: Recorded\n\nTag: Performance Portability\n\nReg
 istration Category: Workshop Reg Pass
END:VEVENT
END:VCALENDAR