BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20230124T171522Z
LOCATION:C143-149
DTSTART;TZID=America/Chicago:20221114T103500
DTEND;TZID=America/Chicago:20221114T104000
UID:submissions.supercomputing.org_SC22_sess439_ws_scsc115@linklings.com
SUMMARY:ReMPI:  A Record-and-Replay Tool for Debugging Non-Deterministic M
 PI Applications
DESCRIPTION:Workshop\n\nReMPI:  A Record-and-Replay Tool for Debugging Non
 -Deterministic MPI Applications\n\nSato\n\nDebugging massively parallel ap
 plications remains a highly challenging task. With trends towards larger a
 nd more complex supercomputers, remarkably increasing degrees of paralleli
 sm, more parallelism options (e.g., heterogeneity), and emerging programmi
 ng models, applications gain higher performance and scalability by using m
 ore asynchronous algorithms. However, they come at a productivity cost: th
 ey introduce non-determinism in parallel program execution—i.e., the appli
 cations do not produce the same output in different runs—and this makes de
 bugging even a greater challenge. A particularly well-known source of non-
 determinism at large scale is the message-passing interface (MPI). As netw
 ork and system noise can affect the order of received messages, applicatio
 ns can take different computation paths depending on the order of the rece
 ived messages. This complicates debugging since computation paths and asso
 ciated computational results may vary between the original run (where a bu
 g manifested itself) and the debugged runs. In this lightning talk, we int
 roduce ReMPI (MPI Record-and-Replay Tool, https://github.com/PRUNERS/ReMPI
 ) that facilitates debugging non-deterministic MPI applications. ReMPI rec
 ords the execution of each MPI process as trace data, which includes the o
 rder of the message receives. Then, during debugging, a replay mechanism u
 ses these recorded traces to ensure that every MPI process observes the sa
 me message exchanges as the recorded run.\n\nSession Format: Recorded\n\nT
 ag: Reliability and Resiliency\n\nRegistration Category: Workshop Reg Pass
END:VEVENT
END:VCALENDAR
