EMSDialog: AI-Generated Multi-Person Emergency Medical Dialogues for Training

Researchers created EMSDialog, a dataset of 4,414 synthetic multi-speaker emergency medical dialogues. The dataset is designed to train AI models to track evolving evidence in streaming clinical conversations and make accurate diagnoses.

Researchers have developed EMSDialog, a novel dataset containing 4,414 synthetic multi-speaker emergency medical service (EMS) dialogues. Generated using a multi-agent pipeline, these dialogues are grounded in electronic patient care reports (ePCR) and feature iterative planning, generation, and self-refinement with rule-based checks for factual accuracy and topic flow.

This dataset addresses a critical gap in existing medical dialogue corpora, which are largely limited to dyadic interactions or lack the complexity of multi-party workflows. EMSDialog is designed to train AI models to track evolving evidence in streaming clinical conversations and decide when to commit to a diagnosis, a crucial skill for conversational diagnosis prediction.

The introduction of EMSDialog opens new avenues for training AI models in emergency medical scenarios. Future research will likely focus on refining the generation pipeline and expanding the dataset to cover more diverse medical situations. Open questions include how well these synthetic dialogues generalize to real-world emergency settings and the ethical implications of using AI-generated data for medical training.