BACKGROUND: Systematic measurement of conversational features in the natural clinical setting is essential to better understand, disseminate, and incentivize high quality serious illness communication. Advances in machine-learning (ML) classification of human speech offer exceptional opportunity to complement human coding (HC) methods for measurement in large scale studies.
OBJECTIVES: To test the reliability, efficiency, and sensitivity of a tandem ML-HC method for identifying one feature of clinical importance in serious illness conversations: Connectional Silence.
DESIGN: This was a cross-sectional analysis of 354 audio-recorded inpatient palliative care consultations from the Palliative Care Communication Research Initiative multisite cohort study.
SETTING/SUBJECTS: Hospitalized people with advanced cancer.
MEASUREMENTS: We created 1000 brief audio "clips" of randomly selected moments predicted by a screening ML algorithm to be two-second or longer pauses in conversation. Each clip included 10 seconds of speaking before and 5 seconds after each pause. Two HCs independently evaluated each clip for Connectional Silence as operationalized from conceptual taxonomies of silence in serious illness conversations. HCs also evaluated 100 minutes from 10 additional conversations having unique speakers to identify how frequently the ML screening algorithm missed episodes of Connectional Silence.
RESULTS: Connectional Silences were rare (5.5%) among all two-second or longer pauses in palliative care conversations. Tandem ML-HC demonstrated strong reliability (kappa 0.62; 95% confidence interval: 0.47-0.76). HC alone required 61% more time than the Tandem ML-HC method. No Connectional Silences were missed by the ML screening algorithm.
CONCLUSIONS: Tandem ML-HC methods are reliable, efficient, and sensitive for identifying Connectional Silence in serious illness conversations.
OBJECTIVE: Automating conversation analysis in the natural clinical setting is essential to scale serious illness communication research to samples that are large enough for traditional epidemiological studies. Our objective is to automate the identification of pauses in conversations because these are important linguistic targets for evaluating dynamics of speaker involvement and turn-taking, listening and human connection, or distraction and disengagement.
DESIGN: We used 354 audio recordings of serious illness conversations from the multisite Palliative Care Communication Research Initiative cohort study.
SETTING/SUBJECTS: Hospitalized people with advanced cancer seen by the palliative care team.
MEASUREMENTS: We developed a Random Forest machine learning (ML) algorithm to detect Conversational Pauses of two seconds or longer. We triple-coded 261 minutes of audio with human coders to establish a gold standard for evaluating ML performance characteristics.
RESULTS: ML automatically identified Conversational Pauses with a sensitivity of 90.5 and a specificity of 94.5.
CONCLUSIONS: ML is a valid method for automatically identifying Conversational Pauses in the natural acoustic setting of inpatient serious illness conversations.