Background: Data quality is fundamental to the integrity of quantitative research. The role of external researchers in data quality assessment (DQA) remains ill-defined in the context of secondary use for research of large, centrally curated health datasets. In order to investigate equity of palliative care provided to Indigenous Australian patients, researchers accessed a now-historical version of a national palliative care dataset developed primarily for the purpose of continuous quality improvement.
Objectives: (i) To apply a generic DQA framework to the dataset and (ii) to report the process and results of this assessment and examine the consequences for conducting the research.
Method: The data were systematically examined for completeness, consistency and credibility. Data quality issues relevant to the Indigenous identifier and framing of research questions were of particular interest.
Results: The dataset comprised 477,518 records of 144,951 patients (Indigenous N = 1515; missing Indigenous identifier N = 4998) collected from participating specialist palliative care services during a period (1 January 2010–30 June 2015) in which data-checking systems underwent substantial upgrades. Progressive improvement in completeness of data over the study period was evident. The data were error-free with respect to many credibility and consistency checks, with anomalies detected reported to data managers. As the proportion of missing values remained substantial for some clinical care variables, multiple imputation procedures were used in subsequent analyses.
Conclusion and implications: In secondary use of large curated datasets, DQA by external researchers may both influence proposed analytical methods and contribute to improvement of data curation processes through feedback to data managers.