Linking hospital episodes for health research; the effects of data quality

28 February 2020


Electronic health record (EHR) data are available for research in all UK nations and cross-nation comparative studies are becoming more common. All UK inpatient EHRs are based around episodes; a continuous period of time spent under the care of one clinical speciality in one hospital. However episode-based analysis doesn’t really capture the full patient journey; for example, when someone is admitted and then transferred to a second hospital, we want to capture the whole of their stay in both hospitals. If we look at each of their episodes in isolation, we will under-estimate the average length of time they spent in hospital, and over-estimate the number of times they were admitted.

Across the UK there are several existing approaches to the aggregation of episodes into variants of person spell, but there is no single UK-wide method for doing this. These approaches all use a combination of temporal link (such as an episode starting within a day or two of a previous episode ending) and administrative codes which indicate a transfer of care. However we know that the coding of transfers is not always complete, and therefore if we rely on these codes we won’t always link together episodes that are actually connected. We also know that although episodes are not supposed to overlap, they sometimes do, which may also affect the way we link them together.

Our Study

This project came about as part of a review commissioned from SAIL Databank by the National Confidential Enquiry into Patient Outcome and Death (NCEPOD), which used routine data from across the UK to look at mental disorders and self-harm in children and young people. It was completed during the development of the Adolescent Mental Health Data Platform (ADP), an innovative resource supporting research to improve the mental health of children and young people. A key aim of the ADP is to make research easier and more efficient by creating and sharing research-ready datasets, both within Wales and more broadly across the UK and beyond. Part of this is harmonising datasets to make them as comparable as possible.

Our study examined the effects of different methods of linking episodes, examining the effects of on transfer coding and temporal links between episodes (including the handling of overlapping episodes), to find the way that worked best for our study, and also to inform future research.


Around 69% of transferred-in admissions had a preceding coded transfer-out discharge within one day. This increased to 78% when we looked for only a temporal link. The effect was greater for psychiatric admissions, with only 49% of transfers-in preceded by a coded transfer-out, increasing to 56% with only a temporal link. Although the proportion of episodes which overlap is small overall (0.2%), it affected a greater proportion of psychiatric specialties (4.0%).

We found that transfer coding had a bigger impact on episode linking than overlapping episodes. The most impactful method for grouping required only a temporal link (episodes starting and ending less than one day apart) and not a specified transfer code, and aggregated overlapping episodes into a single person spell.

The significance of this work

There are important reasons for linking together episodes across providers; estimates of length of stay and admission rates will be affected if each episode is considered as a stand-alone encounter, rather than part of a longer sequence of care. Data quality impacts on the ways in which episodes can be linked, as well as the comparability of data from different nations; we hope that this project has raised some important considerations about how this may influence results.

The study was published in 2019 study was funded by MQ, the mental health research charity.

Click through to read full article