Challenges Encountered and Lessons Learned when Using a Novel Anonymised Linked Dataset of Health and Social Care Records for Public Health Intelligence: The Sussex Integrated Dataset
Round 1
Reviewer 1 Report
I reviewed the manuscript: Challenges encountered and lessons learned when 3 using a novel anonymised linked dataset of health and social care records for public health intelligence: The Sussex Integrated Dataset. It is an interesting text, but it makes me have 3 concerns;
1. How were ethical issues addressed in this study? Is sensitive information available? How this was handled.
2. I think that, as far as the challenges are concerned, it is well established what they are. But the lessons still lack greater practicality.
3. The authors published a version of the same work in another journal, does this not constitute duplicity?
Johnston, N., Ford, E., Tyler, R., Spencer-Hughes, V., Madzvamuse, A., Evans, G. and Gilchrist, K. (2022) “Exploring a novel linked dataset and building linked data analytics skills in Public Health Intelligence teams in Sussex”., International Journal of Population Data Science, 7(3). doi: 10.23889/ijpds.v7i3.1937.
https://ijpds.org/article/view/1937
Author Response
Dear Reviewer,
Many thanks for taking the time to read our manuscript and for your thoughtful comments. We have addressed your three suggestions as outlined below:
- How were ethical issues addressed in this study? Is sensitive information available? How this was handled.
We have added the following Ethics Statement to the methods section on page 7.
Ethics Statement
In the UK, NHS patient data which is rendered functionally anonymous and curated for public health purposes does not need Research Ethics Committee (REC) approval [26]. Data in SID is stripped of all identifiers such as name, date of birth (only age at consultation date is given), address etc, so that patients become anonymous. Data users sign a formal agreement that no attempts at re-identification will be made. Certain highly sensitive data, such as HIV status or termination of pregnancy, is not routinely extracted from clinics and held within SID. As data is processed (lawfully) without patient consent, the SID team is committed to engagement with Sussex citizens about how they would like their data to be used for public health purposes and research and what safeguards they would like to see [27].
2. I think that, as far as the challenges are concerned, it is well established what they are. But the lessons still lack greater practicality.
We have added some extra text directly highlighting lessons learned at the start of the discussion (page 13) and the end of the discussion (page 15). We drew attention to the need to automate the decisions around which one of competing values to use, the importance of having external sources of data to check accuracy of historic data and code lists, and learning about the skillset needed to produce usable learning from the dataset.
3. The authors published a version of the same work in another journal, does this not constitute duplicity?
We would like to highlight that this is only a 300 word abstract published in a conference proceedings issue of a journal. We declared it on the cover letter with our submission. We do not think it constitutes duplicity as it does not describe the work in any detail.
We thank the reviewer for prompting us to consider ethical issues and lessons learned in more detail and feel this has strengthened the paper overall.
With best wishes,
Elizabeth Ford.
Reviewer 2 Report
With interest, I have read the paper titled “Challenges encountered and lessons learned when using a novel anonymised linked dataset of health and social care records for public health intelligence: The Sussex Integrated Dataset”. Indeed, the opportunity of using patient- and population-level data for health need and impact assessment (particularly for long-term conditions) is a very important which NHSs are direct to.
This paper offers an interesting insight about key issues of managing and exploiting these data. Overall, the paper is well written and fits the journal’s aim and general audience.
I have only a few comments/suggestions.
Authors might want to add comparison between their and other conceptual progresses in other similar contexts. Similarly, more on the public health impact of this research should be said.
Author Response
Dear Reviewer,
We thank you very much for the time you took to review our article and for your thoughtful comments. We have addressed them as outlined below:
1) Authors might want to add comparison between their and other conceptual progresses in other similar contexts.
We searched the literature and found few papers discussing these early stages of understanding linked EHR databases and their quality issues. However, we did find a few examples and have added them on page 14, particularly discussing problems with re-using open-source code lists, and issues with time-stamps not being accurate in the data.
2) Similarly, more on the public health impact of this research should be said.
We have added some more on the lessons learned for public health teams, and the anticipated impact on public health strategy when the database can be fully used for analysis of health inequalities, as planned. These additions are found on page 15.
We thank you again for prompting us to add in these extra sections of the discussion, we feel the paper is strengthened as a result.
With best wishes,
Elizabeth Ford