How an NHS Test and Trace Excel error lost 16,000 COVID-19 cases
As systems integration experts, we normally focus on the positives of integration: better business efficiency, higher customer retention, and greater data security. But sometimes it’s worth talking about the dark side — what happens when digital integration goes wrong? In the case of the NHS Test and Trace Excel error, the answer is: thousands of people potentially exposed to coronavirus.
What was the NHS Test and Trace Excel error?
According to the Press Association (PA), the error became apparent when a large backlog of old COVID-19 test results was suddenly logged in NHS Test and Trace. An additional 12,871 cases were logged on Saturday evening, and 22,961 cases on Sunday evening.
This delay in the test results reaching NHS Test and Trace gave the impression of a sharp spike in infections. While this was not the case, the delay in test data reaching NHS Test and Trace meant that thousands of COVID-19 positive people were not properly controlled. And this lack of oversight may have caused thousands more infections.
Understanding the Excel integration failure
On a technical level, this is how the system should have worked:
- there are two “pillars” of testing centres: those run by the NHS and those run by well-known private IT contractors
- the NHS centres keep their test data in the NHS SGSS database before transfer to Public Health England
- the private contractors upload their test data to Public Health England in CSV spreadsheets
- Public Health England processes the data in Excel and transfers it to NHS Test and Trace
And what reportedly happened was this:
- Public Health England was using an old version of Excel, with a limit of 65,536 rows of data
- as the test numbers increased, the private IT contractors started uploading larger volumes
- at some point the CSV data exceeded the 65,536 row limit, meaning thousands of test results were not transferred
The Excel error message indicating that the row limit has been exceeded.
What was the correct data integration approach?
The relevant parties are reportedly going to solve this issue by splitting the large Excel files into many smaller files, thus avoiding the 65,536 row limit.
But by taking a different integration approach, the contractors could have mitigated this NHS Test and Trace Excel error completely.
First of all, this error shows the importance of proper communication between systems integrators and their clients. If the IT contractors had understood Public Health England’s Excel system, they could have broken the CSV data into smaller chunks from the start.
And beyond that, commentators have rightly questioned why Excel is being used as a data management tool in the first place. There are highly cost-effective data management services available in Microsoft Azure, for example, which are flexible enough to handle any volume of data. This is one of the main benefits of using cloud-based, serverless integration platforms such as Azure — there’s no limit to your usage, no complex maintenance issues, and you only pay for what you use.
In integration, bigger doesn’t always mean better
When even private IT heavyweights make these kinds of mistakes, it shows that in the digital world size isn’t everything. In fact, the Government Digital Service was set up with the aim of encouraging public bodies to work with small and medium-sized IT service providers. Working with a boutique systems integrator such as ourselves, for example, brings the following advantages:
- a fully tailored service: treating each integration as a unique challenge in need of a customised solution
- in-depth technical expertise: in the leading languages, vendors, and technologies, such as Microsoft Azure and Dell Boomi
- a high ratio of senior to junior staff: which translates into greater value for money
- 30 years of experience: across all sectors, including education, publishing, manufacturing, and the public sector