COVID-19 COVID-19 Analytics COVID-19 Diagnostics COVID-19 Therapeutics SB

What we need is a Global Public-Health Data Strategy for Beating COVID-19 and Future Pandemics


Inthe first 4 months of 2020, the United States alone has seen a death toll from COVID-19 of 60,000 people, which matches all American deaths suffered during 12 years of war in Vietnam. At the same time, annual US deaths from Cancer in 2020 alone are estimated to be over 600,000. While the cause of the COVID-19 outbreak is still being determined, the rapid proliferation of the virus and the tremendous toll it has already reaped in lives, economic prosperity and global disruption in a matter of months, is profound.

This article presents the thesis that the mishandling of steps to detect, contain and mitigate the impacts of the pandemic was, at core, a failure of strategic decision-making capabilities by leaders across all nations and their governmental public health systems. The article examines this failure from a data strategy perspective, and posits that while data itself is not enough for robust decision-making, the critical lack of available data in fact exacerbated this situation. A data strategy that can assist nations and systems to organize and mitigate, even partially, the impacts of a second wave of Covid 19 or other pandemics, is presented.


Over the past three months, we have been introduced to a host of new terms such as: ‘Infection rates’, ‘flattening the curve’, ‘contact tracing’, ‘antibody tests, ‘mass testing” and ‘vaccine development cycles’. For those of us not deeply ingrained in the public health domain, these terms are unfamiliar and more reminiscent of a Netflix series or action film (such as the 1995 movie ‘Outbreak’, starring Dustin Hoffman and Renee Russo). For experts and practitioners in public health, scientists, and many medical professionals, this terminology is an integral part of their daily lives. Involved in the process of identification and treatment of infections are physicians who regularly send patients for blood tests and other assays, then receive the results back for analysis and care. Similarly involved are public health experts and epidemiologists who evaluate influenza infection rates and spread patterns. Last are hospitals, pharmaceutical companies, and manufacturers of vaccines and protective equipment.

In early May 2020, we find ourselves inundated daily, on every news channel, with extensive reports by experts (and politicians) now using these “new” terms quite regularly, attempting to explain to the masses the state of affairs in their respective regions, the implications and the decisions that are, or will soon be made, to try to resume a normal life. While styles and methods vary, many credible parties have chosen to rely on a data-driven approach at the basis of their understanding, explanations, and ultimately, their decisions. New York Governor Cuomo is an excellent example, whose daily briefings have been data-heavy, in an explicit effort to make the public aware of the facts and considerations underlying his government’s analyses and decisions.

“Canada’s Data Collection Failures Hurt Canadians, and Humanity”

This the title of an article by Colby Cosh, a Canadian journalist, whose article from April 30th, 2020, details a host of national and provincial data collection and sharing malpractices contributing to the COVID-19 crises.

Data, the Essential Ingredient for Robust Decision-Making

One of many things that COVID-19 has brought to light is the critical lack of reliable data needed to determine an accurate picture of the state of COVID-19 “affairs”. National centers such as the US Center for Disease Control & Prevention, the Canadian Public Health Agency, or the European Centre for Disease Prevention and Control are all mature agencies, well-adept at gathering data about known infections such as influenza, hepatitis, tuberculosis, and others. Where these important organizations fell short was in their ability to gather accurate, timely data about COVID-19 infection rates due to the lack of available, highly accurate testing on the ground, as evidenced by a growing array of recent articles on this topic by various experts.

Bill Gates says US system produces ‘bogus’ testing numbers

Testing the Tests: COVID-19 Antibody Assays Scrutinized for Accuracy by UCSF, UC Berkeley Researchers

U.S. Confronts New Testing Dilemma: How to Figure Out Who Already Had Covid-19

Congress sounds alarm over inaccurate antibody tests

Experts Voice Concerns About Covid-19 Testing Accuracy

The Need for A Global Health Data-Strategy

Asdepicted in Figure 1 below, the required data falls into three main categories, based on the results or insights they are intended to yield: a. Actual infection rates (who has it, who does not?); b. Origination and spread (who gave it to whom and who else may be exposed and at risk?); c. Potential risks and infections (areas and populations that may be at risk). How might governments and healthcare systems secure the various data assets needed to derive these insights? The following sections address each data category.

Figure 1

Data Category #1: Actual Rates — Measuring actual infection rates using widespread testing

The lack of available tests in sheer numbers, their cost per unit as well as the requirements of the existing testing methods (a deeply, specially administered swab followed by a long incubation period in a lab) have made it virtually impossible to determine an accurate, comprehensive picture of asymptomatic and symptomatic infection rates, without a very gradual ramp-up of testing. As illustrated in Figure 2 from the World Economic Forum, as of late April 2020, the testing rates as a percentage of the total population are 1% in the UK, 1.5% in the USA, 1.8% in Canada, with an OECD average of only 2.3%. These levels can hardly be considered comprehensive or capable of yielding more than aggregate statistical metrics, which are quite useless for developing specific treatment plans.

Figure 2 — Diagnostic Testing for COVID-19 in OECD Countries

Multiple approaches to mass testing are being vetted globally, for example, self-test kits such as those by Spartan Bioscience, whose coffee mug size diagnostic unit can be operated by non-technical individuals at home, offices, airports, or other environments, with results obtained in under an hour. While this product showed initial promise and was even touted by Canadian Prime Minister Trudeau on March 31, 2020, it has since been recalled by Health Canada. The road to results will clearly be a rocky one, but also an opportunity for innovation.

Any number of successful solutions, if and when widely distributed, could immediately contribute to solving the data source conundrum that has made it so challenging to understand the proliferation of this highly contagious virus, and allow governments to deploy basic infrastructure that could be re-used for other forms of viruses or illnesses as they arise. This represents an important opportunity for innovation at multiple levels; the firms that create the test kits, those that handle the distribution and maintenance, and parties that harness and then secure the data in order to distribute it to permitted parties. While the mechanisms for mass testing are not yet in place, it is comforting to recognize a growing acceptance for the need for widespread and ongoing testing to yield far more accurate data for decision-making.

Data Category #2: Origin & Spread — Detecting infection origination and spread using behavioral data

In addition to actual infection testing, Contract Tracing is an important element gaining acceptance. Contact Tracing is a common practice that enables public health professionals to identify the potential origin and transmission of illness due to interactions between parties.

This type of data allows establishing the physical behavior of individuals (where they may have been, how they got there, when and with whom they interacted) in order to determine the possible chain of origin and subsequent exposure. One of the ways to gather this data is via mobile devices of course, as much of this data exists at a device level (Apple and Google). Until recently there has been no acceptance for explicit gathering and use of such data due to privacy concerns, nor a facility for utilizing it for public health purposes.

As of April 2020, such an approach has gained public awareness in particular thanks to the efforts by Google and Apple who are collaborating to leverage their onboard mobile handset operating system capabilities for massive contact tracing across billions of users, in parallel with a host of initiatives by independent developers and organizations globally. Overall these initiatives represent a very positive movement towards filling this critical data gap. However, they are challenged by the simple fact that they require massive adoption, data alignment across them all, and the ability to utilize this data in a responsible manner at public health as well as individual levels to be useful for the intended purposes.

Data Category #3: Potential Rates — Early detection of potential risks and infections using continuous micro-measurements of health, behavior and environmental data

Finally, a critical missing data asset is raw individual-level healthcare data. This information would allow public healthcare professionals to develop early warning diagnostic models to form a more accurate picture, not just of actual infection rates and exposure, but also of potential infection and exposure rates, in parallel with or in absence of detailed testing. This ‘raw data’ consists of a variety of detailed information such as historical health records and conditions, historical and current vital signs as well as behavioral and environmental information, for example, breathing, heart rate, body temperature, oxygen, sleep, exertion, physical activity, weight, BMI, fatigue, ambient temperature, geographic location, air pollution and more.

Medical charts are still largely paper-based, although EMR’s (electronic medical records) have been talked about for close to two decades now and many systems have been implemented across hospitals worldwide. While EMRs may be useful, they will not have a critical impact as they don’t gather data on a continuous basis, rather at infrequent points in time when individuals are seen by their doctors (for periodic checkups or the odd ailment), or if they are admitted to hospital for an urgent matter. To illustrate this point, how often has your blood pressure or temperature been checked over the past 12 months? What about blood tests or swabs to test for markers or antibodies to illnesses or viruses? This is not to imply that our doctors have done anything wrong, but rather to highlight that when we think about data gathering in the age of COVID-19 and the need for new data for accurate early detection of potential illnesses, what we need is a different approach to the problem.

Four or five months ago, people would not have thought there to be an urgent need to gather vitals or other forms of the above data on a continuous basis, but rather only when someone fell into a risk category or became ill and exhibited certain symptoms. This has now changed, and we are beginning to see discourse in scientific and public circles and about the need for granular data collection in order to develop appropriate models to understand the correlation between vital signs and outcomes in the form of illnesses. This is not restricted only to COVID-19, but applies to a broad range of illnesses, infections, and conditions, for example in this research study published in The Lancet in January 2020.

In my article titled “The fight against COVID-19 needs an urgent lesson in data management from the finance and retail sectors” from April 12th, 2020, I detail in great depth this type of smartphone or wearable based system for successfully achieving mass adoption and the continuous gathering of raw data. While there are multiple opportunities to solve this data gap, it is clear that without proactive measures and investment, history will repeat itself in the form of future devastating impacts.

Mitigating COVID-19 impact on data privacy

The aforementioned three types of large-scale data gathering mechanisms are essential for enhanced robust, accurate, and proactive decision-making by leaders and the avoidance of future failures and crises such as we are experiencing these very days. The need for such capabilities will undoubtedly cause controversy amongst those concerned (rightfully so, I may add) about repeated abuse of data privacy rights (such as the Facebook — Cambridge Analytica scandal as part of the 2016 US Elections) while governments cite the “common good” at the basis for their actions. Quite simply, this does not need to be the case.

Data gathering exercises without mechanisms to ensure lawful collection (with full awareness and consent by individuals), secure storage, and protection of individual privacy should not become commonplace as a result of COVID-19. Robust, transparent, and lawful options must be urgently adopted (and rapidly created where needed) to ensure that the balance between privacy and the common good is only strengthened by the COVID-19 crises, not weakened. The emphasis should be on the common adherence to strict national or regional standards such as the EU GDPR. Civil liberties need not suffer as a result of COVID-19 and the need to avert future losses in human life and financial disasters, and it is incumbent on governments to establish these protections from the onset of their activity.

Final Thoughts

COVID-19 has been a Decision-Making failure of global proportions, a result of poor underlying data and analytics strategies, practices, and capabilities. The path forward requires refined approaches to data collection along with new analytical and data sharing paradigms to ensure such a loss of human life and economic devastation does not recur, even as soon as this coming winter. While the undertaking is monumental, there is the potential to save lives through a gradual, iterative approach based on this form of comprehensive global healthcare data strategy.

About the Author

Simon Brightman is an Adjunct Professor at the University of Ottawa, Telfer School of Management. His research interests include global strategy, data analytics and Artificial Intelligence driven decision making in government, international organizations, as well as private corporations.

Simon has lectured and taught technology commercialization, innovation, product validation and product development for over 12 years in private, public and academic organizations. He has served in various executive roles leading data analytics product technology firms in the USA and Canada, including Head of Data Strategy & Open Banking at TransUnion (Credit Reporting Agency), Vice President at Panvista Analytics (a US wearable geo-location analytics firm), Head of Agile & Product Management at (Nasdaq traded, leading global loyalty e-Commerce provider to Airlines & Hotels), Senior Manager at KPMG (program management), presently a Senior Partner with Global Data Insights (data analytics advisory & investments across Fintech and other xTech domains).

Simon holds a BSc. in Computer Software & Business Management, an MBA in Technology Management as well as a Masters in International Relations from Cambridge University. Connect with Simon on Linked in

Leave a Reply