Yesterday I credited the Ontario government with starting to include more detailed summaries of COVID-19 cases in Long Term Care homes in its daily updates. I also raised concerns about remaining gaps in the government’s reporting.
I’ve been asked to explain one concern in particular:
When the number of deaths among residents in a particular LTC home is fewer than 5, the government is reporting “<5” rather than the actual count. The government is wrong in thinking that privacy considerations prevent the release of this information.
Protecting an individual’s privacy while sharing sensitive information (e.g. their current health status, history of illness, use of medication, etc) is challenging and requires expertise. “Data suppression” is just one technique. Let’s take a simple example:
Bob is known to be a member of a small group (e.g. he’s one of only four 20-30 yo males living on Avenue A). To disclose publicly that “two 20-30 yo males living on Avenue A are HIV-positive” amounts to revealing that “there’s a fifty percent chance that Bob is HIV-positive.” This disclosure is generally thought to violate Bob’s privacy. In this case, the fact that “two 20-30 yo males living on Avenue A are HIV-positive” would be suppressed. On the other hand, if Bob is known to be a member of a large group (e.g. he’s one of five thousand 18-70 yo males living on Avenue B), the disclosure that “two 18-70 yo males living on Avenue B are HIV-positive” might not be considered a violation of Bob’s privacy.
“To suppress, or not to suppress” data turns partly on the threshold n – the minimum size of a group required before sensitive information about the group may be disclosed without violating the privacy of the group’s members.
There’s no one-size-fits-all value for the threshold n; it depends on several considerations, most obviously the sensitivity of the information being disclosed. For instance, in its reporting on the National Household Survey, Statistics Canada determined that “no characteristics or tabulated data are to be released for areas below a population size of 40.”
The smallest value of threshold n that I’ve encountered is n = 5. In this case, when a group has fewer than five members, data about that group is suppressed, with a note that “n < 5” or simply “< 5”.
In its reporting of COVID-19 in Ontario’s Long Term Care homes, the provincial government is misapplying the threshold n concept. To illustrate, let’s look at two entries in yesterday’s daily update:
|LTC Home||Beds||Confirmed Resident Cases|
|Albright Gardens Homes, Incorporated||231||<5|
A few points:
- The entry “< 5” for the number of Confirmed Resident Cases in Albright Gardens is unjustified. For the government to disclose, for example, that there were 2 cases among a group of residents occupying up to 231 beds in Albright Gardens, would generally not be considered a violation of their privacy.
- If there had been, for instance, fewer than 5 beds in Albright Gardens, then disclosing that there were 2 cases would be considered a violation of the residents’ privacy and should be suppressed.
- The disclosure that 50 out of a total of 60 residents in Stoneridge Manor represents a greater violation of their privacy.
These points illustrate a few of the challenges the Ontario government faces as it strives for greater transparency in its public disclosures of the impact of COVID-19 in the province.