CEO SUMMARY: On June 4, the FDA released results of an evaluation of some of the 20 tests offered for sale in this country to identify antibodies for SARS-CoV-2. A quality control expert in clinical labs called the antibody test analysis deeply flawed in part because of the study’s design. The FDA evaluated serology assays using 110 samples from patients, including 80 samples expected to be negative, the expert said. But the FDA may not know if any of those 80 samples are from patients who are immunocompromised, or who may have been infected with a similar virus, he added.
Federal scientists are using a deeply-flawed methodology to evaluate serology assays for the novel coronavirus, according to an expert in clinical lab quality control.
If the analysis is flawed, this development would be the latest in a series of poor decisions and counterproductive directives that federal agencies have made regarding diagnostic testing since the COVID-19 pandemic began in the United States in late January.
A flawed analysis means that some of the serological COVID-19 tests that have emergency use authorizations (EUAs) from the Food and Drug Administration (FDA) which are currently in use by clinical labs could be withdrawn as a result of being unreliable for clinical purpose.
On May 5, the FDA announced that it would conduct an independent evaluation of antibody assays for SARS-CoV-2 that have EUAs. The FDA and other federal health agencies are doing the evaluations to determine the accuracy of those tests, 20 of which are currently for sale nationwide after the FDA issued EUAs for them without review.
The goal of the analysis of the 20 serology or antibody tests is to determine if each assay or test kit will identify SARSCoV-2 antibodies when those antibodies are present in a patient’s blood, the agency said. The analysis also will determine if the tests do not signal when those antibodies are not present, the FDA added.
COVID-19 Test Reviews
On June 4, the agency released the first findings from its analysis of several of the 20 COVID-19 serology tests from what it said is an independent performance validation study. The serology tests were evaluated at the Frederick National Laboratory for Cancer Research (FNLCR), a federal research and development center affiliated with the National Cancer Institute (NCI), a division of the National Institutes of Health.
In an interview with The Dark Report, a clinical lab professional and expert in quality control processes for clinical laboratory testing questioned the methodology of the analysis. “I am concerned that the foundations of this study are so flawed that use of the results will have a very high risk of poor regulatory decisions,” said Michael A. Noble, MD, Chair of the Clinical Microbiology Proficiency Testing Program, and of the Program Office for Laboratory Quality Management, in the Department of Pathology and Laboratory Medicine at the University of British Columbia, in Vancouver.
In its announcement on May 5, the NCI said its researchers would use a validation set of 110 blood samples for each serology test being assessed. Of those 110 samples, 30 would be from individuals who had confirmed SARS-CoV-2 infections, and 80 samples would be from people whose specimens were collected before the pandemic began and so would not have been infected with SARS-CoV-2, the virus that causes the COVID-19 illness.
The samples would be used to test for the presence of IgG and IgM antibodies, the NCI said.
Noble cited a number of concerns about the design of the evaluation study. “This analysis is sending off all sorts of danger signals, primarily because the NCI’s validation set is badly flawed,” he said. One significant problem stems from the use of the 110 validation samples. “Of those 110 samples, they are hoping that 80 of them (or 73%), will be negative,” he explained. “But they probably don’t know if any one of those 80 samples is from a patient who had a cold at the time of sample collection and thus could have antibodies to a beta-coronavirus.
Unexpected True Positives?
“Let’s assume that of the 80 patient samples, five samples (or 6%) were collected from patients who had colds and thus have antibodies. How would the researchers know if the results from those samples were false positives or unexpected true positives?” Noble asked.
“Another problem is that only 30(27%) of the 110 samples are from what the FDA calls confirmed cases, but any number of variables could affect the likelihood of those patients having antibodies,” he commented. “If any of those patients are over 80 years old and from nursing homes, then the odds are likely that they are poor antibody producers.
“Similarly, if any of those 30 samples are from patients who are obese and diabetic, then they tend also to be poor antibody producers,” he noted. “If the samples are from people with autoimmune disorders—such as lupus, rheumatoid arthritis, or who have had a transplant—then those patients likely are on therapy to suppress their immune systems, which would affect the analysis. Also, if the blood came from patients diagnosed with COVID-19 a month earlier, then those patients had a response but are starting to lose the signs of that response.”
More Questions to Ask
Noble then identified two other problems with the FDA’s positive group. “First, this group is so small that the risk of bias is huge. Second, it seems appropriate to ask that since the NCI is doing this study, are the sera being used from patients the NCI has previously tested? If so, then some of them—and perhaps all—are, by definition, immune-compromised, either by illness or treatment. Such a population is hardly one upon which we can make predictions for the general population.”
Other problems that Noble cited about the NCI’s methodology would seem obvious to most clinical lab professionals involved in quality control. “One of the first questions I’d ask is why did the NCI decide to bias the validation panel toward getting a negative result by having more than double the number of specimens from people who probably were not exposed to the virus?” he said. “More important, have they agreed that the 110 samples should all be blinded for the laboratory doing the evaluation so that the researchers have no idea what to expect?
“Also, have the researchers created the sample sets so that the laboratory can get as many as four copies of the same serum sample from the same patient (and blinded, of course) to ensure that the same sample is read consistently?” he asked.
Noble posed other questions about the researchers’ plan to ensure accuracy. In May, NCI said that every sample in the validation panel would be tested by at least two separate labs, but then NCI did not name the labs or provide information about the types of testing those labs do.
As a quality-control expert, Noble has often been critical of the FDA’s efforts to evaluate molecular and serological tests for the coronavirus, as well as the speed at which the agency has allowed COVID-19 tests into the market for patient care. In previous commentary for The Dark Report, Noble emphasized that quality control should be done slowly and methodically, but that the FDA has proceeded too quickly in an effort to get tests onto the market.
“This evaluation of the COVID-19 serology tests is another good example of choosing between doing it fast or doing it right,” he noted. (See, Dark Daily e-briefing, “Chinese Firm to Replace Clinical Laboratory Test Kits After Spanish Health Authorities Report Tests from China’s Shenzen Bioeasy Were Only 30% Accurate,” April 3, 2020.)
Involve 10 Labs in Study
“Considering the importance of this study of the performance of COVID-19 serology tests that already have an FDA EUA, why would these federal agencies accept an answer as valid if there is agreement between only two laboratories?” he asked. “Given the size and significance of the NCI, and the fact that the researchers call this study a federal effort, why not require agreement among something closer to 10 laboratories?
“If a sample is found to have discrepant readings, then at a minimum, laboratories would want to know if concordance is one-out-of-two or nine-out-of-10,” he said. “A larger number of labs in concordance would provide more confidence about the COVID-19 serology test undergoing review.
“Also, I would expect the federal agencies to use a variety of laboratories to confirm the samples in the validation panel,” he explained. “Because these are all new COVID-19 tests that could be offered in a large number of laboratories nationwide, one would hope NCI would have testing done in a variety of laboratories.
Define Labs by Size, Type
“I understand that federal officials may not want to identify the laboratories, but at least they should be able to define them by size and type,” he said. “Are they research laboratories versus clinical laboratories? Or are they community-based private labs, university, or government labs?” he asked.
When the NCI explained its methods last month, it discussed sensitivity and specificity of the coronavirus test results in terms of false-positive and false-negative results. The NCI wrote, “False-negative results could lead people to believe that they haven’t been infected when they actually have, potentially preventing them from returning to work, school, or other activities.
“Also, false-positive results would cause people to think they have been infected and have developed an immune response, when they haven’t,” NCI added. “Incorrect results could also provide a skewed picture of how many people have been infected and the true death rate.”
The NCI’s explanation about the general understanding of sensitivity and specificity is correct, Noble commented. “And, to their credit, the FDA has recently identified a so called ‘gold standard’ for this testing in the sense that they are using the results of the ELISA (pan-Ig, IgG, and IgM) assay from the federal Centers for Disease Control and Prevention (CDC) and an IgG receptor binding domain (RBD) ELISA that the Krammer Laboratory developed,” he added. “In my opinion, that’s a good development.
“But then problems arise because of the use of the negative samples, and two of those samples tested positive to the gold standard,” he explained. “That fact proves my point as to why they are likely to find antibodies to other coronaviruses.”
Reactivity in Two Samples
When it started publishing its serology test evaluations this month, the FDA said it noted reactivity in two samples at the FNLCR lab. Sample C0063 showed reactivity in the pan-Ig CDC spike ELISA and sample C0087 showed reactivity in the IgG RBD ELISA, Noble reported. “In 80 samples of supposed-to-be-hard negatives, this result represents a failure rate of 2.5%,” he commented. “That result seems to be a potentially big deal.
“It’s possible to characterize these two sample reactivity results under the heading of ‘no test is perfect,’ but politicians and public health officials have put far too much emphasis and pressure on these test results,” he added. “The consequence of a false-positive result can mean a patient would need to be hospitalized, which could lead to exposure to nosocomial infections.
Or, a false positive also could mean that a patient would lose days or weeks of work and maybe be required to be quarantined or isolated for a period of time. “If the FDA designed this study with more precision, we might be able to put these false positives into perspective,” Noble commented. “But the study’s design flaws leave us with more questions than answers.
Antibodies to a Coronavirus
“Just because a person has had symptoms of the new coronavirus, does not mean that person is capable of making antibodies,” he noted. “And, just because that person was tested before the pandemic does not mean that individual did not have antibodies to a coronavirus related to but distinct from SARS-CoV-2.”
In conclusion, Noble offered some standards that quality control experts might apply when verifying or validating antibody testing for the new coronavirus. “Such studies should require planning that includes sound definitions of what constitutes a positive versus a negative sample,” he recommended. “Also, these studies should include a spectrum of samples, including a range of patient ages and conditions, and the studies should be designed in a manner so that the full range of results are available and appropriate for interpretation.
“On its surface, the FDA’s study with NCI does not meet any of those requirements or expectations,” continued Noble. “I would argue that considering the import and influence this study could have, an independent body of testing specialists should be asked to study and comment on the design. Also, those experts should have access to the patient information associated with the samples and have the authority to comment on and critique the results.”
‘We Want Results Tomorrow’
Finally, Noble offered an explanation about why the FDA and NCI proceeded as they have with this analysis. “With this study, they’ve tried to proceed as quickly as possible,” speculated Noble. “It’s as if someone said, ‘We need this done now, and we want the results tomorrow.’
“Designing and implementing a study in this manner, it’s as if the goal of federal officials is to complete the analysis as quickly as possible and worry about the details later,” he concluded. “From day one of this pandemic, that’s been the whole story of test analysis at the FDA for the new coronavirus.”
These observations about the design of the FDA’s assessment program being conducted at the FNLCR lab show that the entire clinical laboratory profession would benefit from more transparency and more engagement in this process.
Contact Michael Noble, MD, at firstname.lastname@example.org or 604-827-1337.