We are navigating between several health information worlds: health info, medical info and clinical info. The standards for truth vary with the risk from the Web and open information, buyer beware, to the double-blind clinical trial. Is it feasible to apply the rules of clinical trials to online health information and the recommendations it contains? Where will that leave 23andme and all other health tech entrepreneurs?
The stories on regulation over these online services raise multiple issues with how to move forward in Direct-to-Consumer (DTC) testing, or at least standardize it. In this second post in a three-part series, let’s look at some of the questions over regulating this space before we discuss a possible path forward in part three.
Read Part 1 of this series: The Social Conquest of Medicine: 23andMe and Conflict
If you’ve been following the 23andMe regulatory saga, there’s a new character in the Federal Trade Commission (FTC), saying that 23andMe and GeneLink have been making unsubstantiated claims.
“The settlement, which would only take place after a 30-day public comment period and a final decision from the FTC, would keep GeneLink and its former subsidiary, Forum International, from making any future claims that their products can impact the course of disease unless such claims are supported by two double-blind, randomized control trials—the gold standard of medicine.”
I hope the FTC can see there’s a difference between health info, medical info and clinical info, and I hope others will join me in making comments on the case that this is not the bar we want to set for what are largely health claims.
Where do we go from here? Should information be regulated like a medical device or treatment?
We’ll get to unsubstantiated claims in a moment, but first, let’s talk about the broader picture of citizen science, because this is not just about a company and product-—23andMe is also a citizen science network.
Do we or do we not regulate citizen science when it’s about us?
As highlighted in the last post, said best by Tim Spector and Barbara Prainsack about 23andMe (I included this quote in Part I of this series):
“the one characteristic that set it (23andMe) apart from other services offering genetic testing beyond the clinic is that they have tried to include “citizen science” elements and encourage wide data sharing. This has played a relatively marginal role in the wider discussion. We are not uncritically applauding the way that they have been doing this, but it adds an important new dimension and direction for science.”
I think they are right on both counts, this is an important new direction for science. Here’s the second half of the quote, and reason to be concerned for fans of citizen science:
“Regulators, in contrast, still operate under the assumption that there are those who produce knowledge (traditional experts) and those who receive knowledge (patients/consumers), but this is not an accurate depiction of platforms such as 23andMe, and many others.”
Again, navigating difficult terrain between two worlds, the crowd and the clinician, the Web and the clinical trial. Now that we’re all connected to an innumerable sources of information, and we all have a chance to participate in collecting some of that information, it won’t be so easy to separate science from the people doing it in the years ahead. That might be well and good when we’re collecting and analyzing info on galaxies as citizen analysts, but what do we do when the research is about ourselves? Don’t we each have the right to do research about ourselves and share that information? Don’t we have a right to this kind of analysis?
It would be a mistake to prevent or discourage people from doing research on themselves and their genomes. It is a core piece of who we are. I don’t need a double-blind study to substantiate the claims on the Internet, I assume they are false and go from there. Still, there’s little doubt we need to be careful when we get into medical information that is prescriptive.
Learning as we go, what level of rigor do we need to simply supply information?
John Wilbanks wonders about the difference between stopping harm and doing good: He writes, “That “traditional” submission to the FDA would be of a very specific kind of analysis based on randomized controlled trials. It is designed to keep bad things from happening to people, not to make sure good things happen to people. As one of my favorite papers lays out, parachutes would not receive FDA approval as a gravity-resisting device.”
He goes on to say, “Modern tech culture doesn’t work that way. Bayes’ rule is about probabilities that evolve as our system does. It’s a different way of knowing that you know something, and it’s one in which there is far more tolerance for uncertainty than the FDA is accustomed to.”
More than “modern tech culture,” complex systems (modern tech is one), where many elements are interacting, require a Bayesian approach. Because of complexity we simply can’t model accurately at the start. The same may be true for truly personalized medicine. We simply can’t do a randomized, controlled trial on every set of variables because we have to look at them all in concert and in each individual.
In the tech world, user experience design, agile methods and other iterative tech development processes have been found to be the most effective. Designers know that “The frequency and timeliness of feedback is what distinguishes healthy projects from unhealthy ones.” Agile, Bayesian processes are designed to test assumptions by opening up those assumptions to feedback from users. If you cut off users from the feedback loop, how do they innovate in this paradigm? Bayes rule requires new information to improve the model.
The key will be in keeping the risks low to learn as we go.
We may need to develop a more agile approach to understanding how people use and gain value from personalized health data, but we must keep them in the loop and the process.
Standardization and Peer Review
Another path might be to standardize the language around statistics in health data in a more regimented way, reviewed by peers.
We could, in an open process, define an expert peer-review system to define which claims are accurate. Although clinical trials can influence standards of care, they ultimately rely on rational people to sift the evidence. Peer review and attribution is the original middle ground between organizational stamps of approval and “common knowledge.”
Could that hint at a middle ground? We, the consumers, are an integral part of this loop now that data can be ubiquitously captured about us, and we’re willing to have it captured to benefit the communities to which we belong, as we’ve seen with the evolution of the web and user-generated content. It’s part of the secret sauce.
We can and should pursue these new directions in bottom-up research, but we have to do it in a way that fits with the risks and potential for abuse. Part of that will come with transparency about how data is used and how analysis is performed, but another part comes from standardizing the language in which we represent statistics.
In a New York Times article in which a reporter received three different genetic screens, including 23andMe, it shows that even the same result can be described in multiple different ways.
The article describes the scenario:
“In the case of Type 2 diabetes, inconsistencies on a semantic level masked similarities in the numbers. G.T.L. said my risk was ‘medium’ at 10.3 percent, but 23andMe said my risk was ‘decreased’ at 15.7 percent. In fact, both companies had calculated my odds to be roughly three-quarters of the average, but they used slightly different averages — and very different words — to interpret the numbers. In isolation, the first would have left me worried; the second, relieved.”
Medical ethicists and other experts have a different kind of worry about results like these: a lack of industry standards for weighing risk factors and defining terminology.
“The ‘risk is in the eye of the beholder’ standard is not going to work,” said Arthur L. Caplan, director of medical ethics at the New York University Langone Medical Center. “We need to get some kind of agreement on what is high risk, medium risk and low risk.”
Problems arise when we pull people into research that they don’t understand, and even the statistical experts mix up what the statistics might mean. Framing and other cognitive biases come into effect. There’s the famous example of physicians making differing recommendations based on whether they are shown 5% mortality rates vs. 95% survival rates (they are the same thing). We can’t regulate away our own biases, but we can standardize language in a way that makes our biases more consistent.
As medical economist Jane Sarasohn-Kahn recently told me, “Risk is a part of life, and we haven’t had an honest conversation about risk. We need to be able to treat people like grown-ups (when it comes to risk and uncertainty in medicine).”
A large part of the task ahead will be determining what information and what resulting health decisions actually pose a health risk, not tacit assumptions about how consumers make decisions.
When do these tests create more harm than good?
According to David Dobbs,
“The FDA says it’s ‘concerned about the public health consequences of inaccurate results.’ I’ve not heard of any harm come to 23andme customers from such inaccuracies — and even if such cases exist. Inaccurate results will inevitably occur (here’s a story of what seems to be one particularly unfortunate such case), and findings of medical relevance should be confirmed before people act on them. But do those apparently rare errors inflict more harm than the sorts of errors our medical system makes every day? Do they inflict more harm than a lack of information about important risk genes inflict? Does any harm done outweigh the great good that people gain from having inexpensive access to lots of information about medical risk?”
My sense is that harm won’t come from fear on it’s own, the problems will come when companies use genetic information to sell products we don’t need; to sell medications, procedures and other dangerous and expensive products out of fear. (This, of course, is nothing new to all forms of advertising, but I diverge) So, as we come up with solutions, let’s focus on the selling of fear-based products and other poor medical decisions as the problem.
What data can be used for sales and selling people products they don’t need, and how do we differentiate between them?
Does the FDA have a role in regulating data analysis as part of medical devices?
Let’s recall, the FDA hasn’t stopped SNP (single nucleotide polymorphism, one-type of genetic diferences) screening, which, ironically is what most might consider the medical device part of the process (you send a sample, you get a result) They’ve only put an end, apparently, to combining the genetic test with the analysis (you can still take your results elsewhere to be analyzed, and this is where the statistical word-play comes in), they haven’t stopped the science, just the confusing, opaque analysis that can go with it.
Even if they are potentially more valuable together, perhaps that’s not such a bad thing. If the analysis comes with the genetic test, it can appear that that’s the only answer to the information. But if the two points are separated, it makes it a little more clear that there are many ways to analyze such data.
Let’s recall, the problem is with being prescriptive, not necessarily diagnostic. The issue, thus far, is: Overcoming genetic risks by taking specific actions—THAT’s the part that’s unsubstantiated, not knowing the risks.
My hope is that the FDA and the FTC will continue to allow access to the test, the analysis and the statistics, in standardized language in a way that’s open to more public scrutiny. Without independent verification, it may very well be hazardous to allow companies to differentiate on the analysis and make wild, unsubstantiated claims, even if they make common sense.
We need to help people to understand statistics and be open to what they might mean, but we shouldn’t try to protect people from the reality of risk. It’s a fine, difficult line.
What constitutes a diagnosis, or a claim about a condition?
The regulators seem to be saying that a scale can give you information that you’re overweight, but if it says you have a 30% increased risk of cardiovascular disease and might want to consider exercise, that needs to be substantiated. As crazy and obvious as that sounds, it shouldn’t be a difficult bar to meet.
People receive a lot of guidance on a wide range of health and non-health related issues, and much of the fuzzy space in-between, and that’s OK, buyer beware. Let’s be aware of what information is provided in a very open manner, and stop false claims, but let people do their tests and gather the information.
The time is now for some clear guidance from the FDA on these questions, and I hope it’s on the side of allowing analysis and making claims, but not halting tests.
Does this mean algorithms should be regulated for effectiveness?
Where might it end if an algorithm must substantiate a recommendation based on personal information? Just because an algorithm is written, do we have to substantiate it? Or could we just license a company to provide this kind of service, much like we license physicians to practice?
Between the two worlds of buyer beware health information online, including supplements, and the world of double-blind clinical trials, we’ve traditionally left the info in the middle up to expert opinion, physicians, nurses, clinical review boards and the legal system. These have developed our standards of care based on “reasonable people.” It’s gets tricky when algorithms become the experts.
There’s a reason second opinions are recommended, and a reason they call what we get from doctors “opinions.” They are often more art than science. Medicine is called a “practice” and an “art”, right? We can’t certify that a physician’s algorithms (the ones in their heads, gleaned from med school and clinical practice, yes practice) are 100% accurate (some have argued that 80% is inaccurate), but we can test them every 10 years to keep them licensed?
This has some very real impacts as machine learning comes to the forefront in medicine, how can a computer make a recommendation (legally) when machine learning is involved, as with IBM’s Watson?
This is a deep, dark hole of regulation if we go down that path. The practice of medicine will never be 100%, and we need to understand what is good enough and verifiable enough given the risks and available info.
Can we separate data, information and statistics from clinical diagnostics? Should there be a new category?
Diagnostic indicators are by no means free from debate. Just see current debate on accuracy and validity of PSA screening for prostate cancer.
Data is not a treatment. Having an increased likelihood of a disease is not a diagnosis. Again, standard language, transparency and a minimum, peer-reviewed bar for claims seems to be in order, as well as a deeper reliance on experts like genetic counselors. Genetic counselors may be the hot new career, working with data scientists to develop a deep understanding of what the standard of care should be and can tease out what’s relevant, and what’s not.
Managing the risks and benefits of knowing
Do I have a right to know that I have a single marker for 10% higher risk of diabetes. Is it meaningless? Sure, but why wouldn’t I be able to have that information, actionable or not? Why not empower patients to act or to form communities to solve the problem? Without better guidance, we may be killing the ability for a “citizen science Framingham” (Framingham Heart Study) and all the benefits that could bring.
Risks: Of course, there are some real, negative effects that sharing this information could cause. From FastCompany: “The confines of Genetic Information Non-Descrimination Act (GINA) don’t yet extend to long-term-care insurance. Several states have banned the discriminatory use of genetic information in all areas, but there is not yet any sweeping federal protection.”
Let’s start with protecting citizens on discrimination then move on to questionable claims.
How do we determine what is health-related?
According to Prusak and Prainsack:
“Something can be health-related without being medical (e.g., the fact that I run in the morning because it makes me feel better), and ‘medical’ is a wider term than ‘clinical.’ 23andMe’s way of operating is also so distressing for regulators because the results given to customers cannot be neatly teased apart into health-related and non-health-related information, contrary to what 23andMe is trying to do now.”
The problem with health-related data, medical data and clinical data is not the data itself, but the decisions they drive. Clinical decisions are more risky than medical decisions, which are more risky than health decisions.
The difference between taking a vitamin or not might be inconsequential, clinically, but having a BRCA mutation might drive a decision for surgery. We should continue to look at the specific case and not overreach on what’s really relevant or not. My sense is that a 20% increase in cardiovascular risk should be on par with taking fish oil supplements. No harm, no foul.
Do corporations have more of a right to our data than we do?
You can still send your saliva to 23andMe and they’ll send you back the results, but what are they able to do with that information? For most intents and purposes, that means they can do the analysis and more, and mix it with other data, but you can’t have access to the same information that they have (at least, not using their analytic tools).
Regulatory agencies (generally) don’t deal in unintended consequences. Regulators control what they can to fulfill their duties, but the rest of us have to live with them. One unintended consequence of the FDA’s action is that companies can now analyze our data in ways that we won’t be able to. There are laws prohibiting “genetic theft” in only 11 states. When will the laws catch up?
How do we reconcile “nothing about us, without us” with market research in the genomic era?
Summary: How do we establish and continue to navigate what risk is tolerable?
Requiring a double-blind trial to make a claim for most informational products seems like an incredibly steep hill to climb, a hill that will drive a lot of innovation out of the market. We don’t want to go down a road where apps have the same level of scrutiny (and costs) of drug development.
But, by the same token, do we want people selling junk based on unsubstantiated claims that are personalized, therefore having the shine of science and validity? How do we find that middle area?
We shouldn’t try to regulate probabilities, but we should try to be consistent, evidence-based, and open in how the language of life, death and disease are used, not only for consumers, but for physicians, patients, researchers and everyone.
Following the Hippocratic Oath, let’s first do no harm, so let’s look at what health harm may have been done by unsubstantiated claims about personalized risk.
We can’t regulate uncertainty. While companies have the primary role in ensuring that their products do what they say they will, customers have a role in making all products better, particularly when they are complex products in which we sometimes can’t predict the consequences.
Product managers who follow lean and agile methodologies and work to improve UX know that complex products evolve best when feedback is built into the design process. Many regulations were written before this era of constant feedback, connected by ubiquitous person-to-person communication and networks. It’s time to upgrade for more complex products — and that will be the topic of the next post.