Why in Obesity Treatment Averages Are Not Good EnoughTuesday, July 20, 2010
One of the topics that I have often thought about (especially in light of our seeming inability to develop zero-risk obesity drugs) is the problem of averages. Our entire medical philosophy of “evidence-based” medicine seems built on the “Gaussian” assumption that averages can reflect the true benefit (or risk) of a drug, when in real life (or medical practice) there is no such thing as the truly average patient.
Clearly, a drug that works in most cases may be entirely ineffective (or have rare but serious adverse effects) in a given patient. Similarly, a drug that is ineffective for most patients can potentially work miracles in a small set of individuals.
For those of you who like analogies, imagine wanting to treat every case of fever with penicillin. Yes, if you run your study during an epidemic of streptococcal infections, more people with fevers may respond than during other times. But even then you will need large numbers to cut through the “noise”, as many fevers will spontaneously resolve or continue unabated unto death (which is why we need a “control” group). Chances are, we may well find that treating all fevers with penicillin is not much better that placebo and we will likely nicely demonstrate that simply taking penicillin for fever has unacceptable individual risks (including deaths from anaphylactic shock). Clearly, penicillin should not be on the market given its potential for “abuse” by anyone who has a fever.
But as we take a closer look at the data we may find that while penicillin is not a great drug for everyone who comes down with a fever, there may be a subset of patients (strangely those who appear to have bacterial infections), in which penicillin does seem to sometimes work. Yes, some of these patients may also have severe anaphylactic responses, but on “average”, people with fever due to bacterial infections do seem to get better faster than people with other causes of fever.
As we look even more closely at the data it seems that even among those with bacterial infections not everyone is “average” – fever patients affected with a certain type of bacteria (interestingly those who stain positively with a certain dye) seem to respond well (albeit still with occasional anaphylactic responses), while those infected by non-staining bacteria (and even some of those that stain positive) seem entirely unresponsive.
You can see where I am going with this – as long as we treat fever as a uniform entity, our chances of finding a “cure” in a large randomized trial of patients presenting with fever is virtually zero unless we are dealing with a very common etiological cause of fever (as in a rare epidemic when most fevers in a population may just happen to be due to a penicillin-sensitive bug), or in a massive study that allows drilling down to meaningful subgroups in post-hoc analyses (purists will likely object to this no matter the size of the study).
In fact, our large randomized fever study will likely tell us that the risk/benefit of using penicillin to treat fever is entirely unacceptable (given that penicillin has the potential to kill) – clearly no regulator would ever consider allowing penicillin on the market, especially for a condition as common as fever. Imagine all the people “misusing” penicillin to treat their fevers – no benefit (on average) – huge risks (for individuals).
No doubt, a company hoping to develop penicillin as a new treatment for fever, better invest heavily into identifying the group of fever patients for whom penicillin does in fact work. Patients in whom penicillin is so effective that even with the occasional death from anaphylactic shock, the “average” benefit remains indisputable. Clearly, simply taking 10,000 cases of fever off the street and treating them all with penicillin is unlikely to convince any regulator on the planet that this drug belongs on the market.
For readers, who may perhaps argue that using penicillin for fever is a long stretch, I’d be happy to offer other analogies: try methotrexate for patients with malignancies, try allopurinol for patients with an inflamed joint, or try vitamin B12 injections for patients with anemia. While all of these treatments may well be highly effective in a subset of patients with these disorders, for the “average” patient with cancer, joint pain, or anemia, these treatments will harbor nothing but side effects.
So what about obesity? The notion that we can take the next best 10,000 people with excess weight off the street and treat them all with a given compound that will result in clinically meaningful weight loss with virtually no side effects is not only overly optimistic but also contrary to any current understanding of the complex nature of obesity.
From where do the companies developing these compounds get the notion that a compound that is indeed powerful enough to override one of nature’s most intricate and essential survival instincts, will be both safe and effective for the “average” person who happens to find himself in a state of positive energy balance?
What is the biological rationale for hoping to find a drug that is as effective in reducing emotional (hedonic) eating as it is in overeating due to true hunger (homeostatic overeating), or perhaps overeating in social settings (as in peer pressure)? And how should this compound work in the person where clearly the problem is not overeating but undermoving (perhaps from that back injury, asthma, or lack of time). Indeed, it would truly have to be a miracle drug if it could also override the hyperphagic response to a hypoglycemic agent or to an atypical antipsychotic drug.
If scientific rationale does not convince us, that obesity is a remarkably heterogeneous condition, let us simply look at the results of our clinical trials with antiobesity drugs. Yes, the average response is modest (indeed some people even gain weight in obesity trials), but that should hardly be a surprise. The real surprise (or is this expected?) is that there is often a subset of patients (perhaps as few as 15% of the entire study population), who do remarkably well, losing not twice, but three-times the amount of weight seen in the control group. Not only do these patients reap clear benefits, but strangely, they may even appear to tolerate the drug better than the rest. Are these patients “random” outliers or are these the very patients for whom this drug would truly be nothing short of a Godsend?
Regulators may well agree that such subgroups exist but would want to see data to support this. They may not care about the biological reason why these “super responders” respond so well, but would certainly want to know if there is a way that these patients can be identified (so as to reasonably limit the license to this population).
But predicting responders (as any prediction) can be a tricky business. Once we know that penicillin is only likely to control fever in people with gram-positive infections, we can certainly limit the use of penicillin to patients with evidence for such infections (or even better use actual resistance testing) – but when we have no such “rationale”, can we somehow still screen for responders?
What easier way to screen than to actually try the drug – albeit in a limited and controlled setting. If a drug is meant to produce weight loss but fails to do so, clearly it is not working and should be discontinued. Even the safest weight loss drug is unlikely to have any benefits in someone who does not lose or even continues to gain weight – in such a setting even the smallest risk will have an infinitely high risk/benefit ratio.
Fortunately, response to weight loss medications can be easily measured (on a simple office scale). All we need to ask are the following questions:
1) How long would it take to be reasonably sure that we are dealing with a “responder”?
2) What is the risk of exposing “non-responders” to this drug long enough to determine if they are indeed “non-responders”?
3) How likely will “non-responders” continue using the drug (despite not losing weight) thereby exposing themselves to unacceptable risk?
Most obesity experts will agree that the answer to the first question is probably 6-12 weeks. The answer to the second question will of course depend on the nature of the drug and its potential for serious (irreversible?) side effects with short-term treatment. The answer to the third question is, probably very few.
Interestingly, this is exactly the way most drugs are actually used in the real world, i.e. outside of the highly artificial construct of randomized double-blind clinical trials.
In my clinical practice I routinely start patients on drugs for any number of complaints and conditions and judging on my patient’s response (with regard to both efficacy and tolerability), I adjust the dose, or discontinue the drug altogether (often only to switch to the next available agent or running additional tests to confirm my diagnosis). Never in clinical practice would I (or my patients) consider continuing patients on drugs that have no demonstrable effect or precipitate unacceptable side effects (cost alone would prove a remarkable deterrent).
Denying approval for compounds that have the potential to deliver important benefits to even a subgroup of patients, simply with the argument that the “average” patient may not benefit and would therefore have an unacceptable risk/benefit ratio cannot be an ethical rationale for denying patients who could well benefit from such compounds.
Obesity has high risks – killing an estimated 300,000 Americans every year. For those with medically relevant obesity the only evidence-based option today is bariatric surgery (surprisingly safe but definitely not without risk). If only a subset of obese patients (15%?) could be effectively and safely treated with existing or emerging anti-obesity compounds, is the potential for misuse by those who should not be taking these compounds enough of an ethical argument to deny this treatment to those who do benefit?
For those who chose to misuse or abuse these compounds, where is the role of personal responsibility, which we so readily call upon to justify ridiculously lax gun or gambling laws? (Inability to enforce these laws has certainly not convinced courts or legislatures of the need to reverse their decisions)
On what legal precedents do regulators (and their advisors) base their recommendations to deny potentially safe and effective treatments to a few (for whom these treatments may well be safe and effective) in order to protect those who should clearly not be using these compounds in the first place?
If such compounds do exist, all I can say is, “restrictions, yes – denial, no”!
I firmly believe that as long as companies (and regulators) continue treating obesity as a homogeneous condition for which we can potentially find a drug that is both safe and effective for anyone with excess weight (irrespective of the cause), we will be unlikely to have safe and effective pharmacological treatments for ANY patients with obesity in the foreseeable future.
Ucluelet, BCYou can now also follow me and post your comments on FaceBook
Tuesday, July 20, 2010
Great post Dr. Sharma!
Sadly it seems that many regulators do not understand the various potential causes of obesity.
Tuesday, July 20, 2010
Very well said! I believe this argument is true for other medical conditions as well and not ONLY obesity.
Many diseases are result of interaction between both biological and environmental factors. Knowing the variations in human genes it will be obvious that not all patients have the same and “harmonious” diseases condition neither they do the same response to medication.
Now a days talking about “personalized medicine” an opportunity to individualize drug therapy for patients based on their genetic make-up (pharmacogenomics), and this being a new era for future medicine, we would need more therapeutic choices for all patients and not only for “Average” patients.
“Average” patient is a concept that I really have difficulty understanding.
Tuesday, July 20, 2010
I agree that current designs for randomised controlled trials in obesity are unhelpful and translate poorly onto the real world. They are of course based upon meeting the requirements of regulatory authorities (FDA in USA and EMEA in Europe) and also the rather rigid approaches of scientific journal editors. It is quite possible to design ‘real-life’ trials whereby only responders to an anti-obesity drug enter the long(er)-term part of the trial (as we did in the STORM study of sibutramine (Lancet 2000; 356: 2119-25). The trick is to design the concept or responder into a trial, so that when analyzed there can be no accusation of cherry-picking results of groups that were not pre-specified. The use of the measure ‘numbers needed to treat’ (NNT) (i.e. you have to treat 50 people to get 1 success) could be an interesting way of expressing realistic and useful thresholds for drug continuation/discontinuation.
So if you set a target of 10 weight loss at one year, many anti-obesity drugs would have an NNT of perhaps 10 (10 people treated to get one person losing 10% or more). If you only continued treatment beyond 4 weeks (say) for people who lost more than 2 kg, you might favourable decrease the NNT to 2 or 3. In the context of reimbursement this would drive a very beneficial cost:effectiveness analysis. To a certain extent that is the approach that NICE (the UK National Institute for Health and Clinical Excellence) took in their appraisals of orlistat, sibutramine and rimonabant. At the moment we really only have initial weight loss response as a predictor – in years tocome we might have genetic or other markers allowing treatment to be personalised.
Tuesday, July 20, 2010
Tuesday, July 20, 2010
Great point Nick – to truly understand treatment effects in individuals pragmatic studies should be the way to go. I wonder what it would need to get regulators to move in that direction.
Tuesday, July 20, 2010
Thanks for puttng into eloquent wording what I believe I have talked about while lecturing around the globe for years!
I just assume that the SCOUT iutcome prompted this?
Wednesday, July 21, 2010
@Rossner: SCOUT, BLOOM, Qnexa – in every study we see modest average results and a subset of “super responders”. I guess it just begs the questions will studying average responses ever lead to finding solutions for special cases?