Comparative Effectiveness for Populations and Individuals

Yesterday, on my flight to San Diego to attend the 28th Annual Scientific Meeting of The Obesity Society, I had the opportunity to catch up on some of my reading.

Two recent articles in the Journal of the American Medical Association (JAMA) caught my attention, as they related to topics that I have recently blogged about.

The first article, by David Kindig and John Mullahy, examines the issue of determining the comparative cost-effectiveness of public health interventions. Readers may recall the recent WHO/OECD report that concluded that the cost-effectiveness of population interventions to prevent or manage obesity is probably so low, that such efforts are unlikely to yield returns within the next three to four decades.

In a similar vein, Kindig and Mullahy point out that rather limited evidence exists to guide public and private policy makers regarding investments to address the social, economic and cultural determinants of health behaviors, which would include tackling issues like the the physical and built environment.

This is why, the authors argue, that the significant funding that the US Government has proposed for comparative effectiveness research (CER) efforts should be dedicated to understanding the effectiveness of investments across broad determinants of health instead of focusing primarily within the health care domain alone (e.g. drug-drug comparisons).

Indeed, the authors warn that “without an adequate evidence base on which to judge the effectiveness of any particular strategy or intervention launched across such multiple sectors, the [Obama] childhood obesity initiative—as well as any other broad, multisectoral initiative on important population health problems—will succeed only by chance.

The second article, by Helene Chmura Kraemer and Ellen Frank, discusses the evaluation of comparative treatment trials. As readers will recall, I recently pointed out that evaluating drug benefits and risks by merely looking at averages (as in the case of anti-obesity agents), may lead to treatments that are safe and effective for some people being removed from the market or not being made available because they are not safe or do not work for most people.

As Kraemer and Frank point out, “The evaluation of the comparative effectiveness of treatments should not depend on the statistical effect of treatments on individual measures of outcome (benefits or harms), but rather on the clinical effects of treatments (both benefits and harms) on individual patients who experience both benefits and harms. Such evaluation requires both statistical assessment of the rates of co-occurrence of such benefits and harms and clinical assessment of their combined clinical effects on patients.

This means that rather than simply looking for p-values, randomized controlled trials should be geared towards looking at effect sizes at the individual patient level. Analyses should thus take into account the simultaneous benefits and harms experienced by each patient, much as a physician does when deciding between two or more treatment options.

For instance, while there is concern that certain antidepressants may increase the risk of suicidal ideation and suicidality in some adolescents, if only harm is considered, individual patients may be denied a potentially lifesaving treatment; in contrast, if only benefits are considered, a small subgroup of young patients may be exposed to serious harm.

Rather, individual benefits and risks can be better represented by the calculations effect sizes expressed as area under the receiver operating characteristic curve (AUC), success rate difference (SRD) (ie, rate difference for a favorable outcome), or number needed to benefit (NNB) or harm (NNH).

Using the theoretical example of a treatment that may have different benefits and harms in men than in women, a very different assessment of individual effect sizes emerges than when looking at overall p-values. Obviously, things may not always be as clear as in the chosen example where one has easily identifiable subgroups (i.e. men vs. women), but the key point here is that evaluation of the comparative effectiveness of treatments should not depend on the effects of treatments on group averages of outcome (benefits or harms), but rather on the overall clinical effects of treatments on individual patients, which requires the statistical assessment of the rates of co-occurrences of positive and negative outcomes and clinical assessment of their combined clinical effects on each patient.

Thus in the case of an obesity drug that results in effective weight loss in both men and women but may cause teratogenicity, a decision to mak the drug unavailable to both men and women, may deprive men of a safe and effective obesity treatment.

Similarly, abandoning an obesity drug that provides measurable benefits for individuals who respond with weight loss, just because it may potentially be harmful in individuals who fail to lose weight on this agent, would deprive “responders” from an effective treatment option.

I certainly concur with Kraemer and Frank, that as we move towards an era of “personalised” medicine, we need to focus more on individual effect sizes than blindly stare at statistical averages in the interpretation of clinical trials.

San Diego, CA

For live updates from Obesity 2010 – follow me on FaceBook and Twitter

Kindig D, & Mullahy J (2010). Comparative effectiveness–of what?: evaluating strategies to improve population health. JAMA : the journal of the American Medical Association, 304 (8), 901-2 PMID: 20736476

Kraemer HC, & Frank E (2010). Evaluation of comparative treatment trials: assessing clinical benefits and risks for patients, rather than statistical effects on measures. JAMA : the journal of the American Medical Association, 304 (6), 683-4 PMID: 20699462