Why is it that I still
remember that the formula for the volume of a sphere is 4/3π r3,
which I learned in tenth grade geometry? And why is it that I never even heard
of a p-value, the measure commonly used to assess whether a result is
“statistically significant,” until I was in medical school? I haven’t had any
occasion to compute the volume of a sphere since I took calculus in college,
but I have to interpret statistical findings all the time. Something is not right
here.
Understanding at least the
rudiments of statistics matters—and not just to me, a physician who has to make
decisions about how to treat patients by evaluating articles in the medical
literature that rely on statistical methodology. Understanding basic statistics
matters to everyone. You need to know some statistics to realize that it is
more accurate to measure the population by using sampling techniques than by
trying to count everyone. You need to know some statistics to understand why
Nate Silver, with his FiveThirtyEight website, was so much more on target in
his predictions about the 2012 presidential elections than anyone else. And you
need to know some statistics to decide, as a patient, how to evaluate the
options your physician presents you with.
Just this morning, I read an article in the first section of the NY Times “Study Discounts Testosterone-Suppressing Therapy for Early Prostate Cancer.” It turns out that millions of
men with early stage prostate cancer, mainly men over the age of 65, have been
treated with “Androgen Deprivation Therapy” (ADT), either by bilateral
orchiectomy (surgical removal of the testes) or by drugs. A new study,
published in JAMA Internal Medicine, concludes that ADT in such men does not
prolong life. It does cause lots of side
effects, ranging from osteoporosis to weight gain, to decreased libido, to
diabetes. The article quotes one expert who was not involved in the study as
saying that the findings were “eye-opening and even alarming.” According to
editorial writers from the Dana Farber Cancer Institute, the treatment is a good
candidate for inclusion in the “Choosing Wisely” campaign, a national effort to
eliminate the use of “low value medicine;” that is, treatments that achieve
little, given their cost. The article fits in nicely with a major theme of JAMA
Internal Medicine, which has a section called “Less is More.” It’s a theme that
resonates with me as well: I often argue on this blog that certain treatments,
especially when provided to frail, older individuals, may cause more harm than
good. Finding that a commonly used treatment, such as ADT in older men, doesn’t
do what it promises, would not be at all surprising to me. But is it true?
I looked up the article,
which isn’t actually in the print issue of the journal yet; it was published in
the “online first” section, which gets important articles distributed quickly.
The authors looked at data on 66,717 men age 66 or older with localized
prostate cancer diagnosed between 1992 and 2009. They defined “primary ADT” as
orchiectomy or the use of a drug such as a luteinizing hormone releasing
agonist (a drug that stimulates the pituitary to signal the testes to make
testosterone until they run out, at which point testosterone levels fall) as
the sole cancer therapy given to men with localized prostate cancer within 6
months of diagnosis. The outcomes they were interested in were cancer specific
mortality (that is, the death rate from prostate cancer) and overall mortality.
So far so good.
But since this was not a randomized study in which some men got
ADT and others received conservative management (ie no treatment unless
symptoms develop), with the selection made based on the flip of a coin, there
was no reason to believe that the two groups of men would be similar to one
another. In fact, they were quite different. The men who got ADT were a good
bit older than those who did not (average age 79 vs 77). They were considerably
sicker, with higher rates of other diseases such as heart disease or lung
disease. And they were far more likely to have “high risk” prostate cancer,
based on the characteristics of the cells in their tumors (47.7% vs 23%). Their
PSA scores were also much higher (an average of 19.5 in the ADT group compared
to 11.1 in the other men, where 4 is the typical cutoff for normal). Simply
comparing the outcomes in these 2 very dissimilar groups of men would not tell
the whole story. Somehow, the authors needed to try to compensate for the
inherent differences between the men. The only way to do that (other than
scrapping this approach entirely and randomizing men to get ADT or some other
treatment), is to build a statistical model.
Build a model the study
authors did. The specifics of what they actually did are too complicated to
describe here. I’m not sure I fully understand what they did, but it involved a
technique called “Instrumental Variable Analysis,” known as IV. Suffice it to
say that when they used this approach to try to adjust for all the differences
between the groups (only some of which they could specify), they concluded
that the 15-year prostate cancer specific survival rate was 85.4% in both
groups. And when they used a different method, the Cox multivariate model, they
found the mortality rate was 2.4/100 in the ADT group and 1.1/100 in the group
treated with conservative management or, after attempting to adjust for
differences based on what was known about other illnesses, PSA levels, etcetera,
the group treated with ADT was 1.53 times more likely to die.
What the reader needs to
understand is that the results of the study depend entirely on which model you
choose. If you select IV, and the authors try hard to make the case that this
is an excellent choice, but which some experts think is a flawed approach, you find that ADT and conservative therapy are
equivalent. If you select the more conventional approach, you find that ADT is
actually worse than watchful weighting. Since neither model predicts that ADT
is better than conservative management, perhaps it follows that ADT is just a
bad choice for the treatment of early prostate cancer in older men. The right
conclusion, I think, is that we don’t actually know what to make of ADT. If we
chose yet another model, perhaps we would find that ADT is superior.
Learning about different
study designs—which ones you can trust, which ones are merely suggestive and
which have to be confirmed using a better, more reliable approach—is what kids
should be learning in high school and college. Learning about probability and
statistics is what kids should be learning, not trigonometry and solid
geometry. Our math curriculum reflects seventeenth century mathematical
knowledge (it typically includes elementary algebra, Euclidean geometry, and perhaps calculus, created in the fourth century BCE and the seventeenth centuries
respectively).
Today, big data is all the rage and there is a growing
enthusiasm for learning how to milk large data sets for useful information. But
the reality is that it’s not just big data that’s important and it’s not just
important for a small cadre of people. We all need to learn how to make sense
of what we read in the newspapers, of what our doctors tell us about different
treatments. And to do that, we need to develop basic statistical literacy.
No comments:
Post a Comment