The field of nutrition is confronting a strange paradox. On the one hand, nutrients are substances provided by the environment which an organism needs for the optimal functioning of its various physiological systems. In other words, a nutrient produces an effect that would not happen in its absence, i.e., it is efficacious for some vital endpoint or outcome.
Against that background and over the past few years, there has accumulated a near mountain of studies reporting that particular nutrients (e.g., vitamins A, C, D, and E, calcium, and others) have little or no effect when tested in human subjects for certain endpoints or outcomes. Of course it could well be that particular nutrients, while active in one body system, may actually have no effect in others. Well conducted studies in those systems would then produce null results. But this is not a likely explanation for many of the failed studies, virtually all of which grew out of epidemiological or associational studies linking the nutrients concerned to various plausible outcomes. In this post I explore some of the reasons that might explain this seeming contradiction.
It is helpful to start with the dose-response relationship followed by most nutrients (See Defining Normal – Living on the Plateau). As has been noted in earlier posts, that relationship is best represented by an S-shaped (sigmoid) curve. When evaluating a particular nutrient we need to have at least a rough idea of the shape and location of that curve along the continuum of plausible intakes of the nutrient concerned.
The curve in Figure 1 is a forward-leaning S-shape (sigmoid) and is typical for most biochemical reactions. Such reactions usually have three rough response zones: a no-response region at the left (which represents something like priming the pump – necessary but not yet enough); then a relatively steep rise up to some maximum response; and then finally a dose range in which further increases in intake produce no further response (which reflects saturation of the responding mechanisms). The forward orientation (higher response with greater exposure) is the usual way of graphing these effects, but a mirror image curve (high at left and low at right) is how negative effects such as risk or harm might be shown, i.e., the nutrient or drug reduces risk or minimizes harm as its intake increases. Figure 1, which is an idealized response curve, both makes clear what is meant by that statement and shows why it is important to test intakes in the response region of the curve (and not at either the low or high ends odf the curve). This is obviously necessary in order to optimize the chance of finding the response, if present.
Most drugs or biochemically active agents operate over an exposure range spanning three or more orders of magnitude; however, clinical studies evaluating their effects are commonly concentrated in the middle region where there is a near linear response to the increase in dose and hence its underlying sigmoidal character can be ignored. However, with nutrients, the entire curve is usually compressed within a single order of magnitude. (Examples include calcium, for which the 95% intake range extends from about 200 mg/d to about 2,000 mg/d, or vitamin D, for which the physiological range of serum 25-hydroxy-vitamin D extends from about 20 nmol/L to about 225 nmol/L.) The entire sigmoid curve of response to the nutrient is compressed within this range of plausible intakes. This means that the response to a given increment in intake from various starting values will vary. In fact, some starting levels will be either so low or so high that a substantial change in intake will produce either no response or one so small as to be hard to detect. This point is illustrated in Figure 2, in which an identical increase in intake produces very different responses depending upon where the basal (or starting) level of the participants is located along the response curve.
It is easy to see, therefore, why it is important to have prior knowledge of the shape and location of the curve and why it is important both to start at a basal level low enough to be at the left end of the curve and to ensure that all participants in a trial start at about the same level. Obviously, conducting a trial on individuals with different starting nutrient status values will blur the average response. If a nutrient does in fact produce the hypothesized effect, some subjects will nevertheless fail to show it, not because they are non-responders, but because either they started from such a low status that the dose was not sufficient to get them up to the response zone or, alternatively, they were already at or close to the maximum response.
And if most or all of the participants had starting values close to the top of the range, the trial would almost certainly be null because most or all of the involved participants would already be experiencing whatever response may have been hypothesized. One might ask, in the latter situation, “Couldn’t one tell that the effect had already been produced?” The answer for quantitative response variables is “usually not”. If the effect of an agent is, for example, to lower blood pressure by about 4 mm Hg, there would be no way to tell, against the background of wide variation in individual blood pressure values, whether participants’ blood pressure values on entry had already benefitted from that average 4 mm lowering .
Establishing the basal level and using it as a criterion for entry into study is not just a minor quibble. Several large, government-sponsored trials over the past 15 years, costing millions of dollars, fell into exactly this trap and, not surprisingly, they produced null (i.e., no-effect) results. An example is the calcium arm of the Women’s Health Initiative (WHI) trial. After the trial was fully enrolled it became clear that average calcium intake in the participants was already at or above the level recommended for their age. NHANES data, on which the WHI designers had relied, had shown that median dietary calcium intake of the target population was under 600 mg/d or about half the intake of the actually enrolled participants. The most likely reason for the discordance between expected and actual basal calcium intake was a combination of healthy volunteer bias and a failure to take into account the growing use of calcium supplements in the population. In any event, had the presumed calcium intake been operationalized as an inclusion criterion, the calcium arm of WHI would have provided a far better test of the related hypotheses.
What is harder to explain is why the results of this trial continue to be cited (and accepted) as evidence that calcium has no effect on the several outcomes that had been planned for analysis in WHI. Nor was WHI an isolated instance. Exactly the same mistake was made with the calcium and preeclampsia prevention trial (CPEP) and its outcome, though for very different reasons.
Finally, studies may be null because of failure to optimize co-nutrient status. The importance of doing so can be illustrated by a few examples. It is well recognized that vitamin D is necessary for regulation of calcium absorption, but it is less well recognized that quantitative analysis of the relation makes clear that vitamin D follows the sigmoid curve of Figure 1, i.e., its effect on calcium absorption reaches a maximum. Above that point more vitamin D does not produce more absorption. It has further been shown that even maximal vitamin D status will not compensate for calcium intakes below a certain level, i.e., even maximal absorption of not very much ingested calcium will result in not very much absorbed calcium. Thus, when testing skeletal outcomes and evaluating the effect of either vitamin D or calcium, it is essential that intake of the nutrient not being directly tested be optimized. Otherwise the effect of the nutrient being tested may be missed. But it doesn’t stop there. Protein is as necessary for bone repair as is calcium, and without adequate dietary protein, increased dietary calcium will not lead to replacement of lost bone, a fact that may help explain why many trials of supplemental calcium failed to increase bone mass. Several of the B vitamins (e.g., folate, B6, B12) are intimately involved in a biochemical process called single carbon transfer, and to discern the full effects of one, studies must ensure that intakes of the other two are not so low as to limit the response to the one being tested.
These examples are just that – illustrative. But they are not solitary instances. The one constant in nutrition is that isolated nutrient deficiencies are the exception. Diets inadequate in one nutrient are almost always inadequate in several others as well.
It is surprising how little attention is paid to the matter of making certain that nutrients other than the one being evaluated are present in adequate quantity. This is partly because of a reductionist approach which looks at nutrients in isolation, and partly because the drug model for evaluating pharmacological agents attempts to minimize and even eliminate, the impact of co-variates, while, as noted, nutritional co-variates must instead be optimized. Nevertheless this drug model has been carried over whole into the evaluation of nutrients without attending to critical differences between the two classes of agents.
The continued citing of studies that in hindsight could not have tested the corresponding hypothesis occurs most harmfully in approaches termed “systematic reviews” and “meta-analyses”. Both involve search of the published scientific literature to find all studies bearing on a particular question and then selecting some of them for inclusion in the review. The selection criteria usually consist of whether or not the studies meet certain design standards such as randomization or blinding, among others. Few such reviews use biological criteria that included studies must meet. As a result, in reviews of calcium, e.g., they inevitably include studies such as the WHI calcium arm and CPEP which, as just noted, could not have tested the corresponding hypotheses. Accordingly, such reviews tend to produce null results. This seems part of the reason studies of so many nutrients mentioned at the outset continue to turn up without seeming effect.
Accordingly, systematic reviews and meta-analyses need to be concerned with the same issues important for individual studies and, importantly, to use as selection criteria for study inclusion, features based on the biology of the nutrient concerned and not exclusively on reporting features of the papers to be analyzed.
Inspection of Figures 1 and 2 makes clear that, even if all studies have the same basal nutrient status, different doses of nutrients will usually produce differences in responses that are not linear, a fact that makes statistical adjustment for dose problematic. Given the often narrow dose range between the bottom and top of the sigmoid curve, twice the dose (doubling the intake) will usually not produce twice the response. It may be more than double or it may be less, and it may even be zero if the smaller dose had already gotten the participants up to the top of the sigmoid curve.
But just as the response to nutrient augmentation is non-linear, so too is the response to duration of intervention. For many nutrients, therefore, studies of substantially differing durations cannot easily be pooled. For example, calcium supplementation evokes what is called a “bone remodeling transient” and measurement of bone mass or density will often produce an apparent rise at short time intervals (e.g., six months), a smaller rise (or no apparent change in bone mass at 12 months, and then after actual bone loss (though at a slower rate than without the supplement). The non-linear region of the response curve is usually confined to the first 12 months, when a new steady state develops. For these reasons, attempts to assess the ultimate effect of calcium intake must be confined to the steady state region of the response curve. Clearly, pooling studies carried out over different portions of the transient will yield confusing or misleading results.
Failure to pay attention to these biological issues, either in individual studies or in systematic reviews and meta-analyses, will inevitably bias the results toward the null, which is statistical jargon for reducing the apparent size of the effect to a point where it is not statistically significantly different from no effect at all.
It may be objected that we may not have the knowledge needed to attend to these matters, even if we had the will. That much is certainly true. However, there is a second purpose of this discussion: it allows us to understand why studies of actually efficacious agents might turn out null. And in the case of systematic reviews and meta-analyses, they should stop us from continuing to cite studies as evidence of a certain conclusion when, in hindsight, we ought to have recognized that these studies could not have validly tested the associated hypotheses. (Systematic reviews, basically, are simply a form of hindsight.) That mistake was a conspicuous feature of the systematic reviews relied upon by the Institute of Medicine and the U.S. Preventive Services Task Force in formulating their recent policy statements for calcium and vitamin D.