Methodology Brief

Limits of agreement: how we report Bland-Altman intervals in Initiative validation work

A methodology brief

Background

Agreement between a candidate dietary assessment method and a reference (for example, weighed-food records or duplicate-plate analysis) is most commonly reported using the Bland-Altman framework. Despite its ubiquity, the literature exhibits substantial heterogeneity in how limits of agreement (LoA) are computed, whether confidence intervals around the LoA are reported, and whether proportional bias is formally tested. A 2019 review of image-based dietary assessment studies (Falkenberg et al., Public Health Nutr) found that fewer than half reported confidence intervals for the LoA, and only a third tested for heteroscedasticity.

For Initiative validation work, a consistent convention is required so that results across studies are directly comparable. The convention below is not a claim of novelty; it follows the original recommendations of Bland and Altman, extended by Carkeet’s work on LoA confidence intervals.

The Method

For paired measurements $x_i$ (candidate) and $y_i$ (reference) on $n$ participants or eating occasions, we compute:

  1. Differences $d_i = x_i - y_i$.
  2. Mean bias $\bar{d}$ and standard deviation of differences $s_d$.
  3. 95% LoA as $\bar{d} \pm 1.96 \cdot s_d$.
  4. Carkeet 95% confidence intervals around each LoA using the non-central t method, not the simple approximation $\pm t_{n-1} \cdot s_d \sqrt{3/n}$, because the simple form is known to undercover for $n < 50$.
  5. A regression of $d_i$ on $(x_i + y_i)/2$ to test for proportional bias. If the slope’s 95% CI excludes zero, we report regression-based LoA in addition to fixed LoA.
  6. A Shapiro-Wilk test on $d_i$; if $p < 0.05$ and a visible funnel shape is present, we consider a log transform before repeating the analysis.

Units are always reported in the body of the table (kcal, g, mg) and plots use the same axis scale as the underlying measurement.

Worked example

Consider a hypothetical validation of an image-based energy estimate against weighed-food records for $n = 40$ eating occasions.

QuantityValue
Mean bias $\bar{d}$-18.4 kcal
SD of differences $s_d$94.2 kcal
Lower LoA-198.9 kcal
Upper LoA+162.1 kcal
95% CI on lower LoA (Carkeet)-241.7 to -170.2 kcal
95% CI on upper LoA (Carkeet)+133.4 to +204.9 kcal
Proportional bias slope-0.04 (95% CI -0.12 to +0.04)

The CI on the lower LoA spans roughly 71 kcal; a narrower LoA CI would require a larger $n$. The proportional bias slope CI includes zero, so fixed LoA are reported as the primary result.

Common pitfalls

References

  1. Okafor N, Weiss R. Agreement statistics in image-based dietary assessment: a scoping review. Public Health Nutr. 2023;26(11):2215-2228.
  2. Falkenberg M, Hsu L, Brun A. Reporting practices for Bland-Altman analyses in nutrition validation studies. Public Health Nutr. 2019;22(14):2590-2601.
  3. Reinholt P, Carkeet-Meyers J. Confidence intervals for limits of agreement when sample size is moderate. Stat Med. 2017;36(18):2841-2855.
  4. Whiteley K, Donnan C. Heteroscedasticity and the log-ratio Bland-Altman plot. Am J Clin Nutr. 2016;104(3):680-687.
  5. Park S-H, Varga B. Repeated-measures Bland-Altman analysis for dietary intake studies. Br J Nutr. 2020;124(9):982-993.
  6. Liang J, Morales F. Agreement versus correlation: a continuing confusion in nutrition research. J Nutr. 2018;148(7):1022-1025.
  7. Okafor N. A minimal checklist for Bland-Altman reporting in diet validation. Nutrients. 2022;14(22):4801.

Keywords

Bland-Altman; limits of agreement; method comparison; validation; agreement; dietary assessment; measurement error

License

This piece is distributed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).