Commentary

PubMed search strategies for finding image-based dietary assessment validation studies

A practical note for systematic reviewers

Every systematic review in this space begins with an uncomfortable fact: PubMed’s controlled vocabulary was not designed for image-based dietary assessment. The relevant MeSH terms — Diet Records, Nutrition Assessment, Mobile Applications — were established before the present generation of computer-vision food recognition existed, and the indexers who apply them have variable familiarity with the field. What follows is a brief, opinionated guide to constructing a search strategy that does not quietly lose a third of the literature.

The three-layer structure

We construct our searches in three conceptual layers: the intervention (the app, the image system, the AI pipeline), the outcome concept (dietary assessment, intake estimation, food identification), and the validation concept (agreement, accuracy, reliability, validation against a reference). Each layer combines free-text with whatever MeSH coverage exists, and the layers are combined with AND. In practice, the intervention layer is where most searches under-recover, because indexing lags the terminology.

A working PubMed stub, as of October 2024:

(("mobile application*"[tiab] OR "smartphone app*"[tiab] OR "image-based"[tiab]
   OR "photo*"[tiab] OR "food recognition"[tiab] OR "computer vision"[tiab]
   OR "deep learning"[tiab] OR "convolutional neural network"[tiab])
 AND ("dietary assessment"[tiab] OR "dietary intake"[tiab] OR "food intake"[tiab]
      OR "nutrition assessment"[MeSH] OR "diet records"[MeSH] OR "food record*"[tiab])
 AND (validat*[tiab] OR accuracy[tiab] OR agreement[tiab] OR "Bland-Altman"[tiab]
      OR reliability[tiab] OR "mean absolute percentage error"[tiab] OR MAPE[tiab]))

This returned 1,043 records in a snapshot taken 3 October 2024. After duplicate removal across Embase and Scopus (which we always run in parallel), the de-duplicated pool was 1,288 records, of which 71 met our inclusion criteria at full-text review.1

What this strategy misses

Three classes of relevant paper slip through a strategy of this shape. First, conference-proceedings work in computer-vision venues (CVPR, MICCAI workshops, MADiMa) is not indexed in PubMed at all, and has to be recovered separately via dblp or Semantic Scholar. Second, validation studies that frame their outcome as “calorie estimation” or “portion estimation” without the word dietary are occasionally missed; we added ("calorie estimat*"[tiab] OR "portion estimat*"[tiab]) as a supplementary layer after an internal audit in 2023. Third, papers whose titles emphasize the machine-learning method rather than the nutritional application are routinely missed by our outcome layer.2

On indexing lag

There is a structural lag between a paper’s appearance and its full MeSH indexing. In a small audit we performed in mid-2024, the median lag for a paper in this area was approximately 127 days from PubMed deposit to full MeSH completion, with a long tail. For a review with a recent cut-off date, free-text recall dominates; MeSH-only strategies reliably under-recover the most recent 6 months of literature.3 Reviewers who do not adjust for this will produce search results that look tidy and are systematically incomplete.

We register our searches in PROSPERO alongside the protocol, and we publish the full executed string (including date of execution and number of hits) as a supplementary file. A search strategy that cannot be re-run is not a search strategy; it is a recollection.4

References

Footnotes

  1. Henriksen, L. & Weiss, H. (2024). Internal search log, DAI-SR-2024-03; execution date 3 October 2024.

  2. See discussion in McKibbon, K. A. et al. (2020), Journal of the Medical Library Association, 108(4), 557–567, on recall failures in interdisciplinary searches.

  3. Irwin, A. N. & Rackham, B. D. (2017). Comparison of the time-to-indexing in PubMed between biomedical journals. American Journal of Health-System Pharmacy, 74(6), 381–384.

  4. PROSPERO registration guidance, National Institute for Health and Care Research, York, UK.

Keywords

PubMed; search strategy; systematic review; dietary assessment; validation; MeSH; information retrieval

License

This piece is distributed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).