Methodology Brief

Labelling vendor-reported vs. independently-replicated accuracy numbers: an editorial convention

A methodology brief

Helena Weiss, PhD, MPH, RD; Lars Henriksen, PhD
Published March 10, 2025

Background

Accuracy claims for commercial dietary assessment tools enter the scientific record through several channels: peer-reviewed validation studies by vendors, peer-reviewed validation studies by independent groups, vendor technical reports, press releases, and secondary citations in review articles. These sources differ materially in their incentive structures, sample construction, reference choice, and preregistration practice, yet a reader encountering a bare “MAPE = 12%” in a narrative review usually cannot tell which channel it came from.

A 2023 survey of systematic reviews in image-based dietary assessment (Henriksen, J Nutr) found that one-third of accuracy figures in review tables could not be traced to a peer-reviewed study, and roughly one in six came from vendor white papers. The Initiative’s editorial convention is that every accuracy number in an Initiative publication must carry a provenance label.

The Method

Initiative documents classify accuracy numbers into four provenance tiers:

Tier A - Independently replicated. The accuracy figure was produced by a group with no financial relationship to the vendor of the tool being evaluated, in a peer-reviewed publication, with a reference standard described sufficient for replication.
Tier B - Vendor-involved peer-reviewed. A peer-reviewed study with vendor authorship, funding, or material participation, where the reference standard and analysis are fully described.
Tier C - Vendor-reported, non-peer-reviewed. Accuracy claims from vendor technical reports, pre-prints not yet peer-reviewed, or product documentation.
Tier D - Secondary citation only. Numbers cited in reviews without a retrievable primary source.

In Initiative tables, accuracy figures are rendered with a superscripted tier letter. Tier C numbers may be included in narrative text but not in quantitative summary tables (for example, forest plots or pooled estimates). Tier D numbers are excluded from quantitative synthesis and flagged in a footnote if they appeared in the source literature.

A secondary convention concerns matching of reference standards. Two numbers may both be Tier A yet be non-comparable if one used weighed-food records and the other used 24-hour recall. When accuracy numbers are placed in the same table, the reference standard column is always populated.

Worked example

A hypothetical evidence table for a candidate tool might read:

Study	Year	N	Reference	MAPE	Tier
Valdez et al.	2022	88	Weighed-food	14.8%	A
Park & Rhee	2023	142	24-h recall	9.7%	A
Bianchi et al. (vendor-funded)	2022	210	Weighed-food	7.1%	B
Company X technical report	2023	not stated	”curated test set”	4.3%	C
Review Y, 2024 citation	-	-	unclear	11%	D

Under Initiative conventions, the Tier A-weighed-food and Tier B-weighed-food results can be compared cautiously; the Tier A-24-h-recall figure is not pooled with them because the reference standards differ. The Tier C and Tier D entries appear in the table with their tier markers but are excluded from any summary statistic.

Common pitfalls

Citing a vendor white paper figure in a narrative review without marking it as non-peer-reviewed. This practice launders Tier C into the peer-reviewed literature.
Pooling MAPE values across different reference standards. Reference choice is itself a major source of between-study heterogeneity.
Treating a peer-reviewed paper with vendor authors as equivalent to an independently replicated one. The Initiative does not treat Tier B as inferior, but it does treat it as distinct, and reports both when available.
Omitting the reference standard column from evidence tables. Without it, the reader cannot judge comparability.
Using a single “accuracy” figure from a product-documentation page in a methods section. This is a frequent practice and is discouraged.

Recommended reporting

Label every accuracy number in Initiative tables with a tier letter (A, B, C, D).
Populate the reference-standard column for every row.
Exclude Tier C and D from pooled estimates and forest plots.
When a vendor-involved figure is included, disclose the specific financial relationship in the footnote.
Where possible, retrieve and cite the primary study rather than a secondary review.

References

Henriksen D. The provenance of accuracy numbers in AI dietary assessment reviews: a methodological audit. J Nutr. 2023;153(6):1640-1649.
Weiss R, Okafor N. Conflict-of-interest disclosure and effect size in nutrition technology studies. Am J Clin Nutr. 2022;116(3):755-764.
Patel R, Henriksen D. Reference-standard heterogeneity as a source of between-study variability in image-based dietary assessment. Nutrients. 2024;16(7):1015.
Linde J, Caballero M. From press release to systematic review: tracing accuracy claims in digital health. JMIR mHealth Uhealth. 2023;11(2):e41122.
Henriksen D, Weiss R. An evidence-tiering scheme for commercial dietary assessment tools. Public Health Nutr. 2025;28(1):14-23.
Yoshida H, Morales F. Unreproducible accuracy claims in nutrition technology: three case studies. Br J Nutr. 2022;128(5):580-589.
Weiss R. Editorial independence in nutrition technology evaluation. J Nutr. 2024;154(9):2621-2624.

Keywords

evidence quality; vendor claims; independent validation; reporting conventions; editorial policy; dietary assessment; accuracy

License

This piece is distributed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).