Commentary

A short audit of food-database provenance in five consumer applications

Where the nutritional composition numbers actually come from

Sofia Rivera, MS, RD
Published October 6, 2025

Readers who have spent time in the weeds of image-based dietary assessment will be familiar with the phrase “garbage in, garbage out.” Less often discussed is what the in actually is: once an application has identified a food and estimated a portion, it retrieves the nutritional composition of that portion from a database. The database is a piece of the pipeline that rarely gets audit attention. In preparation for a planned narrative review on this topic, we conducted a descriptive audit of the food-database provenance in five widely used consumer nutrition-tracking applications during August and September 2025. This is a short commentary on what we found; a full methodology note will accompany the review.¹

The five applications

We anonymize the five applications here as Apps A through E. The selection was stratified: two applications with a primarily US user base, two with a primarily European user base, and one with a primarily Asian user base. All five are consumer-facing and among the 25 most-downloaded nutrition applications in their respective regions as of Q2 2025.²

What we asked

For each application, we attempted to determine: (1) the underlying food database or databases, (2) the update frequency, (3) whether user-submitted entries are distinguishable from verified entries in the interface, and (4) whether the user-submitted entries are reviewed before being made visible to other users. Information was gathered from each application’s public documentation, support pages, terms of service, and — where publicly documented — regulatory filings and academic publications by the vendor.³

What we found

Provenance varied widely. App A claimed to use USDA FoodData Central (formerly USDA National Nutrient Database for Standard Reference) as its primary source, supplemented by user-submitted entries; the user-submitted entries were not marked in the interface. App B used a proprietary database the vendor stated was “derived from USDA and EU sources,” with no public cross-walk documentation. App C combined three regional databases (USDA, CIQUAL for France, and the Japanese STFC) with documented versioning. App D relied almost entirely on user-submitted entries with a community-voting quality signal. App E used a proprietary database with no stated public source.⁴

Update frequency ranged from “annual” (App C, self-reported) to “continuous” (Apps A and D, by virtue of the user-submitted channel). Explicit versioning — in which the user can see which version of the database produced their nutritional estimate — was present in only App C.

On the user-submitted question, four of the five applications mixed user-submitted entries into search results without visible distinction from verified entries. App C was the exception, marking user-submitted entries with a secondary indicator. Prior review of user-submitted entries was rare; App C applied a review queue, and the other four relied on post-hoc community voting or no review at all.⁵

Why this matters

A user who logs a meal in any of Apps A, B, D, or E may, without realizing it, be receiving a nutritional estimate derived from an entry another user created with no review. In our manual spot-checks we identified entries in two of the applications where the reported energy per serving differed by more than 30% from the USDA FoodData Central value for the same food. This is not a critique of user-submitted content as such — user-submitted databases have strengths, notably in cultural breadth — but the absence of a visible provenance signal is a problem. A dietary-assessment pipeline cannot be better than its database, and a user cannot meaningfully calibrate their trust in a number if they cannot see where the number came from.⁶

Where we would like to see the field move

We would encourage consumer applications to display database provenance at the point of log — not buried in a settings page — and to mark user-submitted entries distinctly from verified ones. We would encourage the research community to treat database version as a reportable parameter in validation studies, in the same way software version is reported. The 2026 narrative review the Initiative is preparing will argue for both of these.⁷

References

Rivera, S. et al. (forthcoming 2026). Food-database provenance in consumer dietary applications: a narrative review. Dietary Assessment Initiative, DAI-NR-2026-02. ↩
Selection source: Sensor Tower Q2 2025 rankings, cross-referenced with Google Play top-charts. ↩
Audit methodology protocol, Initiative internal document DAI-AUD-2025-01. ↩
Full per-application extraction sheet will accompany the 2026 narrative review. ↩
Haytowitz, D. B. et al. (2019). USDA Food and Nutrient Database for Dietary Studies. Journal of Food Composition and Analysis, 79, 1–7. ↩
Greenfield, H. & Southgate, D. A. T. (2003). Food Composition Data: Production, Management and Use (2nd ed.), FAO. ↩
See also Roe, M. et al. (2013). Harmonised procedures for producing new data on the nutritional composition of ethnic foods. Food Chemistry, 140(3), 422–427. ↩

Keywords

food database; provenance; USDA FoodData Central; audit; consumer apps; data quality

License

This piece is distributed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).