// Cadre

Cadre d'Évaluation Clinique

Creator: Clinical App Report
Published: 2026-05-18T00:00:00.000Z

Chaque évaluation et classement sur Clinical App Report est noté selon le même cadre de 100 points. Sept critères, pondérés, avec une Note de Preuve (A–F) ancrée dans la littérature de validation publiée.

The 100-point framework

Clinical Evaluation Framework — criteria, weights, and what we measure
Critère	Poids	Ce que nous mesurons
Preuve et Validation	25%	Études de validation revues par les pairs, posture réglementaire (FDA/MHRA/CE), profondeur de citations dans la littérature clinique
Exactitude Clinique	20%	Validité de mesure — MAPE vs repas de référence pesés, niveau de vérification de la base, robustesse au bruit
Performance de Reconnaissance par IA	15%	Identification Top-1/Top-3 des aliments, MAPE de portion, segmentation de plat selon éclairage et angle
Cadre Macronutriments et Objectifs	10%	Profondeur des macros, personnalisation des objectifs, protocoles de coaching adaptatif, fidélité de l'analyseur de recettes
Adhésion Comportementale	10%	Temps médian de saisie sur batterie de 20 tâches, friction, schéma de désengagement issu des études longitudinales
Confidentialité et Sécurité	10%	Clarté de gestion des données, posture HIPAA, facilité d'export/suppression, friction d'annulation, conflits de monétisation
Coût et Accessibilité	10%	Coût réel sur 12 mois, utilité du niveau gratuit, couverture linguistique, prise en charge d'appareils à ressources limitées

Each criterion produces a sub-score from 0 to 100; the weighted sum is the overall score. The Evidence Grade is a separate, structured assessment of validation evidence (A–F).

Evidence & Validation (25%)

Evidence & Validation is the largest criterion because clinical credibility depends on it. We assess peer-reviewed validation studies, regulatory posture (FDA / MHRA / CE), citation depth in clinical literature, and the publisher's own methodology transparency. The Evidence Grade (A–F) is a structured summary: A requires ≥ 1 published RCT validating the app as a clinical intervention versus an active comparator; B requires peer-reviewed observational validation; C requires manufacturer-cited validation; D requires documented methodology; F is neither.

Clinical Accuracy (20%)

Clinical Accuracy is anchored to Mean Absolute Percentage Error (MAPE) against weighed reference meals. Each reference meal is built from USDA FoodData Central composition values, with every ingredient weighed on a calibrated kitchen scale (0.1g precision). We compute MAPE of each app's predicted kcal vs the reference value across the battery.

Scoring anchor: accuracy_points = clamp(100 − MAPE × 4, 0, 100). A 5% MAPE earns 80 points; 15% MAPE earns 40; 25%+ earns zero. The slope was chosen so an app at the boundary of clinical usefulness (~5% MAPE per Schoeller 1995) earns a strong but not perfect sub-score.

AI Recognition Performance (15%)

For each AI-photo-capable app we run a 30-plate photo battery across three lighting conditions, three angles, and three plate sizes. Sub-scoring: Top-1 identification correctness (40 of 100 AI-subscore points), Top-3 identification correctness (20), portion-size MAPE (30), and plate segmentation accuracy on multi-item plates (10).

Macronutrient & Goal Framework (10%)

Macros (10%) covers four sub-dimensions: macro display depth (calories, P/C/F, net carbs, fiber as first-class metrics), target-setting flexibility (custom per-macro targets, time-windowed targets), adaptive coaching protocols (TDEE estimation, weekly target adjustment), and recipe builder fidelity.

Behavioral Adherence (10%)

Behavioral Adherence is measured as median time-to-log across a standardized 20-task battery, plus drop-off pattern from published longitudinal-use studies. Friction matters because logging consistency over weeks is the variable that most predicts weight-management outcomes — a faster-to-log app is structurally more accurate over time even if per-meal accuracy is comparable.

Privacy & Security (10%)

Privacy is graded on data handling clarity, HIPAA posture (where applicable), retention policy transparency, ease of data export and deletion, cancellation friction, and whether the product's monetization model creates conflicts of interest with user advice quality.

Cost & Accessibility (10%)

Accessibility is computed as feature-density per dollar of annual cost plus free-tier usefulness, language coverage, and low-resource device support. Aggressive trial-conversion pricing reduces the sub-score.

Test cadence

Top-tier apps are re-evaluated quarterly. Mid-tier apps are re-evaluated semi-annually. A vendor release that changes core methodology, database source, or photo-AI model triggers a 30-day re-test window. Evidence Grade updates as new validation evidence publishes.

Quality control

All evaluation and scoring is reviewed against the test data before publication. Substantive corrections are logged with date and reason.

Why we don't take affiliate money

We don't maintain affiliate accounts with any of the apps we evaluate. Our reasoning is documented in our no-affiliate disclosure.