The Problem 🔗
AI/ML SaMD teams routinely struggle to design adjudication protocols that are regulator-defensible yet fast enough for commercial timelines. Sponsors face a thicket of inconsistent practices—panel sizes, voting rules, QC layers, segmentation consensus methods (e.g., STAPLE vs. per-pixel), measurement aggregation (mean/median/two-most-concordant), and when to rely on objective diagnosis instead of multi-expert opinion. The result is avoidable rework, slow study execution, and uncertainty during FDA interactions.
This article aims to solve three concrete problems:
- Ambiguity in “ground truth” construction. Teams lack a clear, task-fit playbook (detection, triage, diagnosis, segmentation, measurement, EEG/time-series) to convert expert reads and objective diagnosis into a single, auditable reference standard.
- Excessive operational burden. Over-sized panels, unnecessary synchronous meetings, and open-ended QC loops inflate cost and time. Sponsors need lean defaults (asynchronous reads, 2-of-3/2+1, thresholded QC) that preserve rigor without calendar drag.
- Regulatory predictability. Without precedent-aligned patterns, adjudication sections in protocols and submissions invite questions and delays. Teams need concise, precedent-backed templates that map methods to indications and explicitly define triggers, thresholds, and escalation paths.
In short: we are codifying pragmatic, precedent-aligned adjudication patterns that minimize meetings and re-reads, maximize use of objective diagnosis, and standardize thresholds—so we can lock ground truth faster, document it cleanly, and move to market sooner with fewer regulatory surprises.
Why this matters for our “FDA submission in 3 months” guarantee 🔗
Our 3-month submission guarantee depends on eliminating as many unknowns and timeline risks as possible. One of the biggest timeline risk is getting a calendar date with clinical experts. The fastest path to a defensible submission is to lock a clean, auditable “ground truth” early—with minimal meetings, minimal panel size, maximum automation, and maximum reliance on objective diagnosis. This article operationalizes that into concrete defaults we implement on day 1:
- Objective-first: We prioritize pathology/operative findings, PSG, echo, or structured chart review as primary truth. Expert reads verify mapping only. This compresses ground truthing from weeks into days and reduces FDA back-and-forth on reference standards and getting stuck on high inter-reader variability. It is hard to argue with the pathology result.
- Asynchronous reads with 2-of-3 / 2+1: Three blinded readers in parallel; adjudicator engages only on discordance. No standing panel meetings. This keeps synchronous meetings off your critical path.
- Segmentation consensus by algorithm (STAPLE), not meetings. We will generate a single per-case consensus mask by applying STAPLE to ≥3 independent expert masks (probability map → pre-specified threshold with morphology/topology guards).
- If the primary endpoint is a measurement (e.g., volume, area, length, mean HU), we will derive the scalar(s) from the STAPLE consensus mask for each case (no panel meetings or per-pixel manual reconciliation).
- If the endpoint is segmentation quality (Dice/HD), the primary analysis will compare the algorithm directly to the STAPLE consensus mask. As a secondary sensitivity, we will also analyze algorithm vs. each reader mask using a mixed-effects model with random effects for case and reader to account for clustering; the median of per-reader Dice will be reported as an additional robustness metric.
- Decouple detection and segmentation consensus. Detection consensus can be established by majority vote in most cases but you must predefine the detection threshold and minimum detectable unit first. For example, if your device detects lung nodules, you can consider a positive case where at least two radiologists have an overlapping segmentation. Segmentations without overlaps are removed (treated as false positives).
- Measurement truth = math: Mean/median across three measurements; escalate only on dispersion gates (e.g., >5 mm or >10%). Disagreements resolve without convening a room.
- Pre-specified triggers & “Indeterminate” handling: We write numeric gates (overlap seconds, Dice, IoU, dispersion) directly into your protocol, plus clear “Indeterminate → forced decision/objective” rules—preventing post-hoc debates and extra cycles.
- Parallel Execution: We batch work, pre-calibrate readers, go from 0 to 1 on as many threads as soon as possible, keep alternates on standby, and sometimes even execute parallel threads for high risk items to ensure timeline contingencies are in place.
Bottom line: These adjudication patterns are how we keep your validation on schedule, your protocol reviewer-friendly, and your documentation bulletproof—so we can stand behind a three-month submission timeline with confidence. These are guiding principles rather than hard-and-fast rules. Nuances of your situation may require departures.
Seven Guiding Principles 🔗
1) Objective‑first truth hierarchy 🔗
Default: When a objective diagnosis exists, use it as the primary reference; restrict experts to verification/QC.
Objective Diagnosis: pathology/operative findings, PSG, echocardiography, structured chart review.
Why: Highest objectivity, minimal meetings, fastest lock.
Precedents: pathology/diagnostic imaging verification K221449; PSG scoring and physiologist review K213360, K233618; echo anchoring K213794; duplicate chart review for AF K233549; structured Sepsis‑3 committee with “Indeterminate/forced‑majority” handling DEN230036.
2) Asynchronous, minimal‑panel reads with 2+1 fallback for detection/triage 🔗
Default: Three independent, blinded readers in parallel; truth = 2‑of‑3 majority. If the first two disagree, use a third adjudicator (2+1)—no full panel meeting unless pre‑specified.
Why: Parallelizable; adjudicator time is limited to discordant cases.
Precedents: Majority for case‑level truth K231025, K241923, K242821, K251151, K220709; 2+1 adjudication in CADt/acute findings K213721, K214043, K243611, K243363, K231384.
3) Segmentation: algorithmic consensus first, human QC only by exception 🔗
Default: ≥3 expert masks → STAPLE (primary) or per‑pixel majority to form the consensus mask; QC only if thresholds are breached (e.g., Dice/HD outside bounds).
Instance segmentation: match objects (IoU) first, then apply the same consensus per instance; use an enclosing box rule if working with boxes.
Why: Removes most meetings; reproducible, FDA‑familiarity.
Precedents: STAPLE consensus K220034, K223268, K252362, K250686; per‑pixel majority/QC K241108, K242607; thresholded expert QC/reconciliation K242745, K243647; box reconciliation by enclosure K213566.
4) Measurements: compute first (mean/median), escalate only on dispersion 🔗
Default: Three blinded measurements → truth = median.
Why: Fast automatic adjudication
Precedents: Mean/average across three readers K232083, K230534, K230497; “two most concordant” rule / dispersion handling K222361, K231324; measurement on consensus structures/derived metrics K242607, K241038.
5) Pre‑specify adjudication triggers, thresholds, and “Indeterminate” handling 🔗
Default: Put numbers in the protocol:
- Detection/classification: 2‑of‑3 majority triggers truth; “Indeterminate” allowed with conversion rules.
- Segmentation: Dice < Y or HD > Z → QC re‑read; otherwise accept consensus.
- Instance matching: IoU ≥ 0.5 (operationally set).
- Time‑series: event is “true” if ≥2/3 reviewers’ intervals overlap ≥1s. Precedents: Temporal‑overlap rules (≥1s) and majority definitions K211452, K240993, with majority EEG policies K120260, K141883; STAPLE/consensus thresholds and QC workflows K220034, K252362, K241108; formal committee logic incl. “Indeterminate” DEN230036; “two most concordant” measurement rule K222361.
6) Right‑size panels & meetings; reserve MRMC for reader‑aid claims 🔗
Default:
- Detection/triage: stick to 3 readers (2‑of‑3 or 2+1).
- Ordinal/nuanced categories (e.g., breast density): use consensus of 5 only when justified; median tie‑rule acceptable.
- Reader‑aid claims: use MRMC; otherwise avoid.
- Synchronous consensus: deploy only for a pre‑flagged minority (e.g., policy choice for CADe), not for the entire cohort. Precedents: 5‑reader density consensus/median K202013, K241561, K222275; MRMC reader‑performance studies K240301, K243234, K223347, K223623.
7) Engineer for speed & resilience 🔗
Default:
- Batch cases; run asynchronous first reads; pre‑book alternate readers; set response‑time SLAs.
- Calibrate with a short pilot to reduce downstream discordance.
-
Document once: store raw reads, final consensus/metrics, and brief rationale only where adjudication occurred. Why: Prevents stalls from drop‑offs and re‑reads; satisfies auditability without overhead.
Precedents: Reader‑variability assessment/calibration K223347; dual‑reader reconciliation to consensus with iterative corrections K243647; targeted expert QC in segmentation K242745; senior adjudicator final‑say for edge disagreements K231631; 2+1 schemes that keep progress unblocked K213721, K214043, K243611, K243363, K231384.
Themes, patterns, and outliers 🔗
A. Dominant patterns 🔗
- Multi‑expert truthing is the norm. Most sponsors use either 2‑of‑3 majority, two readers + a third adjudicator, or explicit panel consensus (often with blinding). Examples span triage/detection, diagnosis, segmentation, and measurement tasks K231025, K241923, K242821, K251151, K213721, K214043, K243611, K243363, K231384, K202013, K241561, K243234.
- Method matches task.
- Detection/triage → majority/2+1 adjudication K231025, K241923, K243611.
- Segmentation → STAPLE or per‑pixel majority + QC K220034, K252362, K241108, K242607, K242745, K243647.
- Measurements → mean/median or “two most concordant,” with thresholds for escalation K230497, K232083, K231324, K242607, K222361.
- Time‑series (EEG/sleep) → temporal‑overlap rules (e.g., ≥1s overlap among ≥2/3 reviewers) K211452, K240993, K241390, K233438, K120260, K141883.
- Quality layers are common. Senior‑review or sequential reconciliation appears frequently, especially for complex masks K221449, K243647, K242745, K240411.
- External objective diagnosis are used where available (pathology, PSG, echo, chart review), often superseding expert opinion in CADx contexts K221449, K213360, K213794, K233549, K233618, DEN230036.
B. Outliers and special cases 🔗
- Synchronous (“same‑room”) consensus is a policy choice. FDA summaries usually say “consensus,” not whether it was synchronous; you may still choose synchronous for CADe to accelerate reconciliation (see evidence of consensus workflows: K202013, K241561, K243647).
- User review as a safety net appears in a few algorithm‑improvement contexts, but should not be the primary truthing mechanism K243769.
- Acceptance‑category scoring (without explicit adjudication) is rare; pre‑specify clinical thresholds if you go this route K222745.
- Union‑of‑annotations for localization is a niche but transparent option DEN200080.
2) Multi‑expert truthing—how to choose (task‑driven) 🔗
Detection/triage (CADe/CADt): Default to three blinded experts with 2‑of‑3 majority and pre‑specified 2+1 escalation; this is fast, familiar, and robust K231025, K241923, K242821, K213721, K214043, K243611, K243363, K231384.
Diagnosis (CADx): Prefer external objective diagnosis (pathology/operative/PSG/echo/validated chart review) as primary truth; use expert verification only as needed K221449, K213360, K213794, K233549, K233618, DEN230036.
Segmentation (incl. instance segmentation): Use STAPLE or per‑pixel majority across ≥3 experts; add senior QC and thresholded re‑reads (e.g., Dice/HD limits). For multi‑object tasks, match instances (IoU/Hungarian) then combine per instance K220034, K252362, K223268, K241108, K242607, K242745, K243647.
Measurements (CADm): Take three independent measurements; truth = mean (or median if outlier‑prone). If dispersion exceeds a preset threshold (e.g., >10% or >5mm), escalate; optionally use two most concordant K230497, K232083, K231324, K242607, K222361.
Your preferred policies (incorporated):
- CADe: Synchronous panel consensus after independent reads (adds a live reconciliation layer). Evidence shows consensus is standard (synchronous is your enhancement): K202013, K241561, K243647, plus majority/2+1 patterns K231025, K241923, K243611.
- CADx: Objective‑first truthing; avoid expert consensus where objective diagnosis are definitive K221449, K213360, K213794, K233549, DEN230036, K233618.
- Instance segmentation: STAPLE to combine masks, then a synchronous consensus review for edge cases K220034, K252362, K223268, K250686, K243647, K242994.
3) Examples by Adjudication Family 🔗
A. Majority vote (2‑of‑3) 🔗
Common for presence/absence decisions in detection and triage: K231025, K241923, K242821, K251151, K220709, K221552, K222076, K230020, K231130, K232410, K232431, K232751, K241390, K241440, K242292, K243685, K243851, DEN170073, DEN200069, DEN200080.
B. Two readers + third adjudicator (2+1) 🔗
A workhorse pattern for CADt and nuanced calls: K190424, K191556, K213721, K214043, K231384, K231767, K241480, K243611, K243363, K251766.
C. Panel consensus 🔗
Especially useful for difficult tasks with high anticipated inter-observer variability: Breast density (K202013, K241561), dental (K243234), vessel extraction (K223490), organs RT pipelines (K221305, K242745), MR Planner (K211841), and others: K223491, K233968, K242522, K242600.
D. Segmentation consensus—STAPLE / per‑pixel 🔗
- STAPLE: white‑matter changes, brain hyperintensities, multi‑expert masks K220034, K223268, K252362, K250686.
- Per‑pixel majority: spinal/lumbar structures, per‑pixel rules and medians for measures K241108, K242607, K220497.
- QC layers: board‑certified reviewer corrections or dual‑reader reconciliation K242745, K243647.
E. Measurements—mean/median/”two most concordant” 🔗
Aorta diameters, EF, CAC quant, other continuous outputs: mean of 3 K230497, K232083, two most concordant K222361, median/thresholds K231324, EF vs consensus seg K232331, K241038.
F. Time‑series adjudication (EEG/sleep) 🔗
Explicit overlap/epoch‑majority rules: seizures and spikes K211452, K240993, sleep staging K233438, additional EEG adjudication K241390, with classic 2/3 policies K120260, K141883.
G. External objective diagnosis 🔗
Pathology/diagnostic reports and confirmatory tests supersede panel opinion: breast lesion localization with path/diag images K221449, PSG as gold standard K213360, echo pairing K213794, duplicate chart review in AF K233549, adjudication committee for sepsis DEN230036, sleep scoring by trained physiologists K233618.
Examples 🔗
CADe: MSK X‑ray Fracture Detection (Presence/Absence + Localization) 🔗
Objective
Establish case‑level truth for fracture presence/absence and fracture localization on extremity radiographs.
Readers & Blinding
- 5 board‑certified MSK radiologists (independent, AI‑blinded).
- Prior to consensus, each submits a binary label and one or more localization marks per case.
Adjudication
- Primary: Synchronous panel consensus session to finalize a single case‑level label and a single set of localization marks.
- If consensus stalls: revert to 2‑of‑3 majority on presence/absence; localization marks resolved by enclosing the overlapping boxes (or union region) from agreeing readers.
- If still discordant: senior MSK adjudicator renders final decision with written rationale (2+1 fallback).
QC & Calibration
- 50‑case calibration round with feedback on label definitions and localization tolerance.
Precedent:
Consensus/iterative reconciliation: K202013, K241561, K243647. Majority/2+1 patterns for CADe/CADt: K231025, K241923, K242821, K213721, K214043, K243611, K243363. Localization via box reconciliation: K213566. MSK fracture panels: K220164, K240845, K193417.
CADt: PE Triage on CTPA (Case‑Level Triage) 🔗
Objective
Define ground truth for pulmonary embolism presence (case‑level triage).
Readers & Blinding
- Two ABR‑certified thoracic radiologists (independent, AI‑blinded).
- Adjudicator: third senior thoracic radiologist (blinded).
Adjudication
- Primary: 2+1—if the two initial reads disagree, the adjudicator reviews and finalizes.
- Secondary: If adjudicator flags “indeterminate,” escalate to anchored evidence (e.g., confirmatory ultrasound/CTA addendum if available), then finalize.
Precedent:
2+1 triage schemes: K213721, K214043, K243611, K243363, K231384. Three‑expert/majority triage precedents: K241727, K232751, K230020, K221330, K251151.
CADx: Breast Cancer Diagnosis (Objective‑First Truthing) 🔗
Objective
Determine lesion‑level truth using objective diagnosis; avoid expert consensus unless objective diagnosis are ambiguous.
Primary Truth (Objective‑First)
- Pathology report, diagnostic/post‑biopsy imaging, and radiology reports serve as primary reference.
- A single MQSA‑qualified radiologist verifies objective diagnosis extraction/mapping to lesions; a second verifier only if ambiguous.
Adjudication
- No panel consensus if objective diagnosis are definitive.
Precedent:
Objective‑first (pathology/diagnostic/echo/PSG/duplicate chart review; adjudication committees for CDS): K221449, K213360, K213794, K233549, K233618, DEN230036.
Segmentation: Brain WMH on MRI (Mask Consensus via STAPLE) 🔗
Objective
Create robust consensus masks for white‑matter hyperintensities (WMH).
Readers & Blinding
- 3 experienced neuroradiologists (independent, AI‑blinded) produce voxel‑level masks.
Adjudication
- Primary: STAPLE to combine expert masks into a probabilistic and binary consensus mask.
- QC thresholds: If Dice < 0.80 or 95% HD > 10mm vs STAPLE for any individual mask, a senior neuroradiologist reviews that case and issues corrections (threshold‑triggered QC).
- Archive all original masks + STAPLE + final.
Precedent:
STAPLE for segmentation consensus: K220034, K223268, K252362, K250686. Per‑pixel/consensus variants and QC: K241108, K242607, K242745, K243647.
CADm: Aortic Diameter on CT (Continuous Measurement Truth) 🔗
Objective
Define reference for maximum aortic diameter (mm).
Readers & Blinding
- 3 cardiovascular radiologists (independent, AI‑blinded) measure diameters.
Adjudication
- Primary: truth = mean of 3 measurements.
- Dispersion guardrails: if range > 5mm or CV > 10%, a fourth senior reader measures; truth becomes the mean of the two most concordant measurements.
- Median may be used if distribution is skewed.
Precedent:
Means/medians/two‑most‑concordant and thresholds: K232083, K230534, K243859, K231324, K222361. Related continuous metrics precedents: K241038, K242607.
Measurement‑from‑Segmentation: LV Ejection Fraction (Echo) 🔗
Objective
Establish EF (%) derived from consensus LV segmentation.
Two‑Stage Adjudication
- Segmentation: 3 expert sonographers/radiologists generate masks; combine via STAPLE (primary) or per‑pixel majority; QC thresholds (Dice/HD) trigger senior review.
- EF measurement: Three experts compute EF from the accepted consensus mask; truth = mean of the three; if dispersion > 10% EF, apply two‑most‑concordant rule and add a senior re‑measurement.
Evidence pattern references
EF vs consensus segmentation and multi‑reader EF: K232331, K241038, K241430, K232501. STAPLE/per‑pixel + QC: K220034, K241108, K242607, K242745, K252362.
Classification‑from‑Segmentation: Breast Density Categories 🔗
Objective
Produce BI‑RADS density category truth when classification is derived from segmentation outputs.
Readers & Blinding
- 5 MQSA‑qualified radiologists (independent, AI‑blinded).
Adjudication
- Primary: panel consensus for category; if exact tie persists, set truth to the median category across the panel.
- If segmentation is used to aid density: finalize segmentation via per‑pixel majority (or STAPLE) before category voting.
Precedent:
5‑reader density consensus/median: K202013, K241561, K222275. Segmentation consensus aids: K241108, K220034.
Instance Segmentation: Rib Fracture Instances on CT 🔗
Objective
Establish instance‑level truth masks and counts for rib fractures.
Readers & Blinding
- 3 thoracic radiologists (independent, AI‑blinded) annotate instance masks.
Adjudication
- Match instances across readers algorithmically (IoU ≥0.5; operational choice).
- For matched instances, combine masks via STAPLE (primary) or per‑pixel majority to create a final instance mask.
- For unmatched instances, require 2‑of‑3 readers to concur on presence; majority mask becomes final; if still discordant, a senior adjudicator decides.
- For box‑based localizations, replace multiple boxes by the smallest enclosing box around agreeing boxes.
Precedent:
STAPLE/per‑pixel: K220034, K252362, K241108, K242607. Box reconciliation: K213566. Majority presence standards: K231025, K241923, K242821.
EEG: Seizure/Spike Detection (Temporal Overlap Rules) 🔗
Objective
Define truth for EEG event detection (seizures/spikes) using epoch overlap criteria.
Readers & Blinding
- 3 board‑certified EEG experts (independent, AI‑blinded) provide start/stop times for events.
Adjudication
- Event is true if ≥2 of 3 reviewers’ intervals overlap by ≥1s (seizures) or meet the spike overlap rule.
- Localization truth uses the overlapping time range of the agreeing reviewers.
- Consensus required for rhythmic/periodic patterns across the two reviewers assigned to those sub‑tasks.
Precedent:
Temporal overlap adjudication: K211452, K240993; majority EEG rules: K120260, K141883; neurologist majority approaches: K241390.
Dental: Caries Detection + Pixel‑Level Segmentation 🔗
Objective
Establish tooth/surface‑level presence of caries and (if applicable) pixel‑level segmentation.
Readers & Blinding
- 3 licensed dentists (independent, AI‑blinded).
- Adjudicator: oral/dental radiologist (blinded).
Adjudication
- Classification (tooth/surface): consensus of 3; if non‑consensus, apply majority of 3; remaining ties adjudicated by the oral radiologist.
- Segmentation (if present): per‑pixel majority (3 readers); if structure‑level measures are needed, take the median or mean across readers’ measurements; adjudicator may correct masks for protocol deviations.
- (Optional) MRMC if the device claims reader‑aid.
Precedent:
Dentist consensus + oral radiologist adjudication: K220928, K212519, K222746. Pixel‑majority/consensus references (dentistry and general): K233590, K242607, K241108. MRMC for reader‑aid studies: K243234, K223347, K223623.
EEG seizures/spikes (time‑series events) 🔗
Typical adjudication. Independent expert marking with temporal‑overlap rules—an event is “true” if at least 2 of 3 reviewers’ intervals overlap by a pre‑specified minimum (e.g., ≥1s); epoch boundaries/localization come from the overlapping region. Majority voting is also used for seizure presence on longer windows.
Precedent: Overlap rules and majority criteria were explicitly defined for seizures/spikes and other EEG patterns K211452, K240993, with majority rules also reported in other EEG devices K120260, K141883; seizure/spike adjudication via multi‑expert panels is likewise described in neurologic indications K241390, K231779.
Coronary artery calcium (CAC) on CT (including non‑gated chest CT) 🔗
Typical adjudication. 2‑of‑3 majority among experienced radiologists for CAC category; when disagreement persists or for quantitative scoring/thresholding, a senior adjudicator finalizes the grade.
Precedent: Majority‑rule CAC category from three radiologists is used (and replicated across versions) K210085, K241440; CAC level finalization by a senior radiologist is described when reviewers disagree K231631; consensus/majority truth for CAC segmentation/labels is also used in related CT workflows K242188.
Breast density (BI‑RADS A–D) 🔗
Typical adjudication. Panel consensus with larger reader groups (often 5 experts) to stabilize ordinal categories; some programs compute the median category as the final label.
Examples. Five‑reader consensus and median‑based truth appear across multiple submissions K202013, K241561, K243685, with median/consensus rules also documented by another sponsor K222275.
Breast lesion diagnosis / cancer confirmation (CADx) 🔗
Typical adjudication. External objective diagnosis first (pathology, diagnostic/post‑biopsy imaging, radiology reports); experts verify mapping of objective diagnosis to the case/lesion but do not overrule definitive objective diagnosis.
Examples. Objective‑first truthing and verification by MQSA‑qualified radiologists (including pathology/diagnostic image review) are explicitly described K221449. Parallel objective‑first models exist for sleep (PSG) and auscultation/echo (see items 12 and 13).
Intracranial hemorrhage (ICH)/SDH on head CT (detection/triage) 🔗
Typical adjudication. Three neuroradiologists with majority or explicit consensus; many CADt workflows use 2+1 (third reader adjudicates disagreements).
Examples. Majority read of three neuroradiologists for ICH K203260, K232431; SDH truth established by three expert neuroradiologists K232436; explicit 2+1 adjudication schemes for ICH/SDH triage in related products K243363; neuroradiologist consensus for broader stroke triage is also documented K251983.
Pulmonary embolism (PE) triage on CTPA 🔗
Typical adjudication. 2+1 (two independent thoracic/neuroradiology readers; third adjudicator if disagreement) or 3‑reader majority, all blinded.
Examples. Multi‑site triage studies compared device performance to ground truth by three experts (often “2:3 concurrence”) K251151, K220499, with other programs using three senior radiologists/majority voting K232751, K230020, K241727.
Midline shift (MLS) quantification on head CT 🔗
Typical adjudication. Quantitative truth from multiple independent measurements (often three), combined by mean/average; segmentation components may use consensus/STAPLE.
Examples. Mean of three neuroradiologist measurements defines the reference standard K232083; average shift distance of all annotators is also used K223268; MLS truth by three experts appears in allied stroke tools K243378.
Aortic aneurysm diameter (abdominal/CT) — quantitative CADm 🔗
Typical adjudication. Three independent measurements → mean (or median) as truth; escalation to a senior reviewer if dispersion exceeds preset thresholds (e.g., >5mm or >10%).
Examples. Abdominal aorta diameter truth from three experts across multi‑center validation K230534, with a similar three‑expert measurement paradigm in a larger cohort K241112.
Dental caries/periapical radiolucency (tooth or surface level; optional pixel masks) 🔗
Typical adjudication. Three‑dentist consensus/majority for classification; oral radiologist adjudicates non‑consensus; when masks are used, per‑pixel majority (or consensus) defines the ground truth segmentation.
Examples. Consensus labels with oral‑radiologist adjudication for non‑consensus cases K212519, K222746; three‑dentist consensus with study‑specific majority/consensus rules K230144; adjudicated labeling by a dental specialist in another program K232384; pixel‑level majority/consensus ground truth appears in related dental imaging K242600, K242522, and pixel‑majority principles are documented in broader imaging tasks K233590.
Lung nodule detection/segmentation/quantification (CT/CXR) 🔗
Typical adjudication. Three‑expert majority or consensus for presence; for segmentation/measurement, combine masks (STAPLE/per‑pixel) and then average measurements or apply rules like “two most concordant.”
Examples. Dataset truthed by three dedicated chest radiologists K221592; lung nodules defined using multi‑expert truthers (with CT/auxiliary reports for context) K231805; nodule delineations by three expert radiologists for quantification tasks K240740.
Rib fracture detection/localization (X‑ray/CT) 🔗
Typical adjudication. Three‑expert majority or 2+1 adjudication for presence; bounding‑box disagreements reconciled by enclosing or consensus regions; pediatric/adult sub‑panels where relevant.
Examples. 2+1 adjudication to resolve inconsistencies K202992; majority consensus with third‑reader review for initial disagreements across adult/pediatric panels K242171; multi‑expert MSK fracture truthing in related devices K220164, K240845.
White‑matter hyperintensities (WMH) segmentation (MRI) 🔗
Typical adjudication. Multi‑expert masks combined via STAPLE or per‑pixel majority, with senior clinical expert QC; longitudinal change protocols often separate annotators/reviewers/experts by design.
Examples. STAPLE‑based consensus for hyperintensities K252362 and related neuro segmentation K220034; per‑pixel consensus/QC pipelines K241108; standardized multi‑stage annotation with expert corrections K213706, with longitudinal designs describing disjoint annotator/reviewer/expert groups K232305.
Sleep staging / sleep physiology (PSG‑anchored and algorithmic staging) 🔗
Typical adjudication. objective diagnosis first (PSG scored to AASM standards by trained scorers/physiologists); for algorithmic staging evaluation, 2‑of‑3 technologist majority per epoch is common.
Examples. PSG datasets scored by trained physiologists; video annotations by blinded reviewers for auxiliary labels K233618; sleep‑staging software compared to 2/3 expert consensus per epoch K233438; PSG gold standard and independent scorers in OSA screening K213360.
References 🔗
K Number | Device Name | Applicant | Adjudication Quote |
---|---|---|---|
DEN170073 | ContaCT | Viz.Al, Inc. | "In cases where the neuro-radiologists did not agree on whether a study required further review, an additional neuro-radiologist provided an additional opinion and established a ground truth by majority consensus." |
DEN180005 | OsteoDetect | Imagen Technologies, Inc. | "Ground truth for each case was determined by three US board certified orthopedic hand surgeons who independently interpreted images using the standard clinical definition of a distal radius fracture. Ground truth for the presence/absence of distal radius fracture is defined as the majority opinion of at least 2 of the 3 clinicians participating in the truthing process." |
DEN190040 | Caption Guidance | Bay Labs, Inc. | "Following the study and control exams, a panel of five (5) expert cardiologist readers independently provided assessments of whether the patient study, in its totality, provided sufficient information to assess ten clinical parameters." |
DEN200069 | Cognoa ASD Diagnosis Aid | Cognoa, Inc. | "Majority rule was used to resolve discrepancies between the two central reviewers and the site diagnosing specialist who all evaluated the same subjects." |
DEN200080 | Paige Prostate | Paige.AI | "The union of annotations between at least 2 of the 3 annotating pathologists was used as the localization ground truth." |
DEN220066 | BrainSee | Darmiyan, Inc. | "All ground truth labels were reviewed and confirmed by consensus of three physicians with clinical experience evaluating aMCI patients." |
DEN230003 | Viz HCM | Viz.ai, Inc. | "Study protocols must include a description of the adjudication process(es) for determining ground truth of training and test datasets." |
DEN230027 | NaviCam ProScan | Ankon Technologies Co., Ltd | "When the cutoff value for consistency is less than 3, two arbitration experts independently review and modify the classification results, correcting any missed diagnoses, misdiagnoses, or misjudgments. If difficult questions arise, the arbitration experts engage in collective discussion and confirmation." |
DEN230036 | Sepsis ImmunoScore | Prenosis, Inc. | "The adjudication process for resolving reader disagreements involved "a retrospective chart review done by a team of three physicians that reviewed the medical chart to determine the presence of a sepsis event." "The entirety of the patient's record was sent to an adjudication committee of three physicians." "If it was unclear whether the infection was the cause of organ dysfunction, the adjudicator was instructed to answer 'Indefinite,' and the patient's Sepsis status was labeled as 'Indeterminate.' In addition to providing the 'Septic.' 'Non-Septic,' or 'Indeterminate' label for each subject, each adjudicator was also asked to also provide a 'forced decision' in 'Indeterminate' cases. This led to two groups for analysis, the adjudicated forced majority group and the adjudicated forced unanimous - the majority group was all patients that received adjudication and their Sepsis 3 determination was defined by the majority rule of diagnosis by physicians and the unanimous was where all physicians agreed on the diagnosis." |
K120260 | ICTA | EXCEL-TECH LTD. (XLTEK) | "Due to the anticipated inter-rater variability among EEG experts, a majority rule (at least 2 out of 3) was applied to make the final determination of 'true' electrographic seizure." |
K141883 | CLINISCANSM EEG | PICOFEMTO LLC | "Due to the expected inter-rater variability, a two-thirds majority rule was used to determine the ground truth for seizure presence." |
K142273 | EmboGuide | PHILIPS MEDICAL SYSTEMS NEDERLAND B.V. | "First, feeding vessels of the lesions were defined by consensus of two experienced interventional radiologists (located outside the United States) who also performed the procedures by using all available information (2D angiography, MR and /or CT, Cone Beam CT (CBCT) and EmboGuide). This was used as the "ground truth"." |
K182177 | Accipiolx | MaxQ-Al Ltd. | "Device sensitivity and specificity was compared to ground truth established by concurrence of at least two expert neuroradiologist readers." |
K183019 | SIS Software version 3.3.0 | Surgical Information Sciences, Inc. | adjudication was done. |
K190072 | BriefCase | Aidoc Medical, Ltd. | "Another radiologist was used to break ties between the report and the reviewer." |
K190424 | HealthICH | Zebra Medical Vision Ltd. | "In the event that the two ground truthers did not agree, a third, more senior US Board Certified neuro-radiologist reviewed the axial CT series and determined ground truth (presence or absence of ICH)." |
K191556 | Red Dot | Behold.AI Technologies Limited | "The ground truth was determined by two readers with a third reader in the event of disagreement/discrepancy." |
K191647 | QLAB Advanced Quantification Software | Philips Healthcare | "The results of the validation show that when used as intended, the healthcare professional was able to successfully determine which contours required revision and was capable of revising in the "tracking revision" screen prior to accepting the measurements for a report to create accurate measurements of the RV volume." |
K192109 | KOALA | IB Lab GmbH | "This dataset contained a total of 6597 radiographs, representing 1149 individuals for which ground truth grading for Kellgren Lawrence grades, as well as osteophyte, sclerosis and joint space narrowing grades according to the OARSI (Osteoarthritis Research Society International) guidelines, was established by three physicians following adjudication procedures for discrepancies." |
K192320 | HealthCXR | Zebra Medical Vision, Ltd. | "The validation data set was truthed (ground truth) by three US Board-Certified Radiologists (truthers)." |
K192969 | Ezra Plexo Software | Ezra AI Inc. | "consensus ground truth created by five U.S. board certified expert radiologists." |
K193087 | Rapid ICH | iSchemaView Incorporated | "the RAPID ICH performance has been validated through the use of phantoms and retrospective case data and through the use of reader truthing of the data." |
K193267 | Al-Rad Companion (Musculoskeletal) | Siemens Medical Solutions USA, Inc. | "Ground truth annotations were established using manual vertebra height and density measurements performed by four radiologists (two readers per case plus a third reader for adjudications)." Note: I think the “four” radiologists is a typo, it should say three radiologists |
K193300 | AIMI-Triage CXR PTX | RADLogics, Inc. | "The AIMI-Triage CXR PTX output was compared to the ground truth established by 3 independent US-board certified radiologists (Truther involved in the ground truthing process was blinded to any other Truther's results, to any existing report, and to the results obtained by the AlMI-Triage CXR PTX software." |
K193417 | FractureDetect (FX) | Imagen Technologies, Inc. | "Each case had been previously evaluated by a panel of three U.S. board-certified orthopedic surgeons or U.S. board-certified radiologists who assigned a ground truth binary label indicating the presence or absence of a fracture." |
K193658 | Viz ICH | Viz.ai, Inc. | "Sensitivity and specificity were calculated in the image database, comparing the Viz ICH's output to ground truth as established by trained neuro-radiologists." |
K200621 | Caption Interpretation Automated Ejection Fraction Software | Caption Health | "Results of the Clip Annotator were compared to evaluation by a panel of expert readers. That study met the pre-defined acceptance criteria and found that the observed PPV point estimates for the Clip Annotator were greater than 97% for identification of the imaging mode and the view." |
K200667 | EyeArt | Eyenuk, Inc | "Each subject’s images were graded independently by 2 experienced and certified graders and in case of significant differences (determined using prespecified significance levels) in the 2 independent gradings, a more experienced adjudication grader graded the same images." |
K200717 | CLEWICU System (ClewICUServer and ClewICUnitor) | CLEW Medical Ltd. | "As an initial matter, a tagging system was developed and validated (against human physician readers as ground truth)." |
K200760 | Rapid ASPECTS | iSchemaView Inc. | "Data truthing was performed by three experts." |
K200855 | CINA | AVICENNA.AI | "Device sensitivities and specificities were compared to ground truth established by concurrence of three US-board-certified neuroradiologist readers." |
K200873 | HALO | NICo-Lab B.V. | "Ground truth was established by an expert panel consisting of 3 neuro radiologists." |
K201034 | Syngo.CT CaScoring | Siemens Medical Solutions USA, Inc. | "No statistically relevant difference between the performance of the three individual readers compared to their consensus, and the algorithm compared to the consensus was found." |
K201411 | Visage Breast Density | Visage Imaging GmbH | "Three board certified radiologists with MQSA qualification per site performed a breast density classification and the consensus of the three reviewers was determined for each study." |
K202013 | WRDensity by Whiterabbit.ai | Whiterabbit.ai Inc. | "consensus of five expert radiologists who independently assessed breast density on a test dataset." |
K202928 | DV. Target | Deepvoxel INC | "The ground truth OARs contours on the public validation data were generated from the consensus of three board-certified physicians." |
K202992 | BriefCase, RIB Fractures Triage (RibFx) | Aidoc Medical, Ltd. | "Ground truthing was performed by two radiologists with an additional third radiologist to resolve inconsistencies." |
K203235 | VBrain | Vysioneer Inc. | "The ground truth of each tumor contours was generated from the consensus of three board-certified radiation oncologists." |
K203256 | Imbio RV/LV Software | Imbio, LLC | "The second test (Reader Study- II) will demonstrated the accuracy of RVLV diameter ratios compared to radiologist's measurement of the RVLV diameter ratio." |
K203258 | syngo.CT Lung CAD | Siemens Healthcare GmbH | "The reference standard was based on reader majority (three out of five) followed by expert adjudication, as needed." |
K203260 | syngo.CT Brain Hemorrhage | Siemens Medical Solutions USA, Inc. | "The data cohort consisted of 600 anonymized head CT cases from 5 sites in US and Europe with approximately equal distribution of positive (case with ICH) and negative (case without ICH) cases. Sensitivity and specificity of syngo.CT Brain Hemorrhage in processing of non-contrast head CT have been analyzed by comparison to a ground truth established by majority read of 3 US board certified neuroradiologists with more than 10 years of experience." |
K203517 | Saige-Q | DeepHealth, Inc. | "Each case was reviewed by two independent expert radiologists (and an adjudicator if discordance was observed) to establish the reference standard for each case." |
K203696 | RBknee | Radiobotics ApS | "Ground truth grading for Kellgren Lawrence grades, as well as osteophyte, sclerosis and joint space narrowing grades according to the OARSI (Osteoarthritis Research Society International) guidelines, and measurements of the minimum joint space width was established by two physicians following adjudication procedures with a third reviewer for discrepancies." |
K210085 | HealthCCSng | Zebra Medical Vision Ltd. | "Ground truth category was determined by the majority agreement of two of three radiologists." |
K210187 | Overjet Dental Assist | Overjet, Inc. | "These measurements were then adjudicated by two US Dental Radiologists". |
K210237 | CINA CHEST | Avicenna.AI | "Device sensitivities and specificities were compared to ground truth established by concurrence of several US-board-certified radiologist readers." |
K211452 | Encevis | Austrian Institute of Technology GmbH | "An event was considered as "true seizure" only if the time interval of two out of three reviewers overlapped by at least 1 second. A seizure epoch was then defined as the overlapping time range of two reviewers." "An event was considered as "true spike" only if the time interval of two out of three reviewers overlapped." "the 3D-coordinates of the electrode which is next to the spike maximum averaged over reviewers was used." "Annotations had to be consistent between both reviewers to be used in the sensitivity and specificity measurement." "The detection performance was analyzed for consensus annotations of the two reviewers. The consensus annotations only include annotation segments where both reviewers showed the same decision about Burst Suppression pattern." |
K211803 | HealthPPT | Zebra Medical Vision Ltd. | "The validation data set was truthed (ground truth) by three US Board-Certified Radiologists (truthers)." |
K211841 | MRI Planner | Spectronic Medical AB | "Manual delineations were generated by two expert truthers using the consensus approach, based on US clinical guidelines." |
K212519 | Overjet Caries Assist | Overjet, Inc. | "Ground truth was established by the consensus labels of three US licensed dentists, and non-consensus labels were adjudicated by a Dental Radiologist." |
K212758 | Autoplaque | Cedars-Sinai Medical Center: AIM | "Ground truthing was performed by two cardiologists with one additional highly experienced radiologist to resolve discrepancies." |
K213155 | RT-Mind-AI | MedMind Technology Co., Ltd. | "Ground truthing of each image was generated from the consensus of at least three licensed physicians." |
K213272 | Formus Hip | Formus Labs, Ltd | "A third senior radiologist reviewed each pair of segmentation and selected the most accurate segmentation which was the final manually segmented mesh." |
K213360 | SleepCheckRx | ResApp Health | "The clinical study used a binomial endpoint comparing the presence and severity of OSA, by using sleep sounds captured and analyzed by the SleepCheckRx algorithm and comparing them to a simultaneous PSG diagnosis (gold standard). PSG diagnosis was established by independent scorers, in accordance with the Type II (in-home) American Academy of Sleep Medicine (AASM) 2017 Guidelines. Each sleep study was scored by a qualified independent sleep scorer." |
K213409 | ZEUS System (Zio Watch) | iRhythm Technologies, Inc. | "The ECG-based preliminary findings in the Zio Watch Transmission Reports are quality reviewed by Certified Cardiographic Technicians (CCTs) prior to publishing." |
K213519 | Rune Labs Tremor Transducer System | Rune Labs, Inc. | "The choreiform movement score (CMS) was calculated from sensor data in the pilot study and compared to dyskinesia ratings from three MDS-certified experts during multiple MDS-UPDRS assessments." |
K213566 | ClearRead Xray Pneumothorax | Riverain Technologies, Inc. | "The final image label and associated annotations were derived from a majority voting rule, where the associated annotation bounding boxes were replaced with a single box that enclosed all bounding boxes." |
K213686 | SKOUT Software | Iterative Scopes Inc. | "Ground truth was defined as data reviewed and either validated or created by expert gastroenterologists through a process referred to as gastroenterologist review. During gastroenterologist review, experts reviewed and either validated, rejected new labels post primary annotation." |
K213706 | AI-Rad Companion Brain MR | Siemens Healthcare GmBh | "For each test dataset, the three initial annotations are annotated by three different in-house annotators. Then, each initial annotation is reviewed by the in-house reviewer. Afterwards, each initial annotation is reviewed by the referred clinical expert. The clinical expert reviews and corrects the initial annotation of the WMH according to the annotation protocol." |
K213721 | BriefCase | Aidoc Medical, Ltd. | "Ground truthing was performed by two US Board-certified radiologists and a third one to resolve inconsistencies." |
K213794 | Eko Murmur Analysis Software (EMAS) | Eko Devices, Inc. | "All recordings were annotated by multiple cardiologists in respect to their quality and the presence of any murmur." and "Ground truth for murmur classification was obtained via pairing cardiologist annotations with gold standard echocardiogram." |
K213941 | Annalise Enterprise CXR Triage Pneumothorax | Annalise-AI | "To determine the ground truth, each deidentified CXR case was annotated in a blinded fashion by at least two American Board of Radiology (ABR)-certified and protocol-trained radiologists (ground truthers), with consensus determined by two ground truthers and a third ground truther in the event of disagreement." |
K213944 | HealthOST | NanoxAI Ltd. | "Ground truth measurements were determined by the three US board-certified radiologists." |
K213986 | CerebralGo Plus | Yukun (Beijing) Technology Co., Ltd | "When the two radiologists conflicted, the third radiologist would arbitrate and generate the reference standard." |
K214043 | BriefCase | Aidoc Medical, Ltd. | "Ground truthing was performed by two radiologists with an additional third radiologist to resolve inconsistencies." |
K220034 | NEUROShield | In-Med Prognostics L3C | "This ground truth was combined into one tracing per case by the STAPLE (Simultaneous Truth and Performance Level Estimation) algorithm. The STAPLE-derived ground truth was then compared with segmentation provided by each radiologist and statistical tests were performed to ensure the validity of ground truth." |
K220105 | Saige-Dx | DeepHealth, Inc. | "For exams where there were discrepancies between the two truther's assessment of density, lesion type, and/or lesion location, a third truther served as the adjudicator." |
K220164 | Rayvolve | AZmed SAS | "Each case had been previously evaluated by a panel of three US board-certified MSK radiologists to provide ground truth binary labeling indicating the presence or absence of fracture and the localization information for fractures." |
K220349 | TeraRecon Neuro | TeraRecon, Inc | "The evaluator was asked to confirm through qualitative assessment that the generated maps of TeraRecon Neuro are at least 85% substantially equivalent or better than the predicate and reference devices." |
K220408 | AVIEW RT ACS | Coreline Soft Co.,Ltd | "Second, segmentation results generated by 1 expert are sequentially edited by 2 experts. In the editing process, the first expert makes corrections, and the result is received by another expert completes the gold standard by finalizing it. This process was performed by a panel of three radiation oncology physicians' experiences." |
K220437 | Neurophet AQUA | NEUROPHET, Inc. | "Ground-truth data were initially generated using FreeSurfer (General Hospital Corporation, Boston, MA, USA, version 6.0) and verified and corrected by four radiologists." |
K220497 | CoLumbo | Smart Soft Healthcare AD | "The standalone software performance assessment study compared the CoLumbo software outputs without any editing by a radiologist to the ground truth defined by 3 radiologists on segmentations and measurements. ... The per-pixel majority opinion of the three (3) radiologists established the ground truth for each segmented tissue. Similarly, each radiologist used a commercial software tool to produce a standard set of areal, angular and linear measurements. The ground truth measurements were established by taking the median of three radiologists' measurements." |
K220499 | Rapid PE Triage and Notification (PETN) | iSchemaView Inc. | "Final performance validation included 306 CTPA cases with ground truth established by 3 experts using a 2:3 confirmation." |
K220709 | BriefCase | Aidoc Medical, Ltd. | "The study compared the software's performance to the ground truth, as determined by 3 expert US board certified Neurologists reviewers, using majority voting." |
K220815 | BrainInsight | Hyperfine, Inc. | "Ground truth for midline shift was determined based on the average shift distance of all annotators." "Ground truth for segmentation is calculated using Simultaneous Truth and Performance Level Estimation (STAPLE)." |
K220928 | Overjet Calculus Assist | Overjet Inc. | "Ground truth was established by the consensus labels of three US-licensed dentists, and non-consensus labels were adjudicated by an oral radiologist." |
K220940 | EchoPAC Software Only, EchoPAC Plug-in | GE Medical Systems Ultrasound and Primary Care Diagnostics, | "For all datasets, two certified cardiologists performed manual delineation, then reviewed the annotations for each other. A consensus reading was first done whereby the two cardiologists discussed if they agreed on or not. A panel of experienced experts further reviewed annotations that the two cardiologists could not agree on." |
K221241 | DrAid for Radiology v1 | VinBrain Joint Stock Company | "This data set was truthed by a panel of 3 US board certified radiologists." |
K221305 | AI-Rad Companion Organs RT | Siemens Medical Solutions USA, Inc | "adjudication was done." |
K221330 | BriefCase | Aidoc Medical, Ltd. | "the ground truth as determined by 2 out of 3 majority voting senior board-certified radiologists." |
K221449 | Genius AI Detection 2.0 | Hologic, Inc. | "The truth was verified by another MQSA-qualified, board-certified radiologist to ensure accuracy and consistency." |
K221552 | EFAI ChestSuite XR Pneumothorax Assessment System | Ever Fortune AI Co., Ltd. | "The reference standard (ground truth) was generated by the majority agreement between the three board-certified radiologists." |
K221592 | AVIEW Lung Nodule CAD | Coreline Soft Co.,Ltd. | "Three dedicated chest radiologists with at least ten years of experience determined the ground truth using a dataset of 151 Chest CTs with 103 negative controls and 48 cases with one or more lung nodules." |
K221716 | CINA | AVICENNA.AI | "Device sensitivities and specificities were compared to ground truth established by concurrence of three US-board-certified neuroradiologist readers." |
K221868 | QOCA image Smart CXR Image Processing System | Quanta Computer Inc. | "The dataset was truthed by three radiologists." |
K221921 | DTX Studio Clinic 3.0 | Nobel Biocare AB | "The dataset of 452 adult IOR images was 'ground-truthed by a group of 10 dental practitioners followed by an additional expert review.'" |
K222054 | Denti.AI Auto-Chart | Denti.AI Technology Inc. | "The GT was established with the help of two experienced dental hygienists with an experienced dentist reviewing cases of disagreement." |
K222076 | EFAI ChestSuite XR Pleural Effusion Assessment System | Ever Fortune.AI Co., Ltd. | "Three US board-certified radiologists determined the presence of pleural effusion in each case independently. The majority agreement was used as the reference standard (ground truth)." |
K222179 | Annalise Enterprise CXR Triage Trauma | Annalise-AI Pty Ltd | "To determine the ground truth, each deidentified CXR case was annotated in a blinded fashion by ABR-certified and protocol trained radiologists (ground truthers), with consensus determined by two ground truthers and a third ground truther in the event of disagreement for the primary finding." |
K222268 | Annalise Enterprise CXR Triage Trauma | Annalise-AI Pty Ltd | "To determine the ground truth, each deidentified CXR case was annotated in a blinded fashion by at least two ABR-certified and protocol-trained radiologists (ground truthers), with consensus determined by two ground truthers and a third ground truther in the event of disagreement." |
K222275 | Saige-Density | DeepHealth, Inc. | "Ground truth was established for each case as the consensus of five expert radiologists' breast density categories on the same set of cases, and calculated as the median of the reported categories for each case." |
K222361 | AI-Rad Companion (Musculoskeletal) | Siemens Medical Solutions USA, Inc. | "For outliers, a third annotation was blindly provided by one of the radiologist who had not annotated before. The ground truth was generated by the average of the two most concordant measurements. For all other cases, the two annotations were used as ground truth." |
K222692 | BriefCase | Aidoc Medical, Ltd. | "ground truth as determined by three senior board-certified radiologists" |
K222745 | Axial3D Insight | Axial Medical Printing Limited | "all cases were scored within the acceptance criteria of 1 or 2a [1]." |
K222746 | Overjet Caries Assist | Overjet, Inc. | "Standalone performance of the OCA device was compared to a ground truth established by consensus of labels of three US licensed dentists, and non-consensus labels were adjudicated by an oral radiologist." |
K222781 | Augmento | Deeptek Medical Imaging Private Limited | "Replicability was demonstrated by measurements made by two readers in twelve independent X-ray scans...These measurements were compared to the angles measured using Augmento. Two statistical analysis tests: the equivalence test and T-test were used." |
K223240 | Annalise Enterprise CTB Triage Trauma | Annalise-AI Pty Ltd | "To determine the ground truth, each deidentified case was annotated in a blinded fashion by at least two ABR-certified and protocol-trained neuroradiologists (ground truthers), with consensus determined by two ground truthers and a third ground truther in the event of disagreement." |
K223268 | BrainInsight | Hyperfine, Inc. | "Ground truth for midline shift was determined based on the average shift distance of all annotators." "Ground truth for segmentation is calculated using Simultaneous Truth and Performance Level Estimation (STAPLE)." |
K223296 | Videa Perio Assist | VideaHealth, Inc. | No response |
K223347 | UltraSight AI Guidance | UltraSight Inc. | "The clips acquired during those scans were reviewed by a panel of 5 expert cardiologists blinded to whether the clip was acquired by a non-expert user or a sonographer and to each other's evaluations." and "Assessment of intra-cardiologists' variability using Cohen's kappa coefficient (k) was assessed on a randomly selected 10% of the examinations on which a repeated assessment was performed." |
K223396 | Rapid RV/LV | iSchema View Inc. | "Final performance validation included 124 CTPA cases with ground truth established by 3 experts." |
K223443 | Viz AAA | Viz. ai, Inc. | "Sensitivity and specificity were calculated for the image database, comparing Viz AAA's output to ground truth as established by trained radiologists with fellowship in vascular radiology." |
K223490 | FlightPlan for Embolization | GE Medical Systems SCS | "For vessel extraction, the ground truth was produced by the consensus of 3 board certified radiologists." |
K223491 | Critical Care Suite with Pneumothorax Detection AI Algorithm, Critical Care Suite 2.1, Critical Care Suite | GE Medical Systems, LLC | "The reference standard was established by three blinded radiologists." |
K223502 | MR Diffusion Perfusion Mismatch V1.0 | Olea Medical | "the qualitative assessment allowed an US board-certified neuroradiologist to conclude that all parametric maps were substantially equivalent." "the appraisal performed by an US board-certified neuroradiologist led to the conclusion that Volume 1 was visually equivalent for all 30 cases" "the visual inspection performed by an US board-certified neuroradiologist led to the conclusion that Volume 2 was equivalent for all 30 cases." |
K223623 | SubtleMR (2.3.x) | Subtle Medical Inc. | "Based upon the results of this testing, the SubtleMR performance was determined to be substantially equivalent to the predicate device." |
K223646 | IB Lab LAMA | IB Lab GmbH | "If any pair of assessments differs by more than the threshold defined in the Test-Plan, the respective leg was consensus read by the two truthers in order to establish a reliable ground truth." |
K223757 | Bonelogic | Disior Ltd | "adjudication was done." |
K223774 | Contour ProtégéAI | MIM Software Inc. | "The initial seqmentations were then reviewed and corrected by a radiation oncologist against the same standards and quidelines. Qualified staff at MIM Software (M.D. or licensed dosimetrists) then performed a final review and correction." |
K230020 | BriefCase | Aidoc Medical, Ltd. | "The study compared the software's performance to the ground truth, as determined by three senior board-certified radiologists, using majority voting." |
K230039 | uOmnispace | Shanghai United Imaging Healthcare Co., Ltd | "annotators will refine the first round annotation, they will check each other's annotation. At last, a senior clinical specialist will check and modify annotations to make sure the ground truth correct." |
K230074 | Rapid Aneurysm Triage and Notification | iSchemaView Inc. | "Final performance validation included 266 (151 pos, 115 neg) CTA cases with ground truth established by 3 experts." |
K230082 | Auto Segmentation | GE Medical Systems, LLC | "Ground truth annotations were established following RTOG and DAHANCA clinical guidelines manually by three independent, qualified radiotherapy practitioners." |
K230144 | Denti.AI Detect | Denti.AI Technology, Inc. | "Ground truthing was performed by three independent dentists with the consensus rule applied to establish final reference standard." "Ground truthing was performed by three independent dentists with majority rule applied to establish final reference standard." |
K230209 | Sonix Health | Ontact Health Co., Ltd. | "The ground truth annotation for the test was performed by two experienced sonographers with a Registered Diagnostic Cardiac Sonographer (RDCS) certification. The annotation was supervised by two experienced cardiologists and the consensus annotation was used as the final ground truth." |
K230497 | Bladder AI (AIBV01) | Exo Inc | "The ground truth for bladder volume (reference data) was obtained as the average bladder volume measurement among three expert clinicians." |
K230534 | BriefCase-Quantification | Aidoc Medical, Ltd. | "Aidoc conducted a retrospective, blinded, multicenter study with the BriefCase-Quantification software to evaluate the software's performance in providing maximum axial diameter measurements of the abdominal aorta in CT images in 160 cases, from 6 US-based clinical sites, both academic and community centers, compared to the ground truth, as determined by three US board-certified radiologists." |
K230685 | AutoContour Model RADAC V3 | Radformation, Inc. | "Ground truthing of each test data set were generated manually using consensus (NRG/RTOG) guidelines as appropriate by three clinically experienced experts consisting of 2 radiation therapy physicists and 1 radiation dosimetrist." |
K230899 | qXR-PTX-PE | Qure.ai Technologies | "The ground truth was established by 3 ABR thoracic radiologists with a minimum of 10 years of experience." |
K231001 | DeepTek CXR Analyzer v1.0 | DeepTek Medical Imaging Pvt Ltd | "The ground truth (GT) label for the presence or absence of ROI for each category was defined as the majority opinion of 2 out of the 3 the radiologists." |
K231025 | EFAI NeuroSuite CT ICH Assessment System | Ever Fortune.AI Co., Ltd. | "The presence of ICH in each case was determined independently by three U.S. board-certified neuroradiologists, and the reference standard (ground truth) was generated by the majority agreement between the three experts." |
K231094 | Annalise Enterprise CTB Triage-OH | Annalise-AI Pty Ltd | “To determine the ground truth, each deidentified case was annotated in a blinded fashion by at least two ABR-certified and protocol-trained neuroradiologists (ground truthers), with consensus determined by two ground truthers and a third ground truther in the event of disagreement.” |
K231130 | TumorSight Viz | SimBioSys, Inc. | "In cases where the two radiologists did not agree on whether the segmentation was appropriate, a third radiologist provided an additional opinion and established a ground truth by majority consensus." |
K231324 | DASI Dimensions (V1.0) | DASI Simulations | "The reference standard was derived from 2 qualified truthing each CTA, whose measurements were averaged for each case. If there was a significant variance between the initial two truthers, an adjudicator was involved." |
K231355 | Aurora | EnsoData | "For an event to be officially scored or reported, a consensus of at least two-thirds among the scorers was required." |
K231384 | Annalise Enterprise CTB Triage Trauma | Annalise-AI Pty Ltd. | "To determine the ground truth, each deidentified case was annotated in a blinded fashion by at least two ABR-certified and protocol-trained neuroradiologists (ground truthers), with consensus determined by two ground truthers and a third ground truther in the event of disagreement." |
K231396 | CEPHX- Cephalometric Analysis Software | Orca Dental AI LTD | "The study design involved the comparison of 21 clinically significant landmarks detected automatically by the Al algorithm to the manually detected landmarks by the three orthodontic specialists, with a margin of up to 2.0mm considered "pass" and a margin above this range considered "fail"." |
K231631 | BriefCase-Quantification | Aidoc Medical, Ltd. | "In cases where the reviewers disagree on the level of CAC, the senior US board-certified radiologist provided a final opinion which has established the ground truth." |
K231678 | Overjet Periapical Radiolucency Assist | Overjet, Inc | "The consensus reference standard established by 3 endodontists." |
K231683 | inHEART Models | inHEART, SAS | "In order to use this as a ground truth, two external experts evaluated the concordance of the manual segmentations for the task in which the use of this software is inscribed." |
K231690 | iCAS-LV | HighRAD Ltd. | "The ground truthing process involved two experienced radiologists, one of whom is US board-certified, independently identifying and delineating liver metastases in abdominal ceCT scans. A third senior radiologist reviewed and compared their findings, with the final lesion delineations validated or modified by the third radiologist being considered as the Ground Truth for the study." |
K231767 | Annalise Enterprise CTB Triage Trauma | Annalise-AI Pty Ltd | "To determine the ground truth, each deidentified case was annotated in a blinded fashion by at least two ABR-certified and protocol-trained neuroradiologists (ground truthers), with consensus determined by two ground truthers and a third ground truther in the event of disagreement." |
K231779 | REMI AI Discrete Detection Module | Epitel, Inc. | "Consensus ground truth electrographic seizure negative determinations were made using the wired EEG records when at least 2 of 3 members identified the presence or absence of an electrographic seizure event." |
K231805 | qXR-LN | Qure.ai Technologies | "The standalone study was performed to compare qXR-LN's performance against a ground truth determined by 5 ABR certified ground truthers. They read the Chest X-rays with the accompanying CT scans and reports and the ground truth was based on the nodules visible on the Chest Xray." |
K231837 | Brainomix 360 Triage LVO | Brainomix Limited | "To determine the ground truth, each case was reviewed by two ABR-certified neuroradiologists (ground truthers), with a consensus determined by a third ground truther in the event of disagreement." |
K231871 | Radify Triage | Envisionit Deep AI Ltd | "The ground truth was established by 3 board-certified ABR (USA) radiologists with a minimum of 11 years of experience." |
K232083 | BriefCase-Quantification | Aidoc Medical, Ltd. | "Aidoc conducted a retrospective, blinded, multicenter, study with the BriefCase-Quantification software to evaluate the software's performance in providing adequate measurements of the midline shift in non-contrast head CT images in 284 cases from 228 unique patients from 6 US-based clinical sites, both academic and community centers, compared to the ground truth, as determined by three neuroradiologists, who independently measured the midline shift, the reference standard was created as the mean of all three measurements." |
K232096 | Transpara Density 1.0.0 | Screenpoint Medical B.V. | "Ties in the panel majority-vote were resolved by taking the majority vote of the three most experienced radiologists in the panel." |
K232237 | Tyto Insights for Wheeze Detection | Tyto Care Ltd. | "To establish the ground truth, all of the recordings were read by three blinded experienced Pulmonologists at random, the binary ground truth was determined by majority vote of these three Pulmonologists." |
K232305 | AI-Rad Companion Brain MR | Siemens Medical Solutions U.S.A. | "For each dataset, three sets of ground truth of white matter hyperintensity changes between two time points are annotated manually. Each set is annotated by a disjoint group of annotator, reviewer, and clinical expert, with the expert randomly assigned per case to minimize annotation bias." |
K232331 | InVision Precision LVEF (LVEF) | InVision Medical Technology Corporation | "The primary success criterion was that the subject device would produce an ejection fraction number with a Root Mean Square Deviation below a set threshold as compared to the reference ground truth EF as well as Dice score above a set threshold compared to the consensus annotation of three cardiologists." |
K232384 | Videa Dental Assist | VideaHealth, Inc. | "US licensed dentists labeled the data and a US licensed dentist adjudicated those labels to establish a reference standard for the study." |
K232410 | SmartChest | Milvue | "The presence or absence of pneumothorax and pleural effusion was established by three ABR-certified radiologists with a minimum of 5 years of experience in cardiologists independently interpreted each case and the third radiologist independently reviewed the cases where there was disagreement between the first two. The final reference standard was determined by majority consensus." |
K232431 | syngo.CT Brain Hemorrhage | Siemens Medical Solutions USA, Inc. | "The performance of the syngo.CT Brain Hemorrhage device has been va alone performance study. Sensitivity and specificity of syngo.CT Brain Hemorrhage in processing of noncontrast head CT have been analyzed by comparison to a ground truth established by majority read of 3 US board certified neuroradiologists with more than 10 years of experience." |
K232436 | Rapid SDH | iSchemaView, Inc. | "Truth was established using three (3) expert neuro-radiologists." |
K232501 | AI Platform (AIP001) | Exo Inc | "The ground truth for ejection fraction (reference data) was obtained as the average ejection fraction measurement of three experts." "The ground truth of the presence of A-line was determined by consensus of two or more experts." "The ground truth of B-line counts was determined as the average of B-line counts from three experts." |
K232751 | BriefCase-Triage | Aidoc Medical, Ltd. | "The study compared the software's performance to the ground truth, as determined by three senior boardcertified radiologists, using majority voting." |
K232928 | DeepContour (V1.0) | Wisdom Technologies., Inc. | "a third qualified internal staff member available to adjudicate if needed." |
K233176 | uOmnispace.MI | Shanghai United Imaging Healthcare Co., Ltd. | "For ground truth annotations in spine labeling: 'Finally, a senior clinical specialist will check and modify annotations to make sure the ground truth correct.'" "For ground truth annotations in rib labeling: 'At last, a senior clinical specialist will check and modify annotations to make sure the ground truth correct.'" |
K233186 | uOmnispace.MR | Shanghai United Imaging Healthcare Co., Ltd. | "If there is a disagreement, a consensus between the experts was done." |
K233196 | Medihub Prostate | JLK Inc. | "The ground truthing was conducted by expert-level radiologists. They independently annotated the prostate images, and these annotations were then consolidated into a definitive ground truth through a majority rule approach. The rationale for employing consensus among three radiologists, resolved through discussion and mutual agreement in cases of ties, ensures a reliable and unbiased representation of the prostate, crucial for the accurate clinical performance evaluation of our device." |
K233209 | uOmnispace.CT | Shanghai United Imaging Healthcare Co., Ltd. | "After the first round of annotation, they will check each other's annotation. Finally, all ground truth are evaluated by two licensed physicians with U.S. credentials." "Finally, all ground truth are evaluated by two licensed physicians with U.S. credentials." |
K233247 | Heuron ICH | Heuron Co., Ltd. | "The ground truth was determined by the two US board-certified neuroradiologists (truthers) interpretating each NCCT images, and in case of disagreement between the two truthers, a third truther reviewed the case for generating the final ground truth." |
K233438 | SleepStageML | Beacon Biosignals, Inc. | "SleepStageML software performance was evaluated against the expert consensus sleep stages that were constructed using 2/3 majority scoring (i.e., the stage per epoch where at least 2 of the 3 experts agree)." |
K233549 | Tempus ECG-AF | Tempus AI, Inc. | "Each clinical site contributed >1000 patient records, from which the AF status of each patient was determined based on duplicate manual chart review." |
K233590 | Overjet Charting Assist | Overjet, Inc | "The results were compared to a robust consensus reference standard established by trained dentists via majority pixel voting." |
K233618 | Oxevision Sleep Device | Oxehealth Limited | "Reference PSG measurements were assessed and scored (in accordance with the American Academy of Sleep Medicine Manual for the Scoring of Sleep and Associated Events version 2.6 of January 2020) by three trained sleep physiologists, blinded to the video data collected by the standard off-theshelf camera." "Oxevision video data was reviewed and annotated (to obtain a reference standard) for periods of bed occupancy by two reviewers, blinded to the algorithm development details." |
K233753 | AI-Rad Companion (Pulmonary) | Siemens Healthcare GmbH | "In case of disagreement a third radiologist (9 years of experience) served as an adjudicator." |
K233968 | CINA-iPE | Avicenna.AI | "Device Sensitivity [95% Cl] and Specificity [95% Cl] were computed against the groundtruth established by consensus of three US-board-certified expert radiologists." |
K233998 | TRAQinform IQ | AIQ Global, Inc. | "with blinded or otherwise neutral adjudication regarding interpretation/classification source." |
K234042 | EFAI Bonesuite XR Bone Age Pro Assessment System (BAP-XR-100) | Ever Fortune.AI Co., Ltd. | "The study design measured the performance of EFAI BAPXR against the ground truth (GT) from four U.S. board-certified expert radiologists. As shown in the following figure A) for the ground truthing workflow, the ground truthing was generated through the truthing process based on the current standard of care, with the addition of multiple checkpoints to ensure consistency and consensus among all readers reviewing the radiographs when comparing them to the Greulich-Pyle Atlas." |
K234141 | AISAP Cardio V1.0 | Aisap | "Any discrepancies were interpreted by a third ground truth cardiologist ("2+1" annotation strategy). Any persistent disagreements were decided at a meeting of the three ground truth cardiologists." |
K240003 | Velmeni for Dentists (V4D) | Velmeni Inc. | "Standalone performance was compared to ground truth established by consensus labels of three US licensed dentists, and nonconsensus labels were adjudicated by an oral radiologist." |
K240094 | LumiNE US; Lumi | Augmedit B.V. | "The U.S data was individually truthed by 3 U.S. based neurosurgeons with relevant experience including fellowships. The definitive US ground truth test set was established by mutual agreement after internal discussion and signed off per scan per truther." |
K240291 | EFAI CARDIOSUITE CTA ACUTE AORTIC SYNDROME ASSESSMENT SYSTEM | Ever Fortune.AI, Co., Ltd. | "The presence of AD or IMH in each case was determined independently by three U.S. board-certified radiologists, and the reference standard (ground truth) was generated by the majority agreement between the three experts." |
K240301 | MammoScreen® (3) | Therapixel | "The study applied a fully crossed design, so that each case was red by each reader both with and without the aid of MammoScreen 3." |
K240411 | uAI Portal | Shanghai United Imaging Intelligence Co., Ltd. | "During the ground truthing process, two Chinese radiologists, each with at least 5 years of clinical experience, independently annotated vessel mask for each patient case, resulting in two sets of annotations per case. Both radiologists are hospital employees and are independent from United Imaging. After completion, an American Board-Certified Radiology adjudicator with at least 10 years of clinical experience reviewed both sets of segmented images. Based on his assessment, the adjudicator selected the most accurate segmentation set as the final ground truth. If needed, he would make any necessary modification until a satisfactory ground truth was established for the study." |
K240555 | Tyto Insights for Crackles Detection | Tyto Care Ltd. | "To establish the ground truth, all the recordings were read by three blinded experienced Pulmonologists at random, the binary ground truth was determined by a majority vote of these three Pulmonologists." |
K240612 | CINA-VCF | Avicenna.AI | "Device Area Under the Receiver Operating Characteristic curve (ROC AUC) was computed against the ground truth established by consensus of three US-board-certified expert radiologists, as the primary endpoint, in accordance with the established required technical method under the QFM product code." |
K240642 | SMART Bun-Yo-Matic CT | Disior Ltd | "Based on the majority vote of three, two same responses were required to establish a ground truth on each of the DICOM series." |
K240697 | See-Mode Augmented Reporting Tool, Thyroid (SMART-T) | See-Mode Technologies Pte. Ltd. | "The ground truth labels for localisation, ACR TI-RADS lexicon descriptors, and TI-RADS level agreement were based on the labels of two expert US-board certified radiologists and an adjudicator (also US-board certified radiologist with the most years of experience)." |
K240712 | icobrain aria | icometrix NV | "Ground truth obtained via a consensus of 3 experts." |
K240736 | SMART Bun-Yo-Matic X-Ray | Disior Ltd | "The ground truth for the testing data was established by 2 (2) clinicians with over five (5) years of experience practicing medicine. Each clinician was given the same image data to review dorsoplantar and lateral x-ray images. Each clinician then marks on a spreadsheet the presence of the bone in the image." |
K240740 | qCT LN Quant | Qure.ai Technologies | "Ground Truth was established by three expert radiologists. The truthers independently read the scans and mark out the boundaries of the nodule in all slices" |
K240791 | ADAS 3D | Adas3D Medical S.L | "Ground truth annotations were generated using the FDA-cleared ADAS 3D software by two clinical experts independent of the clinical experts who established the ground truth of the training dataset." |
K240845 | Rayvolve | AZmed SAS | "Each case had been previously evaluated by a panel of three US board-certified MSK radiologists to provide ground truth binary labeling the presence or absence of fracture and the localization information for fractures." |
K240901 | Stethophone | Sparrow Acoustics Inc. | "Each recording in a testing dataset was annotated by multiple expert cardiologists. Annotation of each recording included determining the presence of a heart murmur of any type and providing timings of all S1 and S2 heart sounds that were audible in the recording." |
K240942 | CINA-CSpine | Avicenna.AI | "Device Sensitivity [95% Cl] and Specificity [95% Cl] were computed against the ground truth established by consensus of three US-board-certified expert radiologists." |
K240993 | encevis (2.1) | AIT Austrian Institute of Technology GmbH | "An event was considered as 'true seizure' only if the time interval of two out of three reviewers overlapped by at least 1 second. A seizure epoch was the overlapping time range of two reviewers." "An event was considered as 'true spike' only if the time interval of two out of three reviewers overlapped." "Annotations had to be consistent between both reviewers to be used in the sensitivity and specificity measurement" |
K241009 | PeriCALM Patterns 3.0 | PeriGen, Inc. | "To resolve the inter-observer variation, a majority opinion approach was used." |
K241038 | Cardiac CT Function Software Application | Circle Cardiovascular Imaging | "Compared to a reference standard established from three expert readers, the ML-based model is capable of segmenting the LV cavity with less than 10% difference in MAE, a Dice coefficient above 86%, a HD below 9.5 mm, and an EF bias of 1.3% with a 95% confidence interval of [-12, 14]." |
K241108 | RemedyLogic AI MRI Lumbar Spine Reader | Remedy Logic Inc. | "For the segmentation, each radiologist used a specialized pixel labeling tool to independently label the pixels of the tissues at the predetermined levels of the preselected axial and sagittal slices. The per-pixel majority opinion of the five (5) radiologists established the ground truth for each anatomical structure. Specially, if at least 3 of the 5 radiologists labeled a pixel as belonging to a particular anatomical structure, the pixel was included. Otherwise, the pixel was excluded." |
K241112 | BriefCase-Quantification | Aidoc Medical, Ltd. | "determined by three US board-certified radiologists." |
K241211 | CoLumbo | Smart Soft Healthcare | "the ground truth was defined by 3 radiologists" |
K241232 | Galen™ Second Read™ | Ibex Medical Analytics Ltd. | "the GT determination for a slide was performed by two independent expert pathologists; slides where the pathologists disagreed, a third independent expert pathologist was asked to review the slide and the majority rule determined the GT for the slide." |
K241380 | FETOLY-HEART | Diagnoly | "Images in which the pair of annotators disagreed were reviewed by an adjudicator, who made the final decision." "If the overlap was lower or there was a disagreement on the criterion presence, an adjudicator reviewed the boxes. The final decision regarding the presence was based on majority consensus among the adjudicator and annotators. The final decision for the criteria localization was based on the adjudicator's decision to either keep one of the annotator's boxes or draw a new one." |
K241390 | NeuroMatch | LVIS Corporation | "A reference standard was established for the validation dataset by a panel of three independent EEG trained neurologists who reviewed and annotated the EEG recordings for seizure episodes. Seizures were identified based on a 2 out of 3 majority rule." "A reference standard was established for the validation dataset by a panel of three independent EEG trained neurologists who reviewed and annotated the EEG recordings for spike events. The reference standard for spike is established with majority consensus among the annotating physicians (i.e., consensus of at least 2 out of the 3 physicians)." |
K241430 | EchoMeasure | iCardio.ai | "Ground truth annotations were established using manual measurements and segmentations performed by experienced clinicians (using the mean of three experienced US-based cardiac sonographers per case to establish the Ground Truth)." |
K241440 | HealthCCSng | Nano-X AI Ltd. | "The ground truth (Coronary Artery Calcium Category) was determined by the majority agreement of two out of three US board certified radiologists, experienced in identifying coronary calcium on non-gated CT studies." |
K241480 | JBS-LVO | JLK, Inc. | "In this standalone performance evaluation, each case output from the JBS-LVO device was compared with a ground truth was determined by two ground truthers, with a third ground truther intervening in cases of disagreement. All truthers were US board-certified neuroradiologists." |
K241561 | MammoScreen BD | Therapixel | "The primary objective was to evaluate the accuracy of MammoScreen BD in assessing the breast density value in terms of agreement between MammoScreen BD and the ground truth (GT) established by consensus among the visual assessment of 5 breast radiologists." |
K241593 | BoneMetrics (US) | Gleamer SAS | "Any cases with discrepancies exceeding the predetermined threshold were subjected to an adjudication process, where the three experts mutually agreed on a value for the ground truth." |
K241696 | Ortho AI | Ortho AI LLC | "After the reviews from each blinded surgeon, a final senior-level surgeon adjudicator reviewed the modifications and added further modifications to the segmentations, if necessary." |
K241719 | NeuroICH | Neurocareai Inc. | "comparing the NeurolCH's output to the ground truth as established by three US board certified Neurologists." |
K241725 | Better Diagnostics Caries Assist (BDCA) Version 1.0 | Better Diagnostics AI Corp | "Ground truth was determined through the consensus of two out of three experienced, licensed dentists, each with over 10 years of professional experience. These dentists examined and labeled dental surfaces, agreeing on the final labels for analysis when at least two dentists identified a surface as carious." |
K241727 | BriefCase-Triage | Aidoc Medical, Ltd. | "determined by three senior board-certified radiologists" |
K241747 | Saige-Dx | DeepHealth, Inc | "Briefly, each cancer exam and supporting medical reports were reviewed by two independent truthers, plus an additional adjudicator if needed." |
K241923 | EFAI Neurosuite CT Midline Shift Assessment System (MLS-CT-100) | Ever Fortune.AI, Co., Ltd. | "The presence of MLS in each case was determined independently by three U.S. board-certified radiologists, and the reference standard (ground truth) was generated by the majority agreement between the three experts." |
K242062 | 1CMR Pro | Mycardium AI Limited | "This was done by 3 independent US based truthers, all with >5 years experience." |
K242120 | OTOPLAN | Cascination AG | "The ground truth has been established by three qualified surgeons." |
K242166 | TribusConnect | TribusMed Beheer BV | "Validation of Heart segmentation is performed by 2 US board certified radiologists who qualitatively evaluated the performance." |
K242171 | TechCare Trauma | Milvue | "The ground-truth was established by American Board of Radiology (ABR)-certified radiologists with a minimum of 5 years of experience since ABR certification. Pediatric and adult cases followed two parallel ground-truthing (GT) pathways: the pediatric cases were annotated by a pediatric GT panel made of three ABR-certified pediatric radiologists and the adult cases by an adult GT panel made of three ABR-certified musculoskeletal (MSK) radiologists independently interpreted each case for the presence or absence of fracture and EJE using the standard clinical definitions of these pathologies. The third radiologist independently reviewed the cases where there was disagreement between the first two. The final reference standard was determined by majority consensus." "...with a third reviewing cases with initial disagreements. The final reference standard was determined by majority consensus." |
K242188 | ClearRead CT CAC | Riverain Technologies, Inc. | "In total, 491 cases were used in the clinical assessment of the device... as determined by a consensus ground truth review by three radiologists." |
K242203 | BriefCase-Quantification | Aidoc Medical, Ltd. | "Aidoc conducted a retrospective, blinded, multicenter study with the BriefCase-Quantification software to evaluate the software's performance...compared to the ground truth, as determined by three US board-certified radiologists." |
K242292 | uAI Easy Triage ICH | Shanghai United Imaging Intelligence Co., Ltd. | "Sensitivity and specificity of uAI Easy Triage ICH in processing of non-contrast head CT have been analyzed by comparison to a ground truth established by majority read of 3 U.S .- board-certified neuroradiologists." |
K242342 | Fetal EchoScan | BrightHeart | "The reference standard was derived from the dataset through a truthing process in which three pediatric cardiologists assessed the presence or absence of each of the eight findings, and majority voting was used." |
K242411 | Brainomix 360 e-Lung | Brainomix Limited | "The lung segmentation performance of the updated algorithm was validated through a head-to-head comparison between proposed and predicate devices. The study evaluated the accuracy of the e-Lung lung mask generation compared to a ground truth mask generated from the consensus of three experienced US board certified radiologists, who segmented the lungs following their usual standard of care." |
K242437 | Smile Dx® | Cube Click, Inc. | "Both devices were evaluated in a multi-reader, multi-case (MRMC) retrospective study with at least 13 US licensed dentists (Smile Dx® had 14 readers). Ground truth was established by the consensus labels of at least three US licensed dentists (the ground truth for Smile Dx®'s study was established by four US licensed dentists)." |
K242461 | IRISeg | Intuitive Surgical Inc. | "A consensus of three U.S. Board Certified Radiologists was used to resolve discrepancies / reader disagreements during the performance testing of the machine learning model." |
K242522 | Second Opinion CC | Pearl Inc. | "The ground truth (GT) was established using the consensus approach based on agreement among at least three out of four expert readers." |
K242600 | Second Opinion Periapical Radiolucency Contours | Pearl Inc. | "The ground truth (GT) was established using the consensus approach based on agreement among at least three out of four expert readers." |
K242607 | ScanDiags Ortho L-Spine MR-Q | ScanDiags AG | "Consent ground truth for anatomic structure segmentation determined by pixel-based majority opinion between the three radiologists. Consent ground truth for area and distance measurements determined by averaging the measurements of all three readers." |
K242729 | AutoContour (Model RADAC V4) | Radformation, Inc. | "Ground truthing of each test data set were generated manually using consensus (NRG/RTOG) quidelines as appropriate by three clinically experienced experts consisting of 2 radiation therapy physicists and 1 radiation dosimetrist." |
K242745 | AI-Rad Companion Organs RT | Siemens Healthcare GmbH | "Additionally, a quality assessment including review and correction of each annotation was done by a board-certified radiation oncologist using validated medical image annotation tools." |
K242781 | cvi42 Software Application | Circle Cardiovascular Imaging Inc. | "The performance of the constrained tissue tracking algorithm was also compared to manual tracking in ES phase by three expert readers." |
K242807 | HeartFocus (V.1.1.1) | Deski | "When necessary, disagreements were resolved either through direct reconciliation by the 2 experts or by a third expert. The ground truth (or gold standard) was defined from the consensus between the first expert annotator and the expert reviewer(s)." |
K242821 | EFAI Chestsuite XR Malpositioned ETT Assessment System (ETT-XR-100) | Ever Fortune.AI, Co., Ltd. | "The determination of malpositioned ETT in each case was independently assessed by three U.S. board-certified radiologists, with cases classified as positive for malpositioned ETT. Cases where the ETT was correctly positioned or with no ETT were classified as negative. The reference standard (ground truth) was based on the majority agreement among the three U.S. board-certified radiologists, resulting in 259 positive cases and 681 negative cases (Correctly Positioned ETT: 316, With No ETT: 365)." |
K242837 | BriefCase-Triage | Aidoc Medical, Ltd. | "compared to the ground truth as determined by three senior board-certified radiologists." |
K242925 | MR Contour DL | GE HealthCare | "all (3) independently validated ground-truth contours were incorporated in the performance evaluation." |
K242994 | OncoStudio (OS-01) | OncoSoft. Co., Ltd. | "Ground truth seqmentations were established by three radiation oncologists following international clinical quidelines." "First, the 1 radiation oncologist manually delineated the organs Second, seqmentation results generated by 1 radiation oncologist are sequentially edited and confirmed by 2 radiation oncologists. In this editing"Ground truth seqmentations were established by three radiation oncologists following international clinical quidelines." "First, the 1 radiation oncologist manually delineated the organs Second, seqmentation results generated by 1 radiation oncologist are sequentially edited and confirmed by 2 radiation oncologists. In this editing process, the first radiation oncologist makes corrections, and the corrected results are received and finalized by another radiation oncologist." |
K243145 | syngo.CT LVO Detection | Siemens Medical Solutions USA, Inc. | "Ground truth was established by two US-board certified neuroradiologists independently assessing the cases. In case of disagrement, adjudication was performed by a third US-board certified neuroradiologists." |
K243189 | TumorSight Viz | SimBioSys, Inc. | "In cases where the two radiologists did not agree on whether the segmentation was appropriate, a third radiologist provided an additional opinion and established a ground truth by majority consensus." |
K243230 | Second Opinion® BLE | Pearl Inc. | "measurement differences that were clinically significant required an adjudication. These divergent measurements were then adjudicated by two U.S. Dental Radiologists." |
K243234 | Second Opinion® CS | Pearl Inc. | "ground truth determined by four experienced dentists achieving consensus (Jaccard index ≥0.4)." |
K243239 | Lung AI (LAI001) | Exo Inc | "Adjudication, in case of disagreement, was provided by a third expert." |
K243294 | Brainomix 360 e-ASPECTS | Brainomix Limited | "Ground truth was determined by the consensus of three board-certified US neuroradiologists." |
K243341 | Genius AI Detection 2.0 | Hologic, Inc. | "The ground truthing to evaluate performance metrics including the locations of cancer lesions was done by two MQSA-certified radiologists with over 20 years of experience." |
K243350 | Rapid Neuro3D | iSchemaView, Inc. | "The primary endpoint, clinical accuracy, as determined by the consensus of up to three clinical experts against the source DICOM images, passed for all RN3D outputs with 99.8% agreement for MIP images, 98.6% agreement for VR images, 100.0% agreement for SSE images, and 100.0% agreement for CPR." |
K243363 | JLK-ICH | JLK, Inc. | "Each case output from the JLK-ICH device was compared with a ground truth standard determined by two ground truthers, with a third ground truther intervening in cases of disagreement (i.e., 2+1 truther scheme). All truthers were US board-certified neuroradiologists." |
K243378 | Rapid MLS | iSchemaview Inc. | "Final performance validation included 153 NCCT cases with ground truth established by 3 experts." |
K243548 | BriefCase-Triage | Aidoc Medical, Ltd. | "The study compared the software's performance to the ground truth, as determined by three senior board-certified radiologists." |
K243611 | JLK-SDH | JLK, Inc. | "Each case output from the JLK-SDH device was compared with a ground truth standard determined by two ground truthers, with a third truther intervening in cases of disagreement (i.e., 2+1 truther scheme). All truthers were US board-certified neuroradiologists." |
K243647 | Synapse PACS (7.5) | FUJIFILM Healthcare Americas Corporation | "An initial bone mask was created by certified technologist, then subjected to an independent dual-reader consensus review: two U.S. board-certified radiologists independently evaluated the mask, recorded any discrepancies, and iteratively reconciled them until consensus was achieved. The resulting consensus mask serves as the definitive ground truth for performance testing." |
K243685 | MammoScreen BD | Therapixel | "The reference standard for breast density value was established by majority rule among the assessment of 5 breast radiologists with at least 10 years of experience in breast imaging interpretation." |
K243743 | autoSCORE (V 2.0.0) | Holberg EEG AS | "A consensus of three HEs was used as the reference standard for all calculations. Each segment was prepared in two forms: 1. Without any markers placed by autoSCORE v 2.0 for recording level validation 2. With autoSCORE v2.0 markers and their assigned type of abnormality for marker level validation." "A marker was classified as a True Positive (TP) if at least two HEs agreed that it correctly the abnormality type. Conversely, if fewer than two HEs agreed, the marker was considered a False Positive (FP)." |
K243769 | QFR (3.0) | QFR Solutions bv | "For all of these algorithmic improvements the user is able to review and correct the results before the QFR value is calculated." |
K243810 | TraumaCad Neo (1.1) | Brainlab Ltd. | "Accuracy of implant presence and 2D landmark detection have been tested against ground-truth annotations done by qualified and trained personnel." |
K243851 | CHLOE BLAST | Fairtility Ltd. | "The TLI videos were annotated at a frame level with the ground truth of one of the morphokinetic stages and at a video level with blastulation results. Number of pronuclei (PNs) and embryo quality (according to SART) were also annotated to allow subgroup analysis." |
K243859 | PRAEVAorta®2 | Nurea | "The manual measurements performed by these healthcare professionals are referred to as the "ground truth." The measurements performed by these professionals showed no discrepancy greater than 5 mm at the end of the collected data process." |
K243863 | Opulus™ Lymphoma Precision | Roche Molecular Systems, Inc. | "Reference standard (ground truth) was established using three radiologists/nuclear medicine physicians with expertise in interpreting PET/CT scans from patients with FDG-avid lymphoma. The ground truth for each scan was based on the independent input from three radiologists randomly selected from a pool of nine radiologists." |
K250035 | Contour ProtégéAI+ | MIM Software Inc. | "The initial seqmentations were then reviewed and corrected by radiation oncologists against the same standards and guidelines. Qualified staff at MIM Software (MD or licensed dosimetrists) then performed a final review and correction." |
K250221 | StrokeSENS ASPECTS Software Application | Circle Cardiovascular Imaging Inc. | "The primary standalone performance assessment was a region-level Clustered ROC Analysis to demonstrate the standalone performance of the ASPECTS device with respect to the expert consensus reference standard." |
K250239 | NeuroMatch | LVIS Corporation | "Specifically, a clinical study was designed to evaluate the concordance of the SL algorithms and the resected brain areas, following the 510(k) summary of the FDA-cleared device PreOp (K172858). In this study, three US board-certified epileptologists were recruited to independently complete a survey. The physicians were presented with the source localization results of each device, along with normalized post-operative MRIs with distinctive resection regions. They were instructed to first determine the resection region at the sublobar level. They then assessed whether SL output of each device (NeuroMatch: sLORETA on idealized brain model, CURRY: LORETA on idealized brain model, PreOp: sLORETA on individualized brain model) had any overlap with the determined resection region at a sublobar level. For a particular patient, for every device, the physicians responded to a Yes/No question that asked whether there is concordance for the corresponding device." |
K250248 | BriefCase-Triage | Aidoc Medical, Ltd. | "ground truth [was] determined by three senior board-certified radiologists." |
K250686 | GyriCalc (Version 1.0.0) | NeuroSpectrum Insights Corp. | "For each brain MRI, the expert used an annotation platform to view the image series and a pre-loaded initialization of 16 subregions of the brain. The expert then reviewed the initial segmentation and edited the segmentations as necessary for accuracy. The segmentations of the 3 experts where then combined to produce a single segmentation using the STAPLE method. Reference measurements (i.e., volume, surface area, and local gyrification index) were derived from the combined segmentation." |
K250831 | Annalise Enterprise | Annalise-AI | "To determine the ground truth, each deidentified case was annotated in a blinded fashion by at least two ABR-certified and protocol-trained radiologists who interpret chest X-ray as part of regular clinical practice (ground truthers), with consensus determined by two ground truthers and a third ground truther in the event of disagreement." |
K251071 | Fetal EchoScan (v1.1) | BrightHeart | "The reference standard was derived from the dataset through a truthing process in which three pediatric cardiologists assessed the presence or absence of each of the eight findings, and majority voting was used." |
K251151 | Rapid CTA 360 | iSchemaView | "Final performance validation included 403 CTA cases with ground truth established by 3 experts (2:3 concurrence), all cases are independent of the development data." |
K251342 | EchoPAC Software Only / EchoPAC Plug-in | GE Medical Systems Ultrasound and Primary Care Diagnostics | "A review panel of five clinical experts provided feedback on the annotations which were corrected (as needed) until a consensus agreement was achieved between the annotators and reviewers." |
K251456 | BrightHeart View Classifier | BrightHeart | "The reference standard was derived from the dataset through a truthing process in which a sonographer and an MFM specialist with experience in fetal echocardiography determined the presence or absence of standard views on fetal ultrasound images." |
K251528 | syngo.via MI Workflows; Scenium; syngo MBF | Siemens Medical Solutions USA, Inc. | "In the first analysis conducted, the reference standard used to evaluate the subject device method performance consisted of liver VOI positioning obtained semi-automatically by two expert readers. The subject device algorithm was then compared to the reference standard and shown to yield results in better agreement with semi-automatic evaluation by expert readers compared with the method of placement used in the predicate device." |
K251590 | Methinks CTA Stroke | Methinks Software S.L. | "Ground truthing was established by two US board certified neuroradiologists that read the cases and a third ground truther in case the two first readers were in disagreement regarding LVO findings. The final ground truth was established based on the majority vote." |
K251766 | TumorSight Viz | SimBioSys, Inc. | "In cases where the two radiologists did not agree on whether the segmentation was appropriate, a third radiologist provided an additional opinion and established a ground truth by majority consensus." |
K251837 | Salix Coronary Plaque (V1.0.0) | Artrya Limited | "Discrepancies between the expert readers was resolved by a third independent adjudicator with Level III qualifications or equivalent experience." |
K251983 | Brainomix 360 Triage Stroke | Brainomix Limited | "Truthing was conducted by consensus of three experienced US board certified neuroradiologists." |
K252362 | GBrain MRI | Galileo CDS, Inc | "The GBrain MRI segmentation performance was evaluated by comparing the software-derived segmentations to a Simultaneous Truth and Performance Level Estimation (STAPLE) algorithm generated consensus of three expert-labeled segmentations of hyperintensities for volume measurement accuracy, and segmentation overlap agreement. Comparisons to expert segmentations were quantified using OLS Regression, and Dice similarity coefficient (extent of software-derived vs. ground truth overlap). The three expert labeled segmentations were performed by three independent US board certified, experienced neuroradiologists." |