Deliverables
Defining a Label for Data Quality and Utility
This report outlines how we define and measure the quality and usefulness of data, offering a structured framework for assessing dataset reliability. It builds on previous research, expert discussions, and a rigorous consensus-building methodology to develop a standardized approach to data labeling. To ensure a robust technical specification, a comprehensive systematic literature review and expert consultations were conducted, identifying 54 key dimensions of data quality. Through iterative expert evaluation, statistical validation, and prioritization, this list was refined to 12 core dimensions applicable to any data type and reuse purpose. For each of these dimensions, quantifiable metrics and scoring criteria were developed, ensuring broad agreement among experts from diverse data quality domains.

Evaluating Data Holders’ Readiness
This report presents the Data Holders’ Maturity Model, a framework to assess how well organizations manage high-quality data. It defines ten key dimensions with five maturity levels, guiding progression from lower to higher maturity. Typically, each organization completes a single assessment, but multiple assessments can be done if maturity varies across data domains (e.g., clinical vs. genomic data). Designed for long-term use, the model allows data holders to track and improve their maturity over time.
Understanding Stakeholder Training Needs for High-Quality Data
This report identifies the key training needs of stakeholders involved in data quality and utility. Through a detailed analysis, it highlights gaps in current knowledge and proposes a tailored educational curriculum to help professionals understand and apply the Data Quality & Utility (DQ&U) label. The findings will guide the development of training materials that ensure data is accurate, reliable, and fit for purpose.