Data veracity assurance
- Data spaces will lead to new levels and complexities of cross-party data sharing scenarios by standardising technical aspects of data federation as well as the contractual and consent aspects.
- However, in these more dynamic and more complex networks of data interdepencies, the risks associated with incomplete or erroneous data are also magnified.
- The data veracity building block will provide an integrated framework for the assurance of data quality of data space based data federation.
Start date: T0 (expected: Q1 2024)
End date : T0 + 12 months
Duration (in months): 12
- Developments have yet to start
- Agreements on data quality: lightweight, extendable semantic descriptions of the exchanged data and the constraints that have to be met on them (e.g., completeness, resolution, precision, internal linking, … properties).
- Striking veracity level agreements: building on the consent management capabilities, facilities for striking data ‘veracity level agreements’ between the consenting parties. We plan to provide a decentralized implementation option for the agreement management.
- Veracity evaluation modules: a range of veracity evaluation implementations for key data ensemble types and veracity requirements. These include
- means for ensuring the veracity of properties which are easy to check (e.g., in data federation connectors);
- facilities for weaving into the data transfer third-party data veracity assurance services (which may involve AI support);
- checkable decentralized commitments (from simple hashes to zero-knowledge proofs);
- decentralized agreement schemes for data where veracity evaluation is not feasible to automate and has to rely on either the judgement of third parties (e.g., in an audit role) or majority consensus.
- Rule-based reconciliation: upon accepted/agreed on breaches of data veracity agreements, workflow/decision model based procedures for reconciliation. This reconciliation will leverage on fault-tolerant patterns to improve the trust in both the data sources and the AI components. We plan to provide a decentralized implementation option.
ISO8000 (Data Quality), emerging ISO/IEC CD 5259 family of standards (Data quality for AI)
BME will lead this task ,based on its solid background in fault tolerance, theory and practice of blockchain technologies, model-driven engineering and advanced data analysis.
(The participating research group has been involved in 20+ EU projects.)