Ensuring Data Reliability and Provenance
Provenance: what it is, why it is important
- what it is
- the who, what, and when of metadata
- trusted identification of individuals/organizations and services
- why it is important
- assigning credit for creation and citation
- privacy rights
- judging the value of the data
- replicability of results
Provenance: how to achieve it
Data as Publication
- the technology is there
- institutional and social changeengagement needed
- need to get institutional credit for data publication
- encouragement of researchers to publish and cite
- annotation/quality control
Handles: globally unique, persistent identifiers for
- entities: people, organizations, roles
- documents
- views
- mashups
- doors
Software as a Service
Reliability
- preservation of the bits
- access and use, including privacy
- comprehensibility
Suggested first steps
- proactive education
- carrots
- mentors publish/share their data sets as model for next generation
- provide a "cite as" button with data
- service provision for data structure, integrity validation, and conversion
- sticks
- publishers/editors require provenance information
- editors and funding agencies encourage data sets to be published