Availability of services
- storage as a service
- software as a service
- high-performance computing as a service
- distribution of services, agreements between "centers" to take over if one center ceases to exist
"Pipelines" of services
- provenance metadata becomes essential
- existing metadata set PREMIS which includes provenance metadata
- intermediate results may be worth storing as a new resource
What types of data are there? Can one generalize over all types e.g. regarding the notion of "data publication"? Some types of data e.g. in typology or field linguistics very time consuming to produce whereas other types may be generated in seconds.
Customizable editions of data
- tuning way you look at the data to specific research needs: one view is not enough
- Wittgenstein example
- how to reference such a rendition -> PID needs to include all parameter settings. guaranteed availability of "tool"
- data mashups: user participates in creating data
- store mashups again as new data? keeping provenance information of the sources again essential
Privacy issues
- some technology exists to anonymize source materials, e.g. by masking, but often not automatic and sometimes not possible e.g. in sign language.
- becomes very complex with international access, where different countries may have different rules for guarding privacy. This situation may require different country-specific licences and legal advice.
- DOBES example: legal specialists advised to keep all data closed. Code of Conduct/ethical rules only workable solution, some data needs to remain closed.