This is a read only archive of pad.okfn.org. See the
shutdown announcement
for details.
s20bar_2016_session10
Home: https://pad.okfn.org/p/s20bar_2016
Session 10
Title: Data formats for Open Science
Moderator: Johanna
Participants:
- Dierk Höppner
- Fidan Limani
-Athanasios
Notes:
Questions:
What Data formats are needed are the right ones ?
Ideas of / how can we provide Sustainabiliy -readability & edibility?
How to mark-up Provenance, privacy, policies, licences, world regions in open research data?
Suggestion: Using the Web of Data / linked data, not just the papers, semantics ->> Metadata
Semantic Web Technologies is state of the art
Establishing Structures within Documents is much more advanced and crucial in biology -> for automatic extraction of information
minimal standard: textbased structured data (xml, json), better: rdf, triples
how to store weather data? what are the formats?
metadata are attached in the post-process - because there are terrabytes generated in seconds. With minimum Metadata
In weather meassuerment projects every byte in xml would be too much storage: binary formats
Metadata that are discipline-independent would be dublin core and then more domain specific metadata, which are hopefully linked open data and adressable wia identifier
Libraries make different experiences: everyone uses different metadata standards
Metadata unification is a social process
Apescts of Longterm Preservation
maybe storing the data with the software that generated them toegether for being able to emulate data.
Dependancy on hardware is hard to emulate
Need for Data Archeology and Data Curators who with help of a Technology Watch are able to
Recommended Formats: LaTech for the content (makes the structure searchable and machine readable) and PDF for the appereance (better human readable, stores all visual features...)
The savest Dataformat is Stone plates