This is a read only archive of pad.okfn.org. See the
shutdown announcement
for details.
s20bar_2016_session13
Home: https://pad.okfn.org/p/s20bar_2016
Session 13
Title: Practicalities of data sharing
Moderator: Johannes Breuer
Participants:
- Andreas Leimbach
- Kendra Sticht
- Tamara Heck
- Thilo Paul-Stüve
- Konrad Förstner
- Johanna Kuhnert
- Jasmin Schmitz
Notes:
- Johannes presents his workflow
- How do I need to structure data so I can share it with others --> what kind of information is relevant for the others
- Important to plan ahead (see also translation below)
- Depends also on the field, how can I abstract the model/system
- International cooperations lack a common standard
- Translation issue
- Languages are very nuanced
- Comma decimal or thousand separator (D <=> USA)
- Data formats
- Psychology 90% use SPSS, sociology use STATA
- What is the main target for openness? Think about it before the project
- Reproducibility:
- Script
- Syntax for SPSS
- Comment code, not only for others also for yourself!
- The why is missing often!
- The more open the data the better documentation etc has to be
- Literate programming
- Lower the hurdle to start automation/scripting, for data analysis --> teaching
- How to search, need a basic idea of the concepts
- There are a couple of good resources
- Curriculum needs to change to involve statistics, programming
- Teachers need to have expertise
- Software Carpentry http://software-carpentry.org/
- Gives also an excellent instructor training
- Learning programming needs practicability
- Is it useful to share the workload, have dedicated data scientists, statisticians, bioinformaticians
- Careful with pet bioinformaticians, giving credit is important
- Also for public datasets, data literacy and statistics important