This is a read only archive of pad.okfn.org. See the shutdown announcement for details.

pro-iBiosphere_pilots_20131008 pro-iBiosphere workshop Berlin
October 8, 2013
http://wiki.pro-ibiosphere.eu/wiki/Workshop_Berlin_2_(8_October_2013)_-_Promote_and_foster_the_development_%26_adoption_of_common_mark-up_standards_and_interoperability_between_schemas

M4.2 Pilots: mark-up issues (pilots


Notes

09:35 - 9:40 Jeremy Miller: Spiders (pilot)
      Figures are all uploaded to MorphBank



09:40 - 9:45 Sylvia Mota de Oilveira: Bryophyta - Campylopus pilotSuriname is part of the Guianas and data from Flora of Suriname and other literature sources for the region can be used as a platform for the completion of Flora of the Guianas. Mark-up and importation of data into CDM must garantee that Flora of the Guianas supersedes older Flora of Suriname.


09:45 - 9:50 Teodor Georgiev: Eupolybothrus Chilopoda



09:50 - 9:55 Donat Agosti: Ants

09:55 - 10:00 Quentin Groom: Chenopodium (Phytogeographic records from literature  )


10:00 - 10:05 Peter Hovenkamp Nephrolepis - a revision of ferns containing treatments with varying detail and lots of synonyms


10:05 - 10:10 Tony Walduck families_Loranthaceae_and_Viscaceae.29 Loranthaceae


10:10 - 10:15 Don Kirkup: Flora of Tropical East Africa, etc.

10:15 - 10:20 Thomas Hamann: Mark-up of Flora Malesiana and Flore du Gabon using FlorML

10:20 - 10:25 Hong Cui: Charaparse


10:25 - 10:45 General Discussion
prioritization of literature


POINTS TO BE CONSIDERED

markup strategy

markup Costs
Scaling


Nomenclatorial issues

Post-processing

Schemas used


GoldenGATE
Not yet menitoned:
    Quality control of documents: how do we assure that the markup content corresponds with the original source. 
    How to handle errors in original source document, for example, Tachgs for Tachys? This example error is described in an accompanying errata for the source document. Do we always want the mark-up to match the source, or use it as an opportunity to correct it ;-) ?
     Quality control: help user to recognize/remember what has already been  done, what he still should do, what would be nice to markup. This could be listed in a small area, with 3 columns (TO DO / DONE / NICE TO DO)
      Also, by saving the  document, GG should display pop-up like "hey, there is no nomenclatural  tag in your treatment, you should improve that" or "you didn't mark up any citation nor reference - aren't they any?"


Use of encoded documents at other end of interchange
    Adequacy and completeness of markup for consuming application
    What is the measure of success? How is it measured?

Priorization

Markup> How far to go?

Ontologies
Who is doing what?

Workflow
Tools

OCR
Implementation of worklow

Documentation as an issue

input literature

Next steps
PILOT


Interoperability
Define workflow
Use a OCR that recognizes all characters etc. that isaccurate text capture -> farm this step to pro vendors
Crowdsourcing for OCR
Propagate workflows in the public

markup through simpe MSWord macros

increase the incentices to do the markup

Granularity

increased marke up


13:15 - 14:45  M4.2 interoperability of mark-up schemas A105
  This section will present a brief overview and characterization of the  exchange of data between our systems (Pensoft, Plazi, EOL, EDIT-CDM,  GBIF, Antweb, HNS, KEW, Naturalis) and an assessment of pros and cons  and where an emphasis should be given to enhance the exchanges and  hopefully indicate a best practice.
Notes will be taken by all the participants using Etherpad http://new.okfnpad.org/p/pro-iBiosphere_integration_20131008
13:15 - 13:20   Introduction 
13:20 - 13:25   Patricia Kelbert: [hhttp://wiki.pro-ibiosphere.eu/wiki/Pilots#Pilot_3
13:25 - 13:30   Guido Sautter: [1] 
13:30 - 13:35   Thomas Hammann: Flora Malesiana-CDM
13:35 - 13:40   Don Kirkup: Flora of Tropical East Africa, etc.-CDM
13:40 - 13:45   Hong Cui: Charaparser
13:45 - 13:50   Markus Doering> published Materials Citation import into GBIF
13:55 - 14:00   Katja Schulz: Plazi / Pensoft - EOL (Darwin Core Archive)
14:00 - 14:05   Jordan Biserkov: Common query/response model
14:45 - 15:00  coffee break
15:00 - 16:30  M4.2 interoperability of mark-up schemas (ctd.: Solutions and steps forward)
This section will be used to discuss content, granularity and  quality control issues: Who on which side will be responsible for what.  Elements of best practices will be developed.