This is a read only archive of pad.okfn.org. See the shutdown announcement for details.

b5OdI88b8Q Journal Metadata Federation -- EADH AO Forum

Skype call with Daniel Zöllner (Würzburg / Bibsonomy), 2017-08-22

English summary:
    
I have talked with the computer scientists in Würzburg an we have come up with a new idea for a light-weight technical solution. They are running the "Bibsonomy" bibliographical data project which me may be able to use. See it here: https://www.bibsonomy.org/ 
Importing: You can add publication data manually, via a DOI or as a bulk upload of Bibtex files. They have so-called scrapers, which collect bibliographical data from websites, and we could pay a student to scrape or otherwise collect metadata on a one-off basis for the journals which are no longer active. For the active journals, the idea would be to analyze their RSS feeds regularly, identify new articles and then scrape the metadata for import. This would work in a semi-automatic fashion.
Metadata: They offer a lot of fields for almost all the metadata we need (much more than just Dublin Core, as in standard RDF). And you can define custom fields, which we could use for the abstract in the original language, for example. Tadirah keywords, if they exist, could be added using the normal keyword scheme. The back-end format is Bibtex.
Retrieval: They have an API (but it is read-write, there is no read-only option, which is a bit risky) as well as several nice plugins (for Typo3 and Wordpress at least) which allow you to make a query on bibsonomy and display a nicely formatted bibliography on a website. That uses Citations Stylesheet Language for formatting, so another standard. That could be a way of letting journals display the metadata. 
This is not the professional RDF-version, but it is a lot more manageable. And I know the people behind Bibsonomy well, they will certainly be willing to help us or even add a feature if we need it. 

German notes


---------------------
Ausschreibung: Werksvertrag für Studierende/n

Für die "European Digital Humanities Journal Metadata Federation" (EDHJMF) suchen wir eine/n Studierende/n der Informatik mit Interesse an Webcrawlern, Metadaten und Bibsonomy. Die Aufgaben im Einzelnen: 

Für den aktuellen Bestand von drei Zeitschriften: 
Für eine der drei und eine weitere, neue Zeitschrift: 

Die Aufgaben sollen als Werkvertrag vergeben werden. Fragen zu weiteren Details, zum Ablauf und dem Umfang des Werkvertrags können im persönlichen Gespräch geklärt werden. Bitte nehmen Sie bei Interesse Kontakt zu Daniel Zoller (zoller@informatik.uni-wuerzburg.de) und/oder Christof Schöch (c.schoech@gmail.com) auf.   

----------------------------


TODOs

Tasks for the student


List of metadata items
Basic metadata
Further metadata
=> It is actually not that straightforward to represent all of this in Dublin Core! Journal name, page numbers, DOI, TaDiRAH keywords, explicit language versions, are all not provided out-of-the-box. 
See this for a critique: https://reprog.wordpress.com/2010/09/03/bibliographic-data-part-2-dublin-cores-dirty-little-secret/

Journal article metadata in XML (hypothetical, non-valid example!!)

<?xml version="1.0" encoding="UTF-8"?>
<metadata
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:dcterms="http://purl.org/dc/terms/">
  <dc:title>Digitalisierung - Geisteswissenschaften - Medienwechsel? Hypertext als fachgerechte Publikationsform</dc:title>
  <dc:title xml:lang="eng">Digitsation - Humanities - Transformation? Hypoertext as an appropriate form of publication</dc:title>
  <dc:creator>Baasner, Rainer</dc:creator>
  <dc:creator>Buchsner, Raimund</dc:creator>
  <dc:subject type="free">Digitalisierung</dc:subject>
  <dc:subject type="TaDiRAH_goal">creation</dc:subject>
  <dcterms:abstract xml:lang="ger">das hier wäre der deutsche Abstract.</dcterms:abstract>
  <dcterms:abstract xml:lang="eng">This would be the English-language abstract.</dcterms:abstract>
  <dc:publisher>mentis</dc:publisher>
  <dcterms:issued>1999</dcterms:issued>
  <dc:type>Text</dc:type>
  <dc:format>HTML</dc:format>
  <dc:identifier>http://computerphilologie.digital-humanities.de/jahrbuch/jb1/baasner.html</dc:identifier>
  <dc:identifier>doi</dc:identifier>
  <dcterms:bibliographicCitation>Jahrbuch für Computerphilologie 1</dcterms:bibliographicCitation>
  <dc:language xsi:type="dcterms:ISO639-3">ger</dc:language>
  <dc:rights>not specified</dc:rights>
</metadata>




Skype-Meeting, Christof and Max from SUB Göttingen (June 27, 2017)
Goal: get more information on practical implementation of collecting, storing and providing OAI-compatible metadata records about journals
Questions 
Outcomes


Skype-Meeting 16 June 2017 (13:00-13:45)

News from Human IT (editor: Mats Dahlström)

Reasons for delay in realising the Journal Metadata Federation

Other thoughts 

Solution so we can start already even before several current journals are online

Future of the disbursement model via the EADH AO forum
@Elisabeth: as far as I can see you have not spend any of the money of the first arrangement yet. If I am not mistaken, we are in a second phase (year) already. Could you draw up a budget and calculate expenditure?

Next meeting: in Montréal with Annika via Skype.


Description
Build a loose federation of interconnected journals rather than one large multi-lingual journal. So, set up a small technical infrastructure to share metadata about current journal issues among the various European journals of DH, so that each individual journal website can have a section where it displays the current tables of contents of the other journals. This could be expanded to international DH journals, of course. Why is it useful: Apart from the existing German journal "Zeitschrift für digitale Geisteswissenschaften" (zfdg.de), the French Humanistica is in the process of starting a journal on "Humanités numériques". So it appears there is a wish to have journals in a local language, something which gives visibility and recognition to DH in a local context and keeps a strong connection with the    respective linguistic communities. But of course, it would be a pity if the linguistic communities were separated. 
Automatically connecting all journals would give international visibility to all journals and demonstrate the (linguistic and otherwise) diversity of the DH network. Many people who know one or two DH journals may not know the others, and this "journal federation" could counter that. The more journals join, the higher the positive network effect. The main counter-argument I see right now is that this kind of sharing of metadata could also be done manually, by just sending the table of contents around to the other journals. After all, compared to getting an issue ready, this is not a big effort. An automatic process would ensure this exchange happens regularly and quickly, but has a higher technological overhead. Note that the German journal would be very interested in this and the French (planned) journal as well. The DHN doesn't have a designated journal, yet. Discussions of the prior DHN board came to the conclusion that if need be, the (Swedish) journal Human IT could be asked to 'host' DHN content. However, this year's conference and discussions of the new board have shown that there is little interest in creating a Nordic (linguistically speaking) journal. One reason for this is the publication reward system of Norway where new journals have to undergo a national evaluation process to be regarded as scholarly and then voted for by a national committee to get 'high impact' status. Since DH is not a discipline in this system, there is no way to have a designated Nordic DH journal ranked high enough to be an interesting publication outlet. The Swedish, Danish, and Finnish (and Icelandic) systems are somewhat different, however, most prefer publication in English anyway and people tend to publish with international, already established journals. (Annika)
How much would it cost? It all depends on how the journals manage their metadata in the backend. If they have well-structured metadata available via an API anyway, then it shouldn't be too difficult. However, OJS for example does not appear to be perfect in this respect, at least not out-of-the-box: http://forum.pkp.sfu.ca/t/rest-api-for-metadata/7949/3 I'll ask the German journal how difficult they believe this to be. (The French may or may not use revues.org for their journal with OJS / Lodel in the background.) The second issue is how to distribute the metadata: The solution with the least converting is probably to collect them all into a shared format and make them available for reuse by other journals from there. That means running a small server that does this. Then, each journal can pull the other journals' metadata into a page in whatever way they see fit.


Journals to involve
(Strategy: First make it work with several European journals, then expand beyond, e.g. DHQ.)


Functional requirements


Organisational details