This is a read only archive of pad.okfn.org. See the shutdown announcement for details.

how-to-improve-DBpedia Please add questions and feedback, will be discussed during the session on Thursday 15th 13:30

Meeting URL: http://wiki.dbpedia.org/meetings/Leipzig2016
Slides URL:  https://docs.google.com/presentation/d/1rte2VRrlS_N2yNTdnnOXbAZv3AwFYcIm26OhJFbMI5E/edit#slide=id.g1188047e5b_1_13


ADD YOUR IDEAS AND QUESTIONS BELOW

FUNDING STRATEGIES (see slides above)

JP: for all funding options... what would be the incentive/advantages/benefits offered to subscriptors, etc. What would be the impact on access and usage of data resources

Strategy 1 Public Fundraising


Strategy 2 Direct Fundraising
- KTP

Strategy 3 Membership fees
- add small personal gain (c.f. http://wiki.dbpedia.org/membership)
- a goal should be to increase memberships from industry

Strategy 4 Community/ Projects
- ERIC 
- copy the CLARIN  model
Enno: my colleague at Huygens, Gertjan Filarski is very well informed about these EU funding possibilities, I'll check with him 
- start infrastructe

NEW LANGUAGE CHAPTERS
- Japanese Chapter applied
- Babis and the Greek Chapter

NEXT DBPEDIA MEETING?
- early next year
- ideally not netherlands and not germany
JP: Switzerland, at the EBU (condition is a participation fee to cover organisation costs)?
-> SH: the local host normally supplies the room for free, sometimes also fodd/coffee, alternatively a sponsor can be found.  Switzerlend is very expensive. 
GR: Greece, Thessaloniki
-> SH: your proposal will go to the board


HOW TO IMPROVE DBPEDIA
Uptime

JP: the EBU would propose that DBPedia becomes the reference from which other content like wikipedia would be generated. This means probably a richer ontology with more properties. As an example, the EBU would be happy to contribute properties to provide more structured sport results (RDF/XML, turtle or as text in a page) from which wikipedia table would be published. New templates should also be provided.

Uptime can be improved when we take off the heaviest users. For example, packaging DBpedia in docker and offering an easy way for configuration should help potential exploiters to do it on their own infrastructure, thus freeing resources to incidental (and less skilled) users.
Freemium model: limit number of queries per IP (daily/monthly), which are offered for free. Additional queries require contract/subscription/contribution.

List of public DBpedia endpoints
Official tutorials on setting up a DBpedia mirror

Quality

JP: there are concerns about the editorial quality in wikipedia that may contains a lot of subjective content. If DBPedia becomes the source, this could allow structuting better content to facilitate filtering by users. This would also greatly facilitate NLP èprocessing.
JP: things can be improved in terms of quality for parsability. An example: an athlete role would better be described as "role=athlete + sport=biathlon" rather than using "role=biathlete". This probably applies elsewhere.
JP: Persistence is an important issue
JP: versioning should also be well addressed to help updating by users

There is no other way than alignment with wikidata and making sure we put together what is currently offered in various languages.

Coverage

JP: as mentioned above, many wikipedia pages with good content have no equivalent dbpedia. dbpedia should cover more and the easiest way would be to hlpe make dbpedia the source/reference (more properties and templates...)
@JP I'm afraid templates and properties are at the discretion of Wikipedia editors. No progress unless they decide to cooperate. What about extraction from free text?


(new) Services

Services like DBpedia Lookup / Spotlight
Mappings Wiki
JP:improve the sparql endpoint to facilitate natural language queries

Citation
Add links to original sources/research work. This can also be reflected in provenance.

NLP & DBpedia 
- NE hot topic
- two presentation from GSoC
- research should focus on bigger problems

Ontology Session
- task to research on extacting data with a difference scheme and see if inferencing is possible
- Monika will choose an upper ontology or schema.org, create mappings to dbo
- Dimitris will temporarily replace the mappings to the new ones and rerun extraction to get data with the new schema

Wikipedia References
- quite comprehensive work on the polish DBpedia, there will be a follow-up

Developer session
- overview about the dev process and the merging into the framework
- - DataID - Integration
- - RML- Integration



Actions:
- set up a task force for crowd-sourced coordinated hosting
-- rough figure by OpenLink 99.99% uptime would be around 50k/year 
- Soeren will prepare a survey to assess direction where we are going
- new application: OKFN Greece, SWC, 
- suggestion to add a small personal benefit to membership or make it clearer
- add references to data, so it can be ingested by wikidata
- next meeting: application from Greece, Thessaloniki  in Spring
- add extraction framework to the discussion bucket
- provide checksum for file
- rethink bzip2 
- Dimitris to provide a short paragraph about the experiment with the mappings