This is a read only archive of pad.okfn.org. See the shutdown announcement for details.

rpb-export Refer to https://github.com/refugee-phrasebook.

automate export of refugee-phrasebook - Technical site

If you want to know more about the Project please visit http://www.refugeephrasebook.de/

For rather non-technical suggestions please visit https://pad.okfn.org/p/rpb-suggestions

users should be able to 
  1. open a site
    1. select the needed languages (for parsing use ISO-Codes)
    2. select if phonetics are needed
    3. select the paper-size (A4, A5, A6)
    4. select the orientation (portrait, landscape)
    5. select single/double-side
    6. Eco-Mode (uses less toner - prints more pages)
    7. press create
  2. get a pdf for printing

ToDo (See https://github.com/refugee-phrasebook)

quick conversion hack, scraping from google sheets: https://github.com/refugee-phrasebook/py_rpb

a quick fix would be some kind of shell-script (mac/linux: bash + brew): manual download the book as zip, unpack and cat/cut/grep trough the html (using options like what lang to use) - create tex files (using tables?), replace rows with good fonts
  1. get the data from google to your hard-drive 
  2. parse the documents (normalize)
  3. add options on what languages to include
  4. create tex-files (table, tabular, ... what to use?)
  5. align according to writing (LR, RL)
  6. replace fonts in single rows (recomendations for language-specific fonts needed)
  7. run your local tex-distribution
  8. done :)
  9. ... upload PDF to destination

The discussions below look to be duplication of what we (Sebastian and Ursula) too discussed.
The tools differ, but the brainstorming could come together via a Google Hangout and an action plan could be devised from that.
Thoughts?
Ok, the refugee-phrasebook IRC channel now exists on OFTC:
    http://webchat.oftc.net/?channels=refugee-phrasebook

  1. Webserver using Linux (preferably debian/centos) with root-access (for installing packages)
  2. Install LaTeX (ziegenhagen@gmail.com can help with that, TeX Live 2015 recommended)
  3. Create scripts that download (export) the google-docs to a SQL-Server (sync) - UTF-8 - Use Data already on github
  4. Table-Layout for SQL-Server https://pad.okfn.org/p/rpb-backend
  5. Scripts that download the Icons (as vector-graphics: SVG or EPS, for LaTeX PNG or PDF is preferred) or use a reference to the raw-data files on the web in case the icons are changed
  6. Create a front-end with nice buttons
  7. create scripts that match the pressed buttons with data from DB and create LaTeX-documents (normalize the data/syntax to catch up errors)
  8. Feedback to the original-data if normalization does not work -> change the text in original (search&replace)
  9. Pay attention to left-right and right-left writing for alignment (LR, RL)
  10. run pdfLaTeX or XeLaTex and create PDF for download (different tex-apps produce different output - needs testing on ready tex-files)

Changes on RPB
  1. Match Languages to ISO-Codes (additional Line needed) - 3-letter-Version https://en.wikipedia.org/wiki/List_of_ISO_639-2_codes 3-letter-code is not enough
  2. Match Phonetic version to language (i.e. ger -> ger-phonetic)
l
LaTeX-experts
  1. Create Layouts/Templates
  2. Match Languages with good-reading Fonts (can be challenging, I don't know many fonts that are visually nice if different languages are used, maybe Helvetica World could be a good choice (very expensive however, could ask Linotype - maybe they will license them and can give some recomendations on which font is best readable in xyz-language) I recommend using Google Noto Font family: http://www.google.com/get/noto/ (free, open font license.)
-> use different fonts for different languages (every row (language) in a different font) - prefferably CC

Questions:
Wouldnt a simpled workflow be:
  1. Mediawiki tables and text saved into file.mw
  2. (file.mw can be edited manually to introduce some extra info)
  3. Pandoc converts file.mw to PDF using LaTex/LuaTex/XeLatex in conjuction with a template
? What do you think? Just  a suggestion. But it allow the current system of wiki tables to continue to be used, and not scare folks like me who come more from design and are a bit afraid of SQL.
-whoever wrote this, could you please test? thanks! ^mn