This is a read only archive of pad.okfn.org. See the
#Can Open Data Go Wrong
## Facilitator Contact Details
Lindsay Beck: @becklindsay
Alix Dunn: @alixtrot
Tin Geber: @tingeber
Mushon Zer-Aviv: @mushon
## Your session hashtag
## Liveblog [please forgive any typos!] by Lindsay Beck @becklindsay
Reuben Binns @RDBinns
Friedhelm Weinberg @WhyFrycek @huridocs
Janet Gunter @JanetGunter @RestartProject
Mushon Zer-Aviv @mushon
Podcast interview about the session and the topic with Tin & Mushon:
Concrete cases where it has gone wrong - we're introducing examples to start conversation about where open data has gone wrong
We'll open it to you to discuss when things have gone less than perfect
Start with Janet Gunter: case on land data (can exacerbate existing inequality)
Friedhelm: coalition of NGOs on FOI portal which may have misrepresented what they wanted to communicate (when opening data sets can backfire)
Reuben Binns: healthcare data debacle in UK
Mushon: opening a parliamentary dataset, "open washing" for israeli gov't
Background in international development projects: you end up riding in trucks (pickup trucks)
Certain tools we use can convey our power and can be disturbing in different scenarios
data sets and data can be like lexuses in development contexts
will be talking about northern mozambique
when car rolls up, people ask, what do the people in white troucks want. this can be big companies, and can roll into traditional leaders walled compound and close door conversation
community consultation within walls, and communities want to see a map of what land was discussed/given away
good = community wants a map. map = good, more maps are better
why are major donors interested in community mapping in place that's empty on google maps?
[Good background on Mozambique at the moment, from the smallholder perspective: http://newint.org/features/2013/05/01/smallholders-last-land-keynote/
why instututions want land data more available. could defend small farmers, but:
- Bhoomi case study: Bangalore
- Published in 2007, ethnography of what happened when indian authorities in Bangalore wanted to digitzise 20 mil land records
- IT companies wanted to get this data
- Read it: Somonom Benjamin, R Bhuvaneswari, P. RAjan, Manjunatha, ca. 2007 http://casumm.files.wordpress.com/2008/09/bhoomi-e-governance.pdf
- Aimed to remove "messy" politics and deal directly with outside investors
- Be skeptical that development cannot be driven at local levels, and that they are naive.
- Played both in Mozambique and India
- Takeaways: people are exposed to new risks, and can happen quite quickly in their terms (1-2 years); when we open data sets, think beyond the "net positive", who is empowere dyb this activity
- Work w/ 4 georgian ngos: opendata.ge
- not open data, but gathering FOI requests from 4 major NGOs
- covers a lot, but onlly focys on FOI requests
- goal: make FOI more transparents, but also to pool information to ensure that same responses are given and cross-referenced
- ex: one request - categories, (public institutuions have to respond within 10 working days), very structured information
- which public instutuions are always late? on time? disparity between these agencies
- requests not equal = 1 question or 10 questions, easy vs. hard questions
- should time to respond differ?
- aggregation of this data very interesting. comparing perfomance and categories, but...
- ... fundamental problem: even NGOS don't agree on what a request is. if you really care about data, all of this data aggregation is bullshit. There was no agreement on what to use as performance indicators, and requests were different. But, you have to already work with those doing this work.
- What this teaches: we need to open ourselves up and determine if this means anything. Or, is aggregation really desirable?
UK - national health service, general practioners, and then they refer to specialists.
some people gan cave same GP, and understanding of privacy.
data always with GP
2012 - health and social care information center were to aggregate this data from GPs to get other (...) to get this data and provide services
Data: red and green. Red was PII but would be stored on their servers and not published
- Amber: "soft anonymisation": identity was replaced by "code" (pseudonymous) and linked to hospital visists, and linked thorughout data sets
- Amber data was not processed vs best practices for this kind of anonymisation. It was occasionally also uploaded to Google big query for analysis, or uploaded to Earthware and was publicly searchable until they realised about the mistake.
- Hostiitals cgather this data, a few cases where people gave access through a convernance framework, one case thurned out uplodaed it to Google big query for analysis and uploaded to http://www.earthware.co.uk/ and was publicly searchable
- -hospital data is mostly visits, etc. not a ton of data
- Use of "open data" label which doesn't apply to seudonymous data which is paid; OD community needs to be careful about how terminology is used,
Mushon Zer-Aviv - the public knowledge workshop:
Slides here: http://mushon.com/blog/2014/07/29/when-data-goes-wrong/
things happening in israel are worrisome, and will not discuss in this session, but can talk 1-1 later
The Public Knowledge Workshop volunteer based NGO, more than 100 vols that work on diff projects. don quixote of israeli democracy - fighting a lot of windmills
open budget: 2010 by Adam Kariv to get gov't budget open. collab with israeli's .gov website, first of this open data iniative
lots of data on site, celebrate as gov't transparency, etc.
what this tells me is that budget is open, but you're stupd. if you don't understand, that's your problem. not gov'ts.
so, is it really open? data set and basic pi charts are open?
showed NYT obama budget visualization (2013 budget) - can actually understand this budget based on the visualization
was convinced to model theirs after this - rewrote code and open sourced it, worked with calcalist online news papers. got a lot of news coverage for it, people looking at data, went to israeli parliament and showed it, used viz as table of contents for the budget
is it open enough? but what they realized, it's not the full story. As soon as budget is signed off, 10% of budget starts getting shuffled. and this is the most interesting data.
comparing one line to another, and they didn't have transation data.
oversight: get a bunch of viles at vote meeting, they can't question it, just vote on it.
- the cool visualisation in a way distracted the public from the real information, it was a fake achievement
He has problem with use of term "evidence" - might be beautiful or ugly. Visualization does not equal evidence. MAsquerades as evidence. Viz is speech, or beautiful argument.
- one MP put pressure on not voting on a budget without more requests available and time to review
- Finance ministry gave some more review time. She then filed supreme court case against MF.
- MF then turned to Public Knowledge Workshop - had to transparency wash. They played good cop bad cop in the scope of supreme court case. may release information 3-4 days ahead of case.
- People can get alert about transactions and when they are happening. Working on new website that can be more communicative and full context of what's happening in these transactions. Trying to work to make budget transparent by design.
- They are in a position to discuss budget transparency.
- Now what? we are software devs, not financial analysts. who are we to define discussions on budget? we have a bias toward things that we can or cannot quanitify. if can't be quantified, doesn't mean it's not important. Trying to work with more professional entities to get more knowledge and that not everybody can know everything to do a dataset.
- "disinformation visualisation"
- Edward Tufte: Beautiful evidence. Also, official hater of PPT and pie charts
- Galileo- drawing of sunpots, "delight by the wonder of the spectacle and accuracy of expression".
More on this point: https://visualisingadvocacy.org/blog/disinformation-visualization-how-lie-datavis
Too excited about quanitifcation and aggregation - the way we speak is the way we argue, data taht can't be talked back to is useless or even dangerous
*Chatham house rules*
policy for a government on comps and internet
charging for open data - like banging on the door, or charging through the door and smashing up what's behind it
wrote loads of [olicy papers - one on OD as right or gift
seen as gift by most ministers
gov't give rights to life, etc. but "gifts" are like bribes.
we stop giving gifts, and will ensure a right to OD.
s/he sent it off, and was contacted by civil servant and asked about right to data leg. they had sent paper to parl and was already voted on without him/her being notified about it
civil service gives right = put it in leg but it won't do much more
it's a component on FOI and changed the law of the land
gov't can get so enthusiastic and change law in stupid ways because they don't understand OD as well as we do
data being open and responsiblility about people reading
science story trending on reddit about "farts changing dna and can cure cancer", yet sci study says none of this
worked with investigative journalists and proovers of data from dangerous countries
seen cases where mishandled open data can kill, and endanger sources
int'l agency - data published, and creation of new map data
entered into agree with large internet CO to create maps and make it more available
means, that people use this tool and endorse use of tool that isn't inherently open
open data efforts to create closed data sets
internet blowback on this
they backtracked on agreement and used only open mapping platforms
to think that tech fix, anonymize data better, etc. is oversimplifying
what makes us as a society is the idea of social security
the risk of losing a leg or land is equal
the more we know is that it's not true
the more we know, the less we are a society.
big debate in UK about sharing tax data - who has access, etc.?
promise of gov't transparency of clinical data to work with gov't on new policy
many orgs did, things seemed fine but likely too optimistic
finally, in may of this year, that after negotiations of policy, hardcore pharma lobby behind clsoed door, they changed this policy and said this data was IP rights, and limited ngo access to this data. it's only screen read, can't be downloaded, and it is 10s of thousands of data sets, heavy TOS
pharma industry was able to produce 2 versions - version a which is complete, and version b which was heavily redacted
started online campaign and nearly managed to get over the screen read only, but under the frame that it's a false victory. but, clinical research and public access is still not done.
if you don't keep the ball rolling and checking up the consumption of the news, this is pointless
Gianfranco Cecconi, @giacecco, Digital Contraptions Imaginarium Ltd.vv
Lindsay Beck, @becklindsay, Program Manager, Open Technology Fund
## Participants: pre-event, to get in touch with each other (feel free to add your Twitter handle)
## Agenda + pre-festival materials, resources, instructions
AT THE FESTIVAL
## Participants - name, contact (if you want to leave it), number of attendees
## Notes from the session
AFTER THE FESTIVAL
## What did you learn and/or make?
## How/what could you teach others?