We need more voices in this debate about data. Interview with Catherine d'Ignazio.

2016-06-22

Aleksandra Jach: How to represent complexity of data and be effective in communication at the same time? Is it possible to combine those two assumptions? These were the issues I was thinking off, while reading your blog post “What would feminist data visualization look like?”

Catherine d'Ignazio: The post came from my observations during my work at Emerson College and the MIT Media Lab. I spend a lot of time teaching journalism and communication, and also teaching art students. One of the things that I see in those novices, especially those who don't consider themselves technical, is the rhetorical power of data visualization. As data visualization becomes a more mainstream form of communication, we also have to be careful with the aura of truth they often invoke. The deeper understanding of the way photographs are produced and ways that they can be manipulated, means we have a more sophisticated understanding of photography. It is now not so difficult to say whether a photo can actually prove something or not. I think that even basic understanding towards data visualization has become more sceptical. When you see a photo, you do not just assume that it is true, because it depends on where you see it, what is the delivery mechanism etc. It matters if it is in the New York Times versus Buzzfeed.

The main issue I was wrestling with in this blog post was that we have to be sensitive to the way in which data visualization is leaving out all of the complicated aspects of the ways that they are produced, of who is producing them, of why they have been produced? That's troublesome to me. Of course, I don't want to say that we should never create the beautiful, effective visualizations, but one of my main goals is to teach data scepticism at all stages, not only in the terms of visualization. Journalists often end up getting data from open data feeds, from governments, like API, and they often regard them as they are truth. But they are not. They are representations of a particular set of stakeholders who decided to catalogue the world in a very particular way and for a particular reason. They consist of many gaps depending on how data was collected and what it was intended to do.
I don't have a straightforward answer, but I believe that these two things can go hand in hand with each other. One the one hand, I teach students how to communicate effectively, on the other hand, attending to the power of data, particularly writing for non-technical audiences. If you are somebody who works with data, if you are a statistician, or data scientist, you actually have been trained in scepticism- these people have a much more sophisticated way of looking at data, critiquing of visualization, or critiquing a spread sheet or data set. Laypeople yet don't have those tools.

Aleksandra Jach: They don't know how to read the context in which data is located...

Catherine d'Ignazio: This is a structural problem. For example, Open Data Movement are publishing their data online, but the issue with that is that often data is decontextualized. At the same time, people are starting to develop models to provide more contexts. Some institutions provide data dictionaries or user guides that accompany data visualization. I think that the biggest problems with data stem from the idea of cultural fallacy in that we think data just speaks for itself. Once we have the numbers, we just crunch them and they tell us what is going on. You don't know how to begin processing the data, or interpreting data without deeply contextualized knowledge.

Aleksandra Jach: How this context can be added? Should it be included in metadata?

Catherine d'Ignazio: We need more robust metadata and more narrative accounts as well. Data dictionary is a good example of metadata- it literally goes through each field in the data set. But I think we actually need a narrative. I'm always telling my students: whoever’s data you are using, call them, talk to them on phone. Often, it is very difficult to convince these people, because obviously they don't want to spend all day on the phone call, but I believe that the narratives or the ethnographic kind of accounts are important. The other problem is when the outsider is coming to the data set. If you are an insider, you have this deep contextualized knowledge and you know how to work with the data. Because we are publishing more data streams, because there is an imperative to work with data, a lot of people are coming as “strangers” or “foreigners” into the data field and sometimes you need a kind of acculturation process for that.

(Laugh )

Aleksandra Jach: If you want to include complexities into data visualization you make a political statement; you cannot include everything, but select specific factors. What do you think about the politics of data visualization?

Catherine d'Ignazio: There is always some kind of politics at play. We should treat data visualization as we treat any other form of communication. This is a message and it has been sculpted and designed. Of course, it might have very good intentions behind it. It can be something that drives your attention to an issue that urgently needs highlighting in the world. But when we talk about more self-reflexive data visualization, I do think that in some sense they should be more transparent. For example, by acknowledging who the stakeholders are of this specific data, who produces it, and what is the intention behind it? It doesn't mean that the politics goes away. Self-reflexive data visualization can focus our attention on information not as an output, but drawing us back into the process. It's funny because I feel two ways about it. On the one hand, the power of a tool that works, as data visualization or maps, can be leverage in the service of desirable outcomes (as for example social justice). I think that there are multiple, ethical ways to deal with it. So one track is that you are just leveraging. Data visualization has this rhetorical power, that is why we will exploit the hell out of it and make it really convincing.

The other track is when you call the visualization back and you speak more clearly of the intentions behind it, who creates it, why they are doing this? What are the unknowns? I actually think that actually we don't have a good model for that kind of representation, even in the visual language.

The third way you can go -and something that appears in my art practice- is to take a given data visualization which has already been produced and un-visualizing it, re-configuring it. That's more what I will call “humanistic critique”, but I think it is still relevant and drove us back to the question of how do we can communicate more ethically.

Aleksandra Jach: Why you have chosen feminism as an ideological and political frame to talk about data visualization?

Catherine d'Ignazio: I thought about using the term “critical data visualization”, because there is a whole conversation emerging across different disciplines around critical data studies. Going back to the blog post, I was interested in Donna Haraway's works, in the idea of “situated knowledges” which come from feminist tradition. In that article she is trying to say that super empiricist, rationalist models of science produce one kind of objectivity, which often excludes other perspectives of women, of people of colour etc. On the other hand Haraway doesn't want to say that everything is relative. There is such a thing as truth, but there are multiple truths. What you see depends much of where you are situated in the world, for example, where your body is. That for me really resonates with the question of how we can be more self-reflexive about producing knowledge in the world? The body is usually absent from data visualization. They often look like they have become (?) a weirdly disembodied eye which looks at me. So how do we bring back the body to data visualization? Another argument why I decided to use feminism as ideological frame was that I knew people would be provoked by it, so it was purely rhetorically.

(Laugh)

Aleksandra Jach: Why is the discussion about ethics and data so important now?

Catherine d'Ignazio: I think this is the most important question right now. There are basic structural inequalities to collection of data, to the storing of data, to the analysis of data and its production and interpretation. The capability of doing this is very unequally distributed. Alas, we have by and large either huge corporations and governments who collect and maintain a lot of that data. We have elite technocrats who process the data and then produce the representations in the world. Those things are democratizing and we are seeing movement in that space and people are discussing how to get more people up to speed with data etc. But still, there is so much rhetoric about data visualization as a powerful form of communication and something that is “more true”. Also, people should be able to understand what are the reasons to collect data, to be able to interpret particular interests that are behind it (predictive policing, discrimination of the algorithms).
We need more voices in this debate about data. We need more people in journalism, education, law, in accountability of industries involved in this conversation. Right now technology is really far ahead of common person understanding of it. My goal is just to push more people to this. That is why the ethics is so important right now, because the more we afloat to data, to algorithms, to more complex computer processing, the more power we are seeding to technocrats, corporation and governments and that might not be necessarily the best outcome for everybody.

Catherine D’Ignazio is an Assistant Professor of Data Visualization and Civic Media at Emerson College who investigates how data visualization, technology and new forms of storytelling can be used for civic engagement. Professor D'Ignazio has conducted research on geographic bias in the news media, developed custom software to geolocate news articles and designed an application, "Terra Incognita", to promote global news discovery. She is working on sensor journalism around water quality with PublicLab, data literacy projects and various community-educational partnerships with her journalism students. Notably, she co-organized a hackathon at the MIT Media Lab called "The Make the Breast Pump Not Suck!" Hackathon. Her art and design projects have won awards from the Tanne Foundation, Turbulence.org, the LEF Foundation, and Dream It, Code It, Win It. In 2009, she was a finalist for the Foster Prize at the ICA Boston. Her work has been exhibited at the Eyebeam Center for Art & Technology, Museo d’Antiochia of Medellin, and the Venice Biennial. Professor D'Ignazio is a Fellow at the Emerson Engagement Lab and a Research Affiliate at (and alumna of) the MIT Center for Civic Media.

For more information: http://www.kanarinka.com/

2016

the anthropocene index

commons

We need more voices in this debate about data. Interview with Catherine d'Ignazio.