After attending so many talks on Open Data last week, I feel conflicted and yet compelled to share my data. I have hundreds of thousands (perhaps millions, if I managed to parse out one particularly problematic file) of lines of chat data from my ethnography in World of Warcraft. Can you imagine how valuable that would be to data mine?
Yet, I am reserved about sharing. Mostly because the dominant discourse around Open Data does not exactly pertain to my data. Here are some reasons why:
Subject to your agreement to and continuing compliance with the Blizzard Agreements, Blizzard Entertainment hereby grants, and you hereby accept, a limited, revocable, non-transferable, non-sublicensable, non-exclusive license to use the Service solely for your own non-commercial entertainment purposes by accessing it with an authorized, unmodified Game Client. You may not use the Service for any other purpose, or in connection with any other software.
All title, ownership rights and intellectual property rights in and to World of Warcraft (including without limitation any user accounts, titles, computer code, themes, objects, characters, character names, stories, dialogue, catch phrases, locations, concepts, artwork, animations, sounds, musical compositions, audio-visual effects, methods of operation, moral rights, any related documentation, “applets” incorporated into World of Warcraft, transcripts of the chat rooms, character profile information, recordings of games played on World of Warcraft, and the World of Warcraft client and server software) are owned by Blizzard Entertainment or its licensors. World of Warcraft is protected by the copyright laws of the United States, international copyright treaties and conventions, and other laws. All rights are reserved. World of Warcraft may contain certain licensed materials, and Blizzard Entertainment’s licensors may protect their rights in the event of any violation of this Agreement.
So I do not own the transcripts of the chat room, but I have license to use them for non-commercial purposes, which I have done, but I cannot transfer the transcripts.
Barring that, though, I would endeavor to obtain a license to open my data, except:
2. I didn’t tell my participants that I would release their data. When I started this project, I didn’t even know that the Open movement was a thing. (It was 2007.) When I talked to my participants and told them about the project, I always said that I was the only one who would see their data and I asked for permission to use their stories. I kept a notice in the guild’s information and on the forums that I was collecting the chat logs and talked directly to as many people as I could, but I never said anything about sharing. Confidentiality was a big thing, and one of the reasons that I think my participants opened up to me.
Which leads to…
3. There is an ethnical question here about personally identifiable information. Sure it’s not your medical history or the address, but a lot of my participants revealed very real and very personal stories to me. When I wanted to re-tell their stories in my writing, I always tried to ask — and when I interviewed them directly, I introduced at the very beginning that I was collecting stories about their experience to share. I could scrub the data of that personal information (it would be hard and take a lot of time, but I could do it), but I really don’t know what my participants might consider to be that personal. Something I might consider confidential, they might not — and vice versa. They trusted me with those stories and secrets, and that was the only way that some of them would open up to me. I have to cherish that and protect it.
Just because I don’t (or can’t, perhaps) share my data does not mean I won’t share the results openly. I try to make my writing as accessible as possible (while still being up to academic standards that are required for publication), and even though some of my earlier work is in closed-access journals, I have made versions available openly on my institutional repository.
So when we talk about “Open Data”, we need to consider those data sets that not only contain sensitive medical information, but socially confidential information. Is it ethical to share those words? This is a question perhaps unique to the social sciences and humanities, and we must consider it.