Module 3.2: Protecting Privacy
Last updated
Last updated
While the open data movement is premised on the value of transparency and availability of data for innovation and accountability, a commitment to openness should be accompanied by respect for the privacy of the individuals who are represented in data. In this module, we explore privacy in the context of open civic data.
When should we not collect or share data for privacy reasons?
What are examples of strategies to protect the privacy of individuals represented in civic data?
The librarian wants to share the data about Wi-Fi access before and during the pandemic, but they are concerned about protecting patron privacy. While no patron names were collected, they see that personally identifiable information, including browser history, usernames and passwords, emails, and data entered into online forms, is part of the data set. The librarian contacts the library’s IT team to make sure these data are removed once users log off.
Overview: In Module 3.2, we discussed assessment and mitigation of risk to personal privacy when sharing open civic data. In this activity, we will evaluate risk associated with publishing an example dataset – library WiFi data – and consider approaches for de identification of the data.
To guide the privacy review, we will use a tool developed by DataSF, the city of San Francisco’s open data initiative, that we adapted for this module. Through this exercise, we will reflect on risk and potential harms of data sharing, weigh approaches for mitigation of risk to the people represented in the data, and build experience with using a risk assessment tool.
This activity can be done individually or in small groups.
Supplies: Printed or digital copies of the activity handout
Time: 30 minutes
Set-Up
If working in a group setting, divide into groups of 2 - 3.
2. Situate yourself in a data sharing scenario
You are a Public Services Librarian in a public library system. Following an ecosystem mapping activity and engagement with other data intermediaries, your library has determined that it will participate in your local open data ecosystem as a data publisher. This involves identifying library data for sharing and preparing the data for release in a local open data portal.
For the purposes of this exercise, imagine that you have received previous requests from local government employees for access to library WiFi usage data. They have communicated that this data supports their work in advocating for funds for public networks. In addition, you have used a strengths-based approach to understanding local data needs (Module 2.3) and have identified WiFi data as valuable to the digital equity efforts of community organizations.
At the time of your review, access to your library’s WiFi network requires users to enter their library account number (located on their library card).
With your library IT, you must now assess the WiFi usage data to determine whether there are privacy concerns and, if there are, how you will address them.
3. Complete the Privacy Review Form [using the Activity Handout]
Your library has adapted a tool by DataSF to guide standardized privacy assessments. With your team, review the data for privacy considerations, using the form and additional guidance in the form’s appendix.
The form has three sections or steps:
Step 1: Identify personally identifiable and sensitive data
Step 2: Complete a Privacy Risk Assessment
Step 3: Determine privacy approaches
For some of the questions within the steps, you can refer to the Form Appendix for guidance that will help you to make your assessment.
“Think aloud” as you work on your form, talking through your assessment in your group. You will ultimately be making an assessment on whether the library WiFi data should be released through the local open data portal and what de-identification approaches are needed, if any, based on the data fields described on the form.
4. Reconvene and debrief as a group [if applicable]
If completing this exercise in a group, reconvene and discuss the experience of working through the assessment.
Consider, for example:
Were there differences of opinion within your small group? Questions that were challenging to reach agreement on?
What was your small group’s ultimate decision regarding data sharing and privacy approaches?
Did all groups reach the same decision? What differences do you observe among the groups' assessment (different assessment of value? De-identification methods?)?
The Data 101 series by the Western Pennsylvania Regional Data Center and the Carnegie Library of Pittsburgh provides an introduction to data literacy and open educational materials. This series approaches data education through paper-based and low-tech activities, aiming to build confidence with data concepts. Members of the Data 101 Team are on the Civic Switchboard Literacy Team.
In support of Module 3.2: Protecting Privacy, we recommend completing the Data 101 activity "When to Share?" This activity encourages us to consider how our relationship to data may impact sharing and approaches to privacy. Use the facilitator activity guide and accompanying worksheet to complete this activity.