Sunday, May 17, 2009

CyLab Seminar Series Notes: User-Controllable Security and Privacy -- Norman Sadeh asks, "Are Expectations Realistic?"


"As we all realize on a daily basis, application developers have great expect- ations about what we users are capable of doing. They expect us to be able to properly configure the firewall on our home computer and virus settings on our cell phone. As enterprises move towards more agile and decentralized business practices, developers also expect us to configure increasingly complex access control policies at work. Are these expectations realistic? If they are not, how much trouble are we in and what can we do about it?"

CyLab Seminar Notes: User-Controllable Security and Privacy -- Norman Sadeh asks, "Are Expectations Realistic?"

[NOTE: CyLab's weekly seminar series provides a powerful platform for highlighting vital research. The physical audience in the auditorium is composed of Carnegie Mellon University faculty and graduate students, but CyLab's corporate partners also have access to both the live stream and the archived content via the World Wide Web. From time to time, CyBlog will wet your appetite by offering brief glimpses into these talks. Here are some of my notes from a talk delivered by Norman Sadeh on 3-16-09. Sadeh's team of collaborators in this important research includes faculty members Jason Hong, Lorrie Cranor, Lujo Bauer, Tuomas Sandholm, post docs Paul Hankes Drielsma, Eran Toch, Jinghai Rao, and PhD students Patrick Kelley, Jialiu Lin, Janice Tsai, Michael Benisch and Ram Ravichandran. -- Richard Power]

Can users be expected to effectively specify their policies? Do people even what policies they want or need? Even if they did, could they articulate these policies? What if policies evolve over time? Are we always willing to invest enough time to have perfect policies or are there important trade-offs between user burden and policy accuracy? Can we develop technologies that mitigate these potential problems and empower users to more accurately and efficiently specify security and privacy policies?

To shed some light on these compelling questions, Norman Sadeh shared some insights into data from lab and field research on mobile social networking applications.

An example is a location sharing application that uses GPS and WiFi triangulation on laptops and cell phones and allows people to share their locations with friends, families, colleagues, and ... Well, that is one of the big issues that arises in this space, who exactly are you sharing this information with? And to what extent can you control access to it?

According to Sadeh, although many such applications have been released over the past several years, adoption has been rather limited. Early on, Sadeh and his team noticed that users had great difficulty specifying location sharing privacy policies that accurately reflected their preferences.

“So what’s going on? Is it because these applications have bad user interfaces? Do people who define more privacy rules do better? Do the people who spend more time defining and refining their rules do better?” Sadeh continued. “Location sharing applications seemed to be a very good domain to study these and related issues. Because, at the end of the day, the problems are the same, whether you are trying to configure a firewall at home or at work, or you are trying to configure social networking policies. Ultimately, the question is whether we can empower users (both expert users and lay users) to specify combinations of rules that enact the behaviors they really want to enforce?”

From 2003 to 2005, Sadeh and his colleagues worked on early prototypes and did some lab studies. In 2006 and 2007, they launched the "People Finder" application, which involved a couple of hundred users in multiple pilots, with laptops and some cell phones.

In 2008, they developed their first Facebook application, Locyoution, which was piloted by over one hundred users on their laptops.

In February, 2009, Sadeh and his colleagues launched Locaccino, a new Facebook app, which could scale to hundreds of thousand of users if successful.

Data from the team’s research indicates that the problem is not bad interfaces, or the number of rules defined, or even the time spent defining and refining those rules.

But Sadeh’s work and the data he has collected through a number of pilots are providing a number of powerful insights into what it takes to better support users as they define and maintain security policies. One element of functionality that has been shown by Sadeh and his teamto have a major impact on the ability of users to specify policies they feel more comfortable with is auditing functionality:


Auditing (‘feedback’) functionality that enables users to review decisions made by the rules they have specified and ask questions such as “Which rule in my policy is responsible for this particular decision” can help users better understand the behaviors their policies give rise to

The chart on "Evaluating Usefulness of Feedback," provides a summarized view of the impact of auditing (or “feedback”) functionality on user’s comfort and, ultimately, their "willingness to share their locations with others." What you are looking at in these two charts are the total number of hours per week different users were willing to share their location with others, depending on whether they had access to feedback functionality or not.. People who had access to the auditing functionality (“Feedback” chart) started to feel more comfortable and gradually relaxed their rules, utlimately resulting in more sharing than what was observed among users who did not have access to this functionality (“No Feedback” chart).

"That makes perfect sense. You see what is going on, you gain more confidence that the system is, in fact, not leading to any sort of abuse, and is not leading to any bad scenarios, and you end up sharing your location on a more regular basis,” Sadeh explained. “This is, by the way, one of those very simply types of functionality that none of the commercial applications out there today supporting location-sharing offers. So it is not surprising that when these applications get launched, tens of thousands of people download them, but these people only end up using the application for a few days.” Current location-sharing applications are very restrictive in the types of controls they allow their users to define and provide no such feedback functionality. The end result is very little sharing. In other words, the applications are of little value.

In his remarks, Sadeh went on to explore another challenging question, "How expressive should security or privacy policies be?"

Security and privacy policies can be viewed in the light of research on mechanism design. Through recent work, Sadeh and his colleagues has looked at the benefits afforded by more expressive mechanisms or more expressive security and privacy policies, when it comes to more accurately capturing the preferences of a user or organization... ” What are the sorts of features, and the types of attributes, I will need to make available in my language to my users, so that they can end up with policies that accurately capture their intended policies?"

" You can think of a security or privacy mechanism as being some sort of function that associates different actions with different sets of conditions subject to a collection of preferences expressed by a user. Work in mechanism design typically assumes a fully rational user. In other words, given some level of expressiveness in a policy language, we would assume that our user will be able to fully take advantage of that expressiveness. ... This is what is stated in this complex formula with the arg max. The notion of efficiency is a traditional one in mechanism design. Ideally we would want our policy, or mechanism, to be as efficient as possible, namely to do the best possible job capturing our user’s ground truth preferences. If however the policy language the user is given imposes restrictions on what he or she can express, the efficiency of the resulting mechanism may be less than 100%. In other words, the user may have to make some sacrifices. For instance, you may have to decide that you will not disclose your location to a given individual at all because you don’t have the ability to accurately specify the fine conditions under which you would have been willing to do so. Instead, given the restrictions of the available policy language, you decide that you will “play it safe” and simply deny all requests for your location from that individual. In general, one can define the efficiency of a security or privacy mechanism by looking at all possible scenarios and looking at the percentage of the time when the best policy a user can define (subject to the expressiveness of the available policy language) accurately captures what the user would like to do (e.g. sharing your location versus not sharing it). However rather than doing this for a single user, we will try to do this for the entire population of users for whom the mechanism is being designed. In practice, one can approximate this by looking for a representative sample of the target user population, collect their ground truth preferences and see how we can optimally configure policies to capture their preferences subject to different restrictions in the available policy language.” –For instance, in the case of location sharing applications, we can collect people’s ground truth preferences about sharing their locations with others and examine the impact of different levels of expressiveness in the language made available to users to specify the conditions under which they are willing to disclose their location to others. This means estimating the benefits afforded by a privacy language where users can specify rules that include restrictions tied to groups of people (e.g. friends, colleagues), restrictions tied to the day or time of the request, or to where the user is at the time his or her location is requested (....or some combination of the above).

What Sadeh and his colleagues found was that such insight could be applied to the design of any security or privacy mechanism to help users take fuller advantage of the expressiveness of the language through the interface. But real users are not fully rational. There is a point where users will say, "Well, I don't care. Yes, in principle I could get a higher efficiency, i.e., policies that more closely reflect what I really want, but perhaps I am not willing to invest the time, or no matter how hard I try, beyond six or seven rules I get completely confused."

At this point, Sadeh remarked, the next natural question arises, "What about machine learning? Could machine learning help us?"

In some of the team's early experiments, using case-based reasoning, it was clear that yes, in principle, machine learning could make "a huge difference."

"You might say this is wonderful, problem solved, let's just use your game theory results, add machine learning, and we're done. So why is it that this is not the case?'

"There is a slight problem," Sadeh points out, "and that is that we are talking about privacy and security. Machine learning can be used for lots of different things, and be more accurate than we humans can be, but machine learning is not 100% accurate and there lies the potential problem. It could end up making a decision that we don't feel comfortable with at all. Even if machine-learning gives us 99% accuracy, in security or privacy the remaining 1% could be devastating: you could be giving away national security secrets, or sensitive corporate data ... “

The problem, Sadeh adds, is that machine learning is traditionally configured as a “black box” technology, i.e., users are unlikely to understand the policies they end up with.

"So we are developing different families of machine learning techniques that essentially reconcile the power of machine learning, which is unquestionable, with the principle that ultimately the user has to remain in control. If the user loses control, you have not accomplished anything. “

"Can we develop technology that incrementally suggests policy changes to users? This leads us to the concept of user-controllable policy learning. The idea is that users are willing to provide feedback. We have seen that they are actually keen to have the auditing interface; and we have also seen that they are willing to provide feedback, e.g. thumbs up or thumbs down on decisions made by their current policy. They are not necessarily going to review every decision that was made, but they are willing to go back occasionally and provide feedback. ... So what we do is take this feedback, but instead of taking over, we develop suggestions we are going to present to the user and let the user decide whether or not to accept these suggestions. You might say, 'that sounds very easy, anybody can do that.' Well, there is another problem, in order for these suggestions to be meaningful, they have to be understandable: we have to develop suggestions that a user can relate to. If your suggestion is a new decision tree with a number of different branches, the user will stare at it for a very long time and not know what to do. Instead, we tend to limit ourselves to incremental changes to user policies. We start from the policy that the user has already defined, and see if we can learn over time small, incremental variations to the policy that can be presented to the user in a way that he or she can still relate to them. When you do that, the user can make a meaningful decision, as to whether or not he likes policy changes you are suggesting and gradually improve the accurary of his or her policy. If conditions suddenly change, the user can also directly manipulate his or her policy, because he or she continues to understand it. There is no need to wait for machine learning to adapt to the new situation. So you have the best of both worlds, with users and machine learning working hand in hand.”

Yes, patents are pending.

Some References

User-Controllable Security and Privacy Project

N. Sadeh, J. Hong, L. Cranor, I. Fette, P. Kelley, M. Prabaker, and J. Rao,
"Understanding and Capturing People's Privacy Policies in a Mobile Social
Networking Application", Journal of Personal and Ubiquitous Computing
.

P.G.Kelley, P. Hankes Drielsma, N. Sadeh, and L.F. Cranor, "User-Controllable Learning of Security and Privacy Policies", First ACM Workshop on AISec (AISec'08), ACM CCS 2008 Conference. Oct. 2008.

J.Tsai, P. Kelley, P. Drielsma, L. Cranor, J. Hong, and N. Sadeh. Who’s Viewed You? The Impact of Feedback in a Mobile-location Application. To appear in CHI '09.

Michael Benisch, Patrick Gage Kelley, Norman Sadeh, Tuomas Sandholm, Lorrie
Faith Cranor, Paul Hankes Drielsma, and Janice Tsai. The Impact of Expressiveness on the Effectiveness of Privacy Mechanisms for Location Sharing. CMU Technical Report CMU-ISR-08-139, December 2008

Other Relevant Links

CyLab Chronicles: Wombat, the Latest CyLab Success Story

CyLab Research Update: Locaccino Enables the Watched to Watch the Watchers

CyLab Chronicles: Q&A w/ Norman Sadeh

Some Other CyLab Seminar Notes

CyLab Seminar Series: Of Frogs, Herds, Behavioral Economics, Malleable Privacy Valuations, and Context-Dependent Willingness to Divulge Personal Info

CyLab Seminar Series Notes: Why do people and corporations not invest more in security?

CyLab Research Update: Basic Instincts in the Virtual World?

For information on the benefits of partnering with CyLab, contact Gene Hambrick, CyLab Director of Corporate Relations: hambrick at andrew.cmu.edu