Tuesday, November 27, 2012

CyLab Researchers Make Major Advances In Audit Technology For Privacy Protection

A team of researchers at Carnegie Mellon University led by Dr. Anupam Datta, Assistant Research Professor at CyLab and Electrical & Computer Engineering, has developed algorithms that can help protect individual privacy by checking that organizations such as hospitals and banks are disclosing personal information about their customers to third parties in compliance with privacy regulations. They have produced the first complete formal specification of disclosure clauses in two important US privacy laws -- the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule and the Gramm-Leach-Bliley Act (GLBA).

They also built an algorithm that can help investigators detect violations of these laws and similar privacy policies. The research team included Henry DeYoung (a graduate student in the Computer Science Department) and three postdoctoral researchers in Dr. Datta's research group: Dr. Deepak Garg (now faculty at MPI-SWS), Dr. Limin Jia (now faculty at CMU CyLab), and Dr. Dilsun Kaynar (now faculty at CMU Computer Science Department).

Privacy has become a significant concern in modern society as personal information about individuals is increasingly collected, used, and shared, often using digital technologies, by a wide range of organizations. To mitigate privacy concerns, organizations are required to respect privacy laws in regulated sectors (e.g., HIPAA in healthcare, GLBA in financial sector) and to adhere to self-declared privacy policies in self-regulated sectors (e.g., privacy policies of companies such as Google and Facebook in Web services). Enforcing these kinds of privacy policies in organizations is difficult because privacy laws and enterprise policies typically identify a complex set of conditions governing the disclosure of personal information. For example, the HIPAA Privacy Rule includes over 80 clauses that permit, deny, and even require the disclosure of personal health information, making it difficult to manually ensure that all disclosures are compliant with the law. 

The research team at Carnegie Mellon University created a formal language for specifying a rich class of privacy policies. They then used this language to produce the first complete formal specification of disclosure clauses in two important US privacy laws -- the Health InsurancePortability and Accountability Act (HIPAA) Privacy Rule and theGramm-Leach-Bliley Act (GLBA). Recognizing that certain portions of complex privacy policies such as HIPAA are subjective and might require input from human auditors for compliance determination, the specification clearly separates out the subjective and the objective portions of a given policy.

The team then developed an algorithm that checks audit logs for compliance with privacy policies expressed in their language.  The algorithm has two distinct characteristics. First, it automatically checks the objective portion of the privacy policy for compliance and outputs the subjective portion for inspection by human auditors. Second, recognizing that audit logs are often incomplete in practice (i.e., they may not contain sufficient information to determine whether a policy is violated or not), the algorithm proceeds iteratively: in each iteration it checks as much of the policy it possibly can over the current log and outputs a residual policy that can only be checked when the log is extended with additional information. Initial experiments with a prototype implementation checking compliance of simulated audit logs with the HIPAA Privacy Rule indicates that the algorithm is fast enough to be used in practice. 

Additional information about this work can be found on the project web page:http://www.andrew.cmu.edu/user/danupam/privacy.html