CyBlog: Security, Privacy and Mobility in the Information Age: May 2012

Saturday, May 26, 2012

CyLab Chronicles: CyLab's Strong Presence at IEEE Security and Privacy 2012 Packs A Wallop

CyLab's Zongwei Zhou talks on Building Verifiable Trusted Path on Commodity X86 Computers

CyLab Chronicles: CyLab's Strong Presence at IEEE Security and Privacy 2012 Packs A Wallop

The 33rd annual IEEE Symposium on Security and Privacy held at the St. Francis hotel in downtown San Francisco (May 20-May 23, 2012), is one of the respected venues in the field, and once again, numerous papers presented by Carnegie Mellon University CyLab researcher and several sessions chaired by CyLab faculty made for a powerful presence.

Seven papers authored or co-authored by CyLab researchers were presented in the course of the three-day program. In addition to the papers presented, CyLab faculty also chaired three sessions.

Here is the CyLab 2012 IEEE Security and Privacy roster of papers and presenters, with brief excerpts from each paper:

Jiyong Jang talked on ReDeBug: Finding Unpatched Code Clones in Entire OS Distributions, a paper co-authored with Abeer Agrawal, and CyLab faculty David Brumley.

"ReDeBug was designed for scalability to entire OS distributions, the ability to handle real code, and minimizing false detection. ReDeBug found 15,546 unpatched code clones, which likely represent real vulnerabilities, by analyzing 2.1 billion lines of code on a commodity desktop. We demonstrate the practical impact of ReDeBug by conﬁrming 145 real bugs in the latest version of Debian Squeeze packages. We believe ReDeBug can be a realistic solution for regular developers to enhance the security of their code in day-to-day development."

Michael Carl Tschantz presented Formalizing and Enforcing Purpose Restrictions of Privacy Policies, a paper co-authored with Anupam Datta and Jeannette M. Wing.

"Our work makes the following contributions: 1) The ﬁrst semantic formalism of when a sequence of actions is for a purpose; 2) Empirical validation that our formalism closely corresponds to how people understand the word “purpose”; 3) An algorithm employing our formalism and its implementation for auditing; and 4) The characterization of previous policy enforcement methods in our formalism and a comparative study of their expressiveness. The ﬁrst two contributions illustrate that planning can formalize purpose restrictions. The next two illustrate that our formalism may aid automated auditing and analysis."

Xin Zhang, who graduated from Carnegie Mellon University and now works for Google, delivered Secure and Scalable Fault Localization under Dynamic Traffic Patterns, co-authored with CyLab Technical Director Adrian Perrig, and by Chang Lan of Tsinghua University.

"While existing path-based FL protocols aim to identify a speciﬁc faulty link (if any), DynaFL localizes data-plane faults to a coarser-grained 1-hop neighborhood, to achieve four distinct advantages. First, DynaFL does not require any minimum duration time of paths or ﬂows in order to detect data-plane faults as path-based FL protocols do. Thus, DynaFL can fully cope with short-lived ﬂows which are popularly seen in modern networks. Second, in DynaFL, a source node does not need to know the exact outgoing path, unlike path-based FL protocols. Hence, DynaFL can support agile (e.g., packet-level) load balancing such as VL2 routing [20] for datacenter networks. Third, a DynaFL router only needs around 4MB per-neighbor state based on our classic Sketch implementation, while a router in a path-based FL protocol requires per-path state. Finally, a DynaFL router only maintains a single secret key shared with the AC, while a router in a path-based FL protocol needs to manage 100 to 10000 secret keys in measured ISP topologies."

Sang Kil Cha spoke on Unleashing Mayhem on Binary Code, co-authored with Thanassis Avgerinos, Alexandre Rebert and David Brumley.

"We presented MAYHEM, a tool for automatically finding exploitable bugs in binary (i.e., executable) programs in an efficient and scalable way. To this end, MAYHEM introduces a novel hybrid symbolic execution scheme that combines the beneﬁts of existing symbolic execution techniques (both online and offline) into a single system. We also present index-based memory modeling, a technique that allows MAYHEM to discover more exploitable bugs at the binary-level. We used MAYHEM to analyze 29 applications and automatically identified and demonstrated 29 exploitable vulnerabilities."

Saranga Komanduri talked on Guess again (and again and again): Measuring password strength by simulating password-cracking algorithms, co-authored with Patrick Gage Kelley, , Michelle L. Mazurek, Richard Shay, Tim Vidas, Lujo Bauer, Nicolas Christin, Lorrie Faith Cranor, and Julio Lopez.

"We introduced a new, efﬁcient technique for evaluating password strength, which can be implemented for a variety of password-guessing algorithms and tuned using a variety of training sets to gain insight into the comparative guess resistance of different sets of passwords. Using this technique, we performed a more comprehensive password analysis than had previously been possible. We found several notable results about the comparative strength of different composition policies. Although NIST considers basic16 and comprehensive8 equivalent, we found that basic16 is superior against large numbers of guesses. Combined with a prior result that basic16 is also easier for users [46], this suggests basic16 is the better policy choice. We also found that the effectiveness of a dictionary check depends heavily on the choice of dictionary; in particular, a large blacklist created using state-of-the-art password-guessing techniques is much more effective than a standard dictionary at preventing users from choosing easily guessed passwords. Our results also reveal important information about conducting guess-resistance analysis ..."

Hsu-Chun Hsiao presented LAP: Lightweight Anonymity and Privacy, co-authored with Tiffany Hyun-Jin Kim, and Adrian Perrig, along with Akira Yamada (KDDI R&D), Sam Nelson and Marco Gruteser (Rutgers University), and Wei Ming (Tsinghua University).

"In this framework, our approach is simple yet effective: by leveraging encrypted packet-carried forwarding state, ISPs that support our protocol can efﬁciently forward packets towards the destination, where each encrypted ISP-hop further camouflages the source or destination address or its location. Although encrypted packet-carried forwarding state is currently not supported in IP, we design simple extensions to IP that could enable this technology. In particular, our approach is even more relevant in future network architectures, where the design can be readily incorporated. This new point in the design space of anonymity protocols could also be used in concert with other techniques, for example in conjunction with Tor to prevent one Tor node from learning its successor. Despite weaker security proper- ties than Tor, we suspect that LAP contributes a significant benefit towards providing topological anonymity, as LAP is practical to use for all communication.

Zongwei Zhou delivered Building Verifiable Trusted Path on Commodity X86 Computers, co-authored with CyLab Director Virgil Gligor, as well as James Newsome and Jonathan M. McCune.

"Building a general-purpose trusted path mechanism for commodity computers with a signiﬁcant level of assurance requires substantial systems engineering, which has not been completely achieved by prior work. Speciﬁcally, it requires (1) effective countermeasures against I/O attacks enabled by inadequate I/O architectures and potentially compromised operating systems; and (2) small trusted codebases that can be integrated with commodity operating systems. The design presented in this paper shows that, in principle, trusted path can be achieved on commodity computers, and suggests that simple I/O architecture changes would simplify trusted-path design considerably."

-- Richard Power

See Also:

CyLab Research has Powerful Impact on 2010 IEEE Security and Privacy Symposium

Microcosm & Macrocosm: Reflections on 2010 IEEE Symposium on Security and Privacy; Q and A on Cloud, Cyberwar and Internet Freedom with Dr. Peter Neumann

Five Papers Add to Impressive CyLab Presence at ACM CCS 2011

CyLab Research Presentations Impact CHI 2011

USENIX Security 2011: Another Ring on the Tree Trunk for One of Cyber Security's Worthiest Gatherings, and a Strong CyLab Presence

USENIX Security 2011: CyLab Researchers Release Study on Illicit Online Drug Trade and Attacks on Pharma Industry

A Report on 2012 IEEE Symposium on Privacy and Security

Hsu-Chun Hsiao delivers a paper on LAP: Lightweight Anonymity and Privacy

A Report on 2012 IEEE Symposium on Privacy and Security

As noted in previous CyBlog posts, IEEE's annual Symposium on Privacy and Security (a.k.a. "Oakland") is an important event in the realm of academic research on how to best strengthen cyber security and privacy. This year's Symposium lived up to expectations. (And I am not just saying that because Carnegie Mellon University CyLab's imprint was on eight different sessions. See CyLab Chronicles: CyLab's Strong Presence at IEEE Security and Privacy 2012 Packs A Wallop.)

Here are a few glimpses into some sessions that interested me.

Prudent Practices for Designing Malware Experiments

Christian Rossow of the Institute for Internet Security delivered a talk on "Prudent Practices for Designing Malware Experiments," a paper co-authored with Christian J. Dietrich and Norbert Pohlmann, also of Institute for Internet Security, along Chris Grier, Christian Kreibich and Vern Paxson of University of California, Berkeley and International Computer Science Institute, Berkeley, as well as Herbert Bos and Maarten van Steen, VU University Amsterdam, The Network Institute.

Rossow articulated numerous guidelines on safety, transparency, realism and correct data sets.

I have pulled out an example of one of the guidelines from each categories:

Safety: "1) Deploy and describe containment policies. Well-designed containment policies facilitate realistic experiments while mitigating the potential harm malware causes to others over time. Experiments should at a minimum employ basic containment policies such as redirecting spam and infection attempts, and identifying and suppressing DoS attacks. Authors should discuss the containment policies and their implications on the ﬁdelity of the experiments. Ideally, authors also monitor and discuss security breaches in their containment."

Transparency: "4) Mention the system used during execution. Malware may execute differently (if at all) across various systems, software conﬁgurations and versions. Explicit description of the particular system(s) used (e.g., 'Windows XP SP3 32bit without additional software installations') renders experiments more transparent, especially as presumptions about the 'standard' OS change with time. When relevant, authors should also include version information of installed software.

Realism: "5) Consider allowing Internet access to malware. Deferring legal and ethical considerations for a moment, we argue that experiments become signiﬁcantly more realistic if the malware has Internet access. Malware often requires connectivity to communicate with command-and-control (C&C) servers and thus to expose its malicious behavior. In exceptional cases where experiments in simulated Internet environments are appropriate, authors need to describe the resulting limitations.

Correct data sets: "2) Balance datasets over malware families. In unbalanced datasets, aggressively polymorphic malware families will often unduly dominate datasets ﬁltered by sample-uniqueness (e.g., MD5 hashes). Authors should discuss if such imbalances biased their experiments, and, if so, balance the datasets to the degree possible. explicitly if they decide to blend malicious traces with benign background activity."

Detecting Hoaxes, Frauds, and Deception in Writing Style Online

Sadia Afroz of Drexel University delivered a talk on "Detecting Hoaxes, Frauds, and Deception in Writing Style Online," a paper co-authored with colleagues Michael Brennan and Rachel Greenstadt.

This fascinating paper used the compelling story from recent headlines, i.e., strange tale of Amina, the "Gay Girl in Damascus," whose blog captured the attention of the world during the early days of the Arab Spring, only to be later revealed as the work of Thomas Macmaster, a 40 year old American male.

In reporting on the research, Afroz and her colleagues, concluded:

"Stylometry is necessary to determine authenticity of a document to prevent deception, hoaxes and frauds. In this work, we show that manual counter-measures against stylometry can be detected using second-order effects. That is, while it may be impossible to detect the author of a document whose authorship has been obfuscated, the obfuscation itself is detectable using a large feature set that is content-independent. Using Information Gain Ratio, we show that the most effective features for detecting deceptive writing are function words. We analyze a long-term deception and show that regular authorship recognition is more effective than deception detection to find indication of stylistic deception in this case."

As Afroz and her colleagues also point out, such research has implications for adversarial learning in general:

"Machine learning is often used in security problems from spam detection, to intrusion detection, to malware analysis. In these situations, the adversarial nature of the problem means that the adversary can often manipulate the classiﬁer to produce lower quality or sometimes entirely ineffective results. In the case of adversarial writing, we show that using a broader feature set causes the manipulation itself to be detectable. This approach may be useful in other areas of adversarial learning to increase accuracy by screening out adversarial inputs."

Oakrams: Searching Through Strands of Oakland's DNA

The three day event culminated in a all-star panel on "How can a Focus on 'Science' Advance Research in Cyber Security?" Moderated by Carl Landwehr, the panel members, including Alessandro Acquisti (Carnegie Mellon), Dan Boneh (Stanford), Joshua Guttman (Worcester Polytechnic Institute), Wenke Lee (Georgia Tech) and Cormac Herley (Microsoft) on whether or not the realm of cyber security as currently constituted should be or is already "science." But honestly, in spite of some sparkling insights, particularly from Acquisti and Herley, this debate has a certain dog chasing its tail futility to it. It is the kind of debate that become central after it is already too late to grasp the reality of a situation. It reminded me of a sage perspective delivered back in the 1990s, by the legendary Donn B. Parker: "Information Security, A Folk Art in Need of An Upgrade." Parker was spot-on on that, as well as on other issues.

So before the theme music to the Bill Murray film Groundhog Day once again starts to rise up in my psyche, let me turn away from the august panel and its erudite dialogue, and end this report from Oakland on a "short talk" in which Hilarie Orman (Purple Streak, Inc.) shared her "Oakrams."

I suggest there is at least as much import in them as in the debate over "cyber security" as "science."

Orman was kind enough to explain her exercise to me.

"I call them 'Oakrams' (the conference used to be called "Oakland" informally, and the software is based on an open source system call 'WordCram.' I modified WordCram so that I could control the coloring based on the word position, and so that I could reuse a word placement while changing size and color. This resulted in two sequences of images. I preprocessed the text of the papers so that for each year I had an ordered list of all non-trivial words that occurred 20 times or more. In the first sequence, for each year of the conference, I arranged the words so that the size and color intensity was proportional to word's frequency for that year. I modified WordCram to get word arrangements that were both denser and more uniform that its usual algorithms could produce. The word coloring varied uniformly over a small color range from top to bottom and left to right. Each year had slightly different range, overlapping with the previous year, and drifting from yellow through green in 1980 to the final blue through reddish yellow in 2012. The word arrays seemed endlessly interesting to me. Some words are loaded with context in the security world, and their presence or absence in an array was a source for reflection. As a small example, the word 'alice' appeared briefly in one or two years, but never rose to prominence. These arrays showed that 'system,' 'information,' and 'security' were usually the most frequent words in each year. This wasn't surprising, but I wanted to get more information about the words that had varying popularity, and I wondered if the words could point out trends in topics. That led to the next phase. The second sequence of images used only 50 words. These were the words that were the 'most popular' over the 33 years. For each year, each word had the same placement in the visual array, but the size and color varied. The size of a word was proportional to its frequency for that year. The color hue varied from red to blueish-purple, where red meant the word had not occurred in the previous 5 years, and the amount of blue represented its average frequency during the previous 5 years. As words moved in and out of popularity their size and color and opacity varied to reflect their usage. It was interesting to see how long it took for networking terms like 'message,' 'packet,' and 'node' took to get traction. I was amazed that "privacy" has rarely been a major term, despite it being part of the 'Security and Privacy' symposium's name! And, to me, it was quite significant that 'application' and "attack" have become major terms --- we used to focus on provably secure operating systems, now we try to protect individual applications against specific attacks. I'm a calligrapher and student of typography; the wordcrams are artistic objects that I enjoy, but they carry some fragments of meaning, like pieces of DNA."

-- Richard Power

See Also:

CyLab Research has Powerful Impact on 2010 IEEE Security and Privacy Symposium

Microcosm & Macrocosm: Reflections on 2010 IEEE Symposium on Security and Privacy; Q and A on Cloud, Cyberwar and Internet Freedom with Dr. Peter Neumann

Five Papers Add to Impressive CyLab Presence at ACM CCS 2011

CyLab Research Presentations Impact CHI 2011

USENIX Security 2011: Another Ring on the Tree Trunk for One of Cyber Security's Worthiest Gatherings, and a Strong CyLab Presence

USENIX Security 2011: CyLab Researchers Release Study on Illicit Online Drug Trade and Attacks on Pharma Industry

Monday, May 14, 2012

New You Tube & iTunes Video - CyLab Business Risk Forum: Michelle Dennedy on Privacy by Design for our Technology and Our Future - Why the Future Still Needs Us

Every week, during the school year, the CyLab Seminar Series provides updates on the latest research by our faculty, as well as by visiting scholars from other prestigious institutions. In addition to these academic research presentations, occasional Business Risks Forum events feature security experts from business and government to deliver invaluable insights on the facts on the ground in the operational environment.

Access to our weekly webcasts and on-line archive of Cylab Seminar Series is one of the exclusive benefits of membership in the CyLab's private sector consortium. From time to time, CyLab offers rare glimpses into its Seminar Series with the release of select videos via both the CyLab You Tube Channel and CyLab at iTunesU.

On April 16th, 2012, CyLab presented Michelle Dennedy, VP and Chief Privacy Officer for McAfee in a CyLab Seminar Series Business Risks Forum event. Dennedy spoke on Privacy by Design for our Technology and Our Future - Why the Future Still Needs Us.

CyLab Business Risks Forum: Privacy by Design for our Technology and Our Future