Research Topics

The Joint Research Center aims to advance the science and technology which will allow the cost-effective development of high-quality software products. The following description includes some of the main lines being explored:


30-key-iconCryptography and Privacy

The security and privacy of electronic communications are two very important aspects of today’s web-based world, and are often constructed using cryptographic tools. However, both security and privacy are often breached, sometimes severely, and are not always the result of “total breaks” due to flaws in operating systems, social engineering or physical attacks. Such breaches may be due to the use of bad cryptography, to bad implementations of good cryptography, or to issues in the applications themselves, where information about private data may be leaked by their effect on public results.

In the “cryptography and privacy” line of collaboration, IMDEA and MSR researchers focus on these aspects of security and privacy by developing and applying formal techniques and practical tools to formally prove strong security and privacy properties of algorithms and their implementations. Collaborations in this line follows two main axes. The first axis concerns cryptography, and includes the formalization of complex cryptographic proofs of primitive and protocols, and the verification of cryptographic implementations. The second axis focuses on information-flow properties, ranging from the development of practical tools for verifying standard non-interference properties of programs to quantifying the amount of information leaked by a program through “side-channels” (timing, power consumption, cache accesses).

Past and current collaborations on cryptography include:

  • An extension of the EasyCrypt tool (developed jointly by IMDEA and INRIA) to formally proving differential privacy, a standard notion of privacy to bound the leakage of information about a database when allowing external queries. This collaboration between MSR Cambridge, IMDEA and INRIA includesan application to a smart-metering protocol [1].
  • miTLS, a verified reference implementation of the TLS protocol, which is used to secure (authenticate and protect secrecy) of https connections, among other things [2, 3]. This collaboration involves people from Microsoft Cambridge, the MSR-INRIA joint centre, and IMDEA.
  • The development of techniques to prove security properties of cryptographic programs written in C using general-purpose verification tools. These techniques were applied using MSR’s VCC verifier [4], and were used in partially analyzing the TPM2.0 draft specification. This project has involved researchers from MSR Cambridge, MSR Redmond, the Extreme Computing Group, the European Microsoft Innovation Center and IMDEA.
  • rF*, an extension of the F* dependently-typed language to “hyperproperties”, that is non-functional properties over pairs of traces [5]. This extension, developed collaboratively by MSR Cambridge, MSR Redmond and IMDEA researchers captures both cryptographic and information-flow properties.

Past and current collaborations on information-flow include:

  • Merlin, an approach for automatically inferring explicit information-flow specifications from program code [6]. This project involves researchers from MSR Redmond, MSR India and IMDEA.
  • The development of techniques to quantitatively analyze information-flow properties [7]. This project involves researchers from MSR Cambridge and IMDEA, and was recently consolidated further by a Microsoft PhD Scholarship, with the view of applying these techniques to quantify side-channel leakage in implementations of cryptographic protocols.

 

[1] Verified Computational Differential Privacy with Applications to Smart Metering, by Gilles Barthe, George Danezis, Benjamin Grégoire, César Kunz and Santiago Zanella-Béguelin. In IEEE Computer Security Foundations Symposium (CSF'13). 2013, pp. 287--301.
[2] Implementing TLS with Verified Cryptographic Security, by Karthikeyan Bhargavan, Cédric Fournet, Markulf Kohlweiss, Alfredo Pironti and Pierre-Yves Strub. In IEEE Symposium on Security and Privacy (S&P'13). 2013, pp. 445--459
[3] Modular Code-Based Cryptographic Verification , by Cédric Fournet, Markulf Kohlweiss and Pierre-Yves Strub. In ACM Conference on Computer and Communications Security (CCS'11). 2011, pp. 341--350.
[4] Guiding a General-Purpose C Verifier to Prove Cryptographic Protocols, by François Dupressoir, Andrew D. Gordon, Jan Jürjens and David A. Naumann. In Journal of Computer Security (JCS), to appear.
[5] Probabilistic Relational Verification for Cryptographic Implementations, by Gilles Barthe, Cédric Fournet, Benjamin Grégoire, Pierre-Yves Strub, Nikhil Swamy and Santiago Zanella-Béguelin. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL'14). 2014
[6] Merlin: Specification Inference for Explicit Information Flow Problems, by Ben Livshits, Aditya Nori, Sriram Rajamani and Anindya Banerjee. In ACM SIGPLAN conference on Programming Language Design and Implementation (PLDI'09). 2009, pp. 75--86.
[7] Approximation and Randomization for Quantitative Information-Flow Analysis, by Boris Köpf and Andrey Rybalchenko. In IEEE Computer Security Foundations Symposium (CSF'10). 2010, pp. 3--14.

139-flags-iconConcurrency and memory models

Concurrent programming has always been considered the hardest kind of programming. With the advent of multicore and cloud computing, concurrency is going mainstream, which poses significant challenges for ensuring software reliability. Part of the reason why concurrent programming is so hard is that modern concurrency frameworks provide only weak guarantees about the consistency of the data they manage in the face of concurrent accesses. Thus, multicore processors provide so-called weak memory models that allow more efficient implementations at the expense of sometimes exhibiting counterintuitive behaviors. Similarly, cloud databases often give up traditional ACID guarantees in favour of eventual consistency, where different copies of data may be temporarily inconsistent, but eventually converge to the same state.

We are developing techniques to improve the reliability of concurrent software running on weak consistency models. On the weak memory model front, we are developing methods for reasoning about libraries of concurrent algorithms running on the weak memory models of common processors (such as x86 and ARM) and programming languages (such as C and C++). On the cloud computing front, we are developing techiques for specifying the semantics of eventually consistent systems and reasoning about their behaviour. We are particularly targeting the most subtle aspect of eventual consistency—mechanisms of resolving conflicting concurrent updates, encapsulated into so-called replicated data types. We are also investigating how our formalisation efforts can help designers of eventually consistent systems to improve their programmability and efficiency.

Arvind Haran, Montgomery Carter, Michael Emmi, Akash Lal, Shaz Qadeer, and Zvonimir Rakamaric. SMACK+Corral: A Modular Verifier. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS),2015.
Sebastian Burckhardt, Alexey Gotsman, Hongseok Yang, and Marek Zawirski  Replicated data types: specification, verification, optimality  POPL'14: Symposium on Principles of Programming Languages, San Diego, CA, USA, pages 271-284. ACM, 2014.
Sebastian Burckhardt, Alexey Gotsman, and Hongseok Yang Understanding eventual consistency, Microsoft Research Technical Report MSR-TR-2013-39
Alexey Gotsman, Madanlal Musuvathi, and Hongseok Yang  Show no weakness: sequentially consistent specifications of TSO libraries  DISC'12: International Symposium on Distributed Computing, Salvador, Bahia, Brazil, LNCS 7611, pages 31-45. Springer, 2012.
Sebastian Burckhardt, Alexey Gotsman, Madanlal Musuvathi, and Hongseok Yang  Concurrent library correctness on the TSO memory model  ESOP'12: European Symposium on Programming, Tallinn, Estonia, LNCS 7211, pages 87-107. Springer, 2012.
Alexey Gotsman, Josh Berdine, and Byron Cook Precision and the conjunction rule in concurrent separation logic MFPS'11: Conference on the Mathematical Foundations of Programming Semantics, Pittsburgh, PA, USA, ENTCS 276:171-190. Elsevier, 2011.

175-macbook-iconProgramming Languages and Verification

The F* language.

The F* language is a new higher-order, effectful programming language (like ML) designed with program verification in mind.

F* currently compiles to .NET bytecode in type-preserving style, and interoperates smoothly with other .NET languages, including F#, on which it is based. It also compiles securely to JavaScript, enabling safe interop with arbitrary, untrusted JavaScript libraries. As a consequence, programmers can deploy F* programs on web pages as JavaScript and receive a formal guarantee that despite running their compiled code within an arbitrary JavaScript context, their programs still behave just as they would according to F*’s semantics.

The F* language, via its rF* extension, has been used to verify a number of interesting programs implementing various cryptographic constructions and protocols. The rF* extension adapts the type checker to verify properties of multiple programs, or properties of more than one execution of a program, i.e., so called “hyperproperties”. Last but not the least, the F* type checker has been verified using F* itself, using a novel bootstrapping technique called self-certification.

More information at http://research.microsoft.com/en-us/projects/fstar/

Secure Distributed Programming

Designing and implementing distributed applications with effective security guarantees is known to be a complicated task. They are widely-deployed applications involving complex information flows, making code reviews and testing less effective. They use modern cryptographic primitives that must be carefully designed.

On the other hand, formal methods provide techniques and tools to improve the reliability of programs. They rely on careful mathematical arguments, typically conducted within a logical formalism, that establish properties of programs or systems.

miTLS is a verified reference implementation of the TLS protocol that fully supports its wire formats, ciphersuites, sessions and connections, re-handshakes and resumptions, alerts and errors, and data fragmentation.

It is written in F# and specified in F7. F7, developed at MSR Cambridge, is a language that allows checking security properties expressed using refinement types.

More information at http://www.mitls.org/

Automated reasoning in the Coq proof assistant

Most interactive theorem provers provide support for some form of user-customizable proof automation. In a number of popular systems, such as Coq and Isabelle, this automation is achieved primarily through tactics, which are programmed in a separate language from that of the prover’s base logic. While tactics are clearly useful in practice, they can be difficult to maintain and compose because, unlike lemmas, their behavior cannot be specified within the expressive type system of the prover itself.

In a recent line of work, a novel approach, allowing the user to specify the behavior of custom automated routines in terms of Coq’s own type system, has been proposed. This technique avoids the aforementioned problems with tactics by allowing one to program custom automated routines within the expressive dependent type system of Coq itself.

Quantities as first-class citizens in program analysis.

Quantitative information-flow analysis (QIF) is an emerging technique for establishing information-theoretic confidentiality properties. Automation of QIF is an important step towards ensuring its practical applicability, since manual reasoning about program security has been shown to be a tedious and expensive task. Existing automated techniques for QIF fall short of providing full coverage of all program executions, especially in the presence of unbounded loops and data structures, which are notoriously difficult to analyze automatically.

We are working on a blend of approximation and randomization techniques to bear on the challenge of sufficiently precise, yet efficient computation of quantitative information flow properties.

Related Publications:

Cédric Fournet, Nikhil Swamy, Juan Chen, Pierre-Évariste Dagand, Pierre-Yves Strub, and Benjamin Livshits. Fully abstract compilation to javascript.  In Roberto Giacobazzi and Radhia Cousot, editors, POPL, pages 371–384.  ACM, 2013.
Georges Gonthier, Beta Ziliani, Aleksandar Nanevski, and Derek Dreyer.  How to make ad hoc proof automation less ad-hoc. J. Funct. Program., 23(4):357–401, 2013.
Boris Köpf and Andrey Rybalchenko.  Automation of quantitative information-flow analysis.  In Marco Bernardo, Erik P. de Vink, Alessandra Di Pierro, and Herbert Wiklicky, editors, SFM, volume 7938 of Lecture Notes in Computer Science, pages 1–28. Springer, 2013.
Ilya Sergey, Dimitrios Vytiniotis, and Simon L. Peyton Jones. Modular, higher-order cardinality analysis in theory and practice. In Suresh Jagannathan and Peter Sewell, editors, POPL, pages 335–348. ACM, 2014.
Pierre-Yves Strub, Nikhil Swamy, Cédric Fournet, and Juan Chen. Self-certification: bootstrapping certified typecheckers in F* with Coq. In John Field and Michael Hicks, editors, POPL, pages 571–584. ACM, 2012.
Nikhil Swamy, Juan Chen, Cédric Fournet, Pierre-Yves Strub, Karthikeyan Bhargavan, and Jean Yang. Secure distributed programming with value-dependent types. J. Funct. Program., 23(4):402–451, 2013.
Nikhil Swamy, Cédric Fournet, Aseem Rastogi, Karthikeyan Bhargavan, Juan Chen, Pierre-Yves Strub, and Gavin M. Bierman. Gradual typing embedded securely in javascript. In Suresh Jagannathan and Peter Sewell, editors, POPL, pages 425–438. ACM, 2014.

Researchers Involved