Irfan Ul Haq, PhD Student, IMDEA Software Institute
Interaction between variables, in a program, is a common phenomenon. Sometimes, programmers make a mistake and use variables erroneously resulting in an undesired interaction, e.g., storing Euros in a variable that should hold dollars. In this work, we propose a technique which uses Natural Language Processing (NLP) and Abstract Type Inference (ATI) to detect such undesired interactions. First, we use ATI to group variables which interact with each other (ATI clusters), and then use semantic similarity between names of the variables to validate these interactions.
We evaluate our approach using two open source projects, Exim Mail Server and grep. Although these programs have been extensively tested, and have been in deployment for years, we find a programming mistake in Exim from the top ATI cluster reported by our tool.
Ilya Sergey, Post-doctoral Researcher, IMDEA Software Institute
Designing scalable concurrent objects, which can be efficiently used on multicore processors, often requires one to abandon standard specification techniques, such as linearizability, in favor of more relaxed consistency requirements. However, the variety of alternative correctness conditions makes it difficult to choose which one to employ in a particular case, and to compose them when using objects whose behaviors are specified via different criteria. The lack of syntactic verification methods for most of these criteria poses challenges in their systematic adoption and application.
In this work, we argue for using Hoare-style program logics as an alternative and uniform approach for specification and compositional formal verification of safety properties for concurrent objects and their client programs. Through a series of case studies, we demonstrate how an existing program logic for concurrency can be employed off-the-shelf to capture important state and history invariants, allowing one to explicitly quantify over interference of environment threads and provide intuitive and expressive Hoare-style specifications for several non-linearizable concurrent objects that were previously specified only via dedicated correctness criteria. We illustrate the adequacy of our specifications by verifying a number of concurrent client scenarios, that make use of the previously specified concurrent objects, capturing the essence of such correctness conditions as concurrency-aware linearizability, quiescent, and quantitative quiescent consistency. All our examples are verified mechanically in Coq.
This is a joint work with Aleks Nanevski, Anindya Banerjee and Germán Delbianco.
Platon Kotzias, PhD Student, IMDEA Software Institute
Code signing is a solution to verify the integrity of software and its publisher's identity,but it can be abused by malware to look benign. This work performs a systematic analysis of Windows Authenticode code signing abuse, evaluating the effectiveness of existing defenses by certification authorities. We build an infrastructure that automatically analyzes signed malware, classifies it into operations, and produces a blacklist of malicious certificates. We evaluate it on 350~K malware samples from 2006-2015. Our analysis shows the constant increase of signed malware over time and that CA defenses such as identity checks and revocation are not currently effective. Up to 97% of the signed malware uses CA-issued certificates and only 15% of those certificates are revoked. Our generated blacklist is 9x larger than current ones. We analyze the code signing infrastructure of the largest operations and show how they evolve over time, using multiple identities and leveraging the lack of CA synchronization to move from one CA to another. We also identify a design issue in Authenticode where timestamped signed malware successfully validates even after the revocation of their code signing certificate. We propose hard revocations as a solution.
Joint work with Srdjan Matic, Richard Rivera, Juan Caballero.
Benedikt Schmidt, Post-doctoral Researcher, IMDEA Software Institute
Analyzing cryptographic constructions in the computational model, or simply verifying the correctness of security proofs, are complex and error-prone tasks. Although computer tools have significant potential to increase confidence in security proofs and to reduce the time for building these proofs, existing tools are either limited in scope, or suffer from an impedance mismatch (i.e., they can only be used by formal methods experts, and have a significant overhead). In effect, it has remained a challenge to design usable and intuitive tools for building and verifying cryptographic proofs, especially for more advanced fields such as pairing-based or lattice-based cryptography.
This paper introduces a formal logic which captures some key reasoning principles in provable security, and more importantly, operates at a level of abstraction that closely matches cryptographic practice. Automatization of the logic is supported by an effective proof search procedure, which in turn embeds (extended and customized) techniques from automated reasoning, symbolic cryptography and program verification.
Although the logic is general, some of the techniques for automating proofs are specific to fixed algebraic settings. Therefore, in order to illustrate the strengths of our logic, we implement a new tool, called AutoGP, which supports extremely compact, and often fully automated, proofs of cryptographic constructions based on (bilinear or multilinear) Diffie-Hellman assumptions. For instance, we provide a 100-line proof of Waters' Dual System Encryption (CRYPTO'09), and fully automatic proofs of Boneh-Boyen Identity-Based Encryption (CRYPTO'04). Finally, we provide an automated tool that generates independently verifiable EasyCrypt proofs from AutoGP proofs.
Joint work with Gilles Barthe and Benjamin Grégoire.
Goran Doychev, PhD Student, IMDEA Software Institute
Provably fixing security problems can make a system unusable, while patching the system to protect from the latest security threat does not anticipate the vulnerabilities waiting around the corner. This statement is particularly true for side-channel attacks, which can be easily eliminated, however often with a prohibitively high performance penalty. For example, cache attacks can be prevented by disabling the cache, and timing attacks can be prevented by returning in constant time. Because of the high cost of eliminating side-channels, in practice some side-channel leakage is still tolerated, and one is faced with the problem of striking a balance between performance and security against such attacks.
In this talk, I will present a systematic approach for choosing a protection against timing attacks, on the example of cryptosystems based on discrete logarithms. Our model includes a resource-bounded timing adversary who strives to maximize the probability of key recovery, and a defender who strives to reduce the cost while maintaining a fixed level of security. We obtain the optimal protection as an equilibrium in a game between the defender and the adversary. At the heart of the equilibrium computation are novel bounds for the probability of key recovery, which are expressed as a function of the applied protection and the attack strategy of a timing adversary.
We put our techniques to work in a case study in which we identify optimal protections for libgcrypt's ElGamal implementation. We determine situations in which the optimal choice is to use a defensive, constant-time implementation and a small key, and situations in which the optimal choice is a more aggressively tuned (but leaky) implementation with a longer key.
Artem Khyzha, PhD Student, IMDEA Software Institute
Linearizability is a commonly accepted notion of correctness for libraries of concurrent algorithms, and recent years have seen a number of proposals of program logics for proving it. Although these logics differ in technical details, they embody similar reasoning principles. In our ongoing work we aim to explicate these principles and propose a logic for proving linearizability that is generic: it can be instantiated with different means of compositional reasoning about concurrency, such as separation logic, rely-guarantee or RGsep. In this talk, I will present a generalisation of the Views framework for reasoning about concurrency to handle relations between programs, required for proving linearizability. This is joint work with Alexey Gotsman (IMDEA) and Matthew Parkinson (MSR).
Maximiliano Klemen, PhD Student, IMDEA Software Institute
In this talk we will review the general resource analysis framework present in the CiaoPP system. We will start by describing a preliminary version and its evolution that lead to a user-definable resource analysis able to infer both lower and upper bounds on the resources used by program executions. Examples of user-definable resources are energy, bits sent or received by an application over a socket, number of calls to a procedure, number of files left open, number of accesses to a database, monetary units spent, etc., as well as the more traditional execution steps or execution time. Then, we will talk about a recent implementation based on abstract interpretation and the use of sized types, which overcomes some limitations of the previous implementation and related existing approaches, and provides additional advantages. Afterwards, we will show how this resource analyzer is used in combination with CiaoPP's general framework for resource usage verification in order to estimate and verify the energy consumption of C-like programs. Finally, we will present other recent improvements of key components of the CiaoPP tool.
Alessandra Gorla, Assistant Research Professor, IMDEA Software Institute
Dynamic invariant detection allows mining of specifications from existing systems, but the quality of the resulting invariants depends on the executions observed: Unobserved behavior is not captured by dynamically inferred invariants, which may thus be unsound. Although this can be countered by producing additional executions with automated test generation techniques, it is crucial to generate new inputs that exercise relevant unobserved executions.
In this talk I will present an ongoing work aiming to produce a test suite that, despite the limited set of executions, can automatically discover many sound invariants for Java classes.
Alexey Gotsman, Assistant Research Professor, IMDEA Software Institute
To achieve scalability and availability, modern distributed systems often provide only weak guarantees about the consistency of data processing. To help programmers cope with this, researchers have proposed novel programming concepts, such as constructs for strengthening consistency on demand, replicated data types for resolving conflicts (aka CRDTs) and new forms of transactions. These concepts have a subtle semantics and using them correctly remains nontrivial. I will present our ongoing work to develop ways of checking the correctness of their usage. This is joint work with Hongseok Yang (Oxford), Carla Ferreira (U Nova Lisboa), Mahsa Najafzadeh and Marc Shapiro (UPMC).
Joaquín Arias, PhD Student, IMDEA Software Institute
Logic programming systems featuring Constraint Logic Programming and tabled execution have been shown to increase the declarativeness and efficiency of Prolog, while at the same time making it possible to write very expressive programs.
Previous implementations fully integrating both capabilities (i.e., forcing suspension, answer subsumption, etc. in all the places where it is necessary in order to avoid recomputation and terminate whenever possible) did not feature a simple, well-documented, easy-to-understand interface which made it possible to integrate arbitrary CLP solvers into existing tabling systems. This clearly hinders a more widespread usage of the combination of both facilities.
We examine the requirements that a constraint solver must fulfill in order to be interfaced with a tabling system. We propose a minimal set of operations (e.g., entailment checking and projection) which the constraint solver has to provide to the tabling engine.
We validate and evaluate the performance of our design by a series of use cases and benchmarks: we re-engineer a previously existing tabled constrain domain (difference constraints) in Ciao Prolog, we integrate Holzbauer's CLP(Q) implementation with Ciao Prolog's tabling engine, and we implement a simple abstract analyzer whose fixpoint is reached by means of tabled execution and whose domain operations are implemented as a constraint solver, which therefore avoids recomputation of subsumed abstractions.
Nataliia Stulova, PhD Student, IMDEA Software Institute
Using annotations, referred to as assertions or contracts, to describe program properties for which run-time tests are to be generated, has become frequent in dynamic programing languages. However, the frameworks proposed to support such run-time testing generally incur high time and/or space overheads over standard program execution.
We present an approach for reducing this overhead that is based on the use of memoization to cache intermediate results of check evaluation, avoiding repeated checking of previously verified properties.
Compared to approaches that reduce checking frequency our proposal has the advantage of being exhaustive (i.e., all tests are checked at all points) while still being much more efficient than standard run-time checking. Compared to the limited previous work on memoization, it performs the task without requiring modifications to data structure representation or checking code.
While the approach is general and system-independent, we present it for concreteness in the context of the Ciao run-time checking framework. We also report on a prototype implementation and provide some experimental results that support that using a relatively small cache leads to significant decreases in run-time checking overhead.
Ratan Lal, PhD Student, IMDEA Software Institute
We consider the problem of computing a bounded error ap-proximation of the solution over a bounded time [0, T], of a parameterized linear system, ẋ(t) = A x(t), where A is constrained by a compact polyhedron Ω. Our method consists of sampling the time domain [0, T] as well as the parameter space Ω and constructing a continuous piecewise bilinear function which interpolates the solution of the parameterized system at these sample points. More precisely, given an ϵ > 0 , we compute a sampling interval δ > 0, such that the piecewise bilinear function obtained from the sample points is within of the original trajectory. We present experimental results which suggest that our method is scalable.
Dragan Ivanovic, Post-doctoral Researcher, IMDEA Software Institute
In this talk I'll try to practically demonstrate how some of the key features of Prolog-style logic programming can be easily implemented as a user-defined syntactic and semantic extension of a functional programming language -- in this case F# with its computational expressions. As a result, functional-style developers can benefit from automatic backtracking, non-determinism, and cuts when needed, without sacrificing other advantages of the host language/platform.