IMDEA initiative

Home > Events > Invited Talks

Invited Talks

Veronica Dahl

Monday, November 04, 2019

3:00pm Meeting room 302 (Mountain View), level 3

Veronica Dahl, Research Professor, Simon Fraser University, Canada

AI for Social Responsibility: Embedding principled guidelines into AI systems

Abstract:

In this position talk we briefly retrace the historic and evolutionary context that led to AI's results not necessarily being used first and foremost to benefit the public that funded it, nor to necessarily focus on human values and concerns. Next, we discuss how the AI language Constraint Handling Rules -CHR- can promote social responsibility by making it easy to embed principled guidelines into our systems, and we exemplify this idea within an application to enhance voting and decision-making power. Finally, we examine the very notion of intelligence in the light of the more recent notion of group intelligence, and draw consequences on what might be needed to ensure that AI capabilities are put to socially responsible uses only. In particular, we identify what legislations might help place AI at the service of the urgently needed solutions for today's various crises, with the overall aim, as K. Raworth put it, to "meet the needs of all within the means of the planet".


Time and place:
3:00pm Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Bernardo David

Wednesday, October 16, 2019

10:45am Meeting room 302 (Mountain View), level 3

Bernardo David, Associate Research Professor, IT University of Copenhagen, Denmark

Efficient Privacy Preserving Computation meets Blockchains

Abstract:

Multiparty Computation (MPC) protocols allow a set of mutually distrustful parties to compute a program without revealing their private inputs. It has been suggested that MPC can be combined with blockchain systems to achieve two goals: 1. Determine cash distribution according to private inputs; 2. Improve fairness of MPC protocols through financial punishments for misbehaving parties. In this talk, we will present an approach to constructing general purpose MPC protocols that can be efficiently combined with blockchain systems and distributed applications, such as gambling, distributed cryptocurrency exchanges and privacy preserving smart contracts. First, we will present a general approach for combining MPC protocols with public verifiability and cheater identification protocols with blockchain based smart contracts in order to achieve the two goals above. Next we will present a new compiler for achieving public verifiability and cheater identification from MPC protocols with certain simpler properties properties given publicly verifiable oblivious transfer and homomorphic commitments. Finally, we will briefly describe novel constructions of such oblivious transfer and commitment protocols that achieve high concrete efficiency. Based joint work with Carsten Baum, Ignacio Cascudo, Ivan Damgård, Rafael Dowsley, Nico Döttling and Irene Giacomelli.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Yu-yang Lin

Tuesday, October 15, 2019

10:45am Meeting room 302 (Mountain View), level 3

Yu-yang Lin, PhD Student, Queen Mary, London University

A Bounded Model Checking Technique for Higher-Order Programs

Abstract:

We present a Bounded Model Checking technique for higher-order programs based on defunctionalization and points-to analysis. The vehicle of our study is a higher-order calculus with general references. Our technique is a symbolic state syntactical translation based on SMT solvers, adapted to a setting where the values passed and stored during computation can be functions of arbitrary order. We prove that our algorithm is sound and provide a prototype implementation with experimental results showcasing its performance. The first novelty of our technique is a presentation of defunctionalization using nominal techniques, which provides a theoretical background to proving soundness of our technique, coupled with SSA adapted to higher-order values. The second novelty is our use of defunctionalization and points-to analysis to directly encode general higher-order functional programs.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Alejandro Ranchal-Pedrosa

Monday, October 14, 2019

10:45am Meeting room 302 (Mountain View), level 3

Alejandro Ranchal-Pedrosa, PhD Student, University of Sydney, Australia

Platypus: Offchain Protocols without Synchrony

Abstract:

Offchain protocols aim at bypassing the scalability and privacy limitations of classic blockchains by allowing a subset of participants to execute multiple transactions outside the blockchain. While existing solutions like payment networks and factories depend on a complex routing protocol, other solutions simply require participants to build a childchain, a secondary blockchain where their transactions are privately executed. Unfortunately, all childchain solutions assume either synchrony or a trusted execution environment. In this paper we present Platypus, an offchain protocol that requires neither synchony nor a trusted execution environment. Relieving the need for a trusted execution environment allows Platypus to ensure privacy without trusting a central authority, like Intel, that manufactures dedicated hardware chipset, like SGX. Relieving the need for synchrony means that no attacker can steal coins by leveraging clock drifts or message delays to lure timelocks. In order to prove our algorithm correct, we formalize the chilchain problem as a Byzantine variant of the classic Atomic Commit problem, where closing an offchain protocol is equivalent to committing the whole set of payments previously recorded on the childchain ``atomically'' on the main chain. Platypus is resilience optimal and we explain how to generalize it to crosschain payments.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Christian Roldán

Wednesday, October 02, 2019

10:45am Meeting room 302 (Mountain View), level 3

Christian Roldán, PhD Student, University of Buenos Aires, Argentina

About semantics of replicated data stores

Abstract:

Replicated data stores provide highly available, low-latency access to data at the expense of consistency, i.e., observers can perceive differences in the state of a system. Storages adopt different strategies to cope with temporary inconsistencies and resolve conflicts among concurrent updates, such as Replicated data types (RDTs). The selected solution impacts on the properties ensured by a store, i.e., on the kind of inconsistencies or anomalies that are allowed to happened and observed by applications. Such anomalies are characterised in terms of consistency models (such as read-your-write and causal consistency) defined axiomatically by constraining the execution traces of a data store. In this talk, we review the semantics of replicated data stores, moving from the operational approaches for RDTs to a purely functional description. We comment how this approach enables a categorical presentation. In addition, we propose an alternative characterisation for consistency models, offering an operational model that is general enough for capturing some of the well-known consistency model.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Gregory Chockler

Tuesday, September 10, 2019

10:45am Meeting room 302 (Mountain View), level 3

Gregory Chockler, Research Professor, Royal Holloway, University of London (RHUL),

Atomic Transaction Commit for Modern Data Stores

Abstract:

Transaction commit protocols play a pivotal role in supporting scalability and availability guarantees of today's large-scale transactional databases. Their theoretical properties are traditionally captured and analysed through Atomic Commitment Problem (ACP), introduced by Gray in the early 70s. Roughly, ACP is concerned with ensuring all-or-nothing (atomicity) semantics for transactions (the 'A' in the famous ACID acronym). It is formulated as a one-shot agreement problem in which a single COMMIT or ABORT decision must be output for a given transaction depending on the COMMIT or ABORT votes provided by a collection of fail-prone sites holding the data objects involved in the transaction. We argue that ACP is too simplistic to adequately capture complexities of transaction commit in modern transactional data stores. In particular, its one-shot nature ignores the fact that a decision to commit or abort an individual transaction is not taken in isolation, but rather influenced by how conflicts with other simultaneously executing transactions are handled by the concurrency control mechanism, which ensures the isolation of transactions - 'I' in ACID. The lack of a formal framework capturing such intricate interdependencies between mechanisms for atomicity and isolation in real-world databases makes it difficult to understand them and prove correct. We address this by proposing a new problem, called Transaction Certification Service (TCS), that formalises how the classical Two-Phase Commit (2PC) protocol interacts with concurrency control in many practical transaction processing systems in a way parametric in the isolation level provided. In contrast to ACP, TCS is a multi-shot problem in the sense that the outcome of a transaction must be justified by the entire history of the past transactions rather than a single set of input votes. We then use TCS to identify core algorithmic insights underlying transaction commit in several widely used transactional data stores (such as Google Spanner), and leverages these to develop a new fault-tolerant multi-shot transaction commit protocol. This protocol is theoretically optimal in terms of its time complexity, and can serve as a reference implementation for future systems. Joint work with Alexey Gotsman, IMDEA Software Institute, Spain.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Yotam Feldman

Wednesday, September 04, 2019

10:45am Meeting room 302 (Mountain View), level 3

Yotam Feldman, PhD Student, Tel Aviv University, Israel

Inferring Inductive Invariants from Phase Structures

Abstract:

Infinite-state systems such as distributed protocols are challenging to verify using interactive theorem provers or automatic verification tools. Of these techniques, deductive verification is highly expressive but requires the user to annotate the system with inductive invariants. To relieve the user from this labor-intensive and challenging task, invariant inference aims to find inductive invariants automatically. Unfortunately, when applied to infinite-state systems such as distributed protocols, existing inference techniques often diverge, which limits their applicability. This paper proposes user-guided invariant inference based on phase invariants, which capture the different logical phases of the protocol. Users conveys their intuition by specifying a phase structure, an automaton with edges labeled by program transitions; the tool automatically infers assertions that hold in the automaton's states, resulting in a full safety proof.The additional structure from phases guides the inference procedure towards finding an invariant. Our results show that user guidance by phase structures facilitates successful inference beyond the state of the art. We find that phase structures are pleasantly well matched to the intuitive reasoning routinely used by domain experts to understand why distributed protocols are correct, so that providing a phase structure reuses this existing intuition. Joint work with James R. Wilcox, Sharon Shoham, and Mooly Sagiv, that has appeared in CAV'19.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Yotam Feldman

Tuesday, September 03, 2019

10:45am Meeting room 302 (Mountain View), level 3

Yotam Feldman, PhD Student, Tel Aviv University, Israel

Order out of Chaos: Proving Linearizability Using Local Views

Abstract:

Proving the linearizability of highly concurrent data structures, such as those using optimistic concurrency control, is a challenging task. The main difficulty is in reasoning about the view of the memory obtained by the threads, because as they execute, threads observe different fragments of memory from different points in time. Until today, every linearizability proof has tackled this challenge from scratch. We present a unifying proof argument for the correctness of unsynchronized traversals, and apply it to prove the linearizability of several highly concurrent search data structures, including an optimistic self-balancing binary search tree, the Lazy List and a lock-free skip list. Our framework harnesses sequential reasoning about the view of a thread, considering the thread as if it traverses the data structure without interference from other operations. Our key contribution is showing that properties of reachability along search paths can be deduced for concurrent traversals from such interference-free traversals, when certain intuitive conditions are met. Basing the correctness of traversals on such local view arguments greatly simplifies linearizability proofs. To apply our framework, the user proves that the data structure satisfies two conditions: (1) acyclicity of the order on memory, even when it is considered across intermediate memory states, and (2) preservation of search paths to locations modified by interfering writes. Establishing the conditions, as well as the full linearizability proof utilizing our proof argument, reduces to simple concurrent reasoning. The result is a clear and comprehensible correctness proof, and elucidates common patterns underlying several existing data structures. Joint work with Constantin Enea, Adam Morrison, Noam Rinetzky, and Sharon Shoham, that has appeared in DISC'18.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Antonio Nappa

Tuesday, July 02, 2019

10:45am Meeting room 302 (Mountain View), level 3

Antonio Nappa, Researcher, Brave Software

Trusted Hardware: The Good, The Bad, The Ugly

Abstract:

Trusted hardware is one of the most complex and desired components of modern computers. For example, almost all mobile phones are equipped with a TEE (Trusted-Execution-Environment) as well as modern x86 computers (SGX). Apparently, (almost) no one is claiming to use such technology in production, because it looks like its foundations are still too shaky. An example of problems that can undermine trusted hardware are vulnerabilities like Foreshadow on Intel platforms or CVE-2018-14491 on Qualcomm based devices (phones, tablets). These architectures, on one hand offer unique functionalities, on the other hand their programming model is very convoluted, making this technology both a blessing and a curse. In this talk we will show what makes secure computation a unique tool for research and industry by describing existing systems which leverage such technology. We will engage a discussion about their difficult programming model and possible mitigations to hardware vulnerabilities which, by definition can undermine all the security premises of such architectures.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Mooly Sagiv

Wednesday, June 12, 2019

10:45am Meeting room 302 (Mountain View), level 3

Mooly Sagiv, Research Professor, Tel Aviv University, Israel

Deductive verification of distributed protocols in first-order logic

Abstract:

Distributed protocols such as Paxos play an important role in many computer systems. Therefore, a bug in a distributed protocol may have tremendous effects. Accordingly, a lot of effort has been invested in verifying such protocols. However, due to the infinite state space (e.g., unbounded number of nodes, messages) and protocols complexity, verification is both undecidable and hard in practice. I will describe a deductive approach for verification of distributed protocols, based on decidable fragments of first-order logic, inductive invariants and user interaction. The use of decidable fragments of first-order logic allows to completely automate some verification tasks. Tasks that remain undecidable (e.g. finding inductive invariants) are solved with user interaction, based on graphically displayed counterexamples. I will also describe the application of these techniques to verify safety of several variants of Paxos, and a way to extend the approach to verify liveness and temporal properties. This work is part of a Ph.D. by Oded Padon https://www.cs.tau.ac.il/~odedp/. The ideas were integrated into Ivy https://github.com/microsoft/ivy. It is joint work with Giuliano Losa, Jochen Hoenicke, Ken McMillan, Aurojit Panda, Sharon Shoham, Marcelo Taube, James R. Wilcox, and Doug Woos.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


František Farka

Tuesday, April 30, 2019

10:45am Meeting room 302 (Mountain View), level 3

František Farka, PhD Student, Heriot-Watt University, United Kingdom

Proof-Relevant Resolution: The Foundations of Constructive Automation

Abstract:

In this talk, we introduce proof-relevant resolution, a framework for constructive proof automation. The intended application of the framework is verifiable proof automation in strongly typed programming languages. We motivate the framework by two use-cases that show its strengths. First, we show a proof-relevant approach to type inference and term synthesis. Secondly, we demonstrate the use of the framework for the purpose of study semantical properties of programming languages, namely soundness of type-class elaboration. In the talk, we describe the key features of big-step and small-step operational semantics and show soundness of the small-step w.r.t. the big-step semantics. We briefly outline the proof as it requires a use of logical relation. We conclude the talk by a discussion of future applications and extensions of the framework.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Ignacio Cascudo

Friday, April 26, 2019

10:45am Meeting room 302 (Mountain View), level 3

Ignacio Cascudo, Assistant Research Professor, Aalborg University, Denmark

Some developments in secure multiparty computation for binary circuits

Abstract:

Secure multiparty computation studies deals with privacy-preserving computation, where several parties, some of them holding private inputs, want to collaborate in computing some function depending on them while at the same time avoiding to reveal more information about these inputs that it is necessary to perform the computation. Some of the most popular ways of performing secure computation involve the notion of secret sharing. The problem however is that secret-sharing based secure computation protocols require to represent the function to be computed as a arithmetic circuit over a large finite field, which for certain functions is not the most natural representation. In this talk, I will present some of my recent research that it is oriented to adapt some of these protocols to compute boolean circuits, in a way that the amortized communication complexity is still almost as good as the one for the original protocols.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Klaus von Gleissenthall

Thursday, April 25, 2019

10:45am Meeting room 202 (Hill View), level 2

Klaus von Gleissenthall, Post-doctoral Researcher, UC San Diego

Pretend Synchrony: Synchronous Verification of Asynchronous Distributed Programs

Abstract:

In this talk, I will present pretend synchrony, a new approach to verifying distributed systems, based on the observation that while distributed programs must execute asynchronously, we can often soundly treat them as if they were synchronous, when verifying their correctness. To do so, we compute a synchronization, a semantically equivalent program where all sends, receives, and message buffers, have been replaced by simple assignments, yielding a program that can be verified using Floyd-Hoare style verification conditions and SMT. We have implement our approach as a framework for writing verified distributed programs in Go. Pretend synchrony allows us to develop performant systems while making verification of functional correctness simpler by reducing manually specified invariants by a factor of 6, and faster, by reducing checking time by three orders of magnitude.


Time and place:
10:45am Meeting room 202 (Hill View), level 2
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Marco Guarnieri

Wednesday, April 24, 2019

10:45am Meeting room 302 (Mountain View), level 3

Marco Guarnieri, Post-doctoral Researcher, IMDEA Software Institute

Principled detection of speculative information flows

Abstract:

Modern CPUs employ speculative execution to avoid expensive pipeline stalls by predicting the outcome of branching (and other) decisions and speculatively executing the corresponding instructions. Attackers can exploit speculative execution's side effects to leak sensitive information and violate confidentiality. The family of Spectre attacks demonstrates that this vulnerability affects all modern general-purpose CPUs and poses a serious threat against platforms with multiple tenants. Since the advent of Spectre, a number of countermeasures have been proposed and deployed. Rigorously reasoning about their effectiveness, however, requires a well-defined notion of security against speculative execution attacks, which has been missing until now. In this talk, I will present a novel, principled approach for reasoning about software defenses against Spectre-style attacks. My approach builds on speculative non-interference, the first semantic notion of security against speculative execution attacks. I will also present Spectector, an algorithm based on symbolic execution for automatically proving speculative non-interference, or detecting violations. I will show how Spectector can be used to detect subtle leaks – and optimizations opportunities – in the way major compilers place Spectre countermeasures.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Joao Marques Silva

Tuesday, April 23, 2019

10:45am Meeting room 202 (Hill View), level 2

Joao Marques Silva, Research Professor, Universidade de Lisboa

Logic-Enabled Explanations for Machine Learning Models

Abstract:

The practical successes of Machine Learning (ML) in different settings motivates the ability of computing small explanations for predictions made. Small explanations are generally accepted as easier for human decision makers to understand. Existing work on computing explanations is based on heuristic approaches, providing no guarantees of quality, in terms of how close such solutions are from cardinality- or subset-minimal explanations. This talk describes a novel constraint-agnostic approach for computing explanations for any Machine Learning (ML) model. The proposed solution exploits abductive reasoning, and imposes the requirement that the ML model be represented as sets of constraints using some target constraint reasoning system for which the decision problem can be answered with some oracle. The talk also compares the new logic-enabled approach for computing explanations with existing heuristic approaches on well-known datasets.


Time and place:
10:45am Meeting room 202 (Hill View), level 2
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Alessandra Gorla

Monday, April 15, 2019

3:15pm Meeting room 302 (Mountain View), level 3

Alessandra Gorla, Assistant Research Professor, IMDEA Software Institute

Using Natural Language Processing to Improve Reliability and Trustworthiness of Software Systems

Abstract:

Many artifacts that software developers produce are written in natural language: code comments, commit messages, text in user interfaces, privacy policies, high-level description of the system, and so on. In this talk I will present my research work on using natural language processing (nlp) techniques to analyze such artifacts. We infer useful information such as pre- and post-conditions of procedures, and we use it to automatically generate test oracles. In this talk, I will give an overview of my recent contributions on using nlp to make software systems more reliable and trustworthy, together with my plans for the future.


Time and place:
3:15pm Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Maria Schett

Wednesday, April 10, 2019

10:45am Meeting room 302 (Mountain View), level 3

Maria Schett, PhD Student, University College London, United Kingdom

Blockchain Superoptimizer

Abstract:

Etherum smart contracts written in higher level programming languages like Solidity or Viper are compiled to bytecode, which is executed by the Ethereum Virtual Machine (EVM). Any instruction executed on the EVM has a monetary cost.

In this talk, I present our tool that automatically optimizes EVM bytecode. Our EVM Bytecode SuperOptimizer "ebso" implements (unbounded) superoptimization, which relies on an SMT solver to guarantee correctness of the transformation. We then perform a large scale evaluation of ebso on contracts deployed on the Etherum Blockchain.

This is joint work by Julian Nagele and Maria A Schett.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Kenji Maillard

Wednesday, April 10, 2019

2:45pm Meeting room 302 (Mountain View), level 3

Kenji Maillard, PhD Student, INRIA Paris, France

Designing Dijkstra Monads

Abstract:

Verifying a program consist of proving that a given program meets its specification. Various frameworks have been studied to provide such specifications, for instance in Hoare logic programs are specified with pairs of pre/post-conditions. When dealing with programs mixing a variety of side-effects such as mutability, divergence, raising exceptions, or non-determinism, providing a framework for specifying those programs becomes challenging. Computational monads are a convenient algebraic gadget to uniformly represent a wide class of such side effects in programming languages. Inspired by Dijkstra's work, we want to use predicate transformers to verify monadic programs. We will explain how these predicate transformers can themselves be uniformly represented as a monad, leading to a powerful verification tool called Dijkstra monads used heavily in the F* programming language.


Time and place:
2:45pm Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Marko Vukolić

Monday, April 08, 2019

10:45am Meeting room 302 (Mountain View), level 3

Marko Vukolić, Researcher, IBM Research - Zurich

Hyperledger Fabric: a Distributed Operating System for Permissioned Blockchains

Abstract:

Fabric is a modular and extensible open-source system for deploying and operating permissioned blockchains and one of the Hyperledger projects hosted by the Linux Foundation (www.hyperledger.org). Fabric supports modular consensus protocols, which allows the system to be tailored to particular use cases and trust models. Fabric is also the first blockchain system that runs distributed applications written in standard, general-purpose programming languages, without systemic dependency on a native cryptocurrency. This stands in sharp contrast to existing blockchain platforms that require "smart-contracts" to be written in domain-specific languages or rely on a cryptocurrency. Fabric realizes the permissioned model using a portable notion of membership, which may be integrated with industry-standard identity management. To support such flexibility, Fabric introduces an entirely novel blockchain design and revamps the way blockchains cope with non-determinism, resource exhaustion, and performance attacks. Although not yet performance-optimized, Fabric achieves, in certain popular deployment configurations, end-to-end throughput of more than 3500 transactions per second (of a Bitcoin-inspired digital currency), with sub-second latency, scaling well to over 100 peers. In this talk we discuss Hyperledger Fabric architecture, detailing the rationale behind various design decisions. We also briefly discuss distributed ledger technology (DLT) use cases to which Hyperledger Fabric is relevant, including financial industry, manufacturing industry, supply chain management, government use cases and many more.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Ingo Mueller

Friday, April 05, 2019

10:45am Meeting room 302 (Mountain View), level 3

Ingo Mueller, Post-doctoral Researcher, ETH Zurich, Switzerland

The State of the Art of Data Analytics Systems and What is Wrong about it

Abstract:

Few technological advances have affected as many aspects of science, economy, and society in general as the ability to collect, analyze, and understand large amounts of data. Data analytics systems play an important role in this development as they translate the exponential performance improvements made by hardware into similar improvements at higher abstraction levels. As one example, I will present a thorough study of a core database primitive, grouping with aggregation, done in the context of a commercial system for relational in-memory processing. For this primitive alone, we had to address a number of challenges: (provable) cache-efficiency, CPU-friendliness, parallelism within and across processors, robust handling of skewed data, adaptive processing, processing with constrained memory, and integration with modern database architectures. I argue that this approach corresponds to the state of the art of system building: Today's systems typically implement one analysis/platform combination, requiring data scientists to constantly switch tools and duplicating implementation effort of systems and their applications. Still, they are all very similar on a conceptual level, suggesting that we have not fundamentally understood what makes up the essence of our systems. I will thus sketch my vision and research theme for the foreseeable future: a common abstraction for a large span of types of data analytics that can run efficiently on a variety of hardware platforms.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Joseph Izraelevitz

Thursday, April 04, 2019

10:45am Meeting room 302 (Mountain View), level 3

Joseph Izraelevitz, Post-doctoral Researcher, UC San Diego

Practical and Formal Infrastructure for Nonvolatile Memory

Abstract:

For decades, programmers have interacted with persistent storage via a well-defined block-based API, namely, that of the file system. However, it is expected that byte-addressable nonvolatile memory (NVM) will soon be commonplace, potentially transforming main memory into a storage device. This transition forces fundamental changes in how to reason about and manage persistent storage. The possibility that programmers may wish for data in main memory to survive program runs and even system crashes suggests the need to reassess design decisions at all levels of the system stack. This talk will discuss how system architecture and formal foundations are changing to meet the demands of NVM at all scales. At a formal level, we explore what it means for a program to be "correct" on a machine with NVM. At the hardware level, extensions of memory consistency control the ordering and timing of writes into persistence. At the library level, data structures are transformed to be consistent not only in the face of concurrent access, but also in the wake of crashes. Finally, at the compiler and language level, extensions to transactional memory give programmers the ability to ensure that complex changes to persistent state are not only isolated and consistent, but also failure atomic and durable. We conclude by discussing the second-order effects of these changes and how we expect the software ecosystem will mutate after NVM has become deeply integrated into the stack.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Andreas Pavlogiannis

Tuesday, March 26, 2019

10:45am Lecture hall 1, level B

Andreas Pavlogiannis, Post-doctoral Researcher, EPFL, Switzerland

Algorithmic Advances in Automated Program Analysis

Abstract:

Modern-day software is increasingly complex and software engineering is commonly accepted as a challenging, error-prone task. Consequently, software bugs are prevalent, incurring detrimental financial costs and often risking human lives. Program analyses provide rigorous and effective means to automated bug detection, and are becoming an integral part of the software development process. In this talk, I will present recent algorithmic advances in two impactful domains of program analysis: (i) static analysis and (ii) stateless model checking. Static analyses are lightweight approximations to program behavior and largely constitute the first step to bug detection in the software development process. In the first part, I will introduce the Algebraic Path Problem on Recursive Graphs (APP) and its close variant of Dyck Reachability (DR). These problems serve as algorithmic formulations of static analyses on various domains, such as dataflow/alias/data-dependence analysis and program slicing. I will present new algorithms for APP and DR based on the graph-theoretic notion of treewidth, that lead to theoretical complexity improvements and provide a significant scalability boost in practice. Stateless model checking is one of the most influential techniques in the analysis of concurrent programs because of its effectiveness in dealing with the state-explosion problem. In the second part, I will present Dynamic Partial Order Reduction (DPOR), a widespread stateless technique for model checking concurrent programs. I will outline some foundational limitations to existing DPOR approaches, and introduce a new abstraction of concurrent semantics, called the Observation Equivalence (OE), which overcomes some of these limitations. I will demonstrate algorithmic applications of OE in two domains: (i) stateless model checking and (ii) dynamic race detection.


Time and place:
10:45am Lecture hall 1, level B
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Paolo Giarrusso

Friday, March 22, 2019

10:45am Lecture hall 1, level B

Paolo Giarrusso, Post-doctoral Researcher, EPFL, Switzerland

Towards Semantic Type Soundness for Dependent Object Types and Scala with Logical Relations in Iris

Abstract:

The metatheory of the Scala core type system (DOT), first established recently, is still hard to extend, like other systems combining subtyping and forms of dependent types. Soundness of important Scala features remains an open problem in theory and in practice. To obtain a more extensible metatheory, in ongoing work we prove in Coq soundness of DOT semantically. This is challenging, because in DOT variables can be in scope in their own type, objects can define mutually recursive and impredicative type members, and can contain evidence of subtyping relations that must be sound. We address these challenges by novel applications of step-indexing, as supported by the Iris logic. Our proof clarifies difficulties in past proofs; ongoing work on extensions, such as paths, appears encouraging.


Time and place:
10:45am Lecture hall 1, level B
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Rahul Chatterjee

Thursday, March 21, 2019

10:45am Lecture hall 1, level B

Rahul Chatterjee, PhD Student, Cornell University, New York, USA

Empiricism-Informed Secure System Design: From Improving Passwords to Helping Domestic Violence Victims

Abstract:

Security often fails in practice due to a lack of understanding of the nuances in real-world systems. For example, users choose weak passwords to deal with the several usability issues with passwords, which in turn degrades the security of passwords. I will talk about how we can build better security mechanisms by combining methodical empiricism with analytical frameworks. First, in the context of passwords, I will show how to improve the usability of passwords by allowing users to log in with typos in their passwords. I will detail in the talk how to do so without giving attackers any additional advantage to impersonate a user. In the second part of my talk, I will talk about my recent research direction on how traditional authentication mechanisms fail to properly model digital attacks by domestic abusers, and therefore are ineffective for victims. As a result, abusers can spy on, stalk, or harass victims using seemingly innocuous apps and technologies. I will finish with some recent progress that I have made in helping victims of tech abuse, and provide some future research directions.


Time and place:
10:45am Lecture hall 1, level B
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Martin A.T. Handley

Tuesday, March 19, 2019

10:45am Lecture hall 1, level B

Martin A.T. Handley, PhD Student, University of Nottingham, United Kingdom

Liquidate your assets: reasoning about resource usage in Liquid Haskell

Abstract:

Liquid Haskell is an extension to the type system of Haskell that supports formal reasoning about program correctness by encoding logical properties as refinement types. In this interactive talk, I demonstrate how Liquid Haskell can also be used to reason about program efficiency in the same setting, with the system’s existing verification machinery being used to ensure that the results are both meaningful and precise. My experience is that reasoning about efficiency in Liquid Haskell is often just as simple as reasoning about correctness, and even that the two can coincide. I hope to convince you of the same!


Time and place:
10:45am Lecture hall 1, level B
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Eduardo Bezerra

Thursday, March 14, 2019

10:45am Meeting room 202 (Hill View), level 2

Eduardo Bezerra, Software Engineer, Amazon Madrid

Strong Consistency at Scale

Abstract:

Today’s online services must meet strict availability and performance requirements. State machine replication, one of the most fundamental approaches to increasing the availability of services without sacrificing strong consistency, provides configurable availability but limited performance scalability. Scalable State Machine Replication (S-SMR) achieves scalable performance by partitioning the service state and coordinating the ordering and execution of commands. While S-SMR scales the performance of single-partition commands with the number of deployed partitions, it relies on atomic multicast and can present higher latency than traditional SMR, specially when executing multi-partition commands. In this talk, Eduardo Bezerra will revisit Scalable State Machine Replication and Ridge, a high-throughput, low-latency multicast algorithm that mitigates that issue.


Time and place:
10:45am Meeting room 202 (Hill View), level 2
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Michael D. Adams

Monday, January 21, 2019

10:45am Meeting room 302 (Mountain View), level 3

Michael D. Adams, Software Engineer, The University of Utah, USA

Advances in Parsing

Abstract:

Parsing is sometimes thought of as a solved problem. However, recent advances show that there is still much to be discovered in this area. This talk reviews two such advances: indentation-sensitive parsing and disambiguation via tree automata. Several popular languages, such as Haskell, Python, and F#, use the indentation and layout of code as part of their syntax. Parsers for these languages currently use ad hoc techniques to handle layout. This talk presents a simple extension to context-free grammars that can express these layout rules, and derives GLR and LR(k) algorithms for parsing these grammars. Dealing with precedence and associativity rules in context free grammars (CFGs) has long posed a challenge. While they can be encoded in the structure of the CFG, the result can be difficult to work with and often obfuscates the language designer’s intent. This talk shows how tree automata can specify all of these while still allowing the CFG to be written in a natural manner. This unified theory subsumes and generalizes these other techniques and provides an elegant means of specifying grammatical restrictions. These advances indicate that there is still much to be discovered about parsing. We have only uncovered the tip of the iceberg


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Miguel Á.  Carreira-Perpiñán

Friday, January 18, 2019

10:45am Meeting room 302 (Mountain View), level 3

Miguel Á. Carreira-Perpiñán, Professor, University of California at Merced, USA

A new way to train decision trees: tree alternating optimization (TAO)

Abstract:

Decision trees with hard decision nodes stand apart from other machine learning models in their interpretability, fast inference and automatic feature selection -- in addition to their ability to handle naturally nonlinear data, multiclass classification and regression, discontinuous predictor functions, and discrete input features. This has made decision trees widespread in certain practical applications for decades, in spite of the fact that their predictive accuracy is generally considered inferior to that of other models. In fact, this perceived lower accuracy is greatly due to the lack of a good learning algorithm. Learning a decision tree from data is a difficult optimization problem because of the search over discrete tree structures and the lack of gradient information over the node parameters. The most widespread tree learning algorithms in practice (CART, C4.5 and variations of them) date back to the 1980s and are based on greedily growing the tree structure and setting the node parameters via a heuristic criterion. We present a new algorithm, tree alternating optimization (TAO), that makes it practical for the first time to learn decision trees with a guarantee that the classification error is systematically reduced. Starting with a given tree structure and node parameters, TAO iteratively updates the parameters so that the resulting tree has the same or a smaller structure and provably reduces or leaves unchanged the classification error. TAO can be applied to both axis-aligned and oblique trees and our experiments show it consistently outperforms various other algorithms while being highly scalable to large datasets and trees. Further, TAO can handle a sparsity penalty, so it can learn "sparse oblique trees", having few nonzero parameters at each node. This combines the best of axis-aligned and oblique trees: flexibility to model correlated data, low generalization error, fast inference, smaller trees and interpretable nodes that involve only a few features in their decision. This makes decision trees attractive even in tasks where they were traditionally unsuitable, such as MNIST handwritten digit classification. This is joint work with my PhD students Pooya Tavallali and Arman Zharmagambetov.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Julian Thomé

Friday, January 11, 2019

10:45am Meeting room 302 (Mountain View), level 3

Julian Thomé, Junior researcher, ShiftLeft

Automated Security Code Analysis Made Easy

Abstract:

Security auditing, i.e., the examination of the source code for the purpose of detecting vulnerabilities, helps to detect vulnerabilities during the early phases of software development. When performed manually, this task can be laborious, error-prone and does not scale to large software systems. Over the course of the last years, a lot of research has been done with regard to approaches in the areas of Static Analysis, Symbolic Execution and Constraint Solving which aim to make security auditing more effective and cost-efficient. In this presentation, we will see how we at ShiftLeft automate security auditing by using a pragmatic approach, i.e., the combination of techniques proposed by the research community and security expert knowledge, which allows us to support different languages/frameworks and scale to large software systems. We will also see some examples with a live demonstration of our security auditing tool Ocular.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Invited Talks - 2018