IMDEA Software

IMDEA initiative

Home > Events > Invited Talks > 2011 > Online Testing of Deployed Federated and Heterogeneous Distributed Systems

Dejan Kostic

Friday, October 7, 2011

11:00am IMDEA conference room

Dejan Kostic, Assistant Professor, Ecole Polytechnique Federale de Lausanne, Switzerland

Online Testing of Deployed Federated and Heterogeneous Distributed Systems

Abstract:

It is notoriously difficult to make distributed systems reliable. This becomes even harder in the case of the widely-deployed systems that are heterogeneous (multiple implementations) and federated (multiple administrative entities). The set of routers in charge of the Internet’s inter-domain routing is a prime example of such a system.

We argue that a key step in making these systems reliable is the need to automatically explore the system behavior to check for potential faults. In this talk, I will describe the design and implementation of DiCE, a system for online testing of heterogeneous and federated distributed systems. DiCE runs concurrently with the production system by leveraging distributed checkpoints and isolated communication channels. DiCE orchestrates the exploration of relevant system states by controlling the inputs that drive system actions. While respecting privacy among different administrative entities, DiCE detects faults by checking for violations of properties that capture the desired system behavior. We demonstrate the ease of integrating DiCE with a BGP router and a DNS server, the building blocks of two vital services in the Internet. Our evaluation in the testbed shows that DiCE quickly and successfully detects three important classes of faults, resulting from configuration mistakes, policy conflicts and programming errors.

Joint work with Marco Canini, Vojin Jovanovic, Daniele Venzano, Gautam Kumar, Dejan Novakovic, Boris Spasojevic, and Olivier Crameri.