Iniciativa IMDEA

Inicio > Eventos > Charlas Invitadas

Charlas Invitadas

Samer Hassan

Wednesday, March 14, 2018

10:45am Meeting room 302 (Mountain View), level 3

Samer Hassan, Associate Research Professor, Berkman Klein Center at Harvard University & Universidad Complutense de Madrid

Decentralized Blockchain-based Organizations for Bootstrapping the Collaborative Economy

Abstract:

Today's Collaborative Economy has three challenges: 1. It relies on centralized hubs, that in order to monetize use the massive collection of personal data as business models. 2. There is a big difference of power between the owners of the infrastructure and the user communities, where decisions are concentrated on the latter. 3. The economic profits derived from the communities’ activity are again concentrated in the owners of the infrastructure. Can we build platforms that are decentralized, democratic, and where profit is distributed? In this talk, I will present P2P Models (p2pmodels.eu), a new ERC 1.5M€ research project to build Blockchain-powered organizations which are decentralized, democratic and distribute their profits, in order to boost a new type of Collaborative Economy. The project has three legs: 1. Infrastructure: To provide a software framework to build decentralized infrastructure for Collaborative Economy organizations, providing building blocks to build agent-mediated "Decentralized Autonomous Organizations" (DAOs). 2. Governance: To enable democratic-by-design models of governance for communities, whose rules are, at least partially, encoded in the software to ensure higher levels of equality (using "smart contracts" of those DAOs). 3. Economics: To enable value distribution models which are interoperable across organizations, improving the economic sustainability of both contributors and organizations.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Zsolt István

Friday, February 23, 2018

10:45am Meeting room 302 (Mountain View), level 3

Zsolt István, PhD Student, ETH Zurich, Switzerland

Caribou -- Intelligent Distributed Storage for the Datacenter

Abstract:

In the era of Big Data, datacenter and cloud architectures decouple compute and storage resources from each other for better scalability. While this design choice enables elastic scale-out, it also causes unnecessary data movements. One solution is to push parts of the computation down to storage where data can be filtered more efficiently. Systems that do this are already in use and rely either on regular server machines as storage nodes or on network attached storage devices. Even though the former provide complex computation and rich functionality since there are plenty of conventional cores available to run the offloaded computation, this solution is quite inefficient because of the over-provisioning of computing capacity and the bandwidth mismatches between storage, CPU, and network. Networked storage devices, on the other hand, are better balanced in terms of bandwidth but at the price of offering very limited options for offloading data processing. With Caribou, we explore an alternative design that offers rich offloading functionality in a much more efficient package (size, energy consumption) than regular servers, but without sacrificing features such as a general purpose interface, reliable networking or replication for fault tolerance. Our FPGA-based prototype system has been designed such that the internal data management logic can saturate the network and the processing logic can saturate the storage bandwidth without either of the two being over-provisioned. Each Caribou node is a stand-alone FPGA that implements all functionality necessary for a distributed data store, including replication that is typically not supported by FPGA-based solutions. Caribou has been released as open source. Its modular design and extensible processing pipeline make it a convenient platform for exploring domain-specific processing inside storage nodes.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Deepak Padmanabhan

Wednesday, February 21, 2018

10:45am Meeting room 302 (Mountain View), level 3

Deepak Padmanabhan, Lecturer, Queen's University Belfast, United Kingdom

Multi-view Data Analytics

Abstract:

Conventional unsupervised data analytics techniques have largely focused on processing datasets of single-type data, e.g., one of text, ECG, Sensor Readings and Image data. With increasing digitization, it has become common to have data objects having representations that encompass different "kinds" of information. For example, the same disease condition may be identified through EEG or fMRI data. Thus, a dataset of EEG-fMRI pairs would be considered as a parallel two-view dataset. Datasets of text-image pairs (e.g., a description of a seashore, and an image of it) and text-text pairs (e.g., problem-solution text, or multi-language text from machine translation scenarios) are other common instances of multi-view data. The challenge in multi-view exploratory analytics is about effectively leveraging such parallel multi-view data to perform analytics tasks such as clustering, retrieval and anomaly detection. This talk will cover some emerging trends in processing multi-view parallel data and outline the speaker's research plan in the area. This talk will cover two recent research publications authored by the speaker, one on multi-view clustering, and another on multi-view dimensionality reduction.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Pedro Reviriego

Tuesday, February 13, 2018

10:45am Meeting room 302 (Mountain View), level 3

Pedro Reviriego, Associate Professor, Nebrija Universidad, España

Reducing the False Positive Rate for Correlated Queries with the Adaptive Cuckoo Filter (ACF)

Abstract:

In this talk we will present the adaptive cuckoo filter (ACF), a data structure for approximate set membership that extends cuckoo filters by reacting to false positives, removing them for future queries. As an example application, in packet processing queries may correspond to flow identifiers, so a search for an element is likely to be followed by repeated searches for that element. Removing false positives can therefore significantly lower the false positive rate. The ACF, like the cuckoo filter, uses a cuckoo hash table to store fingerprints. We allow fingerprint entries to be changed in response to a false positive in a manner designed to minimize the effect on the performance of the filter. We will show that the ACF is able to significantly reduce the false positive rate by presenting both a theoretical model for the false positive rate and simulations using both synthetic data sets and real packet traces.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Manuel Bravo

Tuesday, January 30, 2018

10:45am Meeting room 302 (Mountain View), level 3

Manuel Bravo, PhD Student, University of Lisboa, Portugal

Towards a Distributed Metadata Service for Causal Consistency

Abstract:

The problem of ensuring consistency in applications that manage replicated data is one of the main challenges of distributed computing. The observation that delegating consistency management entirely to the programmer makes the application code error prone and that strong consistency conflicts with availability has spurred the quest for meaningful consistency models, that can be supported effectively by the data service. Among the several invariants that may be enforced, ensuring that updates are applied and made visible respecting causality has emerged as a key ingredient among the many consistency criteria and client session guarantees that have been proposed and implemented in the last decade. Mechanisms to preserve causality can be found in systems that offer from weaker to stronger consistency guarantees. In fact, causal consistency is pivotal in the consistency spectrum, given that it has been proved to be the strongest consistency model that does not compromise availability. In this talk, I present a novel metadata service that can be used by geo-replicated data services to efficiently ensure causal consistency across geo-locations. Its design brings two main contributions: • It eliminates the tradeoff between throughput and data freshness inherent to previous solutions. To avoid impairing throughput, our service keeps the size of the metadata small and constant, independently of the number of clients, servers, partitions, and locations. By using clever metadata propagation techniques, we also ensure that the visibility latency of updates approximates that of weak-consistent systems that are not required to maintain metadata or to causally order operations. • It allows data services to fully benefit from partial geo-replication, by implementing genuine partial replication, requiring datacenters to manage only the metadata concerning data items replicated locally.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Miguel Á. Carreira-Perpiñán

Friday, January 12, 2018

10:45am Meeting room 302 (Mountain View), level 3

Miguel Á. Carreira-Perpiñán, Professor, Universidad de California (Merced), USA

Model compression as constrained optimization, with application to neural nets

Abstract:

Deep neural nets have become in recent years a widespread practical technology, with impressive performance in computer vision, speech recognition, natural language processing and many other applications. Deploying deep nets in mobile phones, robots, sensors and IoT devices is of great interest. However, state-of-the-art deep nets for tasks such as object recognition are too large to be deployed in these devices because of the computational limits they impose in CPU speed, memory, bandwidth, battery life or energy consumption. This has made compressing neural nets an active research problem. We give a general formulation of model compression as constrained optimization. This includes many types of compression: quantization, low-rank decomposition, pruning, lossless compression and others. Then, we give a general algorithm to optimize this nonconvex problem based on the augmented Lagrangian and alternating optimization. This results in a "learning-compression" (LC) algorithm, which alternates a learning step of the uncompressed model, independent of the compression type, with a compression step of the model parameters, independent of the learning task. This simple, efficient algorithm is guaranteed to find the best compressed model for the task in a local sense under standard assumptions. We then describe specializations of the LC algorithm for various types of compression, such as binarization, ternarization and other forms of quantization, pruning, low-rank decomposition, and other variations. We show experimentally with large deep neural nets such as ResNets that the LC algorithm can achieve much higher compression rates than previous work on deep net compression for a given target classification accuracy. For example, we can often quantize down to just 1 bit per weight with negligible accuracy degradation. This is joint work with my PhD students Yerlan Idelbayev and Arman Zharmagambetov.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Charlas Invitadas - 2017