IMDEA initiative

Home > Events > Invited Talks

Invited Talks

Michael D. Adams

Monday, January 21, 2019

10:45am Meeting room 302 (Mountain View), level 3

Michael D. Adams, Assistant Research Professor, The University of Utah, USA

Advances in Parsing

Abstract:

Parsing is sometimes thought of as a solved problem. However, recent advances show that there is still much to be discovered in this area. This talk reviews two such advances: indentation-sensitive parsing and disambiguation via tree automata. Several popular languages, such as Haskell, Python, and F#, use the indentation and layout of code as part of their syntax. Parsers for these languages currently use ad hoc techniques to handle layout. This talk presents a simple extension to context-free grammars that can express these layout rules, and derives GLR and LR(k) algorithms for parsing these grammars. Dealing with precedence and associativity rules in context free grammars (CFGs) has long posed a challenge. While they can be encoded in the structure of the CFG, the result can be difficult to work with and often obfuscates the language designer’s intent. This talk shows how tree automata can specify all of these while still allowing the CFG to be written in a natural manner. This unified theory subsumes and generalizes these other techniques and provides an elegant means of specifying grammatical restrictions. These advances indicate that there is still much to be discovered about parsing. We have only uncovered the tip of the iceberg


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Miguel Á.  Carreira-Perpiñán

Friday, January 18, 2019

10:45am Meeting room 302 (Mountain View), level 3

Miguel Á. Carreira-Perpiñán, Professor, University of California at Merced, USA

A new way to train decision trees: tree alternating optimization (TAO)

Abstract:

Decision trees with hard decision nodes stand apart from other machine learning models in their interpretability, fast inference and automatic feature selection -- in addition to their ability to handle naturally nonlinear data, multiclass classification and regression, discontinuous predictor functions, and discrete input features. This has made decision trees widespread in certain practical applications for decades, in spite of the fact that their predictive accuracy is generally considered inferior to that of other models. In fact, this perceived lower accuracy is greatly due to the lack of a good learning algorithm. Learning a decision tree from data is a difficult optimization problem because of the search over discrete tree structures and the lack of gradient information over the node parameters. The most widespread tree learning algorithms in practice (CART, C4.5 and variations of them) date back to the 1980s and are based on greedily growing the tree structure and setting the node parameters via a heuristic criterion. We present a new algorithm, tree alternating optimization (TAO), that makes it practical for the first time to learn decision trees with a guarantee that the classification error is systematically reduced. Starting with a given tree structure and node parameters, TAO iteratively updates the parameters so that the resulting tree has the same or a smaller structure and provably reduces or leaves unchanged the classification error. TAO can be applied to both axis-aligned and oblique trees and our experiments show it consistently outperforms various other algorithms while being highly scalable to large datasets and trees. Further, TAO can handle a sparsity penalty, so it can learn "sparse oblique trees", having few nonzero parameters at each node. This combines the best of axis-aligned and oblique trees: flexibility to model correlated data, low generalization error, fast inference, smaller trees and interpretable nodes that involve only a few features in their decision. This makes decision trees attractive even in tasks where they were traditionally unsuitable, such as MNIST handwritten digit classification. This is joint work with my PhD students Pooya Tavallali and Arman Zharmagambetov.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Julian Thomé

Friday, January 11, 2019

10:45am Meeting room 302 (Mountain View), level 3

Julian Thomé, Junior researcher, ShiftLeft

Automated Security Code Analysis Made Easy

Abstract:

Security auditing, i.e., the examination of the source code for the purpose of detecting vulnerabilities, helps to detect vulnerabilities during the early phases of software development. When performed manually, this task can be laborious, error-prone and does not scale to large software systems. Over the course of the last years, a lot of research has been done with regard to approaches in the areas of Static Analysis, Symbolic Execution and Constraint Solving which aim to make security auditing more effective and cost-efficient. In this presentation, we will see how we at ShiftLeft automate security auditing by using a pragmatic approach, i.e., the combination of techniques proposed by the research community and security expert knowledge, which allows us to support different languages/frameworks and scale to large software systems. We will also see some examples with a live demonstration of our security auditing tool Ocular.


Time and place:
10:45am Meeting room 302 (Mountain View), level 3
IMDEA Software Institute, Campus de Montegancedo
28223-Pozuelo de Alarcón, Madrid, Spain


Invited Talks - 2018