Elena Gutierrez, PhD Student, IMDEA Software Institute
There exist two main formalisms to describe context-free languages: context-free grammars and pushdown automata. In fact, there exists a standard algorithm for converting a pushdown automaton into a context-free grammar that preserves exactly the language. One may wonder which is more economical in terms of size. It has been shown before that, in general, pushdown automata are more concise to represent a particular language than context-free grammars.
Now, we consider the open question: If we are interested in preserving, not each word exactly, but just the number of occurrences of each symbol, regardless of the order: are pushdown automata more concise?
In this talk, I show that the answer is still positive. The number of occurrences of each symbol in a word is called its Parikh image. I present an infinite family of pushdown automata with n states, p stack symbols and size proportional to np, for which every context-free grammar that preserves the Parikh image must have size proportional to n²p. To present this result, I introduce a new tree-based semantics to describe runs of a pushdown automaton.