Thaleia Doudali, Assistant Research Professor, IMDEA Software Institute
The massive scale and heterogeneity of current workloads and platforms, such as cloud applications and large machine learning models, break the effectiveness of conventional resource management approaches and create the need for new, custom-tailored systems solutions. The use of machine learning methods can enable robust management decisions, but comes with substantial overheads, practicality and interpretability concerns, therefore it is crucial to enable its practical use. In this talk, I will demonstrate data-driven insights and observations that enable the use of lightweight prediction models for forecasting resource usage and improving upon cloud resource efficiency. In addition, I will describe missed opportunities in the efficient serving of Large Language Model (LLM) inference. Finally, I will conclude with my research vision on the upcoming challenges and directions in system-level resource management.