Two new papers accepted at AAAI 2025

Researches in Formela lab discover algorithmic advances in planning for autonomous agents.

12 Dec 2024

Formela has two papers accepted at AAAI 2025, a top-tier venue for research in artificial intelligence.

The paper Threshold UCT: Cost-Constrained Monte Carlo Tree Search with Pareto Curves by M. Kurečka, V. Nevyhoštěný, P. Novotný, and V. Unčovský presents a new algorithm for optimizing performance of autonomous agents under constraints on the riskiness of their behavior. The paper provides experiments on challenging benchmarks, such as model of autonomous vehicle routing in Manhattan, which demonstrates that the proposed method outperforms the existing ones.

The paper Multiple Mean-Payoff Optimization under Local Stability Constraints by D. Klaška, A. Kučera, V. Kůr, V. Musil, and V. Řehák considers the problem of optimizing the long-run average performance of an autonomous agent while maintaining the stability of agent's behavior, i.e. preventing fluctuations in its performance. The paper provides the first efficient algorithmic solution for this problem.

Abstracts are provided below.

Running from 1980, the AAAI Conference on Artificial Intelligence is one of the top venues for presenting AI research. Members of our lab will present their papers at the next iteration of the conference held on February 25 - March 4 in Philadelphia, Pennsylvania, USA.

Abstracts

M. Kurečka, V. Nevyhoštěný, P. Novotný, and V. Unčovský: Threshold UCT: Cost-Constrained Monte Carlo Tree Search with Pareto Curves

Constrained Markov decision processes (CMDPs), in which the agent optimizes expected payoffs while bounding expected cost, have emerged as a leading framework for safe reinforcement learning. Among algorithms for planning and learning in CMDPs, methods based on Monte Carlo tree search (MCTS) have particular importance due to their efficiency and extendibility to more complex frameworks (such as games). Existing MCTS-based techniques for CMDPs exhibit several conceptual limitations, such as cost-agnostic tree search or unsound cost threshold updates. In this paper, we introduce Threshold UCT, an online MCTS-based planning algorithm for CMDPs, which overcomes these limitations by making decisions based on the estimates of the cost-payoff trade-offs throughout the search tree. Our experiments demonstrate that our approach significantly outperforms state-of-the-art methods from the literature

D. Klaška, A. Kučera, V. Kůr, V. Musil, and V. Řehák: Multiple Mean-Payoff Optimization under Local Stability Constraints

The long-run average payoff per transition (mean payoff) is the main tool for specifying the performance and dependability properties of discrete systems. The problem of constructing a controller (strategy) simultaneously optimizing several mean payoffs has been deeply studied for stochastic and game-theoretic models. One common issue of the constructed controllers is the instability of the mean payoffs, measured by the deviations of the average rewards per transition computed in a finite ``window'' sliding along a run. Unfortunately, the problem of simultaneously optimizing the mean payoffs under local stability constraints is computationally hard, and the existing works do not provide a practically usable algorithm even for non-stochastic models such as two-player games. In this paper, we design and evaluate the first efficient and scalable solution to this problem applicable to Markov decision processes.