Risk-Aware Decision Making

Most real-world decisions contain tradeoffs between achievable rewards and accepted losses. Current mainstream game-theoretic methods typically reduce this tradeoff to a single maximum reward objective by encoding the risks into the (negative) rewards. However, this leads to the loss of control over the tradeoff. We develop techniques for solving games where both the reward and the risk are explicitly and quantitatively considered in players' decisions.

Risk-Aware Reinforcement Learning

Reinforcement learning (RL) is a sub-area of machine learning concerned with learning agents acting in an interactive environment. Recent breakthroughs in gameplaying have been achieved using methods of reinforcement learning. We concentrate on developing RL methods for risk-aware agents that act to maximize their rewards while keeping their risk at an acceptable level. Our research comprises theoretical reasoning about various classes of agents/environments and experimental evaluation of developed algorithms.

RAlph - We develop a system for risk-aware reinforcement learning based on enhanced variants of the AlphaZero algorithm. The RAlph allows for an explicit specification of acceptable risk levels controlling thus aspects of its behavior that are typically difficult to express using rewards only. So far, the RAlph has been developed only for fully observable Markov decision processes. Possible research topics:

Extensions of RAlph towards more complex formalisms such as POMDP, multi-player games, etc.
Augmentation of more advanced RL algorithms, such as MuZero, with risk-awareness.
Theoretical study of risk-aware bandit problems.

What We Do

Formal reasoning

We study the theoretical properties of Risk-ware agents, such as the existence of optimal strategies and their structure.
We consider the theoretical complexity of computing optimal strategies for various types of games.

Algorithms

We develop algorithms synthesizing strategies in various types of games that are aware of the risks involved in maximizing the rewards.
We combine the methods of reinforcement learning, Monte Carlo tree search, linear programming, etc.

Experiments

We develop experimental implementations of our reinforcement learning algorithms.
We extensively experiment with game arenas such as various types of Maze problems and even real (simple) games such as Bomberman.