While reinforcement learning methods are steadily gaining in popularity and relevance, complex models often lead to non-transparent behavior and their usefulness is tied to narrowly defined tasks. The (RL)3 project aims to improve both interpretability and transferability in reinforcement learning by using understandable data representations as well as rule-based simplifications of neural networks.

Moritz Lange and Prof. Laurenz Wiskott focus on finding data representations to improve the training of reinforcement learning agents and the interpretability of their decisions. The work of Raphael Engelhardt and Prof. Wolfgang Konen aims at developing automated ways to extract human-understandable rules from trained agents.

Caption: In (RL)3 representation, reinforcement and rule learning should complement each other. While representation learning allows for a low dimensional input space that facilitates reinforcement learning, rule-based learning will allow insight into the uncovered causalities and operation of the trained system. Illustration: Christoph J Kellner, Studio Animanova

Project Overview

Reinforcement learning is an approach to AI in which an agent learns to dynamically interact with its environment to achieve a certain goal. These agents, which are becoming impressively successful across applications from playing games to controlling industrial processes, are commonly based on deep neural networks. However, despite their usefulness, neural networks are notoriously seen as black-box models: their complexity makes them hard to understand and to reason about their decisions. Additionally, the resulting complex and highly specific reinforcement learning algorithms end up being heavily tailored towards specific tasks. (RL)3 will improve the interpretability and transferability of those algorithms to achieve more understandable, more predictable reinforcement learning approaches that are ultimately more secure and easier to apply.

Creating understandable data representations and formulating decision processes as simple rules are among the most efficient approaches to achieve interpretability. Our representation learning research will focus on unsupervised techniques that can learn interpretable data representations independent of specific tasks. Our rule-learning research will simultaneously investigate methods for transforming complex, high-performing reinforcement learning models into simple rules which can be interpreted and modified by domain experts. The developed approaches will be tested on games and in industrial applications.


Preliminary Results

We developed a first approach of rule learning based on observing a trained reinforcement learning agent interacting with its environment. From the recorded data, containing the environment’s state and the corresponding action of the agent, we induce decision trees. For simple benchmark problems we could show that human-readable decision trees of very limited complexity perform equally well as the black-box deep reinforcement agents they are based on.
Our results have been published as an extended abstract and presented during the poster session of the workshop „Trustworthy AI in the Wild“ at KI 2021 – 44th German Conference on Artificial Intelligence

Project Publications

  • Raphael C. Engelhardt, Moritz Lange, Laurenz Wiskott and Wolfgang Konen „Shedding Light into the Black Box of Reinforcement Learning“. In: KI 2021 44th German Conference on Articial Intelligence. Workshop on Trustworthy AI in the Wild (Sept. 27, 2021). 2021. eprint: https://dataninja.nrw/wp-content/uploads/2021/09/1_Engelhardt_SheddingLight_Abstract.pdf
  • Raphael C. Engelhardt, Moritz Lange, Laurenz Wiskott and Wolfgang Konen „Shedding Light into the Black Box of Reinforcement Learning (poster)“. In: KI 2021 44th German Conference on Articial Intelligence. Workshop on Trustworthy AI in the Wild (Sept. 27, 2021). 2021. eprint: https://dataninja.nrw/wp-content/uploads/2021/09/1_Engelhardt_SheddingLight.pdf
  • Raphael C. Engelhardt, Moritz Lange, Laurenz Wiskott, and Wolfgang Konen. Sample-based rule
    extraction for explainable reinforcement learning. In Giuseppe Nicosia, Varun Ojha, Emanuele
    La Malfa, Gabriele La Malfa, Panos Pardalos, Giuseppe Di Fatta, Giovanni Giuffrida, and Renato
    Umeton, editors, Machine Learning, Optimization, and Data Science, pages 330–345, Cham, 2023.
    Springer Nature Switzerland. doi:10.1007/978-3-031-25599-1_25
  • Raphael C. Engelhardt, Marc Oedingen, Moritz Lange, Laurenz Wiskott, and Wolfgang Konen.
    Iterative oblique decision trees deliver explainable rl models. Algorithms, 16(6), 2023. doi:10.3390/a16060282



    1. 1.
      N. Escalante A, Wiskott L. Improved graph-based SFA: information preservation complements the slowness principle. Machine Learning. 2019;109:999-1037.
    2. 2.
      S. Bagheri, M. Thill, P. Koch, W. Konen. Online Adaptable Learning Rates for the Game Connect-4. IEEE Transactions on Computational Intelligence and AI in Games. 2016;8:33-42. doi:10.1109/TCIAIG.2014.2367105
    3. 3.
      Konen W, Bagheri S. Reinforcement Learning for N-Player Games: The Importance of Final Adaptation. In: Vasile M, Filipic B, eds. 9th International Conference on Bioinspired Optimisation Methods and Their Applications (BIOMA) . ; 2020. http://www.gm.fh-koeln.de/ciopwebpub/Konen20b.d/bioma20-TDNTuple.pdf
    4. 4.
      Legenstein, Robert AND Wilbert, Niko AND Wiskott, Laurenz. Reinforcement Learning on Slow Features of High-Dimensional Input Streams. PLOS Computational Biology. 2010;6:1-13. doi:10.1371/journal.pcbi.1000894