Spring School 2022 – dataninja

DataNinja Spring School 2022

Spring School Information

23^rd to 25^th of March 2022, virtual event of morning sessions (9 AM to 12:30 PM, CET); plus two evening lectures (on 23^rd and 24^th of March at 4:15 PM); there will be a poster session on 23^rd of March at 5:15 PM.
DataNinja Spring School 2022
Organizers: Barbara Hammer, Malte Schilling

We invite you to our upcoming spring school on the topic: Artificial Intelligence – perspectives and challenges of real data. Our Spring School is aimed at PhD students as well as master students or interested researchers from the broad area of Artificial Intelligence and Machine Learning. It is our goal to provide in-depth tutorials on current hot topics which are spanning the spectrum of current Machine Learning approaches with a specific focus on explainable models that allow for inspection. Furthermore, the tutorials are geared towards providing hands-on experiences and empowering the participants to directly apply or transfer methods onto their own tasks or problems.

As the Spring School is aiming at application of Machine Learning in real world challenges, we also want to foster exchange between participants and want to know who is attending and what current typical tasks are. Therefore, there will be, first, a social platform (gather.town, works best in Chrome or Firefox Browser) as an open space to meet and discuss lectures, individual topics or tasks (held in gather.town). And, secondly, we will host a poster session. It would be great if you share with everybody on what you are working: Participants are invited to submit a contribution (via email: contact@dataninja.nrw) as an extended abstract (maximum 2 pages in length). Contributions will be reviewed and selected by the organizers. The workshop contributions will appear as online proceedings on our webpage.
We want to give researchers a chance to present their (ongoing or planned) work. But we also want to provide a forum for relevant work that has recently been published in journals and other conferences.

Submission: Please submit until the 22nd of March 2022. There will be a poster prize for the best poster (certificate plus 300 Euro). The poster session will be done in gather.town — there is a board where you can hang a poster or information on your contribution. Please submit your poster (as a file, could be a complete poster, but also anything helping explain your approach, e.g. the main images, results etc.) until early Wednesday, 23rd of March (before 12 PM).

There will be an open gather.town environment to meet other people at the spring school – you can share or discuss the topics of the talks or your own research (or whatever else suits you). We are happy to have you there – and will have a nice small poster session on Wednesday evening.

Schedule, overview:

	Wednesday, 23.3.2022	Thursday, 24.3.2022	Friday, 25.3.2022
9:00 to 10:30	Explainable AI, Gregoire Montavon	Graph Neural Networks, Christopher Morris	Gaussian Processes & their Interpretability, Markus Lange-Hegermann
10:30	Break	Break
11:00 to 12:30	Symbolic and Sub-symbolic Representations of Knowledge Graphs, Maribel Acosta	AutoML, Bernd Bischl
16:15 to 17:15	Evening Talk: Outracing champion Gran Turismo drivers with deep reinforcement learning, Kaushik Subramanian	Evening Talk: Ethics of AI: A Quick Overview, Rainer Mühlhoff
18.00 to 20:00		LabTour through CITEC

Schedule DataNinja Spring School 2022

Overall, the spring school aims at a multidisciplinary perspective on key aspects and challenges of Trustworthy AI and Machine Learning. The three half-day sessions will consist of two tutorials each day in the morning. On Wednesday and Thursday, we will have two additional evening talks (at 4:15 PM).

Wednesday, 23.3.2022

Explainable AI, Grégoire Montavon – 9 AM

On Wednesday, 23rd of March, 9 AM (CET)

Lecturer: Grégoire Montavon is a Research Associate in the Machine Learning Group at the Technische Universität Berlin, and Junior Research Group Lead in the Berlin Institute for the Foundations of Learning and Data (BIFOLD). His current research is on advancing the foundations and algorithms of explainable AI (XAI) in the context of deep neural networks. One particular focus is on closing the gap between existing XAI methods and practical desiderata.

Summary: The tutorial will focus on Explainable AI (XAI) in the context of deep neural networks. It will be organized in three parts of 30 minutes each:

Part 1 will cover the different components of XAI (model, explanation, user), practical motivations, desiderata of an explanation system, and types of explanations.

Part 2 will present methods of XAI, with a focus on the problem of attribution. It will cover common techniques such as Shapley Value, Integrated Gradients, and Layer-wise Relevance Propagation. Theoretical aspects of these explanation techniques will also be presented.

Part 3 will showcase applications of XAI for model validation and scientific discovery. Along with the presented application scenarios, we will show how basic explanation methods can be extended to address application needs (e.g. XAI for unsupervised learning, and XAI for graphs).

Further Material:

Slides from the tutorial.
Review Paper on XAI: Samek et al. (2021), “Explaining Deep Neural Networks and Beyond: A Review of Methods and Applications”, in Proceedings of the IEEE, vol. 109, no. 3, pp. 247-278, March 2021, doi: 10.1109/JPROC.2021.3060483.
Tutorial on LRP: Implementing Layer-Wise Relevance Propagation, by Gregoire Montavon

Symbolic and Sub-symbolic Representations of Knowledge Graph, Maribel Acosta – 11 AM

On Wednesday, 23rd of March, 11 AM (CET)

Lecturer: Maribel Acosta is a Junior Professor for Databases and Information Systems at Ruhr-University, Bochum. Her research lies in the areas of Databases, Semantic Web, and Artificial Intelligence, where she has extensively investigated novel techniques to query Knowledge Graphs efficiently and effectively. More recently, she has studied the application of Machine Learning approaches to solve diverse problems in the management of Knowledge Graphs.

Summary: Knowledge Graphs (KG) allow for representing inter-connected facts or statements annotated with semantics. In KGs, concepts and entities are typically modeled as nodes while their connections are modeled as directed and labeled edges, creating a graph. In the last years, KGs have become core components of intelligent systems, including recommender systems, chatbots, and advanced analytics.

KGs are traditionally represented with graph-based data models enhanced with ontologies, which allow for encoding the meaning of statements that go beyond simple or raw data. In the first part of this tutorial, we will learn about the symbolic representation of KGs and how vocabularies and ontologies like RDF/S and OWL allow for deriving implicit knowledge from KGs.

More recently, with the advances in Machine Learning, sub-symbolic representations of KGs have emerged. These representations in the form of embeddings or low-dimensional vectors allow for uncovering hidden connections or patterns in the KG. In the second part of this tutorial, we will learn about KG embeddings and their application to the problem of KG completion.

To conclude the tutorial, we will discuss about open challenges in the problem of KG completion and current research directions for neuro-symbolic representations of KGs.

Material: Slides from the tutorial.

Evening Talk: Outracing champion Gran Turismo drivers with deep reinforcement learning, Kaushik Subramanian – 4:15 PM

Lecturer: Kaushik Subramanian is a Senior Research Scientist at Sony AI who is part of the team that recently demonstrated their DRL framework in a successful race against professional e-sports drivers in Gran Turismo.

Summary: Many potential applications of artificial intelligence involve making real-time decisions in physical systems while interacting with humans. Automobile racing represents an extreme example of these conditions; drivers must execute complex tactical maneuvers to pass or block opponents while operating their vehicles at their traction limits. Racing simulations, such as the PlayStation game Gran Turismo, faithfully reproduce the nonlinear control challenges of real race cars while also encapsulating the complex multi-agent interactions. In this talk, I will describe the work done as a collaboration between Sony AI, Polyphony Digital and Sony Interactive Entertainment on how we trained agents for Gran Turismo that can compete with the world’s best e-sports drivers. We combine state-of-the-art model-free deep reinforcement learning algorithms with mixed scenario training to learn an integrated control policy that combines exceptional speed with impressive tactics. In addition, we construct a reward function that enables the agent to be competitive while adhering to racing’s important, but under-specified, sportsmanship rules. We demonstrate the capabilities of our agent, Gran Turismo Sophy, by winning a head-to-head competition against four of the world’s best Gran Turismo drivers. By describing how we trained championship-level racers, we illuminate the possibilities and challenges of using these techniques to control complex dynamical systems in domains where agents must respect imprecisely defined human norms.

References: Wurman, P.R., Barrett, S., Kawamoto, K., MacGlashan, J., Subramanian, K., et al. Outracing champion Gran Turismo drivers with deep reinforcement learning. Nature 602, 223–228 (2022). https://doi.org/10.1038/s41586-021-04357-7

More details: https://www.gran-turismo.com/us/gran-turismo-sophy/

Poster Session: 5:30 PM, will be held in gather.town (join us here)

We will host a virtual poster session in gather.town. Please join us and let us know what you are working on. This can be your recent findings, current, unfinished and ongoing work, or already an overview of already published work. We would like to ask you to submit a short extended abstract of up to two pages (send it to contact@dataninja.nrw ), until the end of the 22nd of March.

The session will be held in gather.town — where you get a small personal area: This is a small space in which you can present to people joining you in that area and discuss with these people. There is also a Board to hang a poster or information on your contribution. Please submit your poster (as a file, could be a complete poster, but also anything helping explain your approach, e.g. the main images, results etc.) until early Wednesday, 23rd of March (before 12 PM). For some tipps see: https://support.gather.town/help/poster-booth-sets#tips

There will be a poster prize (certificate and 300 Euro).

Thursday, 24.3.2022

Graph Neural Networks, Christopher Morris – 9 AM

On Thursday, 24th of March, 9 AM (CET)

Lecturer: Christopher Morris is a postdoc at the Mila – Quebec AI Institute and McGill University in the group of Siamak Ravanbakhsh. He works in the area of machine learning methods for graphs, network, and relational data.

Summary: This lecture gives an overview of machine learning for graphs, focusing on graph neural networks (GNNs). That is, we overview classical, non-neural algorithms, introduce the basic concepts behind GNNs, and discuss state-of-the-art architectures. Further, we discuss GNNs’ relation to the Weisfeiler-Leman algorithm, a simple heuristic for the graph isomorphism problem, and design provably more powerful architectures. Further, we show how to implement GNNs using the PyTorch Geometric framework.

Slides: Christopher Morris Tutorial on GNN Slides.

AutoML, Bernd Bischl – 11 AM

On Thursday, 24th of March, 11 AM (CET)

Lecturer: Bernd Bischl holds the chair of Statistical Learning and Data Science at the Department of Statistics at the Ludwig-Maximilians-University Munich and is a co-director of the Munich Center for Machine Learning (MCML), one of Germany’s national competence centers for ML. His research interests include AutoML, model selection, interpretable ML, as well as the development of statistical software.

References:

Bischl, Bernd, et al. “Hyperparameter optimization: Foundations, algorithms, best practices and open challenges.” arXiv preprint arXiv:2107.05847 (2021).

Evening Talk: Ethics of AI: A Quick Overview, Rainer Mühlhoff – 4:15 PM

Lecturer: Rainer Mühlhoff is a Philosophy professor at the University of Osnabrück. He does research and teaches on the social implications of artificial intelligence and digital media.

Summary: This introduction will take the participants of the DataNinja spring
school on a brief tour to the philosophical discipline of ethics of AI.
We will look at some of the most salient issues in the current
controversy about the societal impact of AI technology. A main focus
will be on data ethics and data-driven machine learning applications,
where topics such as unfair bias and new privacy issues arise. We will
also discuss the role of technical trends such as Explainable AI in
relation to responsible AI.

Friday, 25.3.2022

On Gaussian Processes and their Interpretability, Markus Lange-Hegermann – 9 AM

On Friday, 25th of March, 9 AM (CET)

Lecturer: Markus Lange-Hegermann is a professor of Mathematics and Data Science at the Ostwestfalen-Lippe University of Applied Sciences in Lemgo. His current research interests are in the application of mathematics, statistics, and machine learning.

Summary: Gaussian processes appear as functional priors in various corners of maschine learning, e.g. in low data regimes, as limits of infinitely wide neural networks, or in Bayesian optimization. In this tutorial talk, we discuss Gaussian process regression models and their standard Bayesian inference algorithms. Their predictions come with a variance as uncertainty measure, which is used to keep safety guarantees in applications or to steer Bayesian optimization.
The behaviour of Gaussian processes has a well-understood mathematical description in terms of reproducing kernel Hilbert spaces. This allows to systematically incorporate prior knowledge about the data domain into a Gaussian process model and automatically choosing the best interpretation of data. We discuss several such examples of making Gaussian processes interpretable, e.g. parameter identification and combining differential equations with data.

Material: Slides from the tutorial.