Tutorials | The 23rd International Conference on Autonomous Agents and Multi-Agent Systems

T1. Fairness in the sharing economy and stochastic models for MAS

Presenters: Wynita Griggs (Monash University), Jakub Marecek (Czech Technical University in Prague), and Robert Shorten (Imperial College London)

Contact email: jakub.marecek@fel.cvut.cz

Date: Monday 6 May

Duration: 8:30-12:30, 14:00-18:00

Room: Vista 1

Expected background:

– undergraduate probability theory

– interest in applications of multi-agent systems to sharing economy and interest in AI regulation (such as the AI Act in the EU) is beneficial.

Skills to be gained:

– Understanding of guarantees available for stochastic models of multi-agent systems

– Understanding of why additive surge pricing is better than multiplicative surge pricing in Uber

Brief Description:

Numerous problems in the sharing-economy platforms (such as Uber, Airbnb, and TaskRabbit) and virtual power plants (such as Tesla Virtual Power Plant in the US and Next Kraftwerke in Europe), can be modelled as multi-agent systems, when suitably generalizing the deterministic discrete-event systems. In the past decade, many such platforms have been deployed at scale, there has been substantial progress in modelling stochastic systems, and there is much interest in regulating such platforms (cf. renewable energy communities and citizen energy communities in the Renewable Energy Directive and the Internal Electricity Market Directive, or the The Digital Markets Act in the EU). According to some opinions, this will yield a new wave of interest in multi-agent systems. Many novel, fundamental questions arise in connection with the deployment of such generalized multi-agent systems in the sharing economy and beyond, which are not only providing decision support, but actually execute actions (“perform actuation”). First, the participants have only partial information about the system and are not perfectly rational.This is well understood in behavioral economics, but has not been considered in many calculi in multi-agent systems. Stochastic aspects have been studied in modeling agents’ behavior in consensus problems, but this approach has yet to be developed for more general settings. Second, in studying the behavior of a multi-agent system, one should consider both the perspective of the aggregate behaviour, and the perspective of the individual participants, which requires a probabilistic formulation of the associated desiderata. Finally, the number of participants changes over time, which limits the direct applicability of results from control theory and game theory. Within multi-agent systems, control and verification have often been seen as niche subjects. This lack of interest is rooted in the fact that state-space approaches to supervision and verification of modular discrete-event systems are PSPACE-complete even in deterministic calculi, and are undecidable in some stochastic calculi. While space-efficient methods may still be possible for some special cases, radically novel methods are required to manage the state-explosion problem. In a string of recent papers, we have developed guarantees for such stochastic models of multi-agent systems, utilizing non-trivial conditions from non-linear control theory (incremental input-to-state stability), and conditions from applied probability (contractivity on average). These allow for the study of both ergodic properties of the multi-agent systems, and fairness properties from the individual point of view.

T2. Handling Multiple Objectives in Single and Multi-Agent Reinforcement Learning

Presenters:

Roxana Rădulescu (contact@roxanaradulescu.com)

Ann Nowé (ann.nowe@vub.be)

Peter Vamplew (p.vamplew@federation.edu.au)

Date: Monday 6 May

Duration: 8:30-12:30

Room: Crystal room 2

Expected background:

Previous experience (however brief) in either game theory, reinforcement learning, or utility theory is desirable but not required.

Expected gain skills:

Understanding of the theory on multi-objective decision-making, in both single and multi-agent settings
Overview of multi-objective approaches and tools for single and multi-agent settings

Brief description:

Many, if not most, real-world decision problems have more than a single objective and, often, more than one agent. As the multi-objective aspect fundamentally changes everything you thought you knew about decision-making in the context of reinforcement learning, in this tutorial, we start from what it means to care about more than one aspect of the solution, and why you should consider this when modelling your problem domain. Then we go into what agents should optimise for in multi-objective settings, and discuss different assumptions, culminating in the corresponding taxonomies for both multi-objective single and multi-agent systems, and accompanying solution concepts. We advocate and present a utility-based approach as a framework for such settings and also discuss how this framework can support and address additional ethical concerns such as transparency and accountability. We then follow up with a few initial multi-objective reinforcement learning

T3. Bandit Learning in Mechanism Design: Matching Markets and Beyond

Presenters:

Shuai Li (Shanghai Jiao Tong University), Fang Kong (Shanghai Jiao Tong University)

Contact email: shuaili8@sjtu.edu.cn

Date: Tuesday 7 May

Duration: 8:30-12:30

Room: Crystal Room 2

Expected Background:

A level of understanding of multi-armed bandits that would typically be acquired in an undergraduate reinforcement learning course.

Some understanding of mechanism design especially matching markets would be helpful (at the level of stable matching).

Expected Gained Skills:

Learning basic models of multi-armed bandits and matching markets, including key concepts, theories, and their applications.

Developing insights into the learning algorithm techniques considering multi-player game interactions in uncertain environments.

Brief Description:

Mechanism design is fundamental in economics and game theory, aiming at developing economic mechanisms to achieve the desired objective. Stable matching is an important problem in this field that characterizes the equilibrium state among agents. Due to that agents usually have uncertain preferences, bandit learning recently attracted substantial research attention in this problem. This line of works mainly focus on two objectives: stable regret which characterizes the utility of agents towards the stable matching, and incentive compatibility which characterizes the robustness of the system. In this tutorial, the participants will first be introduced to the basics of matching markets and multi-armed bandits (MAB). Following this, they will then learn about concrete algorithmic techniques and findings in various matching market scenarios to achieve objectives of stable regret minimization and incentive compatibility. Additionally, the tutorial will discuss the application of bandit learning to other mechanism design challenges, such as auctions, and conclude with a discussion of some classic open problems in the field. No detailed prior knowledge of multi-armed bandits or mechanism design is assumed for this tutorial.

T4. Differentiable Agent-Based Models: Systems, Methods and Applications

Presenters:

Ayush Chopra: ayushc@mit.edu

Arnau Quera-bofarull: arnau.quera-bofarull@cs.ox.ac.uk

Sijin Zhang: sijin.Zhang@esr.cri.nz

Joel Dyer: joel.dyer@cs.ox.ac.uk

Date: Monday 6 May

Duration: 14:00-18:00

Room: Crystal Room 2

Expected Background:

Required: Python programming

Good to have: Pytorch experience, some background in deep learning

Expected Gain Skills:

Engineering million-scale multi-agent simulations, through tensorization and GPU computation

Bridging automatic differentiation to multi-agent systems research

Leveraging modern deep neural networks techniques for calibration and analysis of multi-agent systems

Designing differentiable multi-agent learning algorithms

Deploying multi-agent systems for country-scale decision making.

Summary of tutorial content:

In this tutorial, we will introduce a new paradigm for agent-based models (ABMs), where we leverage automatic differentiation to obtain the simulator’s gradients in a fast and accurate way. We will review automatic differentiation, formally define differentiable agent-based modeling and showcase examples of differentiable ABMs with millions of agents used to study disease spread in multiple countries. We will discuss state-of-the-art methods for simulation, calibration and analysis of differentiable ABMs and present technical advances in modeling and multi-agent learning that enabled them. We will walk through open-source platforms to build differentiable ABMs and attendees will apply these methods on a few comprehensive examples. Finally, we will deep-dive into a real-world implementation of differentiable ABMs for country-scale disease surveillance, in collaboration with the New Zealand crown research institute. The tutorial is of relevance to the AAMAS community as well as local modelers and policy makers in New Zealand.

T5. Recent Developments in Mixed Fair Division

Presenters:
Xinhang Lu (UNSW Sydney), Mashbat Suzuki (UNSW Sydney), and Toby Walsh (UNSW Sydney)
Contact email: xinhang.lu@unsw.edu.au

Date: Tuesday 7 May

Duration: 8:30-12:30

Room: Gallery 4

Expected Background:
No prior knowledge of fair division or game theory is assumed.

Expected Gained Skills:
– Understand the current directions in fair division concerning mixed type of resources.
– Know interesting open questions in those directions.

Brief Description:
Fair division considers the allocation of scarce resources among self-interested agents in such a way that every agent gets a fair share. It is a fundamental problem in society and has received significant attention and rapid developments from the game theory and artificial intelligence communities in recent years. The majority of the fair division literature can be divided along at least two orthogonal directions: goods versus chores, and divisible versus indivisible resources. In this tutorial, besides describing the state of the art, we outline a number of interesting open questions and future directions in three mixed fair division settings:
– indivisible goods and chores,
– divisible and indivisible goods (mixed goods).

T6. Automated Planning

Presenters: Roman Barták (Charles University)

Contact email: bartak@ktiml.mff.cuni.cz

Date: Tuesday 7 May

Duration: 14:00-18:00

Room: Jade Room 3

Expected Background:

– basic understanding of logic expressions (propositional and predicate logic)

– understanding algorithms

– basic knowledge of search techniques (specifically, algorithm A*)

Expected Gained Skills:

– understanding planning conceptual models and planning techniques

– constructing planning domain models

Brief Description:

Automated Planning, that is reasoning about future actions to achieve some goal, is a fundamental capability of autonomous agents. As a model-based approach, it relies on a formal model describing states of the world and how agent’s actions are modifying these states. The tutorial gives a survey of core techniques of automated planning. It introduces the formalism for describing planning domain models and specifying planning tasks. Then, it describes core concepts of planning techniques, namely state-space planning and plan-space planning, and discusses approaches to improve efficiency of planning via additional knowledge added to the model in the form of control rules and hierarchical task models. Examples of domain models for selected problems will be given.

T7. Autonomous agents and ABS applied to Bond Markets: Can we build a better market using ABM’s?

Presenters:
Alicia Vidler (UNSW), Professor Toby Walsh (UNSW: Chief Scientist, Laureate Fellow & Scientia Professor of AI at UNSW’s AI Institute)
Contact email: a.vidler@unsw.edu.au

Date: Monday 6 May

Duration: 14:00-18:00

Room: Gallery 4

Expected Background:
• No specific knowledge of finance or Bonds is expected and participants should feel comfortable attending even if they have no prior
knowledge of financial markets.
• A level of understanding of logic that would typically be acquired in an undergraduate computer science course is assumed.
• A curiosity in the application of (agent based) models to financial markets is recommended.
Expected Gained Skills:
• Understanding theory, and algorithms, applied to financial markets through agent based models.
• Learning about the complexity in applying techniques of agent based models to finance.
Brief Description:
The world of finance presents interesting use cases for new and innovative complex system modeling. The goal of this tutorial is to illustrate and detail an interdisciplinary approach to financial market modeling using agent based systems. Exploring complex adaptive financial trading environments through multi-agent based simulation methods presents an innovative approach within the realm of quantitative finance. Despite the dominance of multi-agent reinforcement learning approaches in financial markets with observable data, there exists a set of systematically significant financial markets that pose challenges due to their partial or obscured data availability. We will explore the techniques, challenges and requirements needed to apply agent based modeling methods to such systematically important bond markets. The tutorial will also necessarily discuss features and aspects of what makes bond markets both unique, and why developing AI and, in particular, explainable AI methods, for markets have application to other areas of finance. This tutorial is designed to be of interest to a wide range of audience. The inter-disciplinary approach will allow a technical focus to be of special interest to industry practitioners whilst
researches will find details around financial market design and applications insightful. We will being with an introduction to financial makers such that no prior technical knowledge of financial markets are required.

T8. Tutorial on Multi-Agent Optimization

Presenters: Filippo Bistaffa (IIIA-CSIC), Gauthier Picard (ONERA, Université de Toulouse)

Contact email: gauthier.picard@onera.fr

Date: Monday 6 May

Duration: 8:30-12:30

Room: Gallery 4

Expected Background:

– constraint-based reasoning and multi-agent systems
– problem modeling (constraint based modeling, linear programming, optimization)

Expected Gained Skills:

– understanding the optimization problems behind coordination issues
– understanding the CF models and algorithms
– understanding the DCOP models and algorithms
– basic knowledge in modeling real life problems into multi-agent optimization problems such as DCOP and CF
– understanding pros and cons of different solution methods, depending on the operational constraints

Brief Description:

Teams of agents often have to coordinate their decisions in a distributed manner to achieve both individual and shared goals. Examples include service-oriented computing, sensor network problems, and smart devices coordination homes problems. Such a problem can be formalized and solved in different ways, but in general the multi-agent coordination process is non-trivial and NP-hard to solve.
In this Tutorial on Multi-Agent Optimization we will discuss two fundamental approaches that have been proposed in the Multi-Agent Systems (MAS) literature to tackle coordination problems, one based on Coalition Formation (CF) and one based on Distributed Constrained Optimization Problems (DCOPs).
In the first part, we will discuss a core concept used to model MAS, i.e., Characteristic Function Games (CFGs), and which optimal and approximated approaches are available to form coalitions in both unconstrained and constrained CFGs. We will conclude this part by establishing an interesting link between the first and the second part, showing how Constrained Optimization Problems (COPs) can also be used to solve CF problems.
In the second part, we will present an accessible and structured overview of the core concepts and models about DCOPs. We will also expound the available optimal and suboptimal approaches to solve DCOPs.
Finally, we will invite the attendance to model some sample problems coming from real applications, and discuss the relevant solutions methods. The tutorial will conclude with the most recurrent challenges and open questions.

T9. Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity

Presenters:

Hung Le thai.le@deakin.edu.au

Hoang Nguyen s223669184@deakin.edu.au

Dai Do s223540177@deakin.edu.au

Date: Monday 6 May

Duration: 8:30-12:30

Room: Gallery 1

Expected background:

– Basic familiarity with reinforcement learning is assumed, and an additional understanding of deep learning and neural networks would be beneficial.

– No special equipment is required, but attendees are encouraged to bring their laptops to experiment with models hands-on.

Expected gain skills:

– Technical Understanding: Participants will gain a deeper understanding of exploration inefficiency in RL and how intrinsic motivation can address this challenge.

– Practical Skills: Attendees will acquire practical skills in implementing several memory-driven curiosity methods for efficient exploration, through hands-on coding demonstrations.

Brief description:

Despite remarkable successes in various domains such as robotics and games, Reinforcement Learning (RL) still struggles with exploration inefficiency. For example, in hard Atari games, state-of-the-art agents often require billions of trial actions, equivalent to years of practice, while a moderately skilled human player can achieve the same score in just a few hours of play. This contrast emerges from the difference in exploration strategies between humans, leveraging memory, intuition and experience, and current RL agents, primarily relying on random trials and errors. This tutorial reviews recent advances in enhancing RL exploration efficiency through intrinsic motivation or curiosity, allowing agents to navigate environments without external rewards. Unlike previous surveys, we analyze intrinsic motivation through a memory-centric perspective, drawing parallels between human and agent curiosity, and providing a memory-driven taxonomy of intrinsic motivation approaches. The talk consists of three main parts. Part A provides a brief introduction to RL basics, delves into the historical context of the explore-exploit dilemma, and raises the challenge of exploration inefficiency. In Part B, we present a taxonomy of self-motivated agents leveraging deliberate, RAM-like, and replay memory models to compute surprise, novelty, and goal, respectively. Part C explores advanced topics, presenting recent methods using language models and causality for exploration. Whenever possible, case studies and hands-on coding demonstrations. will be presented.

T10. Reinforcement Learning for Operations Research: Unlocking New Possibilities

Presenters:

Xiangfeng Wang (East China Normal University), Junjie Sheng (East China
Normal University), Wenhao Li (The Chinese University of Hong Kong, Shenzhen)
Contact email: liwenhao@cuhk.edu.cn

Date: Tuesday 7 May

Duration: 14:00-18:00

Room: Gallery 4

Expected Background:

No prior experience with operations research is required. A basic understanding of reinforcement learning and its principles is recommended to help attendees get the most out of this tutorial.

Expected Gained Skills:

Understand the fundamentals of RL and OR. Comprehend the two typical concerns in OR: the contextual knowledge and fundamental principles of cloud resource scheduling (CRS) and multi-agent pathfinding (MAPF).

Gain knowledge about the SOTA methods, techniques, and models in integrating reinforcement learning with CRS and MAPF. Develop practical skills in applying RL strategies to domain-specific tasks or
challenges within CRS and MAPF.

Brief Description:

This half-day tutorial is meticulously crafted to usher participants into the dynamic intersection of reinforcement learning (RL) and operations research (OR). Our aim is to unfold the immense potential of RL in addressing a broad spectrum of OR challenges, especially for cloud resource scheduling and multi-agent pathfinding. This enriching journey will navigate through key areas including the scope of OR, the synergy between RL and OR, diverse industry case studies (including Huawei Cloud and Geekplus Inc.), and pioneering future directions in both realms. Participants will be immersed in a hands-on learning environment, engaging in interactive sessions and comprehensive case studies. This experience is designed to equip attendees with the skills to apply RL strategies to real-world OR problems effectively. The tutorial caters specifically to RL professionals and enthusiasts eager to expand their horizons into the vast domain of OR. By the conclusion of this tutorial, attendees will not only develop a deep appreciation for the diversity of OR problems but also acquire the capability to devise and implement innovative RL solutions. We encourage an environment of active engagement, inviting attendees to partake
in discussions and share their experiences and perspectives at the confluence of RL and OR.

T11. Towards Causal Foundations of Safe AI

Presenters: Tom Everitt, James Fox, Francis Rhys Ward

Date: Tuesday 7 May

Duration: 8:30-12:30

Room: Jade Room 1

Expected background:
– Basic probability theory.
– Experience with probabilistic graphical models (Bayesian networks) is helpful but not required.

Expected gain skills:
– Knowledge of the SOTA research in causal foundations of agency and safe AI
– Technical competence in analysing incentive concepts in ML system design (e.g., incentives to manipulate user preferences in recommender systems, or to discriminate in hiring algorithms)
– Incentive mitigation methods such as path-specific objectives and a brief description of the tutorial content.
– With great power comes great responsibility.
– Artificial intelligence (AI) is rapidly gaining new capabilities, and is increasingly trusted to make decisions impacting humans in significant ways (from self-driving cars to stock-trading to hiring decisions).
– To ensure that AI behaves in ethical and robustly beneficial ways, we must identify potential pitfalls and develop effective mitigation strategies.
– In this tutorial, we will explain how (Pearlian) causality offers a useful formal framework for reasoning about AI risk, and describe recent work on this topic. In particular, we’ll cover: causal models of agents and how to discover them; causal definitions of fairness, intent, harm, and incentives; and risks from AI such as misgeneralization and preference manipulation, as well as how mitigation techniques including impact measures, interpretability, and path-specific objectives can help address them.

T12. Rethinking Online Content Ecosystem in the Era of Generative AI: A Multiagent System Perspective

Presenters: Haifeng Xu

Date: Tuesday 7 May

Duration: 14:00-18:00

Room: Jade Room 1

Expected background:

Only basic knowledge in multi-agent modeling and game theory is needed. Anyone with interest in multi-agent perspectives of online platforms is welcome.

Expected gain skills:

Audience will learn

(1) an overview of recent developments on multi-agent and incentive modeling for online content ecosystems

(2) their research and deployment status at leading platforms like YouTube and Instagram

(3) how content creation via generative AI (GenAIs) technology will affect and reshape these ecosystems.

(4) lots of open questions alone this exciting new frontier

Tutorial Content:

Online content creation/recommendation has emerged as a new format of Internet economy with hundreds of billions of dollars. How to understand the incentives and competition of online content creators? What roles does the platform play in these competitions? Is their competition efficient and, if not, how can the platform optimize its welfare? More importantly, recent advances in generative AI technologies, such as Midjourney/Sora/ChatGPT, introduced unprecedentedly new methods for creating online contents. How will these new technology transform the incentives of human creators? Will they cause distorted competition between humans and AIs? How will this affect the authenticity and richness of online contents in the future Internet (which in turn is crucial to the development of generative AI technology)?

This tutorial will survey the growing body of recent works that attempts to answer the above questions using multiagent system modeling and analysis. Many open questions, particularly in the second half about human-vs-generativeAI competition, will be discussed in order to engage AAMAS researchers to tackle these hard yet important challenges together.

The tutorial will have two parts.

Part A is about multiagent modeling and analysis of competing content creation, including

A.1 Modeling content creation competition;

A.2 Equilibrium analysis of content creation;

A.3 Maximizing welfare of the competition through mechanism design.

Part B is about preparing content creation for the era of generative AIs, including

B.1 understanding the value of data (e.g., instrumental value of data , intrinsic data value such as Shapley values and RL-based valuation);

B.2 models and mechanisms for data acquisition/pricing;

B.3 Rethinking opportunities and challenges of content creation in the era of generative AI with much open discussion.