## Publications

### 2022

• E. van Krieken, E. Acar, and F. van Harmelen, “Analyzing differentiable fuzzy logic operators,” Artificial Intelligence, vol. 302, p. 103602, 2022.

The AI community is increasingly putting its attention towards combining symbolic and neural approaches, as it is often argued that the strengths and weaknesses of these approaches are complementary. One recent trend in the literature are weakly supervised learning techniques that employ operators from fuzzy logics. In particular, these use prior background knowledge described in such logics to help the training of a neural network from unlabeled and noisy data. By interpreting logical symbols using neural networks, this background knowledge can be added to regular loss functions, hence making reasoning a part of learning. We study, both formally and empirically, how a large collection of logical operators from the fuzzy logic literature behave in a differentiable learning setting. We find that many of these operators, including some of the most well-known, are highly unsuitable in this setting. A further finding concerns the treatment of implication in these fuzzy logics, and shows a strong imbalance between gradients driven by the antecedent and the consequent of the implication. Furthermore, we introduce a new family of fuzzy implications (called sigmoidal implications) to tackle this phenomenon. Finally, we empirically show that it is possible to use Differentiable Fuzzy Logics for semi-supervised learning, and compare how different operators behave in practice. We find that, to achieve the largest performance improvement over a supervised baseline, we have to resort to non-standard combinations of logical operators which perform well in learning, but no longer satisfy the usual logical laws.

@article{van2022analyzing,
title = {Analyzing differentiable fuzzy logic operators},
author = {van Krieken, Emile and Acar, Erman and van Harmelen,
Frank},
journal = {Artificial Intelligence},
volume = {302},
pages = {103602},
year = {2022},
publisher = {Elsevier},
url = "https://research.vu.nl/ws/portalfiles/portal/146020254/2002.06100v2.pdf",
abstract = "The AI community is increasingly putting its
attention towards combining symbolic and neural
approaches, as it is often argued that the strengths
and weaknesses of these approaches are
complementary. One recent trend in the literature
are weakly supervised learning techniques that
employ operators from fuzzy logics. In particular,
these use prior background knowledge described in
such logics to help the training of a neural network
from unlabeled and noisy data. By interpreting
logical symbols using neural networks, this
background knowledge can be added to regular loss
functions, hence making reasoning a part of
learning. We study, both formally and empirically,
how a large collection of logical operators from the
fuzzy logic literature behave in a differentiable
learning setting. We find that many of these
operators, including some of the most well-known,
are highly unsuitable in this setting. A further
finding concerns the treatment of implication in
these fuzzy logics, and shows a strong imbalance
between gradients driven by the antecedent and the
consequent of the implication. Furthermore, we
introduce a new family of fuzzy implications (called
sigmoidal implications) to tackle this
phenomenon. Finally, we empirically show that it is
possible to use Differentiable Fuzzy Logics for
semi-supervised learning, and compare how different
operators behave in practice. We find that, to
achieve the largest performance improvement over a
supervised baseline, we have to resort to
non-standard combinations of logical operators which
perform well in learning, but no longer satisfy the
usual logical laws."
}

• D. W. Romero, A. Kuzina, E. J. Bekkers, J. M. Tomczak, and M. Hoogendoorn, “CKConv: Continuous Kernel Convolution For Sequential Data,” International Conference on Learning Representations (ICLR), 2022, 2022.

Conventional neural architectures for sequential data present important limitations. Recurrent networks suffer from exploding and vanishing gradients, small effective memory horizons, and must be trained sequentially. Convolutional networks are unable to handle sequences of unknown size and their memory horizon must be defined a priori. In this work, we show that all these problems can be solved by formulating convolutional kernels in CNNs as continuous functions. The resulting Continuous Kernel Convolution (CKConv) allows us to model arbitrarily long sequences in a parallel manner, within a single operation, and without relying on any form of recurrence. We show that Continuous Kernel Convolutional Networks (CKCNNs) obtain state-of-the-art results in multiple datasets, e.g., permuted MNIST, and, thanks to their continuous nature, are able to handle non-uniformly sampled datasets and irregularly-sampled data natively. CKCNNs match or perform better than neural ODEs designed for these purposes in a faster and simpler manner.

@article{DBLP:journals/corr/abs-2102-02611,
author = {David W. Romero and
Anna Kuzina and
Erik J. Bekkers and
Jakub M. Tomczak and
Mark Hoogendoorn},
title = {CKConv: Continuous Kernel Convolution For Sequential Data},
journal={International Conference on Learning Representations (ICLR), 2022},
year = {2022},
url={https://openreview.net/pdf?id=8FhxBtXSl0},
abstract = {Conventional neural architectures for sequential data present important limitations. Recurrent networks suffer from exploding and vanishing gradients, small effective memory horizons, and must be trained sequentially. Convolutional networks are unable to handle sequences of unknown size and their memory horizon must be defined a priori. In this work, we show that all these problems can be solved by formulating convolutional kernels in CNNs as continuous functions. The resulting Continuous Kernel Convolution (CKConv) allows us to model arbitrarily long sequences in a parallel manner, within a single operation, and without relying on any form of recurrence. We show that Continuous Kernel Convolutional Networks (CKCNNs) obtain state-of-the-art results in multiple datasets, e.g., permuted MNIST, and, thanks to their continuous nature, are able to handle non-uniformly sampled datasets and irregularly-sampled data natively. CKCNNs match or perform better than neural ODEs designed for these purposes in a faster and simpler manner.}
}

• F. Sarvi, M. Heuss, M. Aliannejadi, S. Schelter, and M. de Rijke, “Understanding and Mitigating the Effect of Outliers in Fair Ranking,” in WSDM 2022: The Fifteenth International Conference on Web Search and Data Mining, 2022.

Traditional ranking systems are expected to sort items in the order of their relevance and thereby maximize their utility. In fair ranking, utility is complemented with fairness as an optimization goal. Recent work on fair ranking focuses on developing algorithms to optimize for fairness, given position-based exposure. In contrast, we identify the potential of outliers in a ranking to influence exposure and thereby negatively impact fairness. An outlier in a list of items can alter the examination probabilities, which can lead to different distributions of attention, compared to position-based exposure. We formalize outlierness in a ranking, show that outliers are present in realistic datasets, and present the results of an eye-tracking study, showing that users scanning order and the exposure of items are influenced by the presence of outliers. We then introduce OMIT, a method for fair ranking in the presence of outliers. Given an outlier detection method, OMIT improves fair allocation of exposure by suppressing outliers in the top-k ranking. Using an academic search dataset, we show that outlierness optimization leads to a fairer policy that displays fewer outliers in the top-k, while maintaining a reasonable trade-off between fairness and utility.

@inproceedings{sarvi-2022-understanding,
author = {Sarvi, Fatemeh and Heuss, Maria and Aliannejadi,
Mohammad and Schelter, Sebastian and de Rijke,
Maarten},
booktitle = {WSDM 2022: The Fifteenth International Conference on
Web Search and Data Mining},
month = {February},
publisher = {ACM},
title = {Understanding and Mitigating the Effect of Outliers
in Fair Ranking},
year = {2022},
url = {https://arxiv.org/abs/2112.11251},
abstract = "Traditional ranking systems are expected to sort
items in the order of their relevance and thereby
maximize their utility. In fair ranking, utility is
complemented with fairness as an optimization
goal. Recent work on fair ranking focuses on
developing algorithms to optimize for fairness,
given position-based exposure. In contrast, we
identify the potential of outliers in a ranking to
influence exposure and thereby negatively impact
fairness. An outlier in a list of items can alter
the examination probabilities, which can lead to
different distributions of attention, compared to
position-based exposure. We formalize outlierness in
a ranking, show that outliers are present in
realistic datasets, and present the results of an
eye-tracking study, showing that users scanning
order and the exposure of items are influenced by
the presence of outliers. We then introduce OMIT, a
method for fair ranking in the presence of
outliers. Given an outlier detection method, OMIT
improves fair allocation of exposure by suppressing
outliers in the top-k ranking. Using an academic
search dataset, we show that outlierness
optimization leads to a fairer policy that displays
fewer outliers in the top-k, while maintaining a
reasonable trade-off between fairness and utility."
}

• E. Liscio, M. van der Meer, L. C. Siebert, C. M. Jonker, and P. K. Murukannaiah, “What values should an agent align with?,” Autonomous Agents and Multi-Agent Systems, vol. 36, iss. 23, 2022. doi:10.1007/s10458-022-09550-0

The pursuit of values drives human behavior and promotes cooperation. Existing research is focused on general values (e.g., Schwartz) that transcend contexts. However, context-specific values are necessary to (1) understand human decisions, and (2) engineer intelligent agents that can elicit and align with human values. We propose Axies, a hybrid (human and AI) methodology to identify context-specific values. Axies simplifies the abstract task of value identification as a guided value annotation process involving human annotators. Axies exploits the growing availability of value-laden text corpora and Natural Language Processing to assist the annotators in systematically identifying context-specific values. We evaluate Axies in a user study involving 80 human subjects. In our study, six annotators generate value lists for two timely and important contexts: Covid-19 measures and sustainable Energy. We employ two policy experts and 72 crowd workers to evaluate Axies value lists and compare them to a list of general (Schwartz) values. We find that Axies yields values that are (1) more context-specific than general values, (2) more suitable for value annotation than general values, and (3) independent of the people applying the methodology.

@Article{Liscio2022,
author = {Liscio, Enrico and van der Meer, Michiel and Siebert, Luciano C. and Jonker, Catholijn M. and Murukannaiah, Pradeep K.},
title = {What values should an agent align with?},
journal = {Autonomous Agents and Multi-Agent Systems},
year = 2022,
volume = 36,
number = 23,
month = {March},
DOI = {10.1007/s10458-022-09550-0},
URL = {https://doi.org/10.1007/s10458-022-09550-0},
abstract = {The pursuit of values drives human behavior and promotes cooperation. Existing research is focused on general values (e.g., Schwartz) that transcend contexts. However, context-specific values are necessary to (1) understand human decisions, and (2) engineer intelligent agents that can elicit and align with human values. We propose Axies, a hybrid (human and AI) methodology to identify context-specific values. Axies simplifies the abstract task of value identification as a guided value annotation process involving human annotators. Axies exploits the growing availability of value-laden text corpora and Natural Language Processing to assist the annotators in systematically identifying context-specific values. We evaluate Axies in a user study involving 80 human subjects. In our study, six annotators generate value lists for two timely and important contexts: Covid-19 measures and sustainable Energy. We employ two policy experts and 72 crowd workers to evaluate Axies value lists and compare them to a list of general (Schwartz) values. We find that Axies yields values that are (1) more context-specific than general values, (2) more suitable for value annotation than general values, and (3) independent of the people applying the methodology.}
}

• G. Nadizar, E. Medvet, and K. Miras, “On the Schedule for Morphological Development of Evolved Modular Soft Robots,” in European Conference on Genetic Programming (Part of EvoStar), 2022, p. 146–161.

Development is fundamental for living beings. As robots are often designed to mimic biological organisms, development is believed to be crucial for achieving successful results in robotic agents, as well. What is not clear, though, is the most appropriate scheduling for development. While in real life systems development happens mostly during the initial growth phase of organisms, it has not yet been investigated whether such assumption holds also for artificial creatures. In this paper, we employ a evolutionary approach to optimize the development—according to different representations—of Voxel-based Soft Robots (VSRs), a kind of modular robots. In our study, development consists in the addition of new voxels to the VSR, at fixed time instants, depending on the development schedule. We experiment with different schedules and show that, similarly to living organisms, artificial agents benefit from development occurring at early stages of life more than from development lasting for their entire life.

@inproceedings{nadizar2022schedule,
title={On the Schedule for Morphological Development of Evolved Modular Soft Robots},
author={Nadizar, Giorgia and Medvet, Eric and Miras, Karine},
booktitle={European Conference on Genetic Programming (Part of EvoStar)},
pages={146--161},
year={2022},
organization={Springer},
abstract={Development is fundamental for living beings. As robots are often designed to mimic biological organisms, development is believed to be crucial for achieving successful results in robotic agents, as well. What is not clear, though, is the most appropriate scheduling for development. While in real life systems development happens mostly during the initial growth phase of organisms, it has not yet been investigated whether such assumption holds also for artificial creatures. In this paper, we employ a evolutionary approach to optimize the development—according to different representations—of Voxel-based Soft Robots (VSRs), a kind of modular robots. In our study, development consists in the addition of new voxels to the VSR, at fixed time instants, depending on the development schedule. We experiment with different schedules and show that, similarly to living organisms, artificial agents benefit from development occurring at early stages of life more than from development lasting for their entire life.}
}

• J. Kiseleva, Z. Li, M. Aliannejadi, S. Mohanty, M. ter Hoeve, M. Burtsev, A. Skrynnik, A. Zholus, A. Panov, K. Srinet, A. Szlam, Y. Sun, M. K. Côté. Hofmann, A. Awadallah, L. Abdrazakov, I. Churin, P. Manggala, K. Naszadi, M. van der Meer, and T. Kim, “Interactive Grounded Language Understanding in a Collaborative Environment: IGLU 2021,” , 2022. doi:10.48550/ARXIV.2205.02388

Human intelligence has the remarkable ability to quickly adapt to new tasks and environments. Starting from a very young age, humans acquire new skills and learn how to solve new tasks either by imitating the behavior of others or by following provided natural language instructions. To facilitate research in this direction, we propose \emph{IGLU: Interactive Grounded Language Understanding in a Collaborative Environment}. The primary goal of the competition is to approach the problem of how to build interactive agents that learn to solve a task while provided with grounded natural language instructions in a collaborative environment. Understanding the complexity of the challenge, we split it into sub-tasks to make it feasible for participants.

@article{IGLU2022,
and Mohanty, Shrestha and ter Hoeve, Maartje and
Burtsev, Mikhail and Skrynnik, Alexey and Zholus,
Artem and Panov, Aleksandr and Srinet, Kavya and
Szlam, Arthur and Sun, Yuxuan and Hofmann,
Marc-Alexandre Côté Katja and Awadallah, Ahmed and
Abdrazakov, Linar and Churin, Igor and Manggala,
Putra and Naszadi, Kata and van der Meer, Michiel
and Kim, Taewoon},
keywords = {Computation and Language (cs.CL), Artificial
Intelligence (cs.AI), FOS: Computer and information
sciences, FOS: Computer and information sciences},
title = {Interactive Grounded Language Understanding in a
Collaborative Environment: IGLU 2021},
publisher = {arXiv},
year = {2022},
abstract = "Human intelligence has the remarkable ability to quickly
very young age, humans acquire new skills and learn
how to solve new tasks either by imitating the
behavior of others or by following provided natural
language instructions. To facilitate research in
this direction, we propose \emph{IGLU: Interactive
Grounded Language Understanding in a Collaborative
Environment}. The primary goal of the competition
is to approach the problem of how to build
interactive agents that learn to solve a task while
provided with grounded natural language instructions
in a collaborative environment. Understanding the
complexity of the challenge, we split it into
sub-tasks to make it feasible for participants.",year = {2022},
doi = {10.48550/ARXIV.2205.02388},
url = {https://arxiv.org/abs/2205.02388},
}

• D. Grossi, “Social Choice Around the Block: On the Computational Social Choice of Blockchain,” in 21st International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2022, Auckland, New Zealand, May 9-13, 2022, 2022, p. 1788–1793.

One of the most innovative aspects of blockchain technology con- sists in the introduction of an incentive layer to regulate the behav- ior of distributed protocols. The designer of a blockchain system faces therefore issues that are akin to those relevant for the design of economic mechanisms, and faces them in a computational setting. From this perspective the present paper argues for the importance of computational social choice in blockchain research. It identifies a few challenges at the interface of the two fields that illustrate the strong potential for cross-fertilization between them.

@inproceedings{DBLP:conf/atal/Grossi22,
author = {Davide Grossi},
editor = {Piotr Faliszewski and
Viviana Mascardi and
Catherine Pelachaud and
Matthew E. Taylor},
title = {Social Choice Around the Block: On the Computational Social Choice
of Blockchain},
booktitle = {21st International Conference on Autonomous Agents and Multiagent
Systems, {AAMAS} 2022, Auckland, New Zealand, May 9-13, 2022},
pages = {1788--1793},
publisher = {International Foundation for Autonomous Agents and Multiagent Systems
{(IFAAMAS)}},
year = {2022},
url = {https://www.ifaamas.org/Proceedings/aamas2022/pdfs/p1788.pdf},
abstract = {One of the most innovative aspects of blockchain technology con-
sists in the introduction of an incentive layer to regulate the behav-
ior of distributed protocols. The designer of a blockchain system
faces therefore issues that are akin to those relevant for the design
of economic mechanisms, and faces them in a computational setting.
From this perspective the present paper argues for the importance
of computational social choice in blockchain research. It identifies
a few challenges at the interface of the two fields that illustrate the
strong potential for cross-fertilization between them.}
}

• M. G. Atigh, J. Schoep, E. Acar, N. van Noord, and P. Mettes, “Hyperbolic Image Segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 4453-4462.

For image segmentation, the current standard is to perform pixel-level optimization and inference in Euclidean output embedding spaces through linear hyperplanes. In this work, we show that hyperbolic manifolds provide a valuable alternative for image segmentation and propose a tractable formulation of hierarchical pixel-level classification in hyperbolic space. Hyperbolic Image Segmentation opens up new possibilities and practical benefits for segmentation, such as uncertainty estimation and boundary information for free, zero-label generalization, and increased performance in low-dimensional output embeddings.

@InProceedings{Atigh_2022_CVPR,
author = {Atigh, Mina Ghadimi and Schoep, Julian and Acar,
Erman and van Noord, Nanne and Mettes, Pascal},
title = {Hyperbolic Image Segmentation},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2022},
pages = {4453-4462},
url =
{https://openaccess.thecvf.com/content/CVPR2022/papers/Atigh_Hyperbolic_Image_Segmentation_CVPR_2022_paper.pdf},
abstract = {For image segmentation, the current standard is to
perform pixel-level optimization and inference in
Euclidean output embedding spaces through linear
hyperplanes. In this work, we show that hyperbolic
manifolds provide a valuable alternative for image
segmentation and propose a tractable formulation of
hierarchical pixel-level classification in
hyperbolic space. Hyperbolic Image Segmentation
opens up new possibilities and practical benefits
for segmentation, such as uncertainty estimation and
generalization, and increased performance in
low-dimensional output embeddings.}
}

• R. Verma and E. Nalisnick, “Calibrated Learning to Defer with One-vs-All Classifiers,” in ICML 2022 Workshop on Human-Machine Collaboration and Teaming, 2022.

The learning to defer (L2D) framework has the potential to make AI systems safer. For a given input, the system can defer the decision to a human if the human is more likely than the model to take the correct action. We study the calibration of L2D systems, investigating if the probabilities they output are sound. We find that Mozannar & Sontag’s (2020) multiclass framework is not calibrated with respect to expert correctness. Moreover, it is not even guaranteed to produce valid probabilities due to its parameterization being degenerate for this purpose. We propose an L2D system based on one-vs-all classifiers that is able to produce calibrated probabilities of expert correctness. Furthermore, our loss function is also a consistent surrogate for multiclass L2D, like Mozannar & Sontag’s (2020). Our experiments verify that not only is our system calibrated, but this benefit comes at no cost to accuracy. Our model’s accuracy is always comparable (and often superior) to Mozannar & Sontag’s (2020) model’s in tasks ranging from hate speech detection to galaxy classification to diagnosis of skin lesions.

@inproceedings{Verma-Nalisnick-ICML:2022,
title={Calibrated Learning to Defer with One-vs-All Classifiers},
author={Rajeev Verma and Eric Nalisnick},
year={2022},
booktitle = {ICML 2022 Workshop on Human-Machine Collaboration and Teaming},
abstract={The learning to defer (L2D) framework has the potential to
make AI systems safer. For a given input, the system
can defer the decision to a human if the human is
more likely than the model to take the correct
action. We study the calibration of L2D systems,
investigating if the probabilities they output are
sound. We find that Mozannar & Sontag’s (2020)
multiclass framework is not calibrated with respect
to expert correctness. Moreover, it is not even
guaranteed to produce valid probabilities due to its
parameterization being degenerate for this
purpose. We propose an L2D system based on
one-vs-all classifiers that is able to produce
calibrated probabilities of expert
correctness. Furthermore, our loss function is also
a consistent surrogate for multiclass L2D, like
Mozannar & Sontag’s (2020). Our experiments verify
that not only is our system calibrated, but this
benefit comes at no cost to accuracy. Our model's
accuracy is always comparable (and often superior)
to Mozannar & Sontag’s (2020) model's in tasks
ranging from hate speech detection to galaxy
classification to diagnosis of skin lesions.},
url = {https://icml.cc/Conferences/2022/ScheduleMultitrack?event=18123}
}

• P. Manggala, H. H. Hoos, and E. Nalisnick, “Bayesian Weak Supervision via an Optimal Transport Approach,” in ICML 2022 Workshop on Human-Machine Collaboration and Teaming, 2022.

Large-scale machine learning is often impeded by a lack of labeled training data. To address this problem, the paradigm of weak supervision aims to collect and then aggregate multiple noisy labels. We propose a Bayesian probabilistic model that employs a tractable Sinkhorn-based optimal transport formulation to derive a ground-truth label. The translation between true and weak labels is cast as a transport problem with an inferred cost structure. Our approach achieves strong performance on the WRENCH weak supervision benchmark. Moreover, the posterior distribution over cost matrices allows for exploratory analysis of the weak sources.

@inproceedings{manggala2022optimaltransportweaksupervision,
title={Bayesian Weak Supervision via an Optimal Transport Approach},
author={Manggala, Putra and Hoos, Holger H. and Nalisnick, Eric},
year={2022},
booktitle = {ICML 2022 Workshop on Human-Machine Collaboration and Teaming},
abstract={Large-scale machine learning is often impeded by a lack of
labeled training data. To address this problem, the
paradigm of weak supervision aims to collect and
then aggregate multiple noisy labels. We propose a
Bayesian probabilistic model that employs a
tractable Sinkhorn-based optimal transport
formulation to derive a ground-truth label. The
translation between true and weak labels is cast as
a transport problem with an inferred cost structure.
Our approach achieves strong performance on the
WRENCH weak supervision benchmark. Moreover, the
posterior distribution over cost matrices allows for
exploratory analysis of the weak sources.},
url = {https://openreview.net/forum?id=YJkf-6tTFiY}
}

• R. Dobbe, “System Safety and Artificial Intelligence,” in Oxford Handbook of AI Governance, Oxford: , 2022, vol. To Appear.

This chapter formulates seven lessons for preventing harm in artificial intelligence (AI) systems based on insights from the field of system safety for software-based automation in safety-critical domains. New applications of AI across societal domains and public organizations and infrastructures come with new hazards, which lead to new forms of harm, both grave and pernicious. The text addresses the lack of consensus for diagnosing and eliminating new AI system hazards. For decades, the field of system safety has dealt with accidents and harm in safety-critical systems governed by varying degrees of software-based automation and decision-making. This field embraces the core assumption of systems and control that AI systems cannot be safeguarded by technical design choices on the model or algorithm alone, instead requiring an end-to-end hazard analysis and design frame that includes the context of use, impacted stakeholders and the formal and informal institutional environment in which the system operates. Safety and other values are then inherently socio-technical and emergent system properties that require design and control measures to instantiate these across the technical, social and institutional components of a system. This chapter honors system safety pioneer Nancy Leveson, by situating her core lessons for today’s AI system safety challenges. For every lesson, concrete tools are offered for rethinking and reorganizing the safety management of AI systems, both in design and governance. This history tells us that effective AI safety management requires transdisciplinary approaches and a shared language that allows involvement of all levels of society.

@incollection{dobbe_system_2022,
title = {System {Safety} and {Artificial} {Intelligence}},
volume = {To Appear},
isbn = {978-0-19-757932-9},
url = {https://arxiv.org/abs/2202.09292},
abstract = {This chapter formulates seven lessons for
preventing harm in artificial intelligence (AI)
systems based on insights from the field of system
safety for software-based automation in
safety-critical domains. New applications of AI
across societal domains and public organizations and
infrastructures come with new hazards, which lead to
new forms of harm, both grave and pernicious. The
text addresses the lack of consensus for diagnosing
and eliminating new AI system hazards. For decades,
the field of system safety has dealt with accidents
and harm in safety-critical systems governed by
varying degrees of software-based automation and
decision-making. This field embraces the core
assumption of systems and control that AI systems
cannot be safeguarded by technical design choices on
the model or algorithm alone, instead requiring an
end-to-end hazard analysis and design frame that
includes the context of use, impacted stakeholders
and the formal and informal institutional
environment in which the system operates. Safety and
other values are then inherently socio-technical and
emergent system properties that require design and
control measures to instantiate these across the
technical, social and institutional components of a
system. This chapter honors system safety pioneer
Nancy Leveson, by situating her core lessons for
today's AI system safety challenges. For every
lesson, concrete tools are offered for rethinking
and reorganizing the safety management of AI
systems, both in design and governance. This history
tells us that effective AI safety management
requires transdisciplinary approaches and a shared
language that allows involvement of all levels of
society.},
booktitle = {Oxford {Handbook} of {AI} {Governance}},
author = {Dobbe, Roel},
year = {2022}
}

• A. Sauter, E. Acar, and V. François-Lavet, A Meta-Reinforcement Learning Algorithm for Causal DiscoveryarXiv, 2022. doi:10.48550/ARXIV.2207.08457

Causal discovery is a major task with the utmost importance for machine learning since causal structures can enable models to go beyond pure correlation-based inference and significantly boost their performance. However, finding causal structures from data poses a significant challenge both in computational effort and accuracy, let alone its impossibility without interventions in general. In this paper, we develop a meta-reinforcement learning algorithm that performs causal discovery by learning to perform interventions such that it can construct an explicit causal graph. Apart from being useful for possible downstream applications, the estimated causal graph also provides an explanation for the data-generating process. In this article, we show that our algorithm estimates a good graph compared to the SOTA approaches, even in environments whose underlying causal structure is previously unseen. Further, we make an ablation study that shows how learning interventions contribute to the overall performance of our approach. We conclude that interventions indeed help boost the performance, efficiently yielding an accurate estimate of the causal structure of a possibly unseen environment.

@misc{Sauter22MetaRL,
doi = {10.48550/ARXIV.2207.08457},
url = {https://arxiv.org/abs/2207.08457},
author = {Sauter, Andreas and Acar, Erman and François-Lavet, Vincent},
title = {A Meta-Reinforcement Learning Algorithm for Causal Discovery},
abstract = { Causal discovery is a major task with the utmost
importance for machine learning since causal
structures can enable models to go beyond pure
correlation-based inference and significantly boost
their performance. However, finding causal
structures from data poses a significant challenge
both in computational effort and accuracy, let alone
its impossibility without interventions in
general. In this paper, we develop a
meta-reinforcement learning algorithm that performs
causal discovery by learning to perform
interventions such that it can construct an explicit
causal graph. Apart from being useful for possible
downstream applications, the estimated causal graph
also provides an explanation for the data-generating
estimates a good graph compared to the SOTA
approaches, even in environments whose underlying
causal structure is previously unseen. Further, we
make an ablation study that shows how learning
interventions contribute to the overall performance
of our approach. We conclude that interventions
indeed help boost the performance, efficiently
yielding an accurate estimate of the causal
structure of a possibly unseen environment.},
publisher = {arXiv},
year = {2022},
}

### 2021

• M. van Bekkum, M. de Boer, F. van Harmelen, A. M. -, and A. ten Teije, “Modular design patterns for hybrid learning and reasoning systems,” Appl. Intell., vol. 51, iss. 9, p. 6528–6546, 2021. doi:10.1007/s10489-021-02394-3
[BibTeX]
@article{DBLP:journals/apin/BekkumBHMT21,
author = {Michael van Bekkum and
Maaike de Boer and
Frank van Harmelen and
Andr{\'{e}} Meyer{-}Vitali and
Annette ten Teije},
title = {Modular design patterns for hybrid learning and reasoning systems},
journal = {Appl. Intell.},
volume = {51},
number = {9},
pages = {6528--6546},
year = {2021},
url = {https://doi.org/10.1007/s10489-021-02394-3},
doi = {10.1007/s10489-021-02394-3},
timestamp = {Wed, 01 Sep 2021 12:45:13 +0200},
biburl = {https://dblp.org/rec/journals/apin/BekkumBHMT21.bib},
bibsource = {dblp computer science bibliography, https://dblp.org},
url = https://link.springer.com/article/10.1007/s10489-021-02394-3}

• A. Kuzina, M. Welling, and J. M. Tomczak, “Diagnosing Vulnerability of Variational Auto-Encoders to Adversarial Attacks,” in ICLR 2021 Workshop on Robust and Reliable Machine Learning in the Real World, 2021.

In this work, we explore adversarial attacks on the Variational Autoencoders (VAE). We show how to modify data point to obtain a prescribed latent code (supervised attack) or just get a drastically different code (unsupervised attack). We examine the influence of model modifications ($\beta$-VAE, NVAE) on the robustness of VAEs and suggest metrics to quantify it.

@inproceedings{kuzina2021diagnosing,
title={Diagnosing Vulnerability of Variational Auto-Encoders to Adversarial Attacks},
author={Kuzina, Anna and Welling, Max and Tomczak, Jakub M},
year={2021},
booktitle = {ICLR 2021 Workshop on Robust and Reliable Machine Learning in the Real World},
url={https://arxiv.org/pdf/2103.06701.pdf},
abstract={In this work, we explore adversarial attacks on the Variational Autoencoders (VAE). We show how to modify data point to obtain a prescribed latent code (supervised attack) or just get a drastically different code (unsupervised attack). We examine the influence of model modifications ($\beta$-VAE, NVAE) on the robustness of VAEs and suggest metrics to quantify it.}
}

• H. Zheng and B. Verheij, “Rules, cases and arguments in artificial intelligence and law,” in Research Handbook on Big Data Law, R. Vogl, Ed., Edgar Elgar Publishing, 2021, pp. 373-387.

Artificial intelligence and law is an interdisciplinary field of research that dates back at least to the 1970s, with academic conferences starting in the 1980s. In the field, complex problems are addressed about the computational modeling and automated support of legal reasoning and argumentation. Scholars have different backgrounds, and progress is driven by insights from lawyers, judges, computer scientists, philosophers and others. The community investigates and develops artificial intelligence techniques applicable in the legal domain, in order to enhance access to law for citizens and to support the efficiency and quality of work in the legal domain, aiming to promote a just society. Integral to the legal domain, legal reasoning and its structure and process have gained much attention in AI & Law research. Such research is today especially relevant, since in these days of big data and widespread use of algorithms, there is a need in AI to connect knowledge-based and data-driven AI techniques in order to arrive at a social, explainable and responsible AI. By considering knowledge in the form of rules and data in the form of cases connected by arguments, the field of AI & Law contributes relevant representations and algorithms for handling a combination of knowledge and data. In this chapter, as an entry point into the literature on AI & Law, three major styles of modeling legal reasoning are studied: rule-based reasoning, case-based reasoning and argument-based reasoning, which are the focus of this chapter. We describe selected key ideas, leaving out formal detail. As we will see, these styles of modeling legal reasoning are related, and there is much research investigating relations. We use the example domain of Dutch tort law (Section 2) to illustrate these three major styles, which are then more fully explained (Sections 3 to 5)

@InCollection{Zheng:2021,
author = {H. Zheng and B. Verheij},
title = {Rules, cases and arguments in artificial intelligence and law},
booktitle = {Research Handbook on Big Data Law},
publisher = {Edgar Elgar Publishing},
editor = {R Vogl},
year = 2021,
url = {https://www.ai.rug.nl/~verheij/publications/handbook2021.htm},
pages = {373-387},
abstract = {Artificial intelligence and law is an interdisciplinary field of research that dates back at least to the 1970s, with academic conferences starting in the 1980s. In the field, complex problems are addressed about the computational modeling and automated support of legal reasoning and argumentation. Scholars have different backgrounds, and progress is driven by insights from lawyers, judges, computer scientists, philosophers and others. The community investigates and develops artificial intelligence techniques applicable in the legal domain, in order to enhance access to law for citizens and to support the efficiency and quality of work in the legal domain, aiming to promote a just society. Integral to the legal domain, legal reasoning and its structure and process have gained much attention in AI & Law research. Such research is today especially relevant, since in these days of big data and widespread use of algorithms, there is a need in AI to connect knowledge-based and data-driven AI techniques in order to arrive at a social, explainable and responsible AI. By considering knowledge in the form of rules and data in the form of cases connected by arguments, the field of AI & Law contributes relevant representations and algorithms for handling a combination of knowledge and data. In this chapter, as an entry point into the literature on AI & Law, three major styles of modeling legal reasoning are studied: rule-based reasoning, case-based reasoning and argument-based reasoning, which are the focus of this chapter. We describe selected key ideas, leaving out formal detail. As we will see, these styles of modeling legal reasoning are related, and there is much research investigating relations. We use the example domain of Dutch tort law (Section 2) to illustrate these three major styles, which are then more fully explained (Sections 3 to 5)}
}

• C. A. Kurtan and P. i, “Assisting humans in privacy management: an agent-based approach,” Autonomous Agents and Multi-Agent Systems, vol. 35, iss. 7, 2021. doi:https://doi.org/10.1007/s10458-020-09488-1

Image sharing is a service offered by many online social networks. In order to preserve privacy of images, users need to think through and specify a privacy setting for each image that they upload. This is difficult for two main reasons: first, research shows that many times users do not know their own privacy preferences, but only become aware of them over time. Second, even when users know their privacy preferences, editing these privacy settings is cumbersome and requires too much effort, interfering with the quick sharing behavior expected on an online social network. Accordingly, this paper proposes a privacy recommendation model for images using tags and an agent that implements this, namely pelte. Each user agent makes use of the privacy settings that its user have set for previous images to predict automatically the privacy setting for an image that is uploaded to be shared. When in doubt, the agent analyzes the sharing behavior of other users in the user’s network to be able to recommend to its user about what should be considered as private. Contrary to existing approaches that assume all the images are available to a centralized model, pelte is compatible to distributed environments since each agent accesses only the privacy settings of the images that the agent owner has shared or those that have been shared with the user. Our simulations on a real-life dataset shows that pelte can accurately predict privacy settings even when a user has shared a few images with others, the images have only a few tags or the user’s friends have varying privacy preferences.

@Article{kurtan-yolum-21,
author = {A. Can Kurtan and P{\i}nar Yolum},
title = {Assisting humans in privacy management: an agent-based approach},
journal = {Autonomous Agents and Multi-Agent Systems},
year = {2021},
volume = {35},
number = {7},
abstract = {Image sharing is a service offered by many online social networks. In order to preserve privacy of images, users need to think through and specify a privacy setting for each image that they upload. This is difficult for two main reasons: first, research shows that many times users do not know their own privacy preferences, but only become aware of them over time. Second, even when users know their privacy preferences, editing these privacy settings is cumbersome and requires too much effort, interfering with the quick sharing behavior expected on an online social network. Accordingly, this paper proposes a privacy recommendation model for images using tags and an agent that implements this, namely pelte. Each user agent makes use of the privacy settings that its user have set for previous images to predict automatically the privacy setting for an image that is uploaded to be shared. When in doubt, the agent analyzes the sharing behavior of other users in the user's network to be able to recommend to its user about what should be considered as private. Contrary to existing approaches that assume all the images are available to a centralized model, pelte is compatible to distributed environments since each agent accesses only the privacy settings of the images that the agent owner has shared or those that have been shared with the user. Our simulations on a real-life dataset shows that pelte can accurately predict privacy settings even when a user has shared a few images with others, the images have only a few tags or the user's friends have varying privacy preferences.},
doi = {https://doi.org/10.1007/s10458-020-09488-1}
}

• E. Liscio, M. van der Meer, L. C. Siebert, C. M. Jonker, N. Mouter, and P. K. Murukannaiah, “Axies: Identifying and Evaluating Context-Specific Values,” in Proc. of the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2021), Online, 2021, p. 799–808.

The pursuit of values drives human behavior and promotes cooperation. Existing research is focused on general (e.g., Schwartz) values that transcend contexts. However, context-specific values are necessary to (1) understand human decisions, and (2) engineer intelligent agents that can elicit human values and take value-aligned actions. We propose Axies, a hybrid (human and AI) methodology to identify context-specific values. Axies simplifies the abstract task of value identification as a guided value annotation process involving human annotators. Axies exploits the growing availability of value-laden text corpora and Natural Language Processing to assist the annotators in systematically identifying context-specific values. We evaluate Axies in a user study involving 60 subjects. In our study, six annotators generate value lists for two timely and important contexts: Covid-19 measures, and sustainable Energy. Then, two policy experts and 52 crowd workers evaluate Axies value lists. We find that Axies yields values that are context-specific, consistent across different annotators, and comprehensible to end users

@inproceedings{Liscio2021a,
author = {Liscio, Enrico and van der Meer, Michiel and
Siebert, Luciano C. and Jonker, Catholijn M. and
Mouter, Niek and Murukannaiah, Pradeep K.},
booktitle = {Proc. of the 20th International Conference on
Autonomous Agents and Multiagent Systems (AAMAS
2021)},
keywords = {Context,Ethics,Natural Language
Processing,Values,acm reference format,catholijn
m,context,enrico liscio,ethics,jonker,luciano
c,michiel van der meer,natural language
processing,siebert,values},
pages = {799--808},
publisher = {IFAAMAS},
title = {{Axies: Identifying and Evaluating Context-Specific
Values}},
year = {2021},
url =
abstract = "The pursuit of values drives human behavior and
promotes cooperation. Existing research is focused
on general (e.g., Schwartz) values that transcend
contexts. However, context-specific values are
necessary to (1) understand human decisions, and (2)
engineer intelligent agents that can elicit human
values and take value-aligned actions. We propose
Axies, a hybrid (human and AI) methodology to
identify context-specific values. Axies simplifies
the abstract task of value identification as a
guided value annotation process involving human
annotators. Axies exploits the growing availability
of value-laden text corpora and Natural Language
Processing to assist the annotators in
systematically identifying context-specific values.
We evaluate Axies in a user study involving 60
subjects. In our study, six annotators generate
value lists for two timely and important contexts:
Covid-19 measures, and sustainable Energy. Then, two
policy experts and 52 crowd workers evaluate Axies
value lists. We find that Axies yields values that
are context-specific, consistent across different
annotators, and comprehensible to end users"
}

• E. Liscio, M. van der Meer, C. M. Jonker, and P. K. Murukannaiah, “A Collaborative Platform for Identifying Context-Specific Values,” in Proc. of the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2021), Online, 2021, p. 1773–1775.

Value alignment is a crucial aspect of ethical multiagent systems. An important step toward value alignment is identifying values specific to an application context. However, identifying context-specific values is complex and cognitively demanding. To support this process, we develop a methodology and a collaborative web platform that employs AI techniques. We describe this platform, highlighting its intuitive design and implementation.

@inproceedings{Liscio2021,
author = {Liscio, Enrico and van der Meer, Michiel and Jonker,
Catholijn M. and Murukannaiah, Pradeep K.},
booktitle = {Proc. of the 20th International Conference on
Autonomous Agents and Multiagent Systems (AAMAS
2021)},
keywords = {Context,Ethics,Natural Language
Processing,Values,acm reference format,and
liscio,ethics,jonker,michiel van der meer,natural
language processing,values},
pages = {1773--1775},
publisher = {IFAAMAS},
title = {{A Collaborative Platform for Identifying
Context-Specific Values}},
year = {2021},
url =
"https://www.ifaamas.org/Proceedings/aamas2021/pdfs/p1773.pdf",
abstract = "Value alignment is a crucial aspect of ethical
multiagent systems. An important step toward value
alignment is identifying values specific to an
application context. However, identifying
context-specific values is complex and cognitively
demanding. To support this process, we develop a
methodology and a collaborative web platform that
employs AI techniques. We describe this platform,
highlighting its intuitive design and
implementation."
}

• K. Miras, J. Cuijpers, B. Gülhan, and A. Eiben, “The Impact of Early-death on Phenotypically Plastic Robots that Evolve in Changing Environments,” in ALIFE 2021: The 2021 Conference on Artificial Life, 2021.

In this work, we evolve phenotypically plastic robots-robots that adapt their bodies and brains according to environmental conditions-in changing environments. In particular, we investigate how the possibility of death in early environmental conditions impacts evolvability and robot traits. Our results demonstrate that early-death improves the efficiency of the evolutionary process for the earlier environmental conditions. On the other hand, the possibility of early-death in the earlier environmental conditions results in a dramatic loss of performance in the latter environmental conditions.

@inproceedings{miras2021impact,
title = {The Impact of Early-death on Phenotypically Plastic
Robots that Evolve in Changing Environments},
author = {Miras, Karine and Cuijpers, Jim and G{\"u}lhan,
booktitle = {ALIFE 2021: The 2021 Conference on Artificial Life},
year = {2021},
organization = {MIT Press},
url =
"https://direct.mit.edu/isal/proceedings-pdf/isal/33/25/1929813/isal_a_00371.pdf",
abstract = "In this work, we evolve phenotypically plastic
robots-robots that adapt their bodies and brains
according to environmental conditions-in changing
environments. In particular, we investigate how the
possibility of death in early environmental
conditions impacts evolvability and robot
traits. Our results demonstrate that early-death
improves the efficiency of the evolutionary process
for the earlier environmental conditions. On the
other hand, the possibility of early-death in the
earlier environmental conditions results in a
dramatic loss of performance in the latter
environmental conditions."
}

• K. Miras, “Constrained by Design: Influence of Genetic Encodings on Evolved Traits of Robots,” Frontiers Robotics AI, vol. 8, p. 672379, 2021. doi:10.3389/frobt.2021.672379

Genetic encodings and their particular properties are known to have a strong influence on the success of evolutionary systems. However, the literature has widely focused on studying the effects that encodings have on performance, i.e., fitness-oriented studies. Notably, this anchoring of the literature to performance is limiting, considering that performance provides bounded information about the behavior of a robot system. In this paper, we investigate how genetic encodings constrain the space of robot phenotypes and robot behavior. In summary, we demonstrate how two generative encodings of different nature lead to very different robots and discuss these differences. Our principal contributions are creating awareness about robot encoding biases, demonstrating how such biases affect evolved morphological, control, and behavioral traits, and finally scrutinizing the trade-offs among different biases.

@article{DBLP:journals/firai/Miras21,
author = {Karine Miras},
title = {Constrained by Design: Influence of Genetic Encodings on Evolved Traits
of Robots},
journal = {Frontiers Robotics {AI}},
volume = {8},
pages = {672379},
year = {2021},
url = {https://doi.org/10.3389/frobt.2021.672379},
doi = {10.3389/frobt.2021.672379},
abstract = "Genetic encodings and their particular properties are known to have a strong influence on the success of evolutionary systems. However, the literature has widely focused on studying the effects that encodings have on performance, i.e., fitness-oriented studies. Notably, this anchoring of the literature to performance is limiting, considering that performance provides bounded information about the behavior of a robot system. In this paper, we investigate how genetic encodings constrain the space of robot phenotypes and robot behavior. In summary, we demonstrate how two generative encodings of different nature lead to very different robots and discuss these differences. Our principal contributions are creating awareness about robot encoding biases, demonstrating how such biases affect evolved morphological, control, and behavioral traits, and finally scrutinizing the trade-offs among different biases."
}

• P. Manggala, H. H. Hoos, and E. Nalisnick, “Bayesian Regression from Multiple Sources of Weak Supervision,” in ICML 2021 Machine Learning for Data: Automated Creation, Privacy, Bias, 2021.

We describe a Bayesian approach to weakly supervised regression. Our proposed framework propagates uncertainty from the weak supervision to an aggregated predictive distribution. We use a generalized Bayes procedure to account for the supervision being weak and therefore likely misspecified.

@inproceedings{manggala2021bayesianregression,
title = {Bayesian Regression from Multiple Sources of Weak
Supervision},
author = {Manggala, Putra and Hoos, Holger H. and Nalisnick,
Eric},
year = {2021},
booktitle = {ICML 2021 Machine Learning for Data: Automated
Creation, Privacy, Bias},
url =
{https://pmangg.github.io/papers/brfmsows_mhn_ml4data_icml.pdf},
abstract = {We describe a Bayesian approach to weakly supervised
regression. Our proposed framework propagates
uncertainty from the weak supervision to an
aggregated predictive distribution. We use a
generalized Bayes procedure to account for the
supervision being weak and therefore likely
misspecified.}
}

• C. Steging, S. Renooij, and B. Verheij, “Discovering the rationale of decisions: towards a method for aligning learning and reasoning,” in ICAIL ’21: Eighteenth International Conference for Artificial Intelligence and Law, São Paulo Brazil, June 21 – 25, 2021, 2021, p. 235–239. doi:10.1145/3462757.3466059

In AI and law, systems that are designed for decision support should be explainable when pursuing justice. In order for these systems to be fair and responsible, they should make correct decisions and make them using a sound and transparent rationale. In this paper, we introduce a knowledge-driven method for model-agnostic rationale evaluation using dedicated test cases, similar to unit-testing in professional software development. We apply this new quantitative human-in-the-loop method in a machine learning experiment aimed at extracting known knowledge structures from artificial datasets from a real-life legal setting. We show that our method allows us to analyze the rationale of black box machine learning systems by assessing which rationale elements are learned or not. Furthermore, we show that the rationale can be adjusted using tailor-made training data based on the results of the rationale evaluation.

@inproceedings{StegingICAIL21,
author = {Cor Steging and
Silja Renooij and
Bart Verheij},
editor = {Juliano Maranh{\~{a}}o and
title = {Discovering the rationale of decisions: towards a method for aligning
learning and reasoning},
booktitle = {{ICAIL} '21: Eighteenth International Conference for Artificial Intelligence
and Law, S{\~{a}}o Paulo Brazil, June 21 - 25, 2021},
pages = {235--239},
publisher = {{ACM}},
year = {2021},
url = {https://doi.org/10.1145/3462757.3466059},
doi = {10.1145/3462757.3466059},
abstract = "In AI and law, systems that are designed for decision support should be explainable when pursuing justice. In order for these systems to be fair and responsible, they should make correct decisions and make them using a sound and transparent rationale. In this paper, we introduce a knowledge-driven method for model-agnostic rationale evaluation using dedicated test cases, similar to unit-testing in professional software development. We apply this new quantitative human-in-the-loop method in a machine learning experiment aimed at extracting known knowledge structures from artificial datasets from a real-life legal setting. We show that our method allows us to analyze the rationale of black box machine learning systems by assessing which rationale elements are learned or not. Furthermore, we show that the rationale can be adjusted using tailor-made training data based on the results of the rationale evaluation."
}

• C. Steging, S. Renooij, and B. Verheij, “Discovering the Rationale of Decisions: Experiments on Aligning Learning and Reasoning,” in 4th EXplainable AI in Law Workshop (XAILA 2021), 2021, p. 235–239.

In AI and law, systems that are designed for decision support should be explainable when pursuing justice. In order for these systems to be fair and responsible, they should make correct decisions and make them using a sound and transparent rationale. In this paper, we introduce a knowledge-driven method for model-agnostic rationale evaluation using dedicated test cases, similar to unit-testing in professional software development. We apply this new method in a set of machine learning experiments aimed at extracting known knowledge structures from artificial datasets from fictional and non-fictional legal settings. We show that our method allows us to analyze the rationale of black-box machine learning systems by assessing which rationale elements are learned or not. Furthermore, we show that the rationale can be adjusted using tailor-made training data based on the results of the rationale evaluation.

@inproceedings{StegingXAILA21,
author = {Cor Steging and
Silja Renooij and
Bart Verheij},
title = {Discovering the Rationale of Decisions: Experiments on Aligning Learning
and Reasoning},
maintitle = {{ICAIL} '21: Eighteenth International Conference for Artificial Intelligence
and Law, S{\~{a}}o Paulo Brazil, June 21 - 25, 2021},
booktitle = {4th EXplainable AI in Law Workshop (XAILA 2021) },
pages = {235--239},
publisher = {{ACM}},
year = {2021},
url = {https://arxiv.org/abs/2105.06758},
abstract = "In AI and law, systems that are designed for decision support should be explainable when pursuing justice. In order for these systems to be fair and responsible, they should make correct decisions and make them using a sound and transparent rationale. In this paper, we introduce a knowledge-driven method for model-agnostic rationale evaluation using dedicated test cases, similar to unit-testing in professional software development. We apply this new method in a set of machine learning experiments aimed at extracting known knowledge structures from artificial datasets from fictional and non-fictional legal settings. We show that our method allows us to analyze the rationale of black-box machine learning systems by assessing which rationale elements are learned or not. Furthermore, we show that the rationale can be adjusted using tailor-made training data based on the results of the rationale evaluation."
}

• U. Khurana, E. Nalisnick, and A. Fokkens, “How Emotionally Stable is ALBERT? Testing Robustness with Stochastic Weight Averaging on a Sentiment Analysis Task,” in Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems, Punta Cana, Dominican Republic, 2021, p. 16–31.

Despite their success, modern language models are fragile. Even small changes in their training pipeline can lead to unexpected results. We study this phenomenon by examining the robustness of ALBERT (Lan et al., 2020) in combination with Stochastic Weight Averaging (SWA){–-}a cheap way of ensembling{–-}on a sentiment analysis task (SST-2). In particular, we analyze SWA{‘}s stability via CheckList criteria (Ribeiro et al., 2020), examining the agreement on errors made by models differing only in their random seed. We hypothesize that SWA is more stable because it ensembles model snapshots taken along the gradient descent trajectory. We quantify stability by comparing the models{‘} mistakes with Fleiss{‘} Kappa (Fleiss, 1971) and overlap ratio scores. We find that SWA reduces error rates in general; yet the models still suffer from their own distinct biases (according to CheckList).

@inproceedings{khurana-etal-2021-emotionally,
title = "How Emotionally Stable is {ALBERT}? Testing
Robustness with Stochastic Weight Averaging on a
author = "Khurana, Urja and Nalisnick, Eric and Fokkens,
Antske",
booktitle = "Proceedings of the 2nd Workshop on Evaluation and
Comparison of NLP Systems",
month = nov,
year = "2021",
address = "Punta Cana, Dominican Republic",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2021.eval4nlp-1.3",
pages = "16--31",
abstract = "Despite their success, modern language models are
fragile. Even small changes in their training
pipeline can lead to unexpected results. We study
this phenomenon by examining the robustness of
ALBERT (Lan et al., 2020) in combination with
Stochastic Weight Averaging (SWA){---}a cheap way of
(SST-2). In particular, we analyze SWA{'}s stability
via CheckList criteria (Ribeiro et al., 2020),
examining the agreement on errors made by models
differing only in their random seed. We hypothesize
that SWA is more stable because it ensembles model
snapshots taken along the gradient descent
trajectory. We quantify stability by comparing the
models{'} mistakes with Fleiss{'} Kappa (Fleiss,
1971) and overlap ratio scores. We find that SWA
reduces error rates in general; yet the models still
suffer from their own distinct biases (according to
CheckList).",
}

• M. B. Vessies, S. P. Vadgama, R. R. van de Leur, P. A. F. M. Doevendans, R. J. Hassink, E. Bekkers, and R. van Es, “Interpretable ECG classification via a query-based latent space traversal (qLST),” CoRR, vol. abs/2111.07386, 2021.
@article{DBLP:journals/corr/abs-2111-07386,
author = {Melle B. Vessies and
Rutger R. van de Leur and
Pieter A. F. M. Doevendans and
Rutger J. Hassink and
Erik Bekkers and
Ren{\'{e}} van Es},
title = {Interpretable {ECG} classification via a query-based latent space
traversal (qLST)},
journal = {CoRR},
volume = {abs/2111.07386},
year = {2021},
url = {https://arxiv.org/abs/2111.07386},
abstact = {Electrocardiography (ECG) is an effective and non-invasive diagnostic tool that measures the electrical activity of the heart. Interpretation of ECG signals to detect various abnormalities is a challenging task that requires expertise. Recently, the use of deep neural networks for ECG classification to aid medical practitioners has become popular, but their black box nature hampers clinical implementation. Several saliency-based interpretability techniques have been proposed, but they only indicate the location of important features and not the actual features. We present a novel interpretability technique called qLST, a query-based latent space traversal technique that is able to provide explanations for any ECG classification model. With qLST, we train a neural network that learns to traverse in the latent space of a variational autoencoder trained on a large university hospital dataset with over 800,000 ECGs annotated for 28 diseases. We demonstrate through experiments that we can explain different black box classifiers by generating ECGs through these traversals.}
}

• B. Dudzik and J. Broekens, “A Valid Self-Report is Never Late, Nor is it Early: On Considering the Right Temporal Distance for Assessing Emotional Experience,” , 1, 2021.
[BibTeX]
@techreport{Dudzik2021,
author = {Dudzik, Bernd and Broekens, Joost},
booktitle = {Momentary Emotion Elicitation and Capture Workshop
(MEEC'21), May 9, 2021, Yokohama, Japan},
number = {1},
publisher = {Association for Computing Machinery},
title = {{A Valid Self-Report is Never Late, Nor is it Early:
On Considering the Right Temporal Distance for
Assessing Emotional Experience}},
volume = {1},
year = {2021}
}

• B. Dudzik, S. Columbus, T. M. Hrkalovic, D. Balliet, and H. Hung, “Recognizing Perceived Interdependence in Face-to-Face Negotiations through Multimodal Analysis of Nonverbal Behavior,” in Proceedings of the 2021 International Conference on Multimodal Interaction, New York, NY, USA: Association for Computing Machinery, 2021, p. 121–130. doi:10.1145/3462244.3479935

Enabling computer-based applications to display intelligent behavior in complex social settings requires them to relate to important aspects of how humans experience and understand such situations. One crucial driver of peoples’ social behavior during an interaction is the interdependence they perceive, i.e., how the outcome of an interaction is determined by their own and others’ actions. According to psychological studies, both the nonverbal behavior displayed by Motivated by this, we present a series of experiments to automatically recognize interdependence perceptions in dyadic face-to-face negotiations using these sources. Concretely, our approach draws on a combination of features describing individuals’ Facial, Upper Body, and Vocal Behavior with state-of-the-art algorithms for multivariate time series classification. Our findings demonstrate that differences in some types of interdependence perceptions can be detected through the automatic analysis of nonverbal behaviors. We discuss implications for developing socially intelligent systems and opportunities for future research.

@inbook{10.1145/3462244.3479935,
author = {Dudzik, Bernd and Columbus, Simon and Hrkalovic,
Tiffany Matej and Balliet, Daniel and Hung, Hayley},
title = {Recognizing Perceived Interdependence in
Face-to-Face Negotiations through Multimodal
Analysis of Nonverbal Behavior},
year = {2021},
isbn = {9781450384810},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
booktitle = {Proceedings of the 2021 International Conference on
Multimodal Interaction},
pages = {121–130},
numpages = {10},
doi = {10.1145/3462244.3479935},
url =
{https://research.tudelft.nl/en/publications/recognizing-perceived-interdependence-in-face-to-face-negotiation},
abstract = {Enabling computer-based applications to display
intelligent behavior in complex social settings
requires them to relate to important aspects of how
humans experience and understand such
situations. One crucial driver of peoples' social
behavior during an interaction is the
interdependence they perceive, i.e., how the outcome
of an interaction is determined by their own and
others' actions. According to psychological studies,
both the nonverbal behavior displayed by Motivated
by this, we present a series of experiments to
automatically recognize interdependence perceptions
in dyadic face-to-face negotiations using these
sources. Concretely, our approach draws on a
combination of features describing individuals'
Facial, Upper Body, and Vocal Behavior with
state-of-the-art algorithms for multivariate time
series classification. Our findings demonstrate that
differences in some types of interdependence
perceptions can be detected through the automatic
analysis of nonverbal behaviors. We discuss
implications for developing socially intelligent
systems and opportunities for future research.}
}

• C. Steging, S. Renooij, and B. Verheij, “Rationale Discovery and Explainable AI,” in Legal Knowledge and Information Systems – JURIX 2021: The Thirty-fourth Annual Conference, Vilnius, Lithuania, 8-10 December 2021, 2021, p. 225–234. doi:10.3233/FAIA210341

The justification of an algorithm’s outcomes is important in many domains, and in particular in the law. However, previous research has shown that machine learning systems can make the right decisions for the wrong reasons: despite high accuracies, not all of the conditions that define the domain of the training data are learned. In this study, we investigate what the system does learn, using state-of-the-art explainable AI techniques. With the use of SHAP and LIME, we are able to show which features impact the decision making process and how the impact changes with different distributions of the training data. However, our results also show that even high accuracy and good relevant feature detection are no guarantee for a sound rationale. Hence these state-of-the-art explainable AI techniques cannot be used to fully expose unsound rationales, further advocating the need for a separate method for rationale evaluation.

@inproceedings{DBLP:conf/jurix/StegingRV21,
author = {Cor Steging and
Silja Renooij and
Bart Verheij},
editor = {Schweighofer Erich},
title = {Rationale Discovery and Explainable {AI}},
booktitle = {Legal Knowledge and Information Systems - {JURIX} 2021: The Thirty-fourth
Annual Conference, Vilnius, Lithuania, 8-10 December 2021},
series = {Frontiers in Artificial Intelligence and Applications},
volume = {346},
pages = {225--234},
publisher = {{IOS} Press},
year = {2021},
url = {https://doi.org/10.3233/FAIA210341},
doi = {10.3233/FAIA210341},
abstract = {The justification of an algorithm’s outcomes is important in many domains, and in particular in the law. However, previous research has shown that machine learning systems can make the right decisions for the wrong reasons: despite high accuracies, not all of the conditions that define the domain of the training data are learned. In this study, we investigate what the system does learn, using state-of-the-art explainable AI techniques. With the use of SHAP and LIME, we are able to show which features impact the decision making process and how the impact changes with different distributions of the training data. However, our results also show that even high accuracy and good relevant feature detection are no guarantee for a sound rationale. Hence these state-of-the-art explainable AI techniques cannot be used to fully expose unsound rationales, further advocating the need for a separate method for rationale evaluation.}
}

• H. Loan, “Knowledge Representation Formalisms for Hybrid Intelligence,” in Doctoral consortium of the Knowledge Representation Conference, KR2021, online, 2021.

Knowledge graphs can play an important role to store and provide access to global knowledge, common and accessible to both human and artificial agents, and store local knowledge of individual agents in a larger network of agents. Studying suitable formalisms to model complex, conflicting, dynamic and contextualised knowledge is still a big challenge. Therefore, we investigate the usage of knowledge representation formalisms to allow artificial intelligence systems adapt and work with complex, conflicting, dynamic and contextualized knowledge.

@inproceedings{Loan2021,
author = {Loan, Ho},
publisher = {KR Conference},
title = {Knowledge Representation Formalisms for Hybrid
Intelligence},
year = {2021},
booktitle = {Doctoral consortium of the Knowledge Representation Conference, {KR}2021},
abstract = "Knowledge graphs can play an important role to store
accessible to both human and artificial agents, and
store local knowledge of individual agents in a
larger network of agents. Studying suitable
formalisms to model complex, conflicting, dynamic
and contextualised knowledge is still a big
challenge. Therefore, we investigate the usage of
knowledge representation formalisms to allow
artificial intelligence systems adapt and work with
complex, conflicting, dynamic and contextualized
knowledge."
}

• B. H. Kargar, K. Miras, and A. Eiben, “The effect of selecting for different behavioral traits on the evolved gaits of modular robots,” in ALIFE 2021: The 2021 Conference on Artificial Life, 2021.

Moving around in the environment is a fundamental skill for mobile robots. This makes the evolution of an appropriate gait, a pivotal problem in evolutionary robotics. Whereas the majority of the related studies concern robots with predefined modular or legged morphologies and locomotion speed as the optimization objective, here we investigate robots with evolvable morphologies and behavioral traits included in the fitness function. To analyze the effects we consider morphological as well as behavioral features of the evolved robots. To this end, we introduce novel behavioral measures that describe how the robot locomotes and look into the trade-off between them. Our main goal is to gain insights into differences in possible gaits of modular robots and to provide tools to steer evolution towards objectives beyond ‘simple’ speed.

@inproceedings{kargar2021effect,
title = {The effect of selecting for different behavioral
traits on the evolved gaits of modular robots},
author = {Kargar, Babak H and Miras, Karine and Eiben, AE},
booktitle = {ALIFE 2021: The 2021 Conference on Artificial Life},
year = {2021},
organization = {MIT Press},
url =
{https://direct.mit.edu/isal/proceedings/isal/33/26/102968},
abstract = {Moving around in the environment is a fundamental
skill for mobile robots. This makes the evolution of
an appropriate gait, a pivotal problem in
evolutionary robotics. Whereas the majority of the
related studies concern robots with predefined
modular or legged morphologies and locomotion speed
as the optimization objective, here we investigate
robots with evolvable morphologies and behavioral
traits included in the fitness function. To analyze
the effects we consider morphological as well as
behavioral features of the evolved robots. To this
end, we introduce novel behavioral measures that
describe how the robot locomotes and look into the
trade-off between them. Our main goal is to gain
insights into differences in possible gaits of
modular robots and to provide tools to steer
evolution towards objectives beyond ‘simple’ speed.}
}

• G. Boomgaard, S. Santamaría, I. Tiddi, R. J. Sips, and Z. Szlávik, “Learning profile-based recommendations for medical search auto-complete,” in AAAI-MAKE 2021 Combining Machine Learning and Knowledge Engineering, 2021, p. 1–13.

Query popularity is a main feature in web-search auto-completion. Several personalization features have been proposed to support specific users’ searches, but often do not meet the privacy requirements of a medical environment (e.g. clinical trial search). Furthermore, in such specialized domains, the differences in user expertise and the domain-specific language users employ are far more widespread than in web-search. We propose a query auto-completion method based on different relevancy and diversity features, which can appropriately meet different user needs. Our method incorporates indirect popularity measures, along with graph topology and semantic features. An evolutionary algorithm optimizes relevance, diversity, and coverage to return a top-k list of query completions to the user. We evaluated our approach quantitatively and qualitatively using query log data from a clinical trial search engine, comparing the effects of different relevancy and diversity settings using domain experts. We found that syntax-based diversity has more impact on effectiveness and efficiency, graph-based diversity shows a more compact list of results, and relevancy the most effect on indicated preferences.

@inproceedings{boomgaard-etal-2021-learning,
title = "Learning profile-based recommendations for medical
search auto-complete",
author = "Guusje Boomgaard and Selene
Báez Santamaría and Ilaria Tiddi and Robert Jan Sips
and Zoltán Szlávik",
keywords = "Knowledge graphs, Medical information retrieval,
Professional search, Query auto-Completion",
year = "2021",
month = apr,
day = "10",
language = "English",
series = "CEUR Workshop Proceedings",
publisher = "CEUR-WS",
pages = "1--13",
editor = "Andreas Martin and Knut Hinkelmann and Hans-Georg
Fill and Aurona Gerber and Doug Lenat and Reinhard
Stolle and {van Harmelen}, Frank",
booktitle = "AAAI-MAKE 2021 Combining Machine Learning and
Knowledge Engineering",
Url = "http://ceur-ws.org/Vol-2846/paper34.pdf",
abstract = "Query popularity is a main feature in web-search
auto-completion. Several personalization features
have been proposed to support specific users'
searches, but often do not meet the privacy
requirements of a medical environment (e.g. clinical
trial search). Furthermore, in such specialized
domains, the differences in user expertise and the
domain-specific language users employ are far more
widespread than in web-search. We propose a query
auto-completion method based on different relevancy
and diversity features, which can appropriately meet
different user needs. Our method incorporates
indirect popularity measures, along with graph
topology and semantic features. An evolutionary
algorithm optimizes relevance, diversity, and
coverage to return a top-k list of query completions
to the user. We evaluated our approach
quantitatively and qualitatively using query log
data from a clinical trial search engine, comparing
the effects of different relevancy and diversity
settings using domain experts. We found that
syntax-based diversity has more impact on
effectiveness and efficiency, graph-based diversity
shows a more compact list of results, and relevancy
the most effect on indicated preferences.",
}

• S. Baez Santamaria, T. Baier, T. Kim, L. Krause, J. Kruijt, and P. Vossen, “EMISSOR: A platform for capturing multimodal interactions as Episodic Memories and Interpretations with Situated Scenario-based Ontological References,” in Proceedings of the 1st Workshop on Multimodal Semantic Representations (MMSR), Groningen, Netherlands (Online), 2021, p. 56–77.

We present EMISSOR: a platform to capture multimodal interactions as recordings of episodic experiences with explicit referential interpretations that also yield an episodic Knowledge Graph (eKG). The platform stores streams of multiple modalities as parallel signals. Each signal is segmented and annotated independently with interpretation. Annotations are eventually mapped to explicit identities and relations in the eKG. As we ground signal segments from different modalities to the same instance representations, we also ground different modalities across each other. Unique to our eKG is that it accepts different interpretations across modalities, sources and experiences and supports reasoning over conflicting information and uncertainties that may result from multimodal experiences. EMISSOR can record and annotate experiments in virtual and real-world, combine data, evaluate system behavior and their performance for preset goals but also model the accumulation of knowledge and interpretations in the Knowledge Graph as a result of these episodic experiences.

@inproceedings{baez-santamaria-etal-2021-emissor,
title = "{EMISSOR}: A platform for capturing multimodal
interactions as Episodic Memories and
Interpretations with Situated Scenario-based
Ontological References",
author = "Baez Santamaria, Selene and Baier, Thomas and Kim,
Taewoon and Krause, Lea and Kruijt, Jaap and Vossen,
Piek",
booktitle = "Proceedings of the 1st Workshop on Multimodal
Semantic Representations (MMSR)",
month = jun,
year = "2021",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2021.mmsr-1.6",
pages = "56--77",
abstract = "We present EMISSOR: a platform to capture multimodal
interactions as recordings of episodic experiences
with explicit referential interpretations that also
yield an episodic Knowledge Graph (eKG). The
platform stores streams of multiple modalities as
parallel signals. Each signal is segmented and
annotated independently with
interpretation. Annotations are eventually mapped to
explicit identities and relations in the eKG. As we
ground signal segments from different modalities to
the same instance representations, we also ground
different modalities across each other. Unique to
our eKG is that it accepts different interpretations
across modalities, sources and experiences and
supports reasoning over conflicting information and
uncertainties that may result from multimodal
experiences. EMISSOR can record and annotate
experiments in virtual and real-world, combine data,
evaluate system behavior and their performance for
preset goals but also model the accumulation of
knowledge and interpretations in the Knowledge Graph
as a result of these episodic experiences.",
}

• B. M. Renting, H. H. Hoos, and C. M. Jonker, Automated Configuration and Usage of Strategy Portfolios for Bargaining, 2021.

Bargaining can be used to resolve mixed-motive games in multi-agent systems. Although there is an abundance of negotiation strategies implemented in automated negotiating agents, most agents are based on single fixed strategies, while it is widely acknowledged that there is no single best-performing strategy for all negotiation settings. In this paper, we focus on bargaining settings where opponents are repeatedly encountered, but the bargaining problems change. We introduce a novel method that automatically creates and deploys a portfolio of complementary negotiation strategies using a training set and optimise pay-off in never-before-seen bargaining settings through per-setting strategy selection. Our method relies on the following contributions. We introduce a feature representation that captures characteristics for both the opponent and the bargaining problem. We model the behaviour of an opponent during a negotiation based on its actions, which is indicative of its negotiation strategy, in order to be more effective in future encounters. Our combination of feature-based methods generalises to new negotiation settings, as in practice, over time, it selects effective counter strategies in future encounters. Our approach is tested in an Automated Negotiating Agents Competition (ANAC)-like tournament, and we show that we are capable of winning such a tournament with a 5.6% increase in pay-off compared to the runner-up agent.

@unpublished{Renting2021AutomatedBargaining,
title = {Automated Configuration and Usage of Strategy
Portfolios for Bargaining},
author = {Renting, Bram M. and Hoos, Holger H. and Jonker,
Catholijn M.},
year = {2021},
month = dec,
booktitle = {NeurIPS 2021 Workshop on Cooperative AI},
abstract = {Bargaining can be used to resolve mixed-motive games
in multi-agent systems. Although there is an
abundance of negotiation strategies implemented in
automated negotiating agents, most agents are based
on single fixed strategies, while it is widely
acknowledged that there is no single best-performing
strategy for all negotiation settings. In this
paper, we focus on bargaining settings where
opponents are repeatedly encountered, but the
bargaining problems change. We introduce a novel
method that automatically creates and deploys a
portfolio of complementary negotiation strategies
using a training set and optimise pay-off in
never-before-seen bargaining settings through
per-setting strategy selection. Our method relies on
the following contributions. We introduce a feature
representation that captures characteristics for
both the opponent and the bargaining problem. We
model the behaviour of an opponent during a
negotiation based on its actions, which is
indicative of its negotiation strategy, in order to
be more effective in future encounters. Our
combination of feature-based methods generalises to
new negotiation settings, as in practice, over time,
it selects effective counter strategies in future
encounters. Our approach is tested in an Automated
Negotiating Agents Competition (ANAC)-like
tournament, and we show that we are capable of
winning such a tournament with a 5.6% increase in
pay-off compared to the runner-up agent.},
url = {https://www.cooperativeai.com/neurips-2021/workshop-papers},
}

• R. Dobbe, T. Krendl Gilbert, and Y. Mintz, “Hard choices in artificial intelligence,” Artificial Intelligence, vol. 300, 2021. doi:10.1016/j.artint.2021.103555

As AI systems are integrated into high stakes social domains, researchers now examine how to design and operate them in a safe and ethical manner. However, the criteria for identifying and diagnosing safety risks in complex social contexts remain unclear and contested. In this paper, we examine the vagueness in debates about the safety and ethical behavior of AI systems. We show how this vagueness cannot be resolved through mathematical formalism alone, instead requiring deliberation about the politics of development as well as the context of deployment. Drawing from a new sociotechnical lexicon, we redefine vagueness in terms of distinct design challenges at key stages in AI system development. The resulting framework of Hard Choices in Artificial Intelligence (HCAI) empowers developers by 1. identifying points of overlap between design decisions and major sociotechnical challenges; 2. motivating the creation of stakeholder feedback channels so that safety issues can be exhaustively addressed. As such, HCAI contributes to a timely debate about the status of AI development in democratic societies, arguing that deliberation should be the goal of AI Safety, not just the procedure by which it is ensured.

@article{dobbe_hard_2021,
title = {Hard choices in artificial intelligence},
volume = {300},
issn = {0004-3702},
url = {https://www.sciencedirect.com/science/article/pii/S0004370221001065},
doi = {10.1016/j.artint.2021.103555},
abstract = {As AI systems are integrated into high stakes social
domains, researchers now examine how to design and
operate them in a safe and ethical manner. However,
the criteria for identifying and diagnosing safety
risks in complex social contexts remain unclear and
contested. In this paper, we examine the vagueness
in debates about the safety and ethical behavior of
AI systems. We show how this vagueness cannot be
resolved through mathematical formalism alone,
development as well as the context of
deployment. Drawing from a new sociotechnical
lexicon, we redefine vagueness in terms of distinct
design challenges at key stages in AI system
development. The resulting framework of Hard Choices
in Artificial Intelligence (HCAI) empowers
developers by 1. identifying points of overlap
between design decisions and major sociotechnical
challenges; 2. motivating the creation of
stakeholder feedback channels so that safety issues
can be exhaustively addressed. As such, HCAI
contributes to a timely debate about the status of
AI development in democratic societies, arguing that
deliberation should be the goal of AI Safety, not
just the procedure by which it is ensured.},
language = {en},
urldate = {2021-08-04},
journal = {Artificial Intelligence},
author = {Dobbe, Roel and Krendl Gilbert, Thomas and Mintz, Yonatan},
month = nov,
year = {2021},
keywords = {AI ethics, AI governance, AI regulation, AI safety, Philosophy of artificial intelligence, Sociotechnical systems}
}

### 2020

• B. Verheij, “Artificial intelligence as law,” Artif. Intell. Law, vol. 28, iss. 2, p. 181–206, 2020. doi:10.1007/s10506-020-09266-0
@article{Verheij20,
author = {Bart Verheij},
title = {Artificial intelligence as law},
journal = {Artif. Intell. Law},
volume = {28},
number = {2},
pages = {181--206},
year = {2020},
url = {https://doi.org/10.1007/s10506-020-09266-0},
doi = {10.1007/s10506-020-09266-0},
timestamp = {Fri, 05 Jun 2020 17:08:42 +0200},
biburl = {https://dblp.org/rec/journals/ail/Verheij20.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}

• N. Kökciyan and P. Yolum, “TURP: Managing Trust for Regulating Privacy in Internet of Things,” IEEE Internet Computing, vol. 24, iss. 6, pp. 9-16, 2020. doi:https://doi.org/10.1109/MIC.2020.3020006

Internet of Things [IoT] applications, such as smart home or ambient assisted livingsystems, promise useful services to end users. Most of these services rely heavily on sharingand aggregating information among devices; many times raising privacy concerns. Contrary totraditional systems, where privacy of each user is managed through well-defined policies, thescale, dynamism, and heterogeneity of the IoT systems make it impossible to specify privacypolicies for all possible situations. Alternatively, this paper argues that handling of privacy has tobe reasoned by the IoT devices, depending on the norms, context, as well as the trust amongentities. We present a technique, where an IoT device collects information from others, evaluatesthe trustworthiness of the information sources to decide the suitability of sharing informationwith others. We demonstrate the applicability of the technique over an IoT pilot study.

@ARTICLE{turp-ic-2020,
journal={IEEE Internet Computing},
title={TURP: Managing Trust for Regulating Privacy in Internet of Things},
year={2020},
volume={24},
number={6},
pages={9-16},
abstract = {Internet of Things [IoT] applications, such as smart home or ambient assisted livingsystems, promise useful services to end users. Most of these services rely heavily on sharingand aggregating information among devices; many times raising privacy concerns. Contrary totraditional systems, where privacy of each user is managed through well-defined policies, thescale, dynamism, and heterogeneity of the IoT systems make it impossible to specify privacypolicies for all possible situations. Alternatively, this paper argues that handling of privacy has tobe reasoned by the IoT devices, depending on the norms, context, as well as the trust amongentities. We present a technique, where an IoT device collects information from others, evaluatesthe trustworthiness of the information sources to decide the suitability of sharing informationwith others. We demonstrate the applicability of the technique over an IoT pilot study.},
url = {https://webspace.science.uu.nl/~yolum001/papers/InternetComputing-20-TURP.pdf},
doi = {https://doi.org/10.1109/MIC.2020.3020006}
}

• O. Ulusoy and P. Yolum, “Agents for Preserving Privacy: Learning and Decision Making Collaboratively,” in Multi-Agent Systems and Agreement Technologies, 2020, p. 116–131. doi:https://doi.org/10.1007/978-3-030-66412-1_8

Privacy is a right of individuals to keep personal information to themselves. Often online systems enable their users to select what information they would like to share with others and what information to keep private. When an information pertains only to a single individual, it is possible to preserve privacy by providing the right access options to the user. However, when an information pertains to multiple individuals, such as a picture of a group of friends or a collaboratively edited document, deciding how to share this information and with whom is challenging as individuals might have conflicting privacy constraints. Resolving this problem requires an automated mechanism that takes into account the relevant individuals’ concerns to decide on the privacy configuration of information. Accordingly, this paper proposes an auction-based privacy mechanism to manage the privacy of users when information related to multiple individuals are at stake. We propose to have a software agent that acts on behalf of each user to enter privacy auctions, learn the subjective privacy valuations of the individuals over time, and to bid to respect their privacy. We show the workings of our proposed approach over multiagent simulations.

@InProceedings{ulusoy-yolum-20,
title = "Agents for Preserving Privacy: Learning and Decision
Making Collaboratively",
author = "Ulusoy, Onuralp and Yolum, P{\i}nar",
Jonge, Dave",
booktitle = "Multi-Agent Systems and Agreement Technologies",
year = "2020",
publisher = "Springer International Publishing",
pages = "116--131",
abstract = "Privacy is a right of individuals to keep personal
information to themselves. Often online systems
enable their users to select what information they
would like to share with others and what information
to keep private. When an information pertains only
to a single individual, it is possible to preserve
privacy by providing the right access options to the
user. However, when an information pertains to
multiple individuals, such as a picture of a group
of friends or a collaboratively edited document,
is challenging as individuals might have conflicting
privacy constraints. Resolving this problem requires
an automated mechanism that takes into account the
relevant individuals' concerns to decide on the
privacy configuration of information. Accordingly,
this paper proposes an auction-based privacy
mechanism to manage the privacy of users when
information related to multiple individuals are at
stake. We propose to have a software agent that acts
on behalf of each user to enter privacy auctions,
learn the subjective privacy valuations of the
individuals over time, and to bid to respect their
privacy. We show the workings of our proposed
approach over multiagent simulations.",
isbn = "978-3-030-66412-1",
doi = {https://doi.org/10.1007/978-3-030-66412-1_8},
url =
{https://webspace.science.uu.nl/~yolum001/papers/ulusoy-yolum-20.pdf}
}

• L. Krause and P. Vossen, “When to explain: Identifying explanation triggers in human-agent interaction,” in 2nd Workshop on Interactive Natural Language Technology for Explainable Artificial Intelligence, Dublin, Ireland, 2020, p. 55–60.

With more agents deployed than ever, users need to be able to interact and cooperate with them in an effective and comfortable manner. Explanations have been shown to increase the understanding and trust of a user in human-agent interaction. There have been numerous studies investigating this effect, but they rely on the user explicitly requesting an explanation. We propose a first overview of when an explanation should be triggered and show that there are many instances that would be missed if the agent solely relies on direct questions. For this, we differentiate between direct triggers such as commands or questions and introduce indirect triggers like confusion or uncertainty detection.

@inproceedings{krause-vossen-2020-explain,
title = "When to explain: Identifying explanation triggers in human-agent interaction",
author = "Krause, Lea and
Vossen, Piek",
booktitle = "2nd Workshop on Interactive Natural Language Technology for Explainable Artificial Intelligence",
month = nov,
year = "2020",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2020.nl4xai-1.12",
pages = "55--60",
abstract = "With more agents deployed than ever, users need to be able to interact and cooperate with them in an effective and comfortable manner. Explanations have been shown to increase the understanding and trust of a user in human-agent interaction. There have been numerous studies investigating this effect, but they rely on the user explicitly requesting an explanation. We propose a first overview of when an explanation should be triggered and show that there are many instances that would be missed if the agent solely relies on direct questions. For this, we differentiate between direct triggers such as commands or questions and introduce indirect triggers like confusion or uncertainty detection.",
}

• P. K. Murukannaiah, N. Ajmeri, C. M. Jonker, and M. P. Singh, “New Foundations of Ethical Multiagent Systems,” in Proceedings of the 19th Conference on Autonomous Agents and MultiAgent Systems, Auckland, 2020, p. 1706–1710.

Ethics is inherently a multiagent concern. However, research on AI ethics today is dominated by work on individual agents: (1) how an autonomous robot or car may harm or (differentially) benefit people in hypothetical situations (the so-called trolley problems) and (2) how a machine learning algorithm may produce biased decisions or recommendations. The societal framework is largely omitted. To develop new foundations for ethics in AI, we adopt a sociotechnical stance in which agents (as technical entities) help autonomous social entities or principals (people and organizations). This multiagent conception of a sociotechnical system (STS) captures how ethical concerns arise in the mutual interactions of multiple stakeholders. These foundations would enable us to realize ethical STSs that incorporate social and technical controls to respect stated ethical postures of the agents in the STSs. The envisioned foundations require new thinking, along two broad themes, on how to realize (1) an STS that reflects its stakeholders’ values and (2) individual agents that function effectively in such an STS.

@inproceedings{Murukannaiah-2020-AAMASBlueSky-EthicalMAS,
author = {Pradeep K. Murukannaiah and Nirav Ajmeri and
Catholijn M. Jonker and Munindar P. Singh},
title = {New Foundations of Ethical Multiagent Systems},
booktitle = {Proceedings of the 19th Conference on Autonomous
Agents and MultiAgent Systems},
series = {AAMAS '20},
year = {2020},
pages = {1706--1710},
numpages = {5},
keywords = {Agents, ethics},
url =
abstract = {Ethics is inherently a multiagent concern. However,
research on AI ethics today is dominated by work on
individual agents: (1) how an autonomous robot or
car may harm or (differentially) benefit people in
hypothetical situations (the so-called trolley
problems) and (2) how a machine learning algorithm
may produce biased decisions or recommendations. The
societal framework is largely omitted. To develop
new foundations for ethics in AI, we adopt a
sociotechnical stance in which agents (as technical
entities) help autonomous social entities or
principals (people and organizations). This
multiagent conception of a sociotechnical system
(STS) captures how ethical concerns arise in the
mutual interactions of multiple stakeholders. These
foundations would enable us to realize ethical STSs
that incorporate social and technical controls to
respect stated ethical postures of the agents in the
STSs. The envisioned foundations require new
thinking, along two broad themes, on how to realize
(1) an STS that reflects its stakeholders' values
and (2) individual agents that function effectively
in such an STS.}
}

• Z. Akata, D. Balliet, M. de Rijke, F. Dignum, V. Dignum, G. Eiben, A. Fokkens, D. Grossi, K. Hindriks, H. Hoos, H. Hung, C. Jonker, C. Monz, M. Neerincx, F. Oliehoek, H. Prakken, S. Schlobach, L. van der Gaag, F. van Harmelen, H. van Hoof, B. van Riemsdijk, A. van Wynsberghe, R. Verbrugge, B. Verheij, P. Vossen, and M. Welling, “A Research Agenda for Hybrid Intelligence: Augmenting Human Intellect With Collaborative, Adaptive, Responsible, and Explainable Artificial Intelligence,” IEEE Computer, vol. 53, iss. 08, pp. 18-28, 2020. doi:10.1109/MC.2020.2996587

We define hybrid intelligence (HI) as the combination of human and machine intelligence, augmenting human intellect and capabilities instead of replacing them and achieving goals that were unreachable by either humans or machines. HI is an important new research focus for artificial intelligence, and we set a research agenda for HI by formulating four challenges.

@ARTICLE {9153877,
author = {Z. Akata and D. Balliet and M. de Rijke and
F. Dignum and V. Dignum and G. Eiben and A. Fokkens
and D. Grossi and K. Hindriks and H. Hoos and
H. Hung and C. Jonker and C. Monz and M. Neerincx
and F. Oliehoek and H. Prakken and S. Schlobach and
L. van der Gaag and F. van Harmelen and H. van Hoof
and B. van Riemsdijk and A. van Wynsberghe and
R. Verbrugge and B. Verheij and P. Vossen and
M. Welling},
journal = {IEEE Computer},
title = {A Research Agenda for Hybrid Intelligence:
Augmenting Human Intellect With Collaborative,
Intelligence},
year = {2020},
volume = {53},
number = {08},
issn = {1558-0814},
pages = {18-28},
doi = {10.1109/MC.2020.2996587},
publisher = {IEEE Computer Society},
address = {Los Alamitos, CA, USA},
month = {aug},
url =
"http://www.cs.vu.nl/~frankh/postscript/IEEEComputer2020.pdf",
abstract = "We define hybrid intelligence (HI) as the
combination of human and machine intelligence,
augmenting human intellect and capabilities instead
of replacing them and achieving goals that were
unreachable by either humans or machines. HI is an
important new research focus for artificial
intelligence, and we set a research agenda for HI by
formulating four challenges."
}

• B. M. Renting, H. H. Hoos, and C. M. Jonker, “Automated Configuration of Negotiation Strategies,” in Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems, 2020, p. 1116–1124.

Bidding and acceptance strategies have a substantial impact on the outcome of negotiations in scenarios with linear additive and nonlinear utility functions. Over the years, it has become clear that there is no single best strategy for all negotiation settings, yet many fixed strategies are still being developed. We envision a shift in the strategy design question from: What is a good strategy?, towards: What could be a good strategy? For this purpose, we developed a method leveraging automated algorithm configuration to find the best strategies for a specific set of negotiation settings. By empowering automated negotiating agents using automated algorithm configuration, we obtain a flexible negotiation agent that can be configured automatically for a rich space of opponents and negotiation scenarios. To critically assess our approach, the agent was tested in an ANAC-like bilateral automated negotiation tournament setting against past competitors. We show that our automatically configured agent outperforms all other agents, with a 5.1% increase in negotiation payoff compared to the next-best agent. We note that without our agent in the tournament, the top-ranked agent wins by a margin of only 0.01%.

@inproceedings{Renting2020AutomatedStrategies,
title = {Automated Configuration of Negotiation Strategies},
booktitle = {Proceedings of the 19th International Conference on
Autonomous Agents and MultiAgent Systems},
author = {Renting, Bram M. and Hoos, Holger H. and Jonker,
Catholijn M.},
year = {2020},
month = may,
series = {AAMAS '20},
pages = {1116--1124},
publisher = {International Foundation for Autonomous Agents and
Multiagent Systems},
abstract = {Bidding and acceptance strategies have a substantial
impact on the outcome of negotiations in scenarios
with linear additive and nonlinear utility
functions. Over the years, it has become clear that
there is no single best strategy for all negotiation
settings, yet many fixed strategies are still being
developed. We envision a shift in the strategy
design question from: What is a good strategy?,
towards: What could be a good strategy? For this
purpose, we developed a method leveraging automated
algorithm configuration to find the best strategies
for a specific set of negotiation settings. By
empowering automated negotiating agents using
automated algorithm configuration, we obtain a
flexible negotiation agent that can be configured
automatically for a rich space of opponents and
negotiation scenarios. To critically assess our
approach, the agent was tested in an ANAC-like
bilateral automated negotiation tournament setting
against past competitors. We show that our
automatically configured agent outperforms all other
agents, with a 5.1% increase in negotiation payoff
compared to the next-best agent. We note that
without our agent in the tournament, the top-ranked
agent wins by a margin of only 0.01%.},
isbn = {978-1-4503-7518-4},
keywords = {automated algorithm configuration,automated
negotiation,negotiation strategy},
url =
{https://ifaamas.org/Proceedings/aamas2020/pdfs/p1116.pdf}
}