Code & Data

On this page we publish data sets and experimental code developed by members of the Hybrid Intelligence Center that we believe to be of use for other scientists, and that we are sharing as part of our commitment to open science. 

Personal assistive technologies need a user model that records information about the user’s goals, values, and context. Such user models require updating over time to accommodate changes and continuously align with what the user deems important. 

We performed an exploratory focus group user study, in which we showed participants six scenarios with different variants of how we envision such alignment dialogues might look like. The scenarios and dialogues are available on 4TU.ResearchData. The transcriptions of the focus groups are in a separate dataset.

For details, see the paper Acquiring Semantic Knowledge for User Model Updates via Human-Agent Alignment Dialogues.

The award winning paper Knowledge Engineering for Hybrid Intelligence investigates how classical Knowledge Engineering methods can be adapted for use in HI scenarios, and proposes a new ontology for HI knowledge roles, and a set of HI tasks. We have also built an online  repository for HI scenario’s, allowing reuse, validation and design of existing and new HI applications. For seven Hybrid Intelligence scenarios, the repository provides the task decomposition in markdown as well as a visual (flowchart) representation. The task decomposition consists of a title, a description, a visual task decompositionterminologyinference steps and tasks.  Users can add new scenario’s through the usual GitHub mechanisms.

Memory is a crucial aspect of human interaction as it influences how we perceive, interpret, and respond to new information. Yet, the study of human memory in free-flowing conversations is an under-explored field. To address this gap, we have created the MEMO corpus – a collection of multi-party zoom discussions annotated with conversational memory and various subjective and objective measures.  

What is in MEMO? 

The MEMO corpus is composed of 45 group discussions centred around the topic of Covid-19. These discussions took place with a 3-4 day gap between each session, amounting to a total of 34 hours of discussion. The corpus features 15 groups, with each group consisting of 3 to 6 participants. In total, 59 individuals with diverse backgrounds participated in these discussions, sharing their experiences during the Covid-19 pandemic. Participants were asked to complete questionnaires before and after each session, providing insights into the moments they remembered from each conversation, their personalities, values, and perceptions of each other, the group, and the interaction as a whole. 

First results 

Regarding conversational memorability research, our first study on MEMO corpus [1] already indicates that low-level multimodal cues, such as gaze and speaker activity, can predict conversational memorability, and non-verbal signals can indicate when a memorable moment starts and ends. The study also found that participants’ personal feelings and experiences are the most frequently mentioned grounds for remembering meeting segments. See the video and the publication for more details on this study. 


In conclusion, the MEMO corpus provides rich data on human conversational dynamics, including verbal and non-verbal signals, memorable moments, participants’ personality, values, and perceptions of the interaction and others. This information could be used to train machine learning models to recognize social behaviours and understand humans better, which could ultimately lead to the development of more effective and natural human-computer interactions. Additionally, the corpus provides insight into the impact of individual, interpersonal, and group-level factors on memory formation and recall, which could inform the design of more human-centred hybrid intelligence systems that take these factors into account.  

How to obtain 

The corpus is already available by request – contact Maria Tsfasman ( and Catharine Oertel ( In 2024 the corpus will be released in open access for any researchers to use. For any other questions and inquiries feel free to contact Maria Tsfasman ( 


  1. Tsfasman, M., Fenech, K., Tarvirdians, M., Lorincz, A., Jonker, C., & Oertel, C. (2022). Towards creating a conversational memory for long-term meeting support: predicting memorable moments in multi-party conversations through eye-gaze. In ICMI 2022 – Proceedings of the 2022 International Conference on Multimodal Interaction (pp. 94-104). (ACM International Conference Proceeding Series). Association for Computing Machinery (ACM).



EMMISSOR is a platform to capture multimodal interactions as recordings of episodic experiences with explicit referential interpretations that also yield an episodic Knowledge Graph (eKG).

The platform stores streams of multiple modalities as parallel signals. Each signal is segmented and annotated independently with interpretation. Annotations are eventually mapped to explicit identities and relations in the eKG. As we ground signal segments from different modalities to the same instance representations, we also ground different modalities across each other. Unique to our eKG is that it accepts different interpretations across modalities, sources and experiences and supports reasoning over conflicting information and uncertainties that may result from multimodal experiences. EMISSOR can record and annotate experiments in virtual and real-world, combine data, evaluate system behavior and their performance for preset goals but also model the accumulation of knowledge and interpretations in the Knowledge Graph as a result of these episodic experiences. 

EMISSOR has been published in the Proceedings of the 1st Workshop on Multimodal Semantic Representations (MMSR) and is work by Selen Baez Santamaria, Thomas Baier, Taewoon Kim, Lea Krause, Jaap Kruijt and Piek Vossen.

The code is at

The key arguments underlying a large and noisy set of opinions help understand the opinions quickly and accurately. While NLP methods can extract arguments, they require large labeled datasets and only work well for known viewpoints, but not for novel points of view.

HyEnA, a hybrid (human + AI) method for extracting arguments from opinionated texts, combines the speed of automated processing with the understanding and reasoning capabilities of humans. HyEnA achieves higher coverage and precision than a state-of-the-art automated method, when compared on a common set of diverse opinions, justifying the need for human insight. HyEnA also requires less human effort and does not compromise quality compared to (fully manual) expert analysis, demonstrating the benefit of combining human and machine intelligence.

HyEnA has been published in “HyEnA: A Hybrid Method for Extracting Arguments from Opinions “, M. van der Meer, E. Liscio, C. M. Jonker, A. Plaat, P. Vossen, and P. K. Murukannaiah, “H,” in HHAI2022: Augmenting Human Intellect, Amsterdam, the Netherlands, 2022, p. 17–31



Space Cannons is a two-player shooting game designed as a reinforcement learning test bed. Space Canons has two crucial features: (i) it can provide separate scores to agents when they cooperate to score points, depending on the degree of cooperation displayed by the agents. This is done by counting the number of hits by each enemy and then creating a coop-factor to decide if the hit was cooperative or not; (ii) Space Cannons is designed to also collect demonstration data coming from human experts. This data can then again be used in the training process of any leaning algorithm.

Space Cannons was designed by Mehul Verma as his AI MSc thesis at the Vrije Universiteit under supervision of Erman Acar.

It was published in the paper “Learning to Cooperate with Human Evaluative Feedback and DemonstrationsMehul Verma and Erman Acar, Hybrid Human-Artificial Intelligence Conference (HHAI) 2022, IOS Press (to appear),

The GitHub repo is at

Documentation for Space Cannons is at

Value alignment is a crucial aspect of ethical multiagent systems. An important step toward value alignment is identifying values specific to an application context. However, identifying context specific values is complex and cognitively demanding

The Axies platform simplifies the complex value identification task as a guided value annotation task. Our platform successfully supported the experiments involving two contexts and two groups of annotators by providing an intuitive design that allows the annotators to visualize all components.

Axies has two key features: (1) it requires collaborative work among human annotators, who perform several high-level cognitive tasks, and (2) it exploits natural language processing (NLP) and active learning techniques to guide annotation. The interface can be used on small (e.g., smart phone) and large screens

Axies has been published in “A Collaborative Platform for Identifying Context-Specific Values“, Enrico Liscio, Michiel van der Meer, Catholijn Jonker, Pradeep Murukannaiah,, AAMAS 2021, pgs. 1773–1775, as well as a longer journal version.
There’s also a short paper at the HHAI 2022 conference and a poster on Axies.

See for a demonstration.

Axies is available on Github at and in the TU Delft code repository.

ConfLab is a large scale multi-modal dataset of 48 people in a social networking setting. This dataset is useful for studying group dynamics of a scientific community. Besides the specific dataset, we also make avaiable the general framework for collection social data, including recommendations for data sharing and data gathering in ecologically valid in the wild settings.

ConfLab has been publised in Data Collection Framework and Dataset : Raman, C., Vargas-Quiros, J., Tan, S., Gedik, E., Islam, A., & Hung, H. (2022). ConfLab: A Rich MultimodalM ultisensor Dataset of Free-Standing Social Interactions In-the-Wild. arXiv preprint arXiv:2205.05177. To appear NeurIPS 2022

The dataset can be found at the 4TU data repository:

An agent in RoomEnv has three types of memory systems: short-term, episodic, and semantic, each with different performance properties, and each modeled with a knowledge graph. Every observation is initially stored in the short-term memory system. When it gets full, the agent must decide what to do with the oldest short-term memory. The agent can take one of the three actions: 1) forget it completely, 2) move it to the episodic part of the long-term memory system, or 3) move it to the semantic part of the long-term memory system. The memory systems will be managed according to the actions taken, and they will eventually be used to answer questions. The better you manage your memory systems, the higher chances that your agent can answer more questions correctly. The environment is OpenAI Gym compatible and highly configurable. For example, you can configure the capacity of the memory systems, question frequency, whether to have its semantic memory system prefilled, etc. Documentation can be found in the GitHub repos.

RoomEnv has been published in the paper “A Machine with Short-Term, Episodic, and Semantic Memory Systems”, Taewoon Kim, Michael Cochez, Vincent François-Lavet, Mark Neerincx, and Piek Vossen, AAAI 2023.

Code for the environment:

Code for the RL (DQN) agent:

A crucial challenge in AI is aligning learning (data) and reasoning (knowledge). We therefore provide two domains that can be used to investigate this connection: the fictional welfare benefit domain and the real-life tort law domain.

In our paper “Arguments, rules and cases in law: Resources for aligning learning and reasoning in structured domains” we provide a formal description of both domains. Using the code in our repository, one can generate artificial datasets that are based on these formal descriptions. Each domain therefore consists of a knowledge representation and a dataset. These have been used in previous experiments on aligning learning and reasoning, and can be used to investigate the connections between arguments, cases, rules, and more.

The code and example datasets are publicly available at GitHub.