CogAFFC | Cognition-driven Multimodal Affective and Empathetic Computing

Building multimodal AI that can read affect, model minds, and recover what people mean beyond what they say. CogAFFC is motivated by a core gap: multimodal systems can often recognize visible emotion, yet still fail to explain why it emerges, what mental state causes it, and what hidden social meaning is being conveyed. It connects multimodal perception, Theory-of-Mind reasoning, implicit social context, and empathetic intelligence into one coherent agenda.

A flagship research works
B community workshops, surveys

Emotion reasoning

Theory of Mind

Implicit social context

Flagship Research

Layer 1

Multimodal affect signals

Start from the observable world: facial expression, speech, language, video context, and other cross-modal cues that reveal emotional state.

Emotion perception and recognition
Cross-modal evidence alignment
From cues to grounded observation

Layer 2

Theory-of-Mind reasoning

Move from what is seen to what is mentally simulated: beliefs, intentions, causes, and intermediate cognitive states that explain emotion.

Benchmarking cognitive depth
Process-level reasoning supervision
Faithful emotional explanation

Layer 3

Implicit social meaning

Go beyond explicit emotion to what people actually imply under social constraints, politeness, strategic expression, irony, and hidden stance.

Affection, intent, and stance
Conflict-driven abductive reasoning
Towards robust empathetic systems

ACL 2023 ISA + Chain-of-Thought

THOR-ISA: Reasoning Implicit Sentiment with Chain-of-Thought Prompting

THOR-ISA studies implicit sentiment analysis, where opinion cues appear in obscure and indirect forms. The work introduces a three-hop Chain-of-Thought prompting framework that progressively infers latent aspect, opinion, and final sentiment polarity.

Targets ISA settings that require commonsense and multi-hop reasoning over latent intent.
Introduces THOR, a three-step prompting principle for aspect, opinion, and polarity inference.
Improves the state of the art by over 6% F1 in supervised setup and over 50% F1 in zero-shot setting.

Project Paper

ICLR 2026 Benchmark + Reasoning

Unveiling the Cognitive Compass: Theory-of-Mind-Guided Multimodal Emotion Reasoning

HitEmotion reframes multimodal emotion understanding as a cognition-sensitive task. It introduces a Theory-of-Mind-grounded benchmark and a reasoning pipeline that explicitly tracks mental states instead of relying on shallow post-hoc rationales.

Restructures 24 datasets into three levels of cognitive depth.
Diagnoses where current multimodal models break as reasoning gets deeper.
Combines ToM-guided reasoning with TMPO process supervision and reinforcement learning.

Project Paper

Progressing Dataset + Abductive reasoning

MoCA: Implicit Social Context Analysis

MoCA formalizes the problem of recovering what people truly feel, intend, or imply when their expression is indirect. The work expands the thread from explicit emotion understanding to latent social meaning under multimodal social context.

Defines three core dimensions: implicit affection, implicit intent, and implicit stance.
Builds a 3,000-instance multimodal benchmark from memes, debates, discussions, and sitcoms.
Introduces CoDAR, a conflict-driven abductive reasoning framework for hidden social inference.

Coming soon

Community

Workshop

The CogMAEC Workshop series

The workshop anchors the community around cognition-oriented multimodal affective and empathetic computing. It frames the area around context, causal reasoning, emotion understanding, and the next generation of affect-aware multimodal models.

Co-located with ACM Multimedia 2025 and organized as the first CogMAEC edition.
Brings together work on affective computing, MLLM reasoning, and cognitive modeling.
Includes invited talks, oral presentations, posters, and 11 accepted papers.

ACMMM 2025 Workshop