Synthesizing Images on Perceptual Boundaries (BAM)

Overview

Human decision-making in cognitive tasks and daily life exhibits considerable variability, shaped by factors such as task difficulty, individual preferences, and personal experiences. Understanding this variability across individuals is essential for uncovering the perceptual and decision-making mechanisms that humans rely on when faced with uncertainty and ambiguity. We propose a systematic Boundary Alignment Manipulation (BAM) framework for studying human perceptual variability through image generation. BAM combines perceptual boundary sampling in ANNs and human behavioral experiments to systematically investigate this phenomenon. Our perceptual boundary sampling algorithm generates stimuli along ANN perceptual boundaries that intrinsically induce significant perceptual variability. The efficacy of these stimuli is empirically validated through large-scale behavioral experiments involving 246 participants across 116,715 trials, culminating in the variMNIST dataset containing 19,943 systematically annotated images. Through personalized model alignment and adversarial generation, we establish a reliable method for simultaneously predicting and manipulating the divergent perceptual decisions of pairs of participants. This work bridges the gap between computational models and human individual difference research, providing new tools for personalized perception analysis.

Perceptual Boundary Sampling Algorithm

The image perturbations that significantly affect ANN perception also influence human perception, suggesting that ANNs and humans may share similar perceptual boundaries. Based on this, we hypothesize that samples on these boundaries (which exhibit high perceptual variability for ANNs) may also lead to ambiguous perception in humans, resulting in different internal experiences for the same stimuli.

Collecting Human Perceptual Variability

To comprehensively evaluate the guiding effectiveness of the generation method, we define three types of guidance outcome: success, bias, and failure. For the guidance targets o₁ and o₂, let p₁ and p₂ represent the probabilities of participants choosing o₁ and o₂, respectively.

A result is considered success if \(p_1 + p_2 \geq 80\%\) and \(\min(p_1, p_2) \geq 10\%\), indicating the generated stimuli guide participants to make a balanced choice between the two targets. A result is labeled as bias if \(p_1 + p_2 \geq 80\%\) but \(\min(p_1, p_2) < 10\%\), indicating a strong bias toward one target. A result is classified as failure if \(p_1 + p_2 < 80\%\), meaning the stimuli fail to guide participants effectively. These definitions allow us to evaluate and compare the performance of different guidance strategies and classifiers.

Predicting Human Perceptual Variability

To align the base models pretrained on MNIST with the performance of both group and individual levels, we adopted a 2-stage fine-tuning approach. For the whole procedure, the group model (GroupNet) was finetuned from the base model (BaseNet), and the individual model (IndivNet) was finetuned from the group model. For individual-level datasets (variMNIST-i), which are subsets of variMNIST corresponding to specific individuals, the validation set was designed to avoid overlap with the group validation set. More details can be found in the paper.

Manipulating Human Perceptual Variablity

Building on variMNIST and alignment experiments, we designed a paradigm to test whether individually fine-tuned models can amplify perceptual differences and guide decision-making. This experiment evaluates the ability of targeted stimuli to reveal individual variability and achieve precise manipulation of perceptual outcomes, highlighting the potential of personalized modeling in understanding human perception.

First Round

For the first round of experiments, we initially selected around 500 balanced samples from the variMNIST dataset as stimuli. After collecting behavioral data from pairs of participants, we fine-tuned their individual models. Controversial stimuli were then generated using the updated models, aiming to elicit distinct choices between the two participants, with each choosing their respective guidance targets.

Second Round

In the second round of experiments, these controversial stimuli were presented to participants in pairs, with each pair completing trials designed to test whether the fine-tuned models could effectively guide their decisions in opposite directions. The goal was to evaluate whether the generated stimuli amplified perceptual differences and aligned participants’ responses with their respective guidance targets. Experiment details can be found in the paper.

To analyze the effects of individual manipulation, we employed two key metrics. The first metric, referred to as the guidance outcome, was adapted from the Predicting Human Perceptual Variability section. It categorizes outcomes for two participants, \(s_1\) and \(s_2\), with respective guidance targets \(o_1\) and \(o_2\), and choices \(c_1\) and \(c_2\). A result is labeled as success if both participants’ choices fall within their respective guidance targets and are distinct, i.e., \(c_1, c_2 \in \{o_1, o_2\}\) and \(c_1 \neq c_2\). If both choices are biased toward the same target, such as \(c_1 = c_2 = o_1\) or \(c_1 = c_2 = o_2\), it is categorized as bias. Finally, if at least one choice is outside the targets \(c_1, c_2 \notin \{o_1, o_2\}\), the outcome is labeled as failure.

The second metric, called the targeted ratio, quantifies the directionality of successful guidance. Within successful trials, participant choices are classified as either positive, where \(c_1 = o_1\) and \(c_2 = o_2\), meaning both choices align with their respective targets, or negative, where \(c_1 = o_2\) and \(c_2 = o_1\), indicating swapped choices. The targeted ratio is defined as the proportion of positive trials among all success trials, providing a measure of the effectiveness of directional guidance.

Synthesizing Images on Perceptual Boundaries of ANNs