Perceptual representations mediate effects of stimulus properties on liking for music
Abstract
Perceptual pleasure and its concomitant hedonic value play an essential role in everyday life, motivating behavior and thus influencing how individuals choose to spend their time and resources. However, how pleasure arises from perception of sensory information remains relatively poorly understood. In particular, research has neglected the question of how perceptual representations mediate the relationships between stimulus properties and liking (e.g., stimulus symmetry can only affect liking if it is perceived). The present research addresses this gap for the first time, analyzing perceptual and liking ratings of 96 nonmusicians (power of 0.99) and finding that perceptual representations mediate effects of feature-based and information-based stimulus properties on liking for a novel set of melodies varying in balance, contour, symmetry, or complexity. Moreover, variability due to individual differences and stimuli accounts for most of the variance in liking. These results have broad implications for psychological research on sensory valuation, advocating a more explicit account of random variability and the mediating role of perceptual representations of stimulus properties.
INTRODUCTION
Perception and appreciation of sensory stimuli are fundamental aspects of cognition, crucial for survival.1 Cognitive systems of humans and other organisms make sense of the world by establishing categories, regularities, and relationships to organize their cognitive representations of objects, situations, and events encountered or predicted.2 They motivate their behavior by assigning hedonic value (propitious or pernicious, desirable or distasteful, liked or disliked) to such representations considering the system's current state, aims, and expectations.3, 4 Value assignment is a fundamental neurobiological process as it allows comparing, choosing, and prioritizing actions.5, 6 It is variously referred to as sensory valuation, hedonic evaluation, evaluative judgment, or appreciation, of which liking and disliking constitute a prominent instance and signal the pleasure of perceiving sensory information.7
Psychological research has typically focused on direct relations between liking and either stimulus properties (e.g., Refs. 8 and 9) or perceptual representations assessed via subjective ratings (e.g., Refs. 10 and 11), tacitly assuming a transparent process of perception such that perceptual representations are equivalent to the corresponding stimulus properties. The relationship between perceived affect and hedonic value is well established,12 with emotion and meaning mediating the impact of stimulus properties on pleasure.13 However, a mediating role of perceptual representations of stimulus properties such as balance, contour, symmetry, or complexity has yet to be directly examined, despite their demonstrated influence on liking in various domains.14-16
Likewise, neuroscientific research showing that signals computed in the reward system assess the hedonic value of stimulus information relayed from sensory cortices7 have focused on activity in the reward system17 or connectivity between perceptual and reward systems.18 Therefore, they do not shed light on a mediating psychological role of perceptual representations in the pleasure and value judgment evoked by sensory objects. In summary, a thorough account of the relationships between stimulus properties, their perceptual representation, and their appreciation has yet to be established.
Liking for music
Humans assign elementary hedonic value to biologically relevant objects like food or faces and to abstract and cultural objects like money and music,19 which concurs with the common currency hypothesis for a single neural basis for pleasure arising from different sensory and cognitive states.20 In this context, music provides a rich domain: First, it is considered a pervasive cultural artifact21 whose perception and appreciation rest upon general cognitive mechanisms,22 as creating, making, and appreciating music involves practically every cognitive function.23 Second, musical systems allow stimulus properties to be combined into a virtually unlimited range of compositions across styles and cultures.24 Third, it fulfills a broad range of individual and social functions,25 such as emotion regulation26 and social bonding.27 Fourth, humans invest high personal value in music28 with respect to time, effort, and economic resources.29
Predictive processing is a fundamental cognitive mechanism driving music perception and appreciation.30, 31 Listeners make predictions based on veridical and schematic expectations,32 learning statistical properties implicitly through repeated exposure33 or explicitly through training.34 Whereas schematic expectations rest upon knowledge inferred about syntactic or stylistic regularities characterizing a large body of previously encountered stimuli,35 veridical expectations rest upon precise memory for specific previously encountered stimuli.32
Music activates the reward system by evoking sufficiently uncertain schematic expectations to build anticipation and presenting sufficiently surprising events to foster learning and reward.9, 36 It has been hypothesized that learning makes stimuli perceptually less complex (i.e., less surprising) and, hence, more pleasurable.37 This poses two critical questions: To what extent is perceived complexity related to perceived unpredictability? Do perceived complexity and unpredictability mediate the relationship between stimulus complexity and liking?
Stimulus complexity can be operationalized in terms of feature-based (e.g., event density) and information-based stimulus properties, such as information content (IC), a well-established characterization of stimulus unpredictability, reflecting the negative log probability of a stimulus event.31 Complexity perception and appreciation have been explained as a function of feature-based models like MUST38—which captures the imbalance, jaggedness, asymmetry, and complexity of melodies—and information-based models like IDyOM31—which relies on statistical learning and probabilistic prediction. Both feature-based and information-based measures of stimulus complexity account for shared and unique variability in perceived complexity and liking for short Western tonal melodies38 comparable to those in the present study.
Music-elicited pleasure is directly influenced by schematic expectation9, 36 but is also strongly modulated by veridical familiarity,39 that is, veridical knowledge derived through previous experience with particular pieces and a musical culture.40−42 Controlling for familiarity is, therefore, crucial for understanding the role of perceptual representations in the relationship between stimulus properties and liking for music. This has proven to be a helpful strategy in previous research, such as in Lahdelma and Eerola's study.43
Formal properties such as balance, contour, symmetry, and complexity have been consistently shown to affect liking for melodies8, 16, 44 and visual designs.45−47 Their consistent effects at a group level8, 12, 16 make them particularly suitable to the purposes of this study, that is, to inspect the role of their perceptual representations on these relationships. Beyond consistent general trends—for example, preference for balanced, jagged, asymmetric, and complex melodies8, 12, 16—systematic assessments show that between 50% and 92% of the variance in judgments of liking stems from differences between and within individual.48−51 Thus, the question remains whether individuals prefer, for example, stimuli they perceive as being smoother regardless of how smooth the stimulus actually is or whether hedonic insensitivity to symmetry stems from an inability to recognize symmetry.
The present study
This study aims to further the understanding of psychological mechanisms underlying the appreciation of sensory stimuli. Specifically, it investigates the structure of relations between stimulus properties, their perceptual representations, and liking, testing the hypothesis that perceptual representations mediate the impact of stimulus properties upon liking for melodies. We consider feature-based (balance, contour, symmetry, and complexity) and information-based properties (IC) known to affect perceptual and hedonic evaluations of music8, 31 and visual images.16, 50 The choice of properties enables the approach to be extended with comparable visual stimuli in future research.
Clemente and colleagues38 characterized balance (balanced/imbalanced) as involving an equilibrated/skewed temporal distribution of events, contour (smooth/jagged) as containing small/large interval size and few/many direction changes, symmetry (symmetric/asymmetric) as possessing a mirrored/nonmirrored structure about a vertical axis, and complexity (simple/complex) as made of small/large number and variety of events. We hypothesized that these properties would affect perceived unpredictability because they reflect lower-level stimulus-specific sensory information, whereas unpredictability reflects a higher-level general mechanism of predictive processing.52, 53 Moreover, IC was modeled by combining pitch- and onset-related viewpoints and accounted for event density, thereby encompassing defining aspects of the feature-based properties (see Method). In particular, IC would increase for more imbalanced (irregularly distributed and, thus, rhythmically unpredictable notes), jagged (larger intervals in changing directions), asymmetric (less redundant), and complex melodies (more and more varied notes). Consequently, we hypothesized a structure of relations according to which: (1) the feature-based property varied in each subset (balance, contour, symmetry, or complexity) would influence perceptual representations of those feature-based properties; (2) these perceptual representations and IC would affect perceived unpredictability; and (3) perceptual representations of information- and feature-based properties would impact liking.
MATERIALS AND METHODS
Participants
Ninety-six participants (age: M = 35.18, SD = 11.95; range = [18, 59]; 46 women, 48 men) from the general population took part in the experiment. They were nonmusicians acculturated in the Western musical tradition to enable the generalizability of results within a particular culture. The sample size was based on comparable research in empirical aesthetics, in which 48 participants assessed short melodies in a lab-based experiment.8, 16 Following general recommendations for online studies,54 we doubled the sample size in those studies. According to Judd et al.’s power calculator (https://jakewestfall.shinyapps.io/two_factor_power/),55 the power was 0.99 for a hypothesized moderate effect size of 0.5, given the lack of previous experimental data using similar settings and analytical techniques.
All participants reported having normal or corrected-to-normal vision and hearing and no cognitive impairments. They were native English speakers, unaware of the study's purpose, recruited through Prolific (https://www.prolific.co/) with a minimum approval rate of 80%, and compensated for participation following Prolific recommendations. Informed consent was obtained from all individual participants prior to participation. Data were collected between March and July 2022. An age check revealed two participants with missing data, who were consequently excluded from the analyses. Ethical approval was granted by the Queen Mary University of London Ethical Committee (QMERC20.410).
Materials
We curated a novel set of 96 musical excerpts for research on music perception and cognition. The Naturalistic MUsical STimulus (NatMUST) set consists of four subsets of 24 melodic excerpts from the Western classical repertoire spanning the 14th−20th centuries. The stimulus duration varies between 2 and 10 s; the maximum duration falls within the short-term auditory memory span of nonmusicians,56, 57 and perception of musical symmetry has also been demonstrated for stimuli in this range of durations.58, 59 Using stimuli from the existing canon enhances the ecological validity of our approach as compared with previous studies using stimuli created specifically for a particular experiment8, 12, 16 and enhances the generalizability of results beyond a particular style within a culture. However, this also obliges us to restrict the analysis to unfamiliar melodies to minimize the effects of veridical knowledge resulting from specific episodic memories of the stimuli.
From each selected piece or movement, we curated two fragments representative of each pole of a particular property of interest. This minimized differences in style, mode, tempo, and other factors susceptible to confounding the results. The stimuli were transcribed and synthesized using grand piano sampling with MuseScore. The stimuli were rendered in XML, MIDI, CSV, and MP3 to facilitate their use by other researchers. Further details of the stimuli can be found at https://osf.io/k6gme/ and in the Supplementary Materials. Each subset varies systematically in a single stimulus property (balance, contour, symmetry, or complexity) and comprises 12 stimuli aligned toward each pole of a bimodal dimension: balanced−imbalanced, smooth−jagged, symmetric−asymmetric, and simple−complex (Figure 1). Each of these dimensions was quantified by a feature-based composite computational measure (capturing differences between and within poles) taken from the MUST toolbox, v1.1 (Clemente and colleagues,38 revised for the present research and available at https://osf.io/bfxz7/): BC for balance, CC for contour, SC for symmetry, and KC for complexity. We also computed an information-theoretic measure of the unpredictability of each stimulus: the log-scaled total IC summed for all events in each stimulus, derived from a long-term IDyOM model (LTM) trained on Western music corpora,31 which estimates the conditional probability of each event in each stimulus given the preceding sequence. We used LTM log-scaled total IC to account for event density and the Weber−Fechner law.a

We validated the stimuli in the NatMUST in two ways, which are reported and discussed in full in the Supplementary Materials; the following is a brief summary. First, a behavioral assessment examined the extent to which the experimental design—that is, the grouping of stimuli by stimulus property (balance, contour, symmetry, complexity) and pole—matched perceptual ratings of balance, contour, symmetry, and complexity given by musically untrained participants. Second, a computational assessment examined the extent to which the experimental design matched quantitative measures of balance, contour, symmetry, and complexity from the MUST toolbox.38 The results show that perceptual ratings consistently reflected the intended variation within each subset, which was significantly associated with computational measures of feature-based balance, contour, symmetry, and complexity. This demonstrates that the NatMUST stimuli have acceptable construct validity in systematically manipulating nonmusicians’ perception of musical balance, contour, symmetry, and complexity.
Procedure
We created and hosted the experiment using the Gorilla Experiment Builder (https://www.gorilla.sc).60 After providing informed consent, prospective participants performed a browser soundcheck, a built-in volume calibration, and a headphone screening.61 Individuals failing this test were automatically excluded from the final sample referred to above, as they were not allowed to proceed with the experiment.
All stimuli were presented in the MP3 format (128 kbps). The paradigm comprised two blocks, thus presenting the stimuli twice: In the first block, the participants rated their liking for and familiarity with each stimulus. In the second block, they rated its perceived balance, contour, symmetry or complexity, and unpredictability. Each block consisted of four sub-blocks, each corresponding to a NatMUST subset. Sub-block and stimulus order were individually randomized for each participant. We presented the liking block before the perceptual block to prevent contamination of liking ratings by perceptual ratings.b
-
Liking: I dislike it very much (−2), I dislike it (−1), I neither like nor dislike it (0), I like it (1), I like it very much (2);
-
Perceptual representation: Very balanced/smooth/symmetric/simple (−2), Rather balanced/smooth/symmetric/simple (−1), Undefined (0), Rather imbalanced/jagged/asymmetric/complex (1), Very imbalanced/jagged/asymmetric/complex (2);
-
Unpredictability: Very predictable (−2), Rather predictable (−1), Neutral (0), Rather unpredictable (1), Very unpredictable (2);
-
Familiarity: Totally unknown (−2), Somehow familiar (0), Very well known (2).
The rating scales served as response cues immediately after the stimulus presentation. We opted for 5-point Likert scales because they provide straightforward interpretability by participants, thus facilitating consistency between ratings. Ratings were self-paced. After single responses on all scales on the screen had been provided, participants could submit and proceed to the next stimulus, block, sub-block, or questionnaire. Following the main blocks, participants completed the Gold-MSI training scale62 and a short demographic questionnaire (age, gender, education, musical education, and musicianship) to characterize the sample. Short breaks were allowed, but participants had to complete the experiment in one session. The experimental session lasted about 50 min.
Data analysis
The NatMUST was designed to avoid generally well-known melodies, but it was impossible to ensure a priori that all stimuli were unfamiliar to all participants. Therefore, to preclude any effects of veridical knowledge (i.e., specific knowledge of a stimulus arising from prior listening), participants reported familiarity with each stimulus, and the analysis was restricted to unfamiliar stimuli, retaining 75% of all data points.c
To investigate the structure of relations between stimulus properties, perceptual representations, and liking, we applied structural equation modeling (SEM) to ratings of the stimuli in each subset. SEM estimates a network's hierarchical associations between endogenous (dependent, response) and exogenous (independent, predictor) variables. Global estimation attempts to capture relationships between the variables in the model through a variance-covariance matrix. This approach assumes that multivariate normal data sufficiently replicated to generate unbiased parameter estimates. In contrast, local estimation or piecewise SEM probes the relationships for each endogenous variable separately by fitting a linear model for each response, stringing together the inferences and evaluating them. Piecewise SEM (or confirmatory path analysis)63 expands upon traditional SEM by introducing a flexible mathematical framework that accommodates a variety of model structures, distributions, and assumptions, including interactions and non-Gaussian responses, random effects, and hierarchical models as well as alternate correlation structures. It is, therefore, the most appropriate approach to our data because none of the exogenous variables was normally distributed (ps < 0.05) and we factored in random effects.
Sufficient power is critical for robust unbiased inferences, especially for SEM, as it evaluates multiple hypotheses simultaneously and thus requires more data than other approaches. The system of equations must be overidentified to allow the extra information (degrees of freedom) to provide additional insight—that is, to test the model fit. Analogous to χ2 for global estimation, Fisher's C assesses the global goodness of fit; that is, whether the modeled relationships between variables deviate substantially from the relationships in the data. If not, the model is assumed to fit appropriately and can be used for inference. The model-wide p-value reflects whether the data support the hypothesized structure: If p > 0.05, the hypothesized structure is supported, meaning no potentially significant paths are missing. Conversely, a substantial deviation from the observed correlations (p < 0.05) suggests missing information that could make the estimates more aligned with the observations. The tests of directed separation explicitly identify and test whether each piece of missing information (each missing path) could change the overall model's interpretation. Two variables are d-separated if they are statistically independent, conditional on their joint influences. In summary, if a considerable proportion of the variance is explained in all endogenous variables and there are significant path coefficients, it follows that the residual error is low, so it is safe to assume that no other variables could clarify the model structure further. Consequently, each SEM is adjusted considering three parameters: degree to which the model is unsaturated (df > 0), global goodness of fit (sufficiently low C: p > 0.05), and no missing paths (tests of directed separation: p > 0.05).
-
percept ∼ MUST measure
-
unpredictability ∼ IC + percept
-
liking ∼ MUST measure + IC + percept + unpredictability e
-
Each MUST measure would predict the corresponding perceptual representation (denoted as percept in the SEM structure) of the feature-based property varied in each subset (balance, contour, symmetry, or complexity),
-
IC and those perceptual representations would predict unpredictability, and
-
Perceptual representations of feature-based (perceived imbalance, jaggedness, asymmetry, or complexity) and information-based (perceived unpredictability) properties would impact liking. Including the measures of stimulus properties (MUST measure and IC) allowed us to test for their direct, indirect, or null effects, entailing either no, partial, or total mediation of the perceptual representations, respectively.
Following Barr and colleagues’64 suggestion, we applied linear mixed-effects analyses to model the maximal random-effects structure justified by the experimental design to prevent power loss, reduce type-I error, and enable the generalizability of results to other participants and stimuli. Thus, the models included the stimulus properties as fixed effects, and intercepts and slopes per participant and intercepts per stimulus as random effects to account for the variability within and between participants and stimuli. The difference between conditional and marginal coefficients of determination quantifies the relevance of such variability. We performed a stepwise model reduction through likelihood-ratio tests. For statistically significant differences (p < 0.05), lower Akaike information criterion indicates a better fit of one model over another. For conciseness, we report the best-fitting models of data in each subset. Additionally, we tested whether removing random effects significantly worsened the model fit.
All analyses were performed within the R environment for statistical computing, R version 4.2.3.65 We implemented the SEM analysis in R using the psem and plot functions in the “piecewiseSEM” package, version 2.3.0.66 The psem output includes unstandardized and standardized estimates for each predictor (allowing comparisons within and between models), statistical significance and coefficients of determination (r2) for the fixed effects only (marginal), and fixed plus random effects (conditional) regarding each response variable. We interpret r2 according to Chin.67 For the internal mixed-effects models, we used the glmmTMB function in the “glmmTMB” package68 fitted using maximum likelihood estimation via the TMB (Template Model Builder) algorithm because of its flexible architecture. In all models, the MUST composite measures were centered (subtracting the variable means) and scaled (dividing by the standard deviations) using the scale function in the “base” R package.
RESULTS
The results of the best-fitting SEM for each subset are shown in Figure 2 and Table 1. The analyses reveal consistencies in the structure of relationships across subsets: (1) the stimulus properties impacted their perceptual representations, (2) which influenced perceived unpredictability, (3) which in turn affected liking. Nevertheless, the effects of stimulus properties varied between subsets: IC only affected perceived imbalance and complexity, and stimulus complexity (KC) was the only stimulus property directly influencing liking. That is, perceptual representations fully mediated the effects of IC, balance, contour, and symmetry and partially mediated the effects of stimulus complexity. In addition, perceived unpredictability fully mediated the influence of perceived jaggedness, asymmetry, and complexity, whereas perceived imbalance mediated the effects of stimulus imbalance (BC) and IC on liking.

Subset | C | p | df | Response | r2m | r2c | Predictor | b | se | cv | p | ß |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Balance | 15.94 | 0.10 | 10 | Imbalance | 0.11 | 0.38 | IC | 0.56 | 0.27 | 2.04 | 0.04 | 0.20 |
BC | 0.34 | 0.13 | 2.59 | 0.01 | 0.26 | |||||||
Unpredictability | 0.31 | 0.41 | Imbalance | 0.47 | 0.02 | 22.78 | < 0.01 | 0.54 | ||||
Liking | 0.01 | 0.30 | Imbalance | −0.07 | 0.02 | −4.02 | < 0.01 | −0.10 | ||||
Contour | 14.46 | 0.15 | 10 | Jaggedness | 0.12 | 0.35 | CC | 0.42 | 0.09 | 4.63 | < 0.01 | 0.34 |
Unpredictability | 0.12 | 0.36 | IC | −0.05 | 0.11 | −0.43 | 0.67 | −0.03 | ||||
Jaggedness | 0.33 | 0.02 | 13.64 | < 0.01 | 0.34 | |||||||
Liking | 0.01 | 0.39 | Unpredictability | −0.08 | 0.02 | −3.98 | < 0.01 | −0.10 | ||||
Symmetry | 11.13 | 0.35 | 10 | Asymmetry | 0.08 | 0.26 | SC | 0.36 | 0.07 | 4.88 | < 0.01 | 0.28 |
Unpredictability | 0.24 | 0.41 | IC | 0.06 | 0.12 | 0.55 | 0.58 | 0.03 | ||||
Asymmetry | 0.45 | 0.02 | 22.73 | < 0.01 | 0.48 | |||||||
Liking | 0.01 | 0.41 | Unpredictability | −0.06 | 0.02 | −3.26 | 0.01 | −0.07 | ||||
Complexity | 5.40 | 0.72 | 8 | Complexity | 0.53 | 0.70 | IC | 0.43 | 0.18 | 2.36 | 0.02 | 0.23 |
KC | 0.70 | 0.13 | 5.47 | < 0.01 | 0.54 | |||||||
Unpredictability | 0.25 | 0.43 | Complexity | 0.46 | 0.03 | 15.05 | < 0.01 | 0.50 | ||||
Liking | 0.02 | 0.36 | KC | 0.12 | 0.06 | 2.08 | 0.04 | 0.12 | ||||
Unpredictability | −0.10 | 0.02 | −4.12 | < 0.01 | −0.12 |
- Note: Structural equation models (SEMs) of liking for musical imbalance (BC), jaggedness (CC), asymmetry (SC), complexity (KC), IDyOM LTM log-scaled total information content (IC), and perceived imbalance, jaggedness, asymmetry, complexity, and unpredictability. For each model per subset, C refers to Fisher's C, p to p-value, and df to degrees of freedom. Marginal (r2m) and conditional (r2c) coefficients of determination are reported for each predictor. b stands for raw estimate, ß for standard estimate, se for standard error, df for degrees of freedom, cv for critical value, and p for p-value. Statistically significant effects (p ≤ 0.05) are highlighted in bold.
Across models, the random effects explain a substantial proportion of the variance, most of it in the models of perceived unpredictability and liking. Likelihood-ratio tests show statistical superiority (all ps < 0.01) of the full models over models excluding any random effects (Table 2). Moreover, removing the random effects per participant in the Contour subset or per stimulus across subsets makes the SEMs unacceptable (p < 0.05) as they involve missing paths.
Subset | Model | - RE | participant | - RE | stimulus |
---|---|---|---|
Balance | 12,366.32 | 12,715.12 | 12,899.61 |
Contour | 12,660.20 | 13,164.95 | 13,293.05 |
Symmetry | 13,584.18 | 14,163.98 | 14,109.86 |
Complexity | 9786.82 | 10,287.14 | 10,184.54 |
- Note: Akaike information criterion (AIC) for each model configuration in each subset. Removing the random effects per stimulus (- RE | stimulus) or participant (- RE | participant) significantly worsens the model fit across subsets and random effects: all ps < 0.05.
DISCUSSION
We sought to understand the role of perceptual representations in mediating the effects of stimulus properties (balance, contour, symmetry, and complexity) on liking for Western melodies carefully selected to vary systematically in those properties. SEM revealed a consistent relational network in which stimulus properties influenced the perceptual representation of those properties, which in turn influenced perceived predictability, which in turn predicted liking, consistent with our hypotheses. In other words, stimulus symmetry, contour, and complexity influenced perception of symmetry, contour, and complexity, respectively, which in turn influenced perceived unpredictability, which in turn influenced liking. However, there were variations in this network of influences for balance, in which perceived balance (rather than predictability) played the mediating role, and for complexity, in which stimulus complexity also directly impacted liking (rendering the mediating effect of perceived complexity and predictability only partial). The direction and magnitude of the mediation effects were consistent across subsets, suggesting that the characterization of perceived unpredictability and its relation to liking were relatively robust to the feature-based manipulations.
A notable contribution of this study is to demonstrate the importance of perceived unpredictability in liking for melodies from the repertoire. Perceived unpredictability drove liking for melodies across variation in contour, symmetry, or complexity, fully mediating the influence of perceptual representations of stimulus contour and symmetry and partially mediating the influence of stimulus complexity. Huron32 proposed that the effect of familiarity on liking, as evidenced in the mere exposure effect69 and in studies with musical stimuli (see Ref. 70 for a review), actually reflects an effect of predictability, since with repeated exposure stimuli become more predictable. Furthermore, quantitive information-theoretic conceptions of stimulus complexity have been shown to predict both perceived complexity38, 71 and liking for music9, 36, 38 as well as nonmusical auditory stimuli.72−74 However, the present results suggest a higher-level characterization of perceived unpredictability in which stimuli that are perceived as being more unbalanced, asymmetric, jagged as well as more complex are perceived as being more unpredictable.
IC did have an indirect effect on perceived predictability and liking through its influence on the perception and appreciation of balance and complexity. Regarding the former, IC and stimulus balance (BC) influenced perceived balance, which fully mediated the effect of these stimulus properties on liking. In other words, perceived balance was influenced by both the measure of stimulus balance (BC) that was used to construct the stimuli—essentially how unevenly distributed the events are toward the beginning or end of the stimulus—and the information-theoretic unpredictability of the timing and pitch of the events making up the stimulus. Regarding complexity, both feature-based complexity (KC) and information-based complexity (IC) influenced perceived complexity, pointing to the intertwined but also distinct nature of the two constructs. IC reflects more complex higher-order schematic models of stylistic pitch and rhythmic structure than KC but lacks the measure of event density that is included in KC. The absence of other effects of IC on perceived unpredictability suggests that IC (reflecting stylistic unpredictability) was controlled when manipulating contour and symmetry38 and that other factors (jaggedness and asymmetry) drove perceived unpredictability. These findings add to the literature on the relevance of predictive processing music appreciation,32 underscoring the distinction between information-based stimulus unpredictability and perceived unpredictability—the former, when sufficiently varied, may influence the latter but is just one of several influences—and the preeminence of the latter in directly influencing liking.
Divergences from previous findings (i.e., significant group effects of the stimulus properties on liking in, e.g., Refs. 8, 12, and 16, but not here) may reflect differences in stimuli and experimental settings. We used naturalistic melodies covering a wide range of Western musical periods and styles, examining relations between stimulus properties, their perceptual representations, and liking (cf., Refs. 8 and 16), and we factored in variability between and within participants and stimuli (cf., Refs. 9 and 36). It is possible that the greater variability in our stimulus set created greater opportunity for individual differences in liking and relationships with (perception of) stimulus properties. Further research should examine this possibility and also extend the research to other stimulus properties (e.g., consonance, harmonicity, grammaticality), their perceptual representations, perceived unpredictability, and liking for these and other stimuli. To facilitate this, the computational measures of stimulus properties (BC, SC, CC, KC, and IC) developed and deployed in this research can all be applied to other melodic stimuli using software provided in the online repositories (see Materials section).
The random effects explained the largest proportion of the variance in most internal models (especially liking) and removing them worsened the SEM fit consistently across subsets. Therefore, accounting for variability between and within participants and stimuli was essential to unveil the relations between stimulus properties, their perceptual representations, and liking. Indeed, systematic assessments have shown that most variance explained in judgments of hedonic value reflect differences between and within individuals.48−51 In the realm of music, research has also explicitly examined individual variability in appreciation (e.g., Ref. 75) and the extent to which it is explained by other traits (e.g., Ref. 76). The present study is particularly illustrative as it yields null group-level effects of stimulus properties on appreciation but significant individual effects revealed by random effects. Therefore, these findings emphasize the perils of neglecting individual differences.16, 50
Together, these contributions mean that we cannot assume a direct relationship between stimulus properties and pleasurable responses but must first consider how those stimulus properties are perceived and represented, which may vary between individuals. For example, we might find a relationship of increasing liking with increasing symmetry, but when we look at the stimulus ratings, we might find that all the stimuli are rated as being asymmetric, so the relationship should really be cast as increasing liking with decreasing asymmetry. Or we might find that different participants scale their representations of stimulus properties. For instance, the region of subjective complexity corresponding to “relatively complex” for one individual might correspond to “relatively simple” for another individual. In addition to these individual differences in the representation of stimulus properties, the present results also suggest individual differences in the relationship between representations and liking. One participant might show a positive relationship between liking and complexity (i.e., a preference for greater complexity), while another might show a decreasing relationship (i.e., a preference for lower complexity). Or participants might show nonlinear relationships between stimulus representations and liking such as the inverted-U-shaped relationship (whose apex might differ among individuals). This paints a picture of a much more complex process of appreciation than has typically been reflected in prior research that focused either on stimulus properties or subjective ratings in isolation.
Limitations
We ensured that the stimuli were entirely unfamiliar to participants by removing approximately 25% of the data. This strategy minimized the effects of veridical knowledge based on episodic memories of the music. However, familiarity effects (linked to schematic knowledge; e.g., see Refs. 77 and 78) cannot be completely eliminated due to a familiar tuning system being used and the fact that knowledge of Western harmony and modes in melodies is embedded in our participants, all belonging to Western, educated, industrial, rich and democratic (WEIRD) countries.79 More importantly, this entailed that the participants contributed differently to the models, and the datasets for different subsets had unequal sizes. The sample size was sufficiently large to accommodate such differences between conditions, but replication with equal individual contributions to all dimensions would help to dispel any doubt.
The experimental paradigm precluded contamination between perceptual and liking ratings, but contamination between perceptual ratings could not be prevented. Hence, caution is warranted when considering the role of perceived unpredictability. However, the structure of relations and the effects of perceived unpredictability were robust to our block design intended to enhance the effects of feature-based variability in each subset. Nevertheless, further research is necessary to disentangle potential contamination from the genuine effects suggested by the results and to elucidate the nature of these relations in a randomized design.
We recruited remote cohorts of adults in predominantly English-speaking countries, which involves well-known caveats concerning WEIRD samples, internet access, and willingness to join a paid research pool. However, the NatMUST was designed with a broad Western population in mind. Hence, the universe of stimuli from which the present set was selected matches the common cultural and educational affordances of a broad range of participants acculturated in Western music. This strategy minimized local cultural biases affecting our hypotheses. The stimuli used represent broadly only one category of music, namely, Western classical, albeit with a vast variety of periods and styles covered.
As for any behavioral study, the present research relies on explicit ratings that may entail demand effects and, in the case of liking, may not fully represent feelings of pleasure. And importantly, previous research (e.g., Refs. 43 and 80) has demonstrated that preference and pleasantness may indeed be understood as disjoint concepts in music perception. However, the participants were instructed to rate the stimuli according to their perceptual representations and internal feelings of pleasure, interest, enjoyment, and desirability evoked or elicited by the stimulus.
Finally, although SEM analyses purport to yield causal links, they rely on correlational associations. Causal experimental paradigms are necessary to demonstrate causal relationships.
CONCLUSION
The present research pioneers a systematic investigation of the relationships between stimulus information, its perceptual representations, and the pleasure we get from perceiving. In so doing, it contributes new open resources for research and yields two fundamental empirical findings: First, it unveils a mediating role of perceptual representations in the impact of stimulus properties (stimulus balance, contour, symmetry, complexity, and unpredictability) upon liking for melodic excerpts. Second, it demonstrates the central relevance of variability due to participants and stimuli, typically disregarded as noise in the literature, to understand perceptual and evaluative mechanisms. Both findings align with growing research supporting the situated nature of appreciation—according to which the pleasure of perceiving is driven by how sensory information is processed in the brain, considering the individual's current state, goals, expectations, and context1—and goes a step further in testing how particular aspects of sensory information are represented and what their relevance is for evaluative judgments such as liking.
AUTHOR CONTRIBUTIONS
Conceptualization: A.C. and M.T.P. Methodology: A.C. Software: A.C. and T.M.K. Validation: A.C. Formal analysis: A.C. and T.M.K. Investigation: A.C. Resources: A.C. Data curation: A.C. Writing—original draft: A.C. Writing—review and editing: A.C., T.M.K., and M.T.P. Visualization: A.C. Supervision: M.T.P. Project administration: A.C. Funding acquisition: M.T.P.
ACKNOWLEDGMENTS
This research was supported by a doctoral studentship awarded to T.M.K. from the EPSRC−AHRC Centre for Doctoral Training in Media and Arts Technology under agreement EP/L01632X/1; and a Margarita Salas postdoctoral fellowship awarded to A.C. from the Spanish Ministry of Universities and funded by the European Union.
COMPETING INTERESTS
The authors declare no competing interests.
Open Research
PEER REVIEW
The peer review history for this article is available at: https://publons.com/publon/10.1111/nyas.15106
DATA AVAILABILITY STATEMENT
The anonymized raw data and the NatMUST stimuli and their associated computational measures are available at https://osf.io/k6gme/.