Silent Line-Videos Pick Music Winners

Aug 21, 2013

[In] classical music competitions, … nearly all participants — including highly trained musicians — were better able to identify the winners of competitions by watching silent video clips than by listening to audio recordings. … The effect held up even in high-level international competitions, which often feature not only top performers, but also highly trained musicians as judges. (more; HT Hugh Parsonage)

I give lots of quotes from the original study below the fold. Ordinary people and classical piano experts were rewarded for picking the winner from the top three candidates in ten prestigious international classical music competitions. People said and bet that they would guess better using sound only, but they in fact guessed better using video only, even when the video was reduced to line drawings like:

They guessed worse when they had both audio and video. When they rated videos on various keywords, the word that best predicted winners was “passion.”

This strongly suggests that people are reluctant to admit to themselves how much the passion and energy of motion of pianists influences their evaluation of such pianists. I recently puzzled over why people pay so much more attention to lead singers relative to backup singers when by most accounts the musical skill difference, if any, is very small. (Here’s a recent movie on this.) This new result suggests those usual musical skills are only a minor part of what people want from a singer — lead singers get most of the attention because they give most of what folks want – a vivid passionate attractive character to relate to.

I suspect we’d find similar results hold for novelists and academics – people think they rate them mostly on content, but even experts usually put more weight on style, i.e., on energy and control relative to plot, setting, characters, problem choice, analysis, etc.

Overall this fits into the homo hypocritus framework, as it seems less licit or admirable to like musicians, novelists, or academics mainly because we like to affiliate with people with lots of energy and control. We prefer instead to think that what we like is nice music or words, and that artists are just instruments to get us those things.

Those promised quotes:

These ﬁndings suggest that there may be gaps between what we say we use to evaluate performance and what we actually use. People may be unlikely to recognize or admit that visual displays can affect their judgment about music performance, particularly in a domain in which other signals are deemed to be more indicative of quality. …

In highly competitive arenas such as music, competitions emerge as one launching pad for establishing careers. … . Given different versions of competition performances, 1,164 participants in total were asked to identify the actual competition winners. These choices were then compared against the established outcomes, previously decided by panels of expert judges. …

Suppose that you have the chance to win cash bonuses if you can guess who won a live music competition. You may choose the type of recording you think would give you the best chance at winning the prize. You can select sound recordings, video recordings, or recordings with both video and sound. Which recordings do you choose? In experiment 1, participants were asked to make exactly that decision and bet their study earnings on their choices. As expected, 58.5% chose the sound recordings, signiﬁcantly more so than the 14.2% who chose video recordings. … Despite a “tax” levied on selecting the recordings with both video and sound, 27.4% still chose those recordings. …

In experiments 2–5, the top three ﬁnalists in each of 10 prestigious international classical music competitions were presented to participants. Given such difﬁcult decisions, untrained participants should fare no better than chance (33%) in identifying the winners of these competitions. In fact, even expert interrater agreement tends to be moderate, hovering at an average of 67%. … In experiment 2, novice participants were presented with both video-only and sound-only versions of 6-s clips of the top performances. … with silent video only recordings, participants were signiﬁcantly above chance (52.5%). … With sound-only recordings, they were signiﬁcantly below chance (25.5%) at identifying the winners. …

Experiment 3 tested judgment when more information was available, and presented participants with video-only, sound-only, or video-plus-sound versions of the performance clips included in experiment 2. Participants performed below chance with sound-only recordings (28.8%), and at chance with video-plus-sound recordings (35.4%). However, with silent video-only recordings, 46.4% of novices were able to identify the winners. …

In experiment 4, 96.3% of domain-expert participants reported that the sound mattered more for their evaluations. Despite musicians’ training to use and value sound in their evaluations, only 20.5% of experts identiﬁed the winners when they heard sound-only versions of the recordings. However, 46.6% did so upon viewing silent video clips. … In experiment 5, 82.3% of professional musicians cited sound as the most important information for judgment. However, when provided sound, only 25.7% of experts were able to identify the actual winners. … With video-only stimuli, musicians performed signiﬁcantly better than chance (47.0%) … When provided with stimuli with both video and sound, experts were again at chance at 29.5%.

Experts were not signiﬁcantly different from novices in their judgments of music performance. …

[In] Experiment 6 … recordings were distilled to their most basic representation as outlines of motion (Fig. S2). After seeing these 6-s silent clips of the three ﬁnalists, participants were asked to identify the actual winners. Participants were signiﬁcantly better than chance (48.8%) at identifying the outcomes. … demographic cues such as race and sex … [and] physical attractiveness … did not signiﬁcantly impact professional judgment in these competitions. …

In experiment 7, 262 participants were … asked to identify the most conﬁdent, creative, involved, motivated, passionate, and unique performer in each set of three ﬁnalists in the competitions. … Creativity, involvement, motivation, passion, and uniqueness were significantly more salient through visual cues rather than through sound. … Through silent videos, those selecting “the most passionate contestant” identiﬁed the actual winners at rates signiﬁcantly higher than chance (59.6%). … Involvement (53.1%), motivation (52.8%), creativity (44.6%), and uniqueness (43.6%) also contributed.

Experts in particular reported a severe lack of conﬁdence in their judgment when they were assigned to the video-only recordings, not knowing that their approximations of the actual outcomes would be superior under such constrained conditions. (more)

Philip Goetz

May 15, 2023

Style matters in fiction, but only to editors and writers. I've been studying fan-fiction this past year, and I can say confidently that most readers do not give a damn about style, or even grammar. Content rules with readers. There are plenty of fan-fictions written with excellent style, and they are not especially popular. But you don't need to study fan-fiction to realize that; just check the bestseller lists.

Expand full comment

dmytryl

I seem to be missing some misconception here. Do you think it is logically impossible that people would predict the outcome of a blind musical competition better based on the video only than the audio only, or what?

> "They guessed worse when they had both audio and video. "

Before you unleash the hindsight based pseudo-scientific method on this data, think a little about this passage. It does imply that individuals are in fact judging primarily based on the audio when video is available.

People have idiosyncratic preferences for specific styles of playing, that's the thing, the jury averages this out, but one person is stuck picking the player that matches their preference the best. Unless they don't even hear the player, I guess.

edit: indeed, let's consider world championship level chess competitions. I would assume that in theory* people would predict outcome of those better based on a video from first 10 moves, showing just the player's faces, than based on the first 10 moves, showing just the pieces (the moves would reasonably be 100% perfect as far as anyone tested could tell); that wouldn't necessarily imply that chess is some sort of game where people look at each other's faces and the uglier looking resigns.*in practice I doubt it'd be possible to get sample size large enough.

9 more comments...

Overcoming Bias

Discussion about this post

Ready for more?