nif:isString
|
-
Protocols for this study were approved by the McMaster University Research Ethics Board. All participants gave written informed consent prior to participating. To create voices representative of a given population, we created an initial male and female voice speaking the English monopthong vowel sounds: ‘eh’ as in ‘bet’, ‘ee’ as in ‘see’, ‘ah’ as in ‘father’, ‘oh’ as in ‘note’, and ‘oo’ as in ‘boot’. Such stimuli have been used in many studies on voice perception [17], [22], [27], [34], [35]. Initial voices were created from an average of 32 male (mean: 109.99 Hz, SD: 3.18 Hz, range: 86–152 Hz) and 32 female voices (mean: 210.81 Hz, SD: 20.67 Hz, range: 143–285 Hz), separately, using STRAIGHT [36]. Briefly, this procedure entails pitch extraction, and demarcating key spectral features (e.g., formant frequencies and vowel onset and offset) on spectrograms of the sound (Figure 1). These features are then aligned in time, and then fundamental frequency and harmonics, amplitude, time, and formant frequencies are then averaged separately, and voices reconstructed. This method has been used successfully in other studies of voice processing [37]. The averaging process averaged voices in pairs, iteratively, until one base voice of each sex were created from an average of 32 voices. The final pitches of the averaged voices were 110 Hz for the male voice and 211 Hz for the female voice. Spectrograms of the average male and female voice used in this study are shown in Figure 1.
Figure data removed from full text. Figure identifier and caption: 10.1371/journal.pone.0032719.g001 Spectrograms of the average female (left panel) and male voice (right panel) for the five vowel sound stimuli.Spectrograms plot time on the X axis, frequency on the Y axis, and amplitude is represented by shading. Next, we manipulated voice pitch using the Pitch-Synchronous Overlap Add (PSOLA) algorithm [38] in Praat acoustic phonetics software [39]. The initial voices were manipulated in 2 Hz steps using the PSOLA method. The PSOLA method selectively manipulates mean fundamental frequency and corresponding harmonics independent of time and formant frequencies, and has been used successfully in many studies on voice preferences and other mate-choice relevant contexts in humans [17], [21], [22], [27], [40], [41], and other mammalian species [42], [43]. Although voice pitch was manipulated, formant frequencies were retained, and previous research has demonstrated that such manipulations create voices that still sound “adult-like” [35]. The pitch range for men's voices was 60–180 Hz, and the pitch range of women's voices was 160–300 Hz. These pitch ranges extend well below the 32 men's voices and above the 32 women's voices used in creating the initial averaged voices. Praat's pitch parameters were set at a minimum 50 Hz and maximum 300 Hz for men's voices, and a minimum 100 Hz and maximum 600 Hz for women's voices. Window length was determined automatically by Praat.
Ten men (mean age: 21.80, SD: 1.45) and nine women (mean age: 22.05, SD: 1.58) participated in the study. All were university students with normal hearing. All participants reported English as their first language, and no participant reported any musical training. Participants rated voices in a 2-alternative forced-choice paradigm: on each trial, two voices were presented, one after the other. Participants were free to replay the voices as desired. Voice trials were presented using the method of constant stimuli [44]. In this method, each voice pitch was compared to every other pitch in random order. The method of constant stimuli was chosen to avoid auditory adaptation to stimuli which may have affected JNDs. Furthermore, extensive sampling of the pitch dimension allowed for an in-depth analysis of possible perceptual constraints on preferences for voice pitch. Extensive sampling of few participants is common practice in auditory psychophysics [45], [46], [47], [48], [49], [50], [51]. Men listened to all possible pairs of women's voices, and women listened to all possible pairs of men's voices. Catch trials (i.e., no difference in pitch) at each pitch interval were included. Men listened to 51 blocks of 50 voice pairs and 1 block of 6 voice pairs, while women listened to 37 blocks of 50 voice pairs and 1 block of 42 voice pairs. Each block of 50 voice pairs took approximately 15 minutes to complete, and participants completed a maximum of eight blocks per day. Each task took several weeks to complete. The frequencies of all voices were randomized within and between blocks. In all, males listened to 2556 voice pairs, and females listened to 1892 voice pairs. The study was divided into three tasks. The first task was a simple pitch discrimination task. Four men and four women were asked to pick the voice with the higher pitch [47]. Several previous studies have assessed JNDs in pitch for vowel sounds [45], [46], [52], [53]. The second task was created to determine JNDs in vocal attractiveness based on pitch manipulations. Four men and four women listened to all the voice trials and were asked to pick the voice they thought was more attractive. Two men and three women who completed the pitch discrimination task also completed the voice attractiveness task. As tasks were performed months apart, it is unlikely the voice attractiveness task affected performance on the pitch discrimination task. The third task assessed JNDs in voice pitch for perception of vocal dimorphism (masculinity or femininity). Four men and four women participated in the third experiment. Men were presented with pairs of women's voices and were asked to choose the voice they thought was more feminine (as in [27]). Women were presented with pairs of men's voices and were asked to choose which voice they thought was more masculine.
|