PropertyValue
is nif:broaderContext of
nif:broaderContext
is schema:hasPart of
schema:isPartOf
nif:isString
  • Sixteen human subjects (10 women, ages 20−31) participated in this study in accordance with a research protocol approved by the Institutional Review Board at Purdue University. All subjects were healthy volunteers with normal vision. Informed written consent was obtained from every subject. Among the 16 subjects, three subjects’ data were excluded from the subsequent analyses either because they fell asleep with eyes closed or had excessive head movement. An uninterrupted segment of movie (5:37 minutes in length, from 163:06 to 168:50 minutes in The Good, the Bad, and the Ugly, directed by Sergio Leone) was converted to gray scale to provide a naturalistic visual stimulus. From this movie, a scrambled movie was further created to keep the same statistical distribution of pixel intensity as the intact movie, yet rendering itself perceptually meaningless. Specifically, we used 3D Fast Fourier Transform (FFT), as implemented by the function ‘fftn’ in MATLAB, to convert the intact movie into the frequency domain, in which the phase was randomly shuffled while the magnitude remained unchanged; then inverse FFT was applied to the phase-shuffled data to generate a scrambled version of the original movie. As illustrated in Fig 1, the intact and scrambled movies were matched in low visual features, whereas the contrast between them was specific to the presence vs. absence of high-level natural content independent of low-level visual properties. Figure data removed from full text. Figure identifier and caption: 10.1371/journal.pone.0161797.g001 Illustration of the naturalistic vs. scrambled visual stimuli.Example frames from the intact movie (A) and the scrambled movie (B). The scrambled movie was created by shuffling the phase of the intact movie in the frequency domain (C). Among the 13 subjects, seven underwent four sessions of fMRI acquisition with visual stimulation: two sessions for the intact movie, and two sessions for the scrambled movie. These subjects were instructed to freely view the movie stimuli. Four subjects underwent two repeated fMRI sessions for the intact movie stimulation, while they were instructed to fixate at a cross-hair (0.8 × 0.8 degrees in width and height) at the screen center. Two subjects underwent eight sessions of fMRI acquisition: two sessions for the intact movie (free-viewing), two sessions for the intact movie (fixation), two sessions for the scrambled movie (free-viewing), and two sessions for the scrambled movie (fixation). Every stimulation session started with a blank gray screen presented for 42 seconds, followed by the intact or scrambled movie presented for 5 minutes and 37 seconds, ended with the blank screen again for 30 seconds. No sound was played during the movie. The order of the above stimulation sessions was randomized and counterbalanced across subjects. All experiments were conducted in a 3T MRI system (Signa HDx, General Electric, Milwaukee). A 16-channel receive-only surface phase-array coil (NOVA Medical, Wilmington) was used throughout every experiment. T1-weighted anatomical images were acquired with a spoiled gradient recalled acquisition (SPGR) sequence (256 sagittal slices with 1 mm thickness and 1×1 mm2 in-plane resolution, TR/TE = 5.7/2ms, flip angle: 12°). FMRI data were acquired with the standard single-shot, gradient-recalled (GRE) echo-planar imaging (EPI) sequence (38 interleaved axial slices with 3.5 mm thickness and 3.5 × 3.5 mm2 in-plane resolution, TR = 2000ms, TE = 35ms, flip angle = 78°, field of view = 22 × 22 cm2, in-plane acceleration by a factor of two based on sensitivity encoding (SENSE), no acceleration in the slice direction). The visual input was presented using the MATLAB-based Psychophysics Toolbox [15,16], and they were delivered to the subjects through a binocular goggle system (NordicNeuroLab, Norway) mounted on the head coil. The display resolution was 800 × 600. An MR-compatible monocular eye-tracking camera was integrated into the goggle system to monitor the left eye under infrared illumination. The eye-camera data were transmitted to an eye-tracking system (ViewPoint, Arrington Research, USA), which captured the eye movement and tracked the subject’s gaze location at 30 Hz during the stimulation session. To ensure reliable and accurate eye tracking, the system was recalibrated prior to every stimulation session. Through the goggle system, the visual field covered by the movie was about 26.9-by-20.3 degrees. When the subjects were instructed to fixate at the screen center, the fixation was not ensured by any fixation task, but confirmed retrospectively with eye-tracking data. MRI and fMRI data were preprocessed by using FSL [17] and AFNI [18]. Briefly, T1-weighted anatomical images were non-linearly registered to the MNI brain template. T2*-weighted functional image series were corrected for slice timing, registered to the first volume within each series to account for head motion, masked out non-brain tissues, aligned to the T1-weighted structural MRI, registered to the MNI template and resampled into 3×3×3mm3 voxels. After discarding the first six volumes, the fMRI data were temporally de-trended by using a third-order polynomial function to model the slow signal drift, and spatially smoothed by using a 3-D Gaussian filter with 6mm full width at half maximum (FWHM). As the focus of this study was on brain activity during sustained and dynamic movie stimuli, we only used the data from 12 seconds into the movie to the end of the movie. Data from other periods were all excluded from subsequent analyses to avoid the transient fMRI response shortly after the gray screen was replaced by the movie presentation. Inter- and intra-subject reproducibility in fMRI activity: As previously published elsewhere [5], the map of visually evoked activations was obtained by assessing the zero-lag Pearson’s cross correlation in every voxel’s time series between two repeated presentations of the identical movie for the same or different subjects. Specifically, the intra-subject correlation was measured for every voxel and each subject, whereas the inter-subject correlation was measured for every voxel and each pair of different subjects. The intra- and inter-subject correlation coefficients were converted to z scores through the Fisher’s r-to-z transform with the degree of freedom set as the number of time points minus one. Further, the intra- or inter-subject reproducibility was tested for statistical significance in the group level. The intra- or inter-subject reproducibility and its significance were evaluated separately for the intact or scrambled movie in a free-viewing or fixation condition. For the intra-subject reproducibility, the z-transformed correlation coefficient (or z score) between sessions from the same subject was averaged across subjects; the voxel-wise statistical significance was evaluated by using one-sample t-test, with p < 0.0033, DOF = 8, multiple comparison corrected by controlling the false discovery rate (FDR) < 0.03 [19] (FDR is also denoted as q in this paper). Note that the same method for the correction for multiple comparison was used for all other statistical tests in the study, unless it was described otherwise. For the inter-subject reproducibility, the voxel-wise cross correlation was obtained for each pair of different subjects. The pair-wise correlations were not all independent because each subject could be involved in more than one subject pair. This lack of independence invalidated the use of parametric statistical tests, such as the above-mentioned one-sample t test. Instead, we used a non-parametric resampling based statistical inference as implemented in ISC-toolbox (www.nitrc.org/projects/isc-toolbox/) and described in details elsewhere [20]. Briefly, an empirical distribution was obtained for inter-subject cross-correlations given the null hypothesis that such correlations were trivial and non-significant. To do so, every subject’s signals were circularly shifted in time by a random and different amount so that the signals from different subjects were no longer aligned in time. Following the temporal resampling, cross correlations were computed between the resampled time series at the same or different voxels from different subjects, effectively yielding a null resampling distribution with 10 million samples of trivial inter-subject cross correlations. This resampling-based statistical inference also took into account the auto-correlated nature of the fMRI signal. Without resampling, all inter-subject cross correlations were tested against this empirical null distribution, yielding the p values for significance test, corrected for multiple comparison with FDR < 0.03. Note that either physiological (respiratory or cardiac) fluctuations or head motion parameters were not regressed out from the fMRI signals prior to the calculation of the intra- and inter-subject reproducibility, because such task-unrelated fluctuations were not expected to contribute to the cross correlation between signals from different fMRI sessions. Assessing the effects of gaze behavior: In the natural vision paradigm, the interpretation of reproducible fMRI activity in terms of fluctuating visual features may be of concern due to the presence of multiple confounding factors that may contribute to the fMRI signal, and drive or disrupt the signal reproducibility. We first addressed the degree to which fMRI activity and its reliability could be accounted for by the subjects’ varying gaze behavior monitored with eye tracking. Specifically, every subject’s gaze locations were estimated from instantaneously captured eye images. The time series of the horizontal (x) and vertical (y) variations were first filtered by moving average within a 1s window, in order to minimize the influence of the extreme values caused by eye blinking. After removing the mean from the x and y time series, we characterized the gaze behavior by calculating the instantaneous saccadic amplitude (in degrees), which was the spanning visual angle between two consecutive gaze locations based on a gaze displacement model, as described elsewhere [21,22]. The saccadic amplitude signal was further demeaned, and was convolved with the canonical hemodynamic response function (HRF), as implemented in SPM (www.fil.ion.ucl.ac.uk/spm/). The result were further resampled to match the fMRI sampling times after applying an anti-aliasing low-pass filter with the cutoff frequency at 0.25 Hz. As such, a single regressor was derived from the time-varying saccade amplitude as an explanatory variable for the fMRI signal. The zero-lag cross correlation between the fMRI signal and the saccade-amplitude regressor was evaluated for every voxel. The correlation coefficients were transformed to the z scores, and tested across subjects for significance by using one-sample t-test (p < 0.011, DOF = 17, FDR < 0.03). To assess the gaze effects on the reproducibility of fMRI activity, the gaze-related regressors were excluded from each voxel time series by using a linear nuisance variable regression (NVR). The difference in the intra-subject reproducibility with vs. without the NVR was tested for significance by using paired t-test (p < 0.03, DOF = 8, uncorrected). Alternatively, the effects of gaze behavior were addressed by comparing the intra-subject reproducibility of fMRI activity when the subjects were freely watching the movie (n = 9) versus watching the same movie but with their eyes fixated (n = 6). The latter served as a control experiment without gaze behavioral variation either within or across subjects. Between these two conditions, the voxel-wise difference in the z-transformed intra-subject correlation was tested for statistical significance by using two-sample t-test (p < 0.03, DOF = 13, uncorrected). Assessing the effects of scene transitions: We further addressed the effects of scene transitions due to film editing. The scene transitions manifested themselves as transient and discontinuous changes in visual input. The occurrence times of scene transitions were first detected if the sum of the absolute pixel difference between two consecutive movie frames exceeded a given threshold, and then visually affirmed. A train of binary events was defined as a time series of ones and zeroes, with one indicating the occurrence of a transition event. The binary events were further weighted by the estimated motion between two consecutive movie frames at the transition time based on a block-matching algorithm implemented as a built-in function (vision.BlockMatcher) in MATLAB (the block size = 9) [23]. It measures the movement fields in pixel blocks between two images, regardless of the perceptual meanings of these images. The motion-weighted transition events was further convolved with the HRF, and then filtered and resampled to construct a regressor for the fMRI signal. We used this regressor to evaluate the effects of scene transitions on fMRI activity (p < 0.0017, DOF = 17, FDR < 0.03) and its reproducibility (p < 0.001, DOF = 8, FDR < 0.03) using the similar correlation and regression analyses as used for addressing the effects of gaze behavior. While the above analyses were done with the volumetric data, the results were displayed on both the MNI volume template and the cortical surface templates [24].
rdf:type