nif:isString
|
-
The study was approved by the University of Maryland Institutional Review Board. All participants gave written informed consent and the capacity to provide informed consent was documented by testing all participants on whether they could recall the demands of the study, the risks of taking part in the study, and demonstrated knowledge of their ability to withdraw from the study.
Forty-eight participants with a diagnosis of SZ (N = 38) or schizoaffective disorder (N = 10; according to DSM-IV diagnostic criteria) and 32 controls were recruited for the experiment. Patients were clinically and pharmacologically (drug and dose) stable (> 4 weeks) outpatients from the Maryland Psychiatric Research Center or other nearby clinics. Controls were free from a lifetime history of SZ, other psychotic disorder, current Axis I disorder, drug dependence, neurological disorder, or cognitively impairing medical disorder, with no family history of psychosis in first-degree relatives. Controls were screened with the Structured Clinical Interview for DSM-IV [45,46]. One patient and one control were excluded for being unable to learn the easiest condition (Go-to-Win), defined as less than five correct responses. Three participants (2 SZ and 1 HC) were excluded for lack of deviance in responding, defined as making an extended run (> 40) of “Go” responses or “No-Go” responses. Forty trials covers close to a full block of persistent responding and it is known that for at least one participant this reflected gamepad malfunction. This left 45 SZs and 30 HCs for the behavioural analysis. Participants underwent detailed neuropsychological testing, see supplementary material (S1 File) for assessments reported on.
The task was derived from [9] and the EEG modification was derived from [10]. Four simple shape stimuli were presented 48 times each (total trials = 192) to participants in a pseudo-random order. Participants were instructed to respond by pressing a button (“Go”) or withhold responding (“NoGo”) to gain rewards (“Win”) or avoid punishments (“Avoid”). Stimuli were rewarded or punished at a probability of 0.8. Two stimuli were associated with reward (thumbs up image, reflecting monetary gain) and two stimuli were associated with punishment (thumbs down image, reflecting monetary loss). The alternative to reward or punishment was a neutral outcome (thumb to the side, no monetary change). Monetary gain or loss was set at $0.05 per trial. Action and valence were crossed, resulting in one of each of the four stimuli requiring “Go-to-Win”, “Go-to-Avoid”, “NoGo-to-Win” and “NoGo-to-Avoid” in order to achieve the best possible outcome. The stimulus presentation sequence and timings were as follows: a cross hair presented for 400–600 ms, the stimulus presented for 1000 ms, a no-response period presented for 250–2000 ms, a response window presented for 2500 ms indicated with an “O” for 1500 ms then a cross hair 1000 ms, finally feedback was presented for 2000 ms. Participants were instructed that four images would be presented and they would have to decide on the best response to make (to press the button or to not press the button) by trial and error to win the most money possible. Participants were told that some images had a chance of winning money if they made the right decision and others had a chance of losing money if they made the wrong decision. Depending on the outcome associated with the correct response (achieving a gain or avoiding a loss) the best strategy to some stimuli will be to press the button while for other stimuli the best decision will be to withhold responding. Following instructions, participants were given a series of practice trials with unique stimuli to get accustomed to the task. They were instructed through a Go-to-Win block followed by a NoGo-to-Avoid block, explaining the response options and the probabilistic nature of the rewards or punishments. Following the explicit instruction block, participants underwent a second practice session with two stimuli (Go-to-Win and NoGo-to-Avoid) without instruction to ensure an understanding of both response options and the structure of the task. Before the onset of the main experimental training phase, participants were reminded that each image has one best decision option, to press, or not to press, and that it stays the same for the entire task. Finally, it was reinforced that all four combinations of stimulus-response pairings were possible.
EEG was recorded from a 32 channel Biosemi system. Data were recorded unreferenced with the ground at AFz using a sampling rate of 1024 Hz with 512 Hz hardware filters. Data were imported into EEGLAB [47], offline referenced to linked mastoids, filtered between 1 and 40 Hz and down-sampled to 256 Hz. Data were epoched from -1500 ms to 1500 ms around stimulus and feedback event codes. Epochs with large potential fluctuations were removed using EEGLAB's pop_autorej procedure (starting probability was set at 5 SD and the maximum % of epochs to reject per iteration was set at 5). The first pass cleaned EEG data underwent ICA using the AMICA algorithm [48] before further artifact rejection was applied based on detection of significant linear trends over the epoch in component space or abnormal component signal strength in both the 0–2 Hz range and the 20–40 Hz range [49]. Another round of ICA was repeated on the second pass cleaned data, which was used to subtract activity associated with eye blinks and eye movements. ERPs were baseline corrected to a 100 ms baseline. Time-frequency analysis using the time-frequency analysis function from within EEGLAB [47] was applied to the data at logarithmically spaced frequencies from 3 to 40 Hz. Time-frequency power was baseline corrected using the average of the power response from -300 to -200 ms.
Models were adapted from previous modelling efforts using this task [9,10]. The final model used in the analysis was a six parameter model that included reward sensitivity (ρrew), punishment sensitivity (ρpun), learning rate (ε), irreducible noise (ξ), go bias (b) and Pavlovian bias (π). Hierarchical Bayesian parameter estimation using Monte-Carlo Markov Chain was performed using Stan [50]. This procedure obtains full posterior distributions on each parameter (i.e. not just their best guess value but the uncertainty about those values), and this method was found to improve parameter recovery in simulation experiments relative to other approaches. See supplementary material (S1 File) for more detail.
Bayesian repeated measures ANOVA-style models and Bayesian style t-tests were used to analyse the behavioural data [51,52]. More detail on the models used are in the supplementary material (S1 File). The advantages of these models include: can incorporate a t-distribution to render the analysis robust to outliers and some distortions of the normal distribution; model unequal variances; shrinkage to improve estimation and control for multiple comparisons.
Threshold Free Cluster Enhancement (TFCE) was developed to overcome problems associated with threshold selection for EEG data, that gives a fully parametric account of the functional brain response and the functional differences between groups [53,54]. TFCE was calculated according to the method in Mensen & Khatami [53] and Pernet et al. [54]. First, appropriate between-/within-subject t-statics or correlation coefficients were calculated for each time point and electrode for the ERP analysis, or time point and frequency for the time-frequency analyses. Clustering was applied using a thresholded 8 nearest neighbour approach in time and frequency space (for time-frequency analyses at channel FCz) or time and electrode space (for voltage analyses at Fz, F3, F4, FCz, Cz, C3, C4, Pz, P3, and P4). Violations of test assumptions and type I error rates were addressed using permutation statistics. See supplementary material (S1 File) for further details of the method and permutation testing.
Single trial ERP and theta power relationship with PE: For the ERP traces, voltages on a trial by trial basis at all time points (i.e., from -200 to 1000 ms post-stimulus in 3.9 ms increments) were obtained for each individual. For each of the 307 time points, the estimated PEs obtained from from the RL model (from S1 File ρrew|pun * r–Qt-1[at | st]; see e.g., [33]) were correlated with voltage using Spearman's rho. Spearman's rho coefficients underwent Fisher's r to z transform before entering into TFCE analysis and averaged for display. Similarly, for the relationship between PE and theta (4–8 Hz) power was averaged between 300 and 600 ms post-feedback onset for each trial. Bayesian linear mixed effects modelling using custom code calling Stan was used to regress theta power as a function of PE. Diagnostic group was included as an interacting factor with PE. Participants' intercepts and slopes were treated as random effects.
|