PropertyValue
is nif:broaderContext of
nif:broaderContext
is schema:hasPart of
schema:isPartOf
nif:isString
  • The data used in this study comes from the Arkansas Children’s Hospital Research Institute’s autism IMAGE study [38]. The protocol was approved by the Institutional Review Board at the University of Arkansas for Medical Sciences and all parents signed informed consent. The interested reader is referred to [38] for detailed study design, including demographic information and inclusion/exclusion criteria. Briefly, children between the ages of 3 and 10 years were enrolled to assess levels of oxidative stress. ASD was defined by the Diagnostic and Statistical Manual for Mental Disorders, Fourth Edition, the Autism Diagnostic Observation Schedule (ADOS), and/or the Childhood Autism Rating Scales (CARS; score > 30). FOCM/TS metabolites from 83, 47, and 76 case (ASD), sibling (SIB), and age-matched control (NEU) children, respectively, were used for classification. The metabolites under investigation are tabulated in Table 1 and additional details of these measurements and derivations are presented in [38]. Of the 83 participants on the autism spectrum, 55 also had Vineland II Scores recorded for use in regression analysis (range 46–106). The Vineland Adaptive Behavior Composite evaluates adaptive skills across the domains of communication, socialization, daily living skills, and motor skills through a semi-structured caregiver interview [55]. Data are available in S1 Dataset. Fisher Discriminant Analysis (FDA) is a dimensionality reduction tool that seeks to maximize differences between multiple classes. Specifically, for n samples of m measurements associated with k different classes, the between cluster variability SB is defined to be SB=∑i=1kni(x¯i-x¯)(x¯i-x¯)T where x¯i represents the mean vector of class i, x¯ represents the mean vector of all samples, and ni represents the number of samples in class i. The within cluster variation is defined as SW=∑i=1kni∑j∈i(xj-x¯i)(xj-x¯i)T where xj represents an individual sample. FDA seeks to find at most k − 1 vectors that maximize J(w)=wTSBwwTSWw In other words, FDA seeks to find linear combinations of variables that project samples in the same group close to each other and project samples in different groups far away from each other. The solution to this optimization problem is the generalized eigenvectors associated with the k − 1 largest generalized eigenvalues of SW-1SB. Kernel density estimation attempts to determine the underlying probability distribution function from a set of reference samples. The main assumption is that additional samples are likely to be found near the reference samples [56–58]. Using a Gaussian kernel, this assumption is formulated into an algorithm by associating a kernel function Kx-xiσ with each observation xi. Here, x is the additional sample and σ is the kernel parameter that controls the shape of the distribution function. The estimated density function f^(x) is then given by f^(x)=1nσ∑i=1nKx-xiσ where n is the number of reference samples. The kernel parameter σ is chosen to minimize the mean integrated squared error (MISE) between the unknown density function f(x) and the estimated density function f^(x): MISE(σ)=∫-∞∞f(x)-f^(x)2 using a cross-validatory approach [56]. Kernel techniques provide general nonlinear extensions to the popular linear partial least squares (PLS) regression. The KPLS algorithm commences by defining a nonlinear transformation f = ψ(x) on the predictor set x. In this work, ψ(x) is a Gaussian kernel. Rather than regression on x as in linear PLS, y is regressed onto the high dimensional feature space f [42, 43]. To avoid over-fitting and over-stating results, leave-one-out cross validation is employed in both the FDA and KPLS analysis. The approach leaves out a single sample, fits an FDA or KPLS model, and evaluates the prediction of the sample left out. This scheme is repeated for each sample.
rdf:type