PMC3148213 | SoMeSci

Property	Value
is nif:broaderContext of	sms:PMC3148213/sentence0 sms:PMC3148213/sentence1 sms:PMC3148213/sentence10 sms:PMC3148213/sentence11 sms:PMC3148213/sentence12 sms:PMC3148213/sentence13 sms:PMC3148213/sentence14 sms:PMC3148213/sentence15 sms:PMC3148213/sentence16 sms:PMC3148213/sentence17 sms:PMC3148213/sentence18 sms:PMC3148213/sentence19 sms:PMC3148213/sentence2 sms:PMC3148213/sentence20 sms:PMC3148213/sentence21 sms:PMC3148213/sentence22 sms:PMC3148213/sentence23 sms:PMC3148213/sentence24 sms:PMC3148213/sentence25 sms:PMC3148213/sentence26 sms:PMC3148213/sentence27 sms:PMC3148213/sentence28 sms:PMC3148213/sentence29 sms:PMC3148213/sentence3 sms:PMC3148213/sentence30 sms:PMC3148213/sentence31 sms:PMC3148213/sentence32 sms:PMC3148213/sentence33 sms:PMC3148213/sentence34 sms:PMC3148213/sentence35 sms:PMC3148213/sentence36 sms:PMC3148213/sentence37 sms:PMC3148213/sentence38 sms:PMC3148213/sentence39 sms:PMC3148213/sentence4 sms:PMC3148213/sentence40 sms:PMC3148213/sentence41 sms:PMC3148213/sentence42 sms:PMC3148213/sentence5 sms:PMC3148213/sentence6 sms:PMC3148213/sentence7 sms:PMC3148213/sentence8 sms:PMC3148213/sentence9
nif:broaderContext	<https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3148213>
is schema:hasPart of	sms:PLoS_Methods
schema:isPartOf	sms:PLoS_Methods
nif:isString	To describe the probabilistic inference algorithm used for estimating the parameters , we first write the likelihood of the entire observed network adjacency matrix in terms of our model (1):(3) For a dyadic model, the likelihood factorizes into terms that involve parameters associated with only two nodes. Commonly used methods to estimate the parameters and hidden variables in such a model are to employ maximum likelihood (ML) techniques in the form of an expectation-maximization type algorithm or Monte Carlo sampling [40]. We prefer a Bayesian approach, based on Maximum A Posteriori (MAP) estimates that does not incur the computational cost of Monte Carlo sampling while being less sensitive to initial conditions and more stable numerically than ML, especially as the parameters which maximize (3) may lie on the the borders of the admissible interval . Furthermore, the MAP approach provides a natural Occam's razor as the posterior distributions of parameter estimates can only reduce in variance with the provision of more data, while the ML approach assumes point estimates or functions for the posterior from the start. This is an important feature of the Bayesian approach as it provides a natural limit for the number of inferred classes and confidence levels in the assignments. Classes cannot be arbitrarily small if the posterior for the inter-class link preference is to be localized. In contrast, under an ML approach the likelihood increases monotonically when more and hence smaller classes are used and model selection criteria, as in [19], are needed. Finally, Bayesian techniques offer a principled way to incorporate prior domain knowledge for obtaining a more accurate approximate marginal posterior distribution , where represents one of the parameters or . A message passing or belief propagation algorithm provides a principled way to calculate approximate posterior marginal distributions [41], [42]. The starting point for this algorithm is a so-called factor- or dependency-graph, a graphical representation of the probabilistic dependencies between the variables (model parameters) we wish to infer from the data, and the individual factors that constitute the likelihood (3). Figure 5A shows this for the case of a bi-partite network, likelihood (3) and model (1). Figure data removed from full text. Figure identifier and caption: 10.1371/journal.pone.0021282.g005 Factor graphs and an example of an elementary message passing update.Factors of the likelihood function are represented as squares, variables of the generative model as circles. Connections indicate which variables enter the calculation of which factor. a) For a bipartite actor-event networks represented by an adjacency matrix , class label and activity of actor enter in the calculation of all factors in row . Equivalently, class label and popularity of event enter in the calculation of all factors in column . The variables denoting preference of actors in class for events in class enter in every factor. Note that while each factor depends on only variables, the and variables enter in the calculation of , the and variables in and the variables in factors. b) Pictorial representation of the messages involved in calculating sent from factor to variable according to equation (9). c) For directed networks represented by non-symmetric adjacency matrices, the factors correspond to dyads . Additional to the interclass preference matrix, a symmetric matrix of reciprocities is included in the model. Every node carries a single class label , activity and attractiveness parameter . The variables associated with node enter in the calculation of factors in both row and column . The algorithm proceeds by exchanging messages, conditional probabilities, between factors and variables connected in the dependency graph until convergence. Using the definitions:(4) one can interpret (R-Message) as the likelihood of a single observed matrix entry given only the parameter and all the data matrix except for entry . Equally, (Q-Message) is interpreted as the posterior probability distribution of parameter given the entire data matrix except for entry . For the sake of notational economy, we have adopted to identify functions by their argument. It is to be understood that is a different function than and not the same function evaluated at the points and as should be clear from the definitions (4). Formally, we obtain the R-Message from to , by integrating out all parameters except from a likelihood function(5) Using the independence of given data entries we can readily identify with the of (1). Assuming the joint distribution factorizes with respect to every single , one obtains the following closed set of equations:(6) Although the factorization assumption may seem strong, it merely means that the Q-Messages for any two variables and with are assumed independent. Given that these distributions are conditioned on the entire data matrix except for one entry, the error we make using this assumption is considered negligible for large systems. The form of calculating in (6) follows directly from Bayes' theorem and is the distribution we use to include prior information. These equations can be iterated until convergence after which we finally obtain the desired approximate marginal posterior distribution, for every single parameter, as:(7) To illustrate these ideas, explicit update equations for the inference of the hidden class index of node appear below. Expressions for other parameters are reported in Material S1. With(8) we can write for the R- and Q-Messages between and :(9) The dependency graph greatly facilitates setting-up these update equations. Following the rules that R-Messages are always sent from factors to variables and Q-Messages from variables to factors; and that in R-Messages, we sum or integrate over the incoming Q-messages, while Q-Messages are proportional to the product of incoming R-Messages, we can write the equations based on the dependency graph. Figure 5B shows a detail of 5A focussing on factor to illustrate the messages involved in the calculation of sent to variable as in (9). Figure 5C illustrate the update equations in the case of directed uni-partite networks (cf. Material S1).
rdf:type	nif:Context