VEM algorithm to adjust the noisy stochastic block model to an observed dense adjacency matrix
Description
fitNSBM()
estimates model parameters of the noisy stochastic block model and provides a clustering of the nodes
Usage
fitNSBM(
dataMatrix,
model = "Gauss0",
sbmSize = list(Qmin = 1, Qmax = NULL, explor = 1.5),
filename = NULL,
initParam = list(nbOfTau = NULL, nbOfPointsPerTau = NULL, maxNbOfPasses = NULL,
minNbOfPasses = 1),
nbCores = parallel::detectCores()
)
fitNSBM(
dataMatrix,
model = "Gauss0",
sbmSize = list(Qmin = 1, Qmax = NULL, explor = 1.5),
filename = NULL,
initParam = list(nbOfTau = NULL, nbOfPointsPerTau = NULL, maxNbOfPasses = NULL,
minNbOfPasses = 1),
nbCores = parallel::detectCores()
)
Arguments
dataMatrix |
observed dense adjacency matrix
|
model |
Implemented models:
Gauss
-
all Gaussian parameters of the null and the alternative distributions are unknown ; this is the Gaussian model with maximum number of unknown parameters
Gauss0
-
compared to Gauss , the mean of the null distribution is set to 0
Gauss01
-
compared to Gauss , the null distribution is set to N(0,1)
GaussEqVar
-
compared to Gauss , all Gaussian variances (of both the null and the alternative) are supposed to be equal, but unknown
Gauss0EqVar
-
compared to GaussEqVar , the mean of the null distribution is set to 0
Gauss0Var1
-
compared to Gauss , all Gaussian variances are set to 1 and the null distribution is set to N(0,1)
Gauss2distr
-
the alternative distribution is a single Gaussian distribution, i.e. the block memberships of the nodes do not influence on the alternative distribution
GaussAffil
-
compared to Gauss , for the alternative distribution, there's a distribution for inter-group and another for intra-group interactions
Exp
-
the null and the alternatives are all exponential distributions (i.e. Gamma distributions with shape parameter equal to one) with unknown scale parameters
ExpGamma
-
the null distribution is an unknown exponential, the alterantive distribution are Gamma distributions with unknown parameters
|
sbmSize |
list of parameters determining the size of SBM (the number of latent blocks) to be expored
Qmin
-
minimum number of latent blocks
Qmax
-
maximum number of latent blocks
explor
-
if Qmax is not provided, then Qmax is automatically determined as explor times the number of blocks where the ICL is maximal
|
filename |
results are saved in a file with this name (if provided)
|
initParam |
list of parameters that fix the number of initializations
nbOfTau
-
number of initial points for the node clustering (i. e. for the variational parameters tau )
nbOfPointsPerTau
-
number of initial points of the latent binary graph
maxNbOfPasses
-
maximum number of passes through the SBM models, that is, passes from Qmin to Qmax or inversely
minNbOfPasses
-
minimum number of passes through the SBM models
|
nbCores |
number of cores used for parallelization
|
Details
fitNSBM()
supports different probability distributions for the edges and can estimate the number of node blocks
Value
Returns a list of estimation results for all numbers of latent blocks considered by the algorithm.
Every element is a list composed of:
theta
-
estimated parameters of the noisy stochastic block model; a list with the following elements:
pi
-
parameter estimate of pi
w
-
parameter estimate of w
nu0
-
parameter estimate of nu0
nu
-
parameter estimate of nu
clustering
-
node clustering obtained by the noisy stochastic block model, more precisely, a hard clustering given by the
maximum aposterior estimate of the variational parameters sbmParam$edgeProba
sbmParam
-
further results concerning the latent binary stochastic block model. A list with the following elements:
Q
-
number of latent blocks in the noisy stochastic block model
clusterProba
-
soft clustering given by the conditional probabilities of a node to belong to a given latent block.
In other words, these are the variational parameters tau
; (Q x n)-matrix
edgeProba
-
conditional probabilities rho
of an edges given the node memberships of the interacting nodes; (N_Q x N)-matrix
ICL
-
value of the ICL criterion at the end of the algorithm
convergence
-
a list of convergence indicators:
J
-
value of the lower bound of the log-likelihood function at the end of the algorithm
complLogLik
-
value of the complete log-likelihood function at the end of the algorithm
converged
-
indicates if algorithm has converged
nbIter
-
number of iterations performed
Examples
n <- 10
theta <- list(pi= c(0.5,0.5), nu0=c(0,.1),
nu=matrix(c(-2,10,-2, 1,1,1),3,2), w=c(.5, .9, .3))
obs <- rnsbm(n, theta, modelFamily='Gauss')
res <- fitNSBM(obs$dataMatrix, sbmSize = list(Qmax=3),
initParam=list(nbOfTau=1, nbOfPointsPerTau=1), nbCores=1)
n <- 10
theta <- list(pi= c(0.5,0.5), nu0=c(0,.1),
nu=matrix(c(-2,10,-2, 1,1,1),3,2), w=c(.5, .9, .3))
obs <- rnsbm(n, theta, modelFamily='Gauss')
res <- fitNSBM(obs$dataMatrix, sbmSize = list(Qmax=3),
initParam=list(nbOfTau=1, nbOfPointsPerTau=1), nbCores=1)