math.ST

(what is this?)

# Title: Optimal Bipartite Network Clustering

Abstract: We consider the problem of bipartite community detection in networks, or more generally the network biclustering problem. We present a fast two-stage procedure based on spectral initialization followed by the application of a pseudo-likelihood classifier twice. Under mild regularity conditions, we establish the weak consistency of the procedure (i.e., the convergence of the misclassification rate to zero) under a general bipartite stochastic block model. We show that the procedure is optimal in the sense that it achieves the optimal convergence rate that is achievable by a biclustering oracle, adaptively over the whole class, up to constants. The optimal rate we obtain sharpens some of the existing results and generalizes others to a wide regime of average degree growth. As a special case, we recover the known exact recovery threshold in the $\log n$ regime of sparsity. To obtain the general consistency result, as part of the provable version of the algorithm, we introduce a sub-block partitioning scheme that is also computationally attractive, allowing for distributed implementation of the algorithm without sacrificing optimality. The provable version of the algorithm is derived from a general blueprint for pseudo-likelihood biclustering algorithms that employ simple EM type updates. We show the effectiveness of this general class by numerical simulations.
 Subjects: Statistics Theory (math.ST); Social and Information Networks (cs.SI); Machine Learning (stat.ML) Cite as: arXiv:1803.06031 [math.ST] (or arXiv:1803.06031v1 [math.ST] for this version)

## Submission history

From: Zhixin Zhou [view email]
[v1] Thu, 15 Mar 2018 23:19:30 GMT (689kb,D)