Method for clustering nodes of a textual network taking into account textual content, computer-readable storage device and system implementing said method
Abstract:
The invention relates to a method for clustering nodes of a network, the network comprising nodes associated with message edges of text data, the method comprising an initialization step of determination of a first initial clustering of the nodes, and a step of iterative inference of a generative model of text documents. Edges are modeled with a Stochastic Block Model (SBM) and the sets of documents between and within clusters are modeled according to a generative model of documents. The inference step comprises iteratively modelling the text documents and the underlying topics of their textual content, and updating the clustering as a function of the modelling, until a convergence criterion is fulfilled and an optimized clustering and corresponding optimized values of the parameters of the models are output.
Information query
Patent Agency Ranking
0/0