Monitoring information processing systems utilizing co-clustering of strings in different sets of data records
Abstract:
An apparatus includes a processing device configured to obtain first and second sets of data records, each data record comprising a string associated with an attribute. The processing device is also configured to generate a similarity matrix, wherein entries of the similarity matrix comprise values characterizing similarity between respective pairs of the strings comprising a first string from a data record in the first set and a second string from a data record in the second set. The processing device is further configured to construct a graph network based on the similarity matrix comprising edges connecting pairs of the data records based on values of entries in the similarity matrix, perform a clustering operation on the graph network to identify clusters, and to initiate remedial action responsive to identifying a given cluster comprising at least one data record from each of the first and second sets of data records.
Information query
Patent Agency Ranking
0/0