Clustering and labeling streamed data
Abstract:
Aspects extend to methods, systems, and computer program products for clustering streamed or batch data. Aspects of the invention include dynamic clustering and labeling of streamed data and/or batch data, including failures and error logs (user, platform, etc.), latency logs, warning logs, information logs, Virtual Machine (VM) creation data logs, template logs, etc., for use in analysis (e.g., error log analysis). A clustering system can learn from previously identified patterns and use that information to group newer information dynamically as it gets generated. The clustering system can leverage streamed data and/or batch data domain knowledge for preprocessing. In one aspect, a clustering system uses a similarity measure. Based on (e.g., users' configuration of) a similarity threshold, the cluster system (e.g., automatically) assigns/clusters streamed data and/or batch data into groups.
Public/Granted literature
Information query
Patent Agency Ranking
0/0