- Patent Title: System and method for analyzing result of clustering massive data
-
Application No.: US14009907Application Date: 2012-10-31
-
Publication No.: US10402427B2Publication Date: 2019-09-03
- Inventor: Chae Hyun Lee , Min Soeng Kim , Jun Sup Lee
- Applicant: SK Planet Co., Ltd.
- Applicant Address: KR Seongnam-si
- Assignee: SK PLANET CO., LTD.
- Current Assignee: SK PLANET CO., LTD.
- Current Assignee Address: KR Seongnam-si
- Agency: Brinks Gilson & Lione
- Priority: KR10-2012-0035995 20120406
- International Application: PCT/KR2012/008986 WO 20121031
- International Announcement: WO2013/151221 WO 20131010
- Main IPC: G06F16/28
- IPC: G06F16/28 ; G06F16/31 ; G06K9/62 ; G06F16/35

Abstract:
Disclosed are a system and a method for analyzing a result of clustering massive data. An open-source map/reduce framework named Hadoop is used to calculate a silhouette coefficient corresponding to a significance verification index capable of evaluating a result of clustering massive data. To implement the system and the method for analyzing a result of clustering massive data, clustered data is divided into blocks. For all of the blocks, input splits are generated. Then, the generated input splits are assigned to multiple computers. Each computer stores only data of blocks included in an input split assigned in a memory, and calculates a silhouette coefficient for each record. Each computer provides only the calculated silhouette coefficient to an index coefficient calculation apparatus, and enables the index coefficient calculation apparatus to calculate a silhouette coefficient for a cluster. Therefore, the result of clustering the massive data can be rapidly and objectively analyzed.
Public/Granted literature
- US20150032759A1 SYSTEM AND METHOD FOR ANALYZING RESULT OF CLUSTERING MASSIVE DATA Public/Granted day:2015-01-29
Information query