Invention Grant
- Patent Title: System and a method for associating contextual structured data with unstructured documents on map-reduce
-
Application No.: US15229485Application Date: 2016-08-05
-
Publication No.: US10915537B2Publication Date: 2021-02-09
- Inventor: Manish A. Bhide , Himanshu Gupta , Mukesh K. Mohania , Scott Schumacher
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agent Edward J. Wixted, III
- Main IPC: G06F16/2457
- IPC: G06F16/2457 ; G06F16/93 ; G06F16/33

Abstract:
In an approach for integrating documents a processor extracts a first set of keywords from at least one structured document. A processor generates a first batch of keywords from the first set of keywords, wherein each keyword in the first batch of keywords includes a weight. A processor extracts a second set of keywords from at least one unstructured document. A processor compares the first batch of keywords to the second set of keywords. A processor determines that the at least one unstructured document matches, based on a predetermined threshold, the at least one structured document, based on the comparison of the first batch of keywords to the second set of keywords. A processor removes the at least one unstructured document from a list of unstructured documents which are to be processed.
Public/Granted literature
- US20170060915A1 SYSTEM AND A METHOD FOR ASSOCIATING CONTEXTUAL STRUCTURED DATA WITH UNSTRUCTURED DOCUMENTS ON MAP-REDUCE Public/Granted day:2017-03-02
Information query