-
公开(公告)号:US09965460B1
公开(公告)日:2018-05-08
申请号:US15394436
申请日:2016-12-29
Applicant: KONICA MINOLTA LABORATORY U.S.A., INC.
Inventor: Julie Wasiuk , Daniel Barber
CPC classification number: G06F17/278 , G06F17/2705 , G06F17/271 , G06F17/2765 , G06F17/277
Abstract: Disclosed herein is a method of extracting keywords from a document based on certain statistical, positional and natural language data, as well as relationship maps between the keywords. Under this method, document data are processed to obtain an NLP result for each sentence of the document, and based on the NLP result, words in the document are filtered and grouped into terms; a frequency analysis as well as a co-occurrence analysis are performed over the terms to output one or more keywords representing the document.