System and method of automatic topic detection in text

    公开(公告)号:US12001797B2

    公开(公告)日:2024-06-04

    申请号:US17318524

    申请日:2021-05-12

    CPC classification number: G06F40/289 G06F16/2468 G06F16/248 G06N3/04

    Abstract: A method and system for automatic topic detection in text may include receiving a text document of a corpus of documents and extracting one or more phrases from the document, based on one or more syntactic patterns. For each phrase, embodiments of the invention may: apply a word embedding neural network on one or more words of the phrase, to obtain one or more respective word embedding vectors; calculate a weighted phrase embedding vector, and compute a phrase saliency score, based on the weighted phrase embedding vector. Embodiments of the invention may subsequently produce one or more topic labels, representing one or more respective topics in the document, based on the computed phrase saliency scores, and may select one or more topic labels according to their relevance to the business domain of the corpus.

Patent Agency Ranking