Invention Grant
- Patent Title: Parse tree based vectorization for natural language processing
-
Application No.: US16352358Application Date: 2019-03-13
-
Publication No.: US10922486B2Publication Date: 2021-02-16
- Inventor: Mudhakar Srivatsa , Raghu Kiran Ganti , Yeon-sup Lim , Shreeranjani Srirangamsridharan , Antara Palit
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: Garg Law Firm, PLLC
- Agent Rakesh Garg; Joseph Petrokaitis
- Main IPC: G06F40/279
- IPC: G06F40/279 ; G06F40/211 ; G06F40/30 ; G06F40/284

Abstract:
A parse tree corresponding to a portion of narrative text is constructed. The parse tree includes a data structure representing a syntactic structure of the portion of narrative text as a set of tokens according to a grammar. Using a token in the parse tree as a focus word, a context window comprising a set of words within a specified distance from the focus word is generated, the distance determined according to a number of links of the parse tree separating the focus word and a context word in the set of words. A weight is generated for the focus word and the context word. Using the weight, a first vector representation of a first word is generated, the first word being within a second portion of narrative text.
Public/Granted literature
- US20200293614A1 PARSE TREE BASED VECTORIZATION FOR NATURAL LANGUAGE PROCESSING Public/Granted day:2020-09-17
Information query