Invention Grant
- Patent Title: Text processing method and apparatus
-
Application No.: US16818690Application Date: 2020-03-13
-
Publication No.: US11714964B2Publication Date: 2023-08-01
- Inventor: Maciej Pajak
- Applicant: Canon Medical Systems Corporation
- Applicant Address: JP Otawara
- Assignee: Canon Medical Systems Corporation
- Current Assignee: Canon Medical Systems Corporation
- Current Assignee Address: JP Otawara
- Agency: Oblon, McClelland, Maier & Neustadt, L.L.P.
- Main IPC: G06F40/284
- IPC: G06F40/284 ; G06F40/30 ; G06N20/00 ; G16H50/20

Abstract:
An apparatus comprises processing circuitry configured to pre-process text data for inputting to a trained model, the pre-processing comprising: receiving a set of text data including numerical information, the set of text data comprising a plurality of tokens, wherein a first subset of the plurality of tokens comprises tokens that do not comprise numerical information, and a second subset of the plurality of tokens comprises tokens that each comprise respective numerical information; transforming each of the plurality of tokens into a respective encoding vector, each of the plurality of tokens in the second subset having a common encoding vector; assigning a respective numerical vector to each of the plurality of tokens, wherein each token in the second subset is assigned a respective numerical vector in dependence on the numerical information in said token; and combining the encoding vectors and numerical vectors to obtain a vector representation of the text data.
Public/Granted literature
- US20210286947A1 TEXT PROCESSING METHOD AND APPARATUS Public/Granted day:2021-09-16
Information query