Invention Grant
- Patent Title: Word embeddings and virtual terms
-
Application No.: US17060198Application Date: 2020-10-01
-
Publication No.: US11048884B2Publication Date: 2021-06-29
- Inventor: James Allen Cox , Russell Albright , Saratendu Sethi
- Applicant: SAS Institute Inc.
- Applicant Address: US NC Cary
- Assignee: SAS Institute Inc.
- Current Assignee: SAS Institute Inc.
- Current Assignee Address: US NC Cary
- Agency: Coats + Bennett, PLLC
- Main IPC: G06F40/44
- IPC: G06F40/44 ; G06F16/903 ; G06F40/30 ; G06F40/247 ; G06F40/284

Abstract:
A computing system receives a collection comprising multiple sets of ordered terms, including a first set. The system generates a dataset indicating an association between each pair of terms within a same set of the collection by generating co-occurrence score(s) for the first set. The system generates computed probabilities based on the co-occurrence score(s) for the first set. The computed probabilities indicate a likelihood that one term in a given pair of terms of the collection appears in a given set of the collection given that another term in the given pair of terms of the collection occurs. The system smoothes the computed probabilities by adding one or more random observations. The system generates one or more association indications for the first set based on the smoothed computed probabilities. The system outputs an indication of the dataset. Additionally, or alternatively, based on association measure(s), the system generates a virtual term.
Public/Granted literature
- US20210027024A1 Word Embeddings and Virtual Terms Public/Granted day:2021-01-28
Information query