Domain-specific computational lexicon formation

Invention Grant

US09678941B2 Domain-specific computational lexicon formation 有权

Please log in to see more content

Patent Title: Domain-specific computational lexicon formation
Application No.: US14580583

Application Date: 2014-12-23
Publication No.: US09678941B2

Publication Date: 2017-06-13
Inventor: Branimir K. Boguraev , Esme Manandise , Benjamin P. Segal
Applicant: International Business Machines Corporation
Applicant Address: US NY Armonk
Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
Current Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
Current Assignee Address: US NY Armonk
Agency: Cantor Colburn LLP
Agent William Stock
Main IPC: G06F17/27
IPC: G06F17/27 ; G06F17/30

Domain-specific computational lexicon formation

Abstract:

According to an aspect, a candidate token sequence including one or more word tokens is extracted from an unstructured domain glossary that includes entries associated with a domain. A look-up operation is performed to retrieve language data for each word token in the candidate token sequence and annotates each word token in the candidate token sequence found by the look-up operation with corresponding retrieved language data to form an annotated sequence. A pattern match of the annotated sequence is performed relative to a repository of patterns and identifies a best matching pattern from the repository of patterns to the annotated sequence based on matching criteria. The annotated sequence is refined with lexical information associated with the best matching pattern as a refined annotated sequence. The candidate token sequence and the refined annotated sequence are output to a domain-specific computational lexicon file.

Public/Granted literature

US20160179782A1 DOMAIN-SPECIFIC COMPUTATIONAL LEXICON FORMATION Public/Granted day:2016-06-23

Information query

Espacenet