Invention Grant
- Patent Title: Generating training data for disambiguation
-
Application No.: US14954636Application Date: 2015-11-30
-
Publication No.: US09720904B2Publication Date: 2017-08-01
- Inventor: Yohei Ikawa , Akiko Suzuki
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
- Current Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
- Current Assignee Address: US NY Armonk
- Agency: Cantor Colburn LLP
- Agent Keivan Razavi
- Priority: JP2014-166695 20140819
- Main IPC: G06F17/20
- IPC: G06F17/20 ; G06F17/27 ; G10L15/06

Abstract:
A method for generating training data for disambiguation of an entity comprising a word or word string related to a topic to be analyzed includes acquiring sent messages by a user, each including at least one entity in a set of entities; organizing the messages and acquiring sets, each containing messages sent by each user; identifying a set of messages including different entities, greater than or equal to a first threshold value, and identifying a user corresponding to the identified set as a hot user; receiving an instruction indicating an object entity to be disambiguated; determining a likelihood of co-occurrence of each keyword and the object entity in sets of messages sent by hot users; and determining training data for the object entity on the basis of the likelihood of co-occurrence of each keyword and the object entity in the sets of messages sent by the hot users.
Public/Granted literature
- US20160085740A1 GENERATING TRAINING DATA FOR DISAMBIGUATION Public/Granted day:2016-03-24
Information query