Invention Grant
- Patent Title: Device, method and program for generating accurate corpus data for presentation target for searching
-
Application No.: US14420424Application Date: 2013-09-30
-
Publication No.: US09645979B2Publication Date: 2017-05-09
- Inventor: Keiji Shinzato
- Applicant: RAKUTEN INC
- Applicant Address: JP Tokyo
- Assignee: Rakuten, Inc.
- Current Assignee: Rakuten, Inc.
- Current Assignee Address: JP Tokyo
- Agency: Sughrue Mion, PLLC
- International Application: PCT/JP2013/076545 WO 20130930
- International Announcement: WO2015/045155 WO 20150402
- Main IPC: G06F17/21
- IPC: G06F17/21 ; G06F17/27

Abstract:
A corpus generation device according to an embodiment includes a web page acquisition unit, a reference word acquisition unit, an attachment unit and an output unit. The web page acquisition unit acquires a web page including description sentence data regarding a presentation target. The reference word acquisition unit acquires a reference word that is an attribute value regarding the presentation target from the web page. The attachment unit extracts a broader word belonging to a layer above the reference word acquired by the reference word acquisition unit from a storage unit that stores hierarchical relationship information indicating a hierarchical relationship between attribute values, and attaches an attribute tag corresponding to the reference word to the broader word included in the description sentence data. The output unit outputs, as corpus data, the description sentence data to which the attribute tag is attached by the attachment unit.
Public/Granted literature
- US20160041951A1 CORPUS GENERATION DEVICE, CORPUS GENERATION METHOD AND CORPUS GENERATION PROGRAM Public/Granted day:2016-02-11
Information query