Invention Grant
- Patent Title: Method, program, and device for analyzing document structure
- Patent Title (中): 用于分析文档结构的方法,程序和设备
-
Application No.: US11462871Application Date: 2006-08-07
-
Publication No.: US07698627B2Publication Date: 2010-04-13
- Inventor: Chieko Asakawa , Tarsuya Ishihara , Takashi Itoh , Hironobu Takagi
- Applicant: Chieko Asakawa , Tarsuya Ishihara , Takashi Itoh , Hironobu Takagi
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: Yee & Associates, P.C.
- Agent Libby Z. Toub
- Priority: JP2005-255548 20050902
- Main IPC: G06F17/00
- IPC: G06F17/00

Abstract:
A device, a control method, and a program to increase the accuracy of voice read-out and text mining by automatically structuring a presentation file. The arrangement and practice of the invention involves an overlap grouping part for extracting overlap information between objects in a presentation file and grouping the objects as a parent-child relationship; a graph dividing grouping part for grouping the objects as a sibling relationship by representing the objects as nodes of a graph and by recursively dividing the graph so that a predefined cost between the nodes is minimized; a distance information grouping part for further grouping the objects as a sibling relationship if distance information between the objects is below a threshold determined by a predefined computation from a distribution histogram of the distance information; and a link information extraction part for extracting arrow graphics that represents a link relationship and generating link information including the link relationship and a link label. The resulting structured data is output as meta-information.
Public/Granted literature
- US20070038937A1 Method, Program, and Device for Analyzing Document Structure Public/Granted day:2007-02-15
Information query