- Patent Title: Reordering text from unstructured sources to intended reading flow
-
Application No.: US14640987Application Date: 2015-03-06
-
Publication No.: US09658991B2Publication Date: 2017-05-23
- Inventor: Nicholas V. Bruno , Jared M. Smythe
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: VanLeeuwen & VanLeeuwen
- Agent Diana R. Gerhardt
- Main IPC: G06F3/00
- IPC: G06F3/00 ; G06F17/22

Abstract:
An approach is provided in which a number of sections from a sequence of characters included in a Portable Document Format (PDF) file are identified. Each of the identified sections includes a unique set of coordinate positions. The approach builds links between the sections based on a relative position of each of the sections in relation to the other sections along an axis. The approach repeatedly merges sections based on the links that were built to form increasingly larger sections until a final larger section is generated with the characters appearing in a manner consistent with human reading of the rendered PDF document rather than the placement of the characters found within the original PDF file.
Public/Granted literature
- US20160085731A1 Reordering Text from Unstructured Sources to Intended Reading Flow Public/Granted day:2016-03-24
Information query