Invention Grant
- Patent Title: Document heading detection
-
Application No.: US16212907Application Date: 2018-12-07
-
Publication No.: US10885282B2Publication Date: 2021-01-05
- Inventor: Andreja Ilić , Katarina Jovanović , Milo{hacek over (s)} Ra{hacek over (s)}ković , Vladimir Ranković
- Applicant: Microsoft Technology Licensing, LLC
- Applicant Address: US WA Redmond
- Assignee: Microsoft Technology Licensing, LLC
- Current Assignee: Microsoft Technology Licensing, LLC
- Current Assignee Address: US WA Redmond
- Agency: Merchant & Gould
- Main IPC: G06F17/20
- IPC: G06F17/20 ; G06F40/30 ; G06F40/211

Abstract:
Document heading detection includes performing a classification on each of a plurality of paragraphs of a document to identify each paragraph as either a heading or non-heading paragraph. The classification is based on one or more pre-established values corresponding to one or more pre-established formatting features that are indicative of a heading paragraph relative to currently established values for each of the one or more pre-established formatting features in each of the plurality of paragraphs. Document heading detection further includes determining a strength of each of the one or more heading paragraphs by performing a linear regression on each heading paragraph and assigning each of the one or more heading paragraphs a heading level within a hierarchy of heading levels based on the determined strength.
Public/Granted literature
- US20200184013A1 DOCUMENT HEADING DETECTION Public/Granted day:2020-06-11
Information query