Identifying section headings in a document
Abstract:
A method, non-transitory computer readable medium, and system for inferring certain texts as stylized section headings in an electronic document (ED). Stylized section headings are section headings that have unique styling distinct from the body of text below each stylized heading. In particular, the stylized section headings are identified based on styling information in the ED. Identifying stylized section headings includes grouping candidate headings based on identification of dominant styling, locating high level fragments, and repeatedly locating nested fragments from within higher level fragments. The ED may or may not include explicitly identified headings in the document.
Public/Granted literature
Information query
Patent Agency Ranking
0/0