Invention Grant
- Patent Title: Fast identification of text intensive pages from photographs
-
Application No.: US17542856Application Date: 2021-12-06
-
Publication No.: US11715316B2Publication Date: 2023-08-01
- Inventor: Alexander Pashintsev , Boris Gorbatov , Eugene Livshitz , Vitaly Glazkov
- Applicant: EVERNOTE CORPORATION
- Applicant Address: US CA Redwood City
- Assignee: Evernote Corporation
- Current Assignee: Evernote Corporation
- Current Assignee Address: US CA San Diego
- Agency: Morgan, Lewis & Bockius LLP
- The original application number of the division: US15272744 2016.09.22
- Main IPC: G06T3/40
- IPC: G06T3/40 ; G06T7/60 ; G06V30/413 ; G06V30/414

Abstract:
Methods and systems for training a neural network to distinguish between text documents and image documents are described. A corpus of text and image documents is obtained. A page of a text document is scanned by shifting a text window to a plurality of locations. In accordance with a determination that the text in the window at a respective location meets text line criteria, the text in the window is stored as a respective text snippet. A plurality of image windows are superimposed over at least one page of an image document. In accordance with a determination that the content of a respective image window meets image criteria, content of the image window is stored as a respective image snippet. The respective text snippet and the respective image snippet are provided to a classifier.
Public/Granted literature
- US20220270386A1 FAST IDENTIFICATION OF TEXT INTENSIVE PAGES FROM PHOTOGRAPHS Public/Granted day:2022-08-25
Information query