Template-free extraction of data from documents
Abstract:
The disclosed embodiments provide a system that processes data. One example embodiment is a computer-implemented method for processing data. The computer-implemented method includes obtaining text from a document associated with a user, wherein the document was generated based on a template and, with the obtained text intact, applying a set of rules to each term in the obtained text to determine a broad category of a plurality of terms associated with the term. The computer-implemented method further includes applying an additional set of rules to refine the broad category associated with the term to a refined category of fewer terms based on a location in the document of at least one term in the broad category of the plurality of terms, extracting a term from the obtained text using template-independent code developed to process documents generated based on a plurality of templates and enabling use of the term with an application.
Information query
Patent Agency Ranking
0/0