Method of optimized parsing unstructured and garbled texts lacking whitespaces
Abstract:
A system, method, and computer-readable medium for performing a text parsing operation. The text parsing operation includes: receiving a corpus of text, at least a portion of the corpus of text comprising garbled text; parsing characters within the corpus of text to provide parsed characters from the corpus of text; parsing the parsed characters to generate recognized words from the parsed characters; generating semi-structured text from the recognized words; and, calculating a distribution of recognized words from the semi-structured text.
Information query
Patent Agency Ranking
0/0