Compositional model for text recognition
Abstract:
Embodiments relate to a two-stage end-to-end text recognition system. The text recognition system includes a text detection stage and a text recognition stage. Images inputted to the text recognition system are provided to both the text detection stage and to the text recognition stage. The text detection stage detects text regions in the images and provides the detected regions to the text recognition stage. The text recognition stage is trained to perform geometric rectification on the text regions using the images. There is end-to-end alignment between the text detection stage and the text recognition stage. Additionally, the text detection stage and text recognition stage are each trained independent of the other.
Public/Granted literature
Information query
Patent Agency Ranking
0/0