Automatically predicting text in images
Abstract:
Systems and methods for detecting and predicting text within images. An image is passed to a feature-extraction module. Each image typically contains at least one text object, and each text object contains at least one character. Based on the image, the feature-extraction module generates at least one feature map indicating text object(s) in the image. The feature map(s) is then passed to a decoder module. In son implementations, the decoder module applies a weighted mask to the feature map(s). Based on the feature map(s), the decoder module predicts a sequence of characters in the text object(s). In some embodiments, that prediction is based on previous known data. The decoder module is directed by a query that indicates at least one desired characteristic of the text object(s). An output module then refines the predicted content. At least one neural network may be used.
Public/Granted literature
Information query
Patent Agency Ranking
0/0