Cyclical object segmentation neural networks

    公开(公告)号:US11741611B2

    公开(公告)日:2023-08-29

    申请号:US17584988

    申请日:2022-01-26

    Applicant: Adobe Inc.

    Inventor: Ning Xu

    CPC classification number: G06T7/10 G06T2207/20081 G06T2207/20084

    Abstract: Introduced here are computer programs and associated computer-implemented techniques for training and then applying computer-implemented models designed for segmentation of an object in the frames of video. By training and then applying the segmentation model in a cyclical manner, the errors encountered when performing segmentation can be eliminated rather than propagated. In particular, the approach to segmentation described herein allows the relationship between a reference mask and each target frame for which a mask is to be produced to be explicitly bridged or established. Such an approach ensures that masks are accurate, which in turn means that the segmentation model is less prone to distractions.

    EVENT UNDERSTANDING WITH DEEP LEARNING

    公开(公告)号:US20230127652A1

    公开(公告)日:2023-04-27

    申请号:US17452143

    申请日:2021-10-25

    Applicant: ADOBE INC.

    Abstract: Systems and methods for natural language processing are described. One or more embodiments of the present disclosure generate a word representation vector for each word of a text comprising an event trigger word and an argument candidate word; generate a dependency tree based on the text and the word representation vector; determine that at least one word of the text is independent of a relationship between the event trigger word and the argument candidate word; remove the at least one word from the dependency tree based on the determination to obtain a pruned dependency tree; generate a modified representation vector for each word of the pruned dependency tree using a graph convolutional network (GCN); and identify the relationship between the event trigger word and the argument candidate word based on the modified representation vector for each word of the pruned dependency tree.

    Image object segmentation based on temporal information

    公开(公告)号:US11379987B2

    公开(公告)日:2022-07-05

    申请号:US17020023

    申请日:2020-09-14

    Applicant: ADOBE INC.

    Abstract: A temporal object segmentation system determines a location of an object depicted in a video. In some cases, the temporal object segmentation system determines the object's location in a particular frame of the video based on information indicating a previous location of the object in a previous video frame. For example, an encoder neural network in the temporal object segmentation system extracts features describing image attributes of a video frame. A convolutional long-short term memory neural network determines the location of the object in the frame, based on the extracted image attributes and information indicating a previous location in a previous frame. A decoder neural network generates an image mask indicating the object's location in the frame. In some cases, a video editing system receives multiple generated masks for a video, and modifies one or more video frames based on the locations indicated by the masks.

    Patch-based image matting using deep learning

    公开(公告)号:US11308628B2

    公开(公告)日:2022-04-19

    申请号:US16847819

    申请日:2020-04-14

    Applicant: ADOBE INC.

    Inventor: Ning Xu

    Abstract: Methods and systems are provided for generating mattes for input images. A neural network system is trained to generate a matte for an input image utilizing contextual information within the image. Patches from the image and a corresponding trimap are extracted, and alpha values for each individual image patch are predicted based on correlations of features in different regions within the image patch. Predicting alpha values for an image patch may also be based on contextual information from other patches extracted from the same image. This contextual information may be determined by determining correlations between features in the query patch and context patches. The predicted alpha values for an image patch form a matte patch, and all matte patches generated for the patches are stitched together to form an overall matte for the input image.

    Space-time memory network for locating target object in video content

    公开(公告)号:US11200424B2

    公开(公告)日:2021-12-14

    申请号:US16293126

    申请日:2019-03-05

    Applicant: Adobe Inc.

    Abstract: Certain aspects involve using a space-time memory network to locate one or more target objects in video content for segmentation or other object classification. In one example, a video editor generates a query key map and a query value map by applying a space-time memory network to features of a query frame from video content. The video editor retrieves a memory key map and a memory value map that are computed, with the space-time memory network, from a set of memory frames from the video content. The video editor computes memory weights by applying a similarity function to the memory key map and the query key map. The video editor classifies content in the query frame as depicting the target feature using a weighted summation that includes the memory weights applied to memory locations in the memory value map.

    Interactive image matting using neural networks

    公开(公告)号:US11004208B2

    公开(公告)日:2021-05-11

    申请号:US16365213

    申请日:2019-03-26

    Applicant: Adobe Inc.

    Abstract: Techniques are disclosed for deep neural network (DNN) based interactive image matting. A methodology implementing the techniques according to an embodiment includes generating, by the DNN, an alpha matte associated with an image, based on user-specified foreground region locations in the image. The method further includes applying a first DNN subnetwork to the image, the first subnetwork trained to generate a binary mask based on the user input, the binary mask designating pixels of the image as background or foreground. The method further includes applying a second DNN subnetwork to the generated binary mask, the second subnetwork trained to generate a trimap based on the user input, the trimap designating pixels of the image as background, foreground, or uncertain status. The method further includes applying a third DNN subnetwork to the generated trimap, the third subnetwork trained to generate the alpha matte based on the user input.

    DEEP LEARNING TAG-BASED FONT RECOGNITION UTILIZING FONT CLASSIFICATION

    公开(公告)号:US20210103783A1

    公开(公告)日:2021-04-08

    申请号:US17101778

    申请日:2020-11-23

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to a tag-based font recognition system that utilizes a multi-learning framework to develop and improve tag-based font recognition using deep learning neural networks. In particular, the tag-based font recognition system jointly trains a font tag recognition neural network with an implicit font classification attention model to generate font tag probability vectors that are enhanced by implicit font classification information. Indeed, the font recognition system weights the hidden layers of the font tag recognition neural network with implicit font information to improve the accuracy and predictability of the font tag recognition neural network, which results in improved retrieval of fonts in response to a font tag query. Accordingly, using the enhanced tag probability vectors, the tag-based font recognition system can accurately identify and recommend one or more fonts in response to a font tag query.

    Line drawing generation
    59.
    发明授权

    公开(公告)号:US10922860B2

    公开(公告)日:2021-02-16

    申请号:US16410854

    申请日:2019-05-13

    Applicant: Adobe Inc.

    Abstract: Computing systems and computer-implemented methods can be used for automatically generating a digital line drawing of the contents of a photograph. In various examples, these techniques include use of a neural network, referred to as a generator network, that is trained on a dataset of photographs and human-generated line drawings of the photographs. The training data set teaches the neural network to trace the edges and features of objects in the photographs, as well as which edges or features can be ignored. The output of the generator network is a two-tone digital image, where the background of the image is one tone, and the contents in the input photographs are represented by lines drawn in the second tone. In some examples, a second neural network, referred to as a restorer network, can further process the output of the generator network, and remove visual artifacts and clean up the lines.

    Segmenting objects in video sequences

    公开(公告)号:US10810435B2

    公开(公告)日:2020-10-20

    申请号:US16183560

    申请日:2018-11-07

    Applicant: Adobe Inc.

    Abstract: In implementations of segmenting objects in video sequences, user annotations designate an object in any image frame of a video sequence, without requiring user annotations for all image frames. An interaction network generates a mask for an object in an image frame annotated by a user, and is coupled both internally and externally to a propagation network that propagates the mask to other image frames of the video sequence. Feature maps are aggregated for each round of user annotations and couple the interaction network and the propagation network internally. The interaction network and the propagation network are trained jointly using synthetic annotations in a multi-round training scenario, in which weights of the interaction network and the propagation network are adjusted after multiple synthetic annotations are processed, resulting in a trained object segmentation system that can reliably generate realistic object masks.

Patent Agency Ranking