PERFORMING GLOBAL IMAGE EDITING USING EDITING OPERATIONS DETERMINED FROM NATURAL LANGUAGE REQUESTS

    公开(公告)号:US20220399017A1

    公开(公告)日:2022-12-15

    申请号:US17374103

    申请日:2021-07-13

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that utilize a neural network having a long short-term memory encoder-decoder architecture to progressively modify a digital image in accordance with a natural language request. For example, in one or more embodiments, the disclosed systems utilize a language-to-operation decoding cell of a language-to-operation neural network to sequentially determine one or more image-modification operations to perform to modify a digital image in accordance with a natural language request. In some cases, the decoding cell determines an image-modification operation to perform partly based on the previously used image-modification operations. The disclosed systems further utilize the decoding cell to determine one or more operation parameters for each selected image-modification operation. The disclosed systems utilize the image-modification operation(s) and operation parameter(s) to modify the digital image (e.g., by generating one or more modified digital images) via the decoding cell.

    TRAINING NEURAL NETWORKS TO PERFORM TAG-BASED FONT RECOGNITION UTILIZING FONT CLASSIFICATION

    公开(公告)号:US20220148325A1

    公开(公告)日:2022-05-12

    申请号:US17584962

    申请日:2022-01-26

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to a tag-based font recognition system that utilizes a multi-learning framework to develop and improve tag-based font recognition using deep learning neural networks. In particular, the tag-based font recognition system jointly trains a font tag recognition neural network with an implicit font classification attention model to generate font tag probability vectors that are enhanced by implicit font classification information. Indeed, the font recognition system weights the hidden layers of the font tag recognition neural network with implicit font information to improve the accuracy and predictability of the font tag recognition neural network, which results in improved retrieval of fonts in response to a font tag query. Accordingly, using the enhanced tag probability vectors, the tag-based font recognition system can accurately identify and recommend one or more fonts in response to a font tag query.

    CYCLICAL OBJECT SEGMENTATION NEURAL NETWORKS

    公开(公告)号:US20220148183A1

    公开(公告)日:2022-05-12

    申请号:US17584988

    申请日:2022-01-26

    Applicant: Adobe Inc.

    Inventor: Ning Xu

    Abstract: Introduced here are computer programs and associated computer-implemented techniques for training and then applying computer-implemented models designed for segmentation of an object in the frames of video. By training and then applying the segmentation model in a cyclical manner, the errors encountered when performing segmentation can be eliminated rather than propagated. In particular, the approach to segmentation described herein allows the relationship between a reference mask and each target frame for which a mask is to be produced to be explicitly bridged or established. Such an approach ensures that masks are accurate, which in turn means that the segmentation model is less prone to distractions.

    Image Object Segmentation Based on Temporal Information

    公开(公告)号:US20200034971A1

    公开(公告)日:2020-01-30

    申请号:US16047492

    申请日:2018-07-27

    Applicant: Adobe Inc.

    Abstract: A temporal object segmentation system determines a location of an object depicted in a video. In some cases, the temporal object segmentation system determines the object's location in a particular frame of the video based on information indicating a previous location of the object in a previous video frame. For example, an encoder neural network in the temporal object segmentation system extracts features describing image attributes of a video frame. A convolutional long-short term memory neural network determines the location of the object in the frame, based on the extracted image attributes and information indicating a previous location in a previous frame. A decoder neural network generates an image mask indicating the object's location in the frame. In some cases, a video editing system receives multiple generated masks for a video, and modifies one or more video frames based on the locations indicated by the masks.

    Modifying digital images utilizing a language guided image editing model

    公开(公告)号:US12248796B2

    公开(公告)日:2025-03-11

    申请号:US17384109

    申请日:2021-07-23

    Applicant: Adobe Inc.

    Inventor: Ning Xu Zhe Lin

    Abstract: This disclosure describes one or more implementations of systems, non-transitory computer-readable media, and methods that perform language guided digital image editing utilizing a cycle-augmentation generative-adversarial neural network (CAGAN) that is augmented using a cross-modal cyclic mechanism. For example, the disclosed systems generate an editing description network that generates language embeddings which represent image transformations applied between a digital image and a modified digital image. The disclosed systems can further train a GAN to generate modified images by providing an input image and natural language embeddings generated by the editing description network (representing various modifications to the digital image from a ground truth modified image). In some instances, the disclosed systems also utilize an image request attention approach with the GAN to generate images that include adaptive edits in different spatial locations of the image.

    DIGITAL IMAGE INPAINTING UTILIZING GLOBAL AND LOCAL MODULATION LAYERS OF AN INPAINTING NEURAL NETWORK

    公开(公告)号:US20250054116A1

    公开(公告)日:2025-02-13

    申请号:US18929330

    申请日:2024-10-28

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media that generate inpainted digital images utilizing a cascaded modulation inpainting neural network. For example, the disclosed systems utilize a cascaded modulation inpainting neural network that includes cascaded modulation decoder layers. For example, in one or more decoder layers, the disclosed systems start with global code modulation that captures the global-range image structures followed by an additional modulation that refines the global predictions. Accordingly, in one or more implementations, the image inpainting system provides a mechanism to correct distorted local details. Furthermore, in one or more implementations, the image inpainting system leverages fast Fourier convolutions block within different resolution layers of the encoder architecture to expand the receptive field of the encoder and to allow the network encoder to better capture global structure.

    Event understanding with deep learning

    公开(公告)号:US12019982B2

    公开(公告)日:2024-06-25

    申请号:US17452143

    申请日:2021-10-25

    Applicant: ADOBE INC.

    Abstract: Systems and methods for natural language processing are described. One or more embodiments of the present disclosure generate a word representation vector for each word of a text comprising an event trigger word and an argument candidate word; generate a dependency tree based on the text and the word representation vector; determine that at least one word of the text is independent of a relationship between the event trigger word and the argument candidate word; remove the at least one word from the dependency tree based on the determination to obtain a pruned dependency tree; generate a modified representation vector for each word of the pruned dependency tree using a graph convolutional network (GCN); and identify the relationship between the event trigger word and the argument candidate word based on the modified representation vector for each word of the pruned dependency tree.

Patent Agency Ranking