-
公开(公告)号:US11670023B2
公开(公告)日:2023-06-06
申请号:US17007693
申请日:2020-08-31
Applicant: Adobe Inc.
Inventor: Ning Xu , Trung Bui , Jing Shi , Franck Dernoncourt
CPC classification number: G06T11/60 , G10L15/16 , G10L15/22 , G10L2015/223
Abstract: This disclosure involves executing artificial intelligence models that infer image editing operations from natural language requests spoken by a user. Further, this disclosure performs the inferred image editing operations using inferred parameters for the image editing operations. Systems and methods may be provided that infer one or more image editing operations from a natural language request associated with a source image, locate areas of the source that are relevant to the one or more image editing operations to generate image masks, and performing the one or more image editing operations to generate a modified source image.
-
42.
公开(公告)号:US20220399017A1
公开(公告)日:2022-12-15
申请号:US17374103
申请日:2021-07-13
Applicant: Adobe Inc.
Inventor: Ning Xu , Jing Shi , Franck Dernoncourt , Trung Bui
Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that utilize a neural network having a long short-term memory encoder-decoder architecture to progressively modify a digital image in accordance with a natural language request. For example, in one or more embodiments, the disclosed systems utilize a language-to-operation decoding cell of a language-to-operation neural network to sequentially determine one or more image-modification operations to perform to modify a digital image in accordance with a natural language request. In some cases, the decoding cell determines an image-modification operation to perform partly based on the previously used image-modification operations. The disclosed systems further utilize the decoding cell to determine one or more operation parameters for each selected image-modification operation. The disclosed systems utilize the image-modification operation(s) and operation parameter(s) to modify the digital image (e.g., by generating one or more modified digital images) via the decoding cell.
-
43.
公开(公告)号:US20220148325A1
公开(公告)日:2022-05-12
申请号:US17584962
申请日:2022-01-26
Applicant: Adobe Inc.
Inventor: Zhaowen Wang , Tianlang Chen , Ning Xu , Hailin Jin
IPC: G06V30/244 , G06K9/62 , G06F16/906 , G06N3/08 , G06F16/903 , G06F40/109 , G06V10/40
Abstract: The present disclosure relates to a tag-based font recognition system that utilizes a multi-learning framework to develop and improve tag-based font recognition using deep learning neural networks. In particular, the tag-based font recognition system jointly trains a font tag recognition neural network with an implicit font classification attention model to generate font tag probability vectors that are enhanced by implicit font classification information. Indeed, the font recognition system weights the hidden layers of the font tag recognition neural network with implicit font information to improve the accuracy and predictability of the font tag recognition neural network, which results in improved retrieval of fonts in response to a font tag query. Accordingly, using the enhanced tag probability vectors, the tag-based font recognition system can accurately identify and recommend one or more fonts in response to a font tag query.
-
公开(公告)号:US20220148183A1
公开(公告)日:2022-05-12
申请号:US17584988
申请日:2022-01-26
Applicant: Adobe Inc.
Inventor: Ning Xu
IPC: G06T7/10
Abstract: Introduced here are computer programs and associated computer-implemented techniques for training and then applying computer-implemented models designed for segmentation of an object in the frames of video. By training and then applying the segmentation model in a cyclical manner, the errors encountered when performing segmentation can be eliminated rather than propagated. In particular, the approach to segmentation described herein allows the relationship between a reference mask and each target frame for which a mask is to be produced to be explicitly bridged or established. Such an approach ensures that masks are accurate, which in turn means that the segmentation model is less prone to distractions.
-
公开(公告)号:US20200034971A1
公开(公告)日:2020-01-30
申请号:US16047492
申请日:2018-07-27
Applicant: Adobe Inc.
Inventor: Ning Xu , Brian Price , Scott Cohen
Abstract: A temporal object segmentation system determines a location of an object depicted in a video. In some cases, the temporal object segmentation system determines the object's location in a particular frame of the video based on information indicating a previous location of the object in a previous video frame. For example, an encoder neural network in the temporal object segmentation system extracts features describing image attributes of a video frame. A convolutional long-short term memory neural network determines the location of the object in the frame, based on the extracted image attributes and information indicating a previous location in a previous frame. A decoder neural network generates an image mask indicating the object's location in the frame. In some cases, a video editing system receives multiple generated masks for a video, and modifies one or more video frames based on the locations indicated by the masks.
-
公开(公告)号:US12248796B2
公开(公告)日:2025-03-11
申请号:US17384109
申请日:2021-07-23
Applicant: Adobe Inc.
Abstract: This disclosure describes one or more implementations of systems, non-transitory computer-readable media, and methods that perform language guided digital image editing utilizing a cycle-augmentation generative-adversarial neural network (CAGAN) that is augmented using a cross-modal cyclic mechanism. For example, the disclosed systems generate an editing description network that generates language embeddings which represent image transformations applied between a digital image and a modified digital image. The disclosed systems can further train a GAN to generate modified images by providing an input image and natural language embeddings generated by the editing description network (representing various modifications to the digital image from a ground truth modified image). In some instances, the disclosed systems also utilize an image request attention approach with the GAN to generate images that include adaptive edits in different spatial locations of the image.
-
47.
公开(公告)号:US20250054116A1
公开(公告)日:2025-02-13
申请号:US18929330
申请日:2024-10-28
Applicant: Adobe Inc.
Inventor: Haitian Zheng , Zhe Lin , Jingwan Lu , Scott Cohen , Elya Shechtman , Connelly Barnes , Jianming Zhang , Ning Xu , Sohrab Amirghodsi
IPC: G06T5/77 , G06T3/4046 , G06V10/40
Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media that generate inpainted digital images utilizing a cascaded modulation inpainting neural network. For example, the disclosed systems utilize a cascaded modulation inpainting neural network that includes cascaded modulation decoder layers. For example, in one or more decoder layers, the disclosed systems start with global code modulation that captures the global-range image structures followed by an additional modulation that refines the global predictions. Accordingly, in one or more implementations, the image inpainting system provides a mechanism to correct distorted local details. Furthermore, in one or more implementations, the image inpainting system leverages fast Fourier convolutions block within different resolution layers of the encoder architecture to expand the receptive field of the encoder and to allow the network encoder to better capture global structure.
-
公开(公告)号:US12079901B2
公开(公告)日:2024-09-03
申请号:US17649101
申请日:2022-01-27
Applicant: ADOBE INC.
Inventor: Ning Xu
CPC classification number: G06T11/00 , G06T5/50 , G06T5/77 , G06T7/11 , G06V10/267 , G06V10/82 , G06T2207/10016 , G06T2207/20081 , G06T2207/20132 , G06T2207/20221 , G06T2210/22
Abstract: Systems and methods for image processing are described. Embodiments of the present disclosure identify a first image depicting a first object; identify a plurality of candidate images depicting a second object; select a second image from the plurality of candidate images depicting the second object based on the second image and a sequence of previous images including the first image using a crop selection network trained to select a next compatible image based on the sequence of previous images; and generate a composite image depicting the first object and the second object based on the first image and the second image.
-
公开(公告)号:US12019982B2
公开(公告)日:2024-06-25
申请号:US17452143
申请日:2021-10-25
Applicant: ADOBE INC.
Inventor: Amir Pouran Ben Veyseh , Franck Dernoncourt , Ning Xu
IPC: G06F40/211 , G06F40/166 , G06F40/279 , G06F40/30 , G06N3/04 , G06N3/045
CPC classification number: G06F40/211 , G06F40/166 , G06F40/279 , G06N3/04 , G06N3/045 , G06F40/30
Abstract: Systems and methods for natural language processing are described. One or more embodiments of the present disclosure generate a word representation vector for each word of a text comprising an event trigger word and an argument candidate word; generate a dependency tree based on the text and the word representation vector; determine that at least one word of the text is independent of a relationship between the event trigger word and the argument candidate word; remove the at least one word from the dependency tree based on the determination to obtain a pruned dependency tree; generate a modified representation vector for each word of the pruned dependency tree using a graph convolutional network (GCN); and identify the relationship between the event trigger word and the argument candidate word based on the modified representation vector for each word of the pruned dependency tree.
-
公开(公告)号:US12008739B2
公开(公告)日:2024-06-11
申请号:US17452529
申请日:2021-10-27
Applicant: ADOBE INC.
Inventor: Ning Xu , Zhe Lin , Franck Dernoncourt
CPC classification number: G06T5/77 , G06N3/08 , G06T5/50 , G06T5/90 , G06T11/60 , G10L15/22 , G06T2207/20081 , G10L2015/223
Abstract: The present disclosure relates to systems and methods for automatically processing images based on a user request. In some examples, a request is divided into a retouching command (e.g., a global edit) and an inpainting command (e.g., a local edit). A retouching mask and an inpainting mask are generated to indicate areas where the edits will be applied. A photo-request attention and a multi-modal modulation process are applied to features representing the image, and a modified image that incorporates the user's request is generated using the modified features.
-
-
-
-
-
-
-
-
-