-
公开(公告)号:US11709873B2
公开(公告)日:2023-07-25
申请号:US16741625
申请日:2020-01-13
Applicant: Adobe Inc.
Inventor: Jinfeng Xiao , Lidan Wang , Franck Dernoncourt , Trung Bui , Tong Sun
IPC: G06F16/33 , G06F16/953
CPC classification number: G06F16/3347 , G06F16/953
Abstract: Techniques and systems are provided for predicting answers in response to one or more input queries. For instance, text from a corpus of text can be processed by a reader to generate one or multiple question and answer spaces. A question and answer space can include answerable questions and the answers associated with the questions (referred to as “question and answer pairs”). A query defining a question can be received (e.g., from a user input device) and processed by a retriever portion of the system. The retriever portion of the system can retrieve an answer to the question from the one or more pre-constructed question and answer spaces, and/or can determine an answer by comparing one or more answers retrieved from the one or more pre-constructed question and answer spaces to an answer generated by a retriever-reader system.
-
公开(公告)号:US20220382975A1
公开(公告)日:2022-12-01
申请号:US17333892
申请日:2021-05-28
Applicant: Adobe Inc.
Inventor: Jiuxiang Gu , Vlad Morariu , Varun Manjunatha , Tong Sun , Rajiv Jain , Peizhao Li , Jason Kuen , Handong Zhao
IPC: G06F40/279 , G06N3/04 , G06N3/08 , G06F16/93 , G06F40/30 , G06F40/205
Abstract: One example method involves operations for a processing device that include receiving, by a machine learning model trained to generate a search result, a search query for a text input. The machine learning model is trained by receiving pre-training data that includes multiple documents. Pre-training the machine learning model by generating, using an encoder, feature embeddings for each of the documents included in the pre-training data. The feature embeddings are generated by applying a masking function to visual and textual features in the documents. Training the machine learning model also includes generating, using the feature embeddings, output features for the documents by concatenating the feature embeddings and applying a non-linear mapping to the feature embeddings. Training the machine learning model further includes applying a linear classifier to the output features. Additionally, operations include generating, for display, a search result using the machine learning model based on the input.
-
公开(公告)号:US20210303779A1
公开(公告)日:2021-09-30
申请号:US16834940
申请日:2020-03-30
Applicant: Adobe Inc.
Inventor: Tong Sun , Qi Sun , Jing Qian , Curtis Michael Wigington
IPC: G06F40/171 , G06K9/00 , G06F3/0484 , G06F3/0488 , G06F40/169 , G06T7/20 , G06K9/20 , G06F40/197
Abstract: Techniques are disclosed for sharing user markings between digital documents and corresponding physically printed documents. The sharing is facilitated using an Augmented Reality (AR) device, such as a smartphone or a tablet. The device streams images of a page of a book on a display. The device accesses a corresponding digital document that is a digital version of content printed on the book. In an example, the digital document has a digital user marking, e.g., a comment associated with a paragraph of the digital document, wherein a corresponding paragraph of the physical book lacks any such comment. When the device streams the images of the page of the book on the display, the device appends the digital comment on the paragraph of the page of the book within the image stream. Thus, the user can view the digital comment in the AR environment, while reading the physical book.
-
公开(公告)号:US20210216577A1
公开(公告)日:2021-07-15
申请号:US16741625
申请日:2020-01-13
Applicant: Adobe Inc.
Inventor: Jinfeng Xiao , Lidan Wang , Franck Dernoncourt , Trung Bui , Tong Sun
IPC: G06F16/33 , G06F16/953
Abstract: Techniques and systems are provided for predicting answers in response to one or more input queries. For instance, text from a corpus of text can be processed by a reader to generate one or multiple question and answer spaces. A question and answer space can include answerable questions and the answers associated with the questions (referred to as “question and answer pairs”). A query defining a question can be received (e.g., from a user input device) and processed by a retriever portion of the system. The retriever portion of the system can retrieve an answer to the question from the one or more pre-constructed question and answer spaces, and/or can determine an answer by comparing one or more answers retrieved from the one or more pre-constructed question and answer spaces to an answer generated by a retriever-reader system.
-
公开(公告)号:US20240104951A1
公开(公告)日:2024-03-28
申请号:US17947737
申请日:2022-09-19
Applicant: ADOBE INC.
Inventor: Jiuxiang Gu , Vlad Morariu , Tong Sun , Jason wen yong Kuen , Ani Nenkova
IPC: G06V30/412 , G06V30/262 , G06V30/414
CPC classification number: G06V30/412 , G06V30/262 , G06V30/414
Abstract: In various examples, a table recognition model receives an image of a table and generates, using a first encoder of the table recognition machine learning model, an image feature vector including features extracted from the image of the table; generates, using a first decoder of the table recognition machine learning model and the image feature vector, a set of coordinates within the image representing rows and columns associated with the table, and generates, using a second decoder of the table recognition machine learning model and the image feature vector, a set of bounding boxes and semantic features associated with cells the table, then determines, using a third decoder of the table recognition machine learning model, a table structure associated with the table using the image feature vector, the set of coordinates, the set of bounding boxes, and the semantic features.
-
6.
公开(公告)号:US20240056309A1
公开(公告)日:2024-02-15
申请号:US17819540
申请日:2022-08-12
Applicant: Adobe Inc.
Inventor: Songlin He , Tong Sun , Nedim Lipka , Curtis Wigington , Rajiv Jain , Anindo Roy
IPC: H04L9/32 , G06F21/31 , G06F40/174
CPC classification number: H04L9/3247 , G06F21/31 , G06F40/174
Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media that fill in digital documents using user identity models of client devices. For instance, in one or more embodiments, the disclosed systems receive a digital document comprising a digital fillable field. The disclosed systems further retrieve, for a client device associated with the digital document, a decentralized identity credential comprising a user attribute established under a decentralized identity framework. Using the user attribute of the decentralized identity credential, the disclosed systems modify the digital document by filling in the digital fillable field.
-
公开(公告)号:US20230368003A1
公开(公告)日:2023-11-16
申请号:US17740497
申请日:2022-05-10
Applicant: ADOBE INC.
Inventor: Jiuxiang Gu , Zihan Wang , Jason Wen Yong Kuen , Handong Zhao , Vlad Ion Morariu , Ruiyi Zhang , Ani Nenkova Nenkova , Tong Sun
IPC: G06N3/04 , G06F40/284
CPC classification number: G06N3/0481 , G06F40/284
Abstract: The technology described herein is directed to an adaptive sparse attention pattern that is learned during fine-tuning and deployed in a machine-learning model. In aspects, a row or a column in an attention matrix with an importance score for a task that is above a threshold importance score is identified. The important row or the column is included in an adaptive attention pattern used with a machine-learning model having a self-attention operation. In response to an input, a task-specific inference is generated for the input using the machine-learning model with the adaptive attention pattern.
-
公开(公告)号:US11544503B2
公开(公告)日:2023-01-03
申请号:US16885168
申请日:2020-05-27
Applicant: Adobe Inc.
Inventor: Christopher Tensmeyer , Vlad Ion Morariu , Varun Manjunatha , Tong Sun , Nikolaos Barmpalios , Kai Li , Handong Zhao , Curtis Wigington
Abstract: A domain alignment technique for cross-domain object detection tasks is introduced. During a preliminary pretraining phase, an object detection model is pretrained to detect objects in images associated with a source domain using a source dataset of images associated with the source domain. After completing the pretraining phase, a domain adaptation phase is performed using the source dataset and a target dataset to adapt the pretrained object detection model to detect objects in images associated with the target domain. The domain adaptation phase may involve the use of various domain alignment modules that, for example, perform multi-scale pixel/path alignment based on input feature maps or perform instance-level alignment based on input region proposals.
-
公开(公告)号:US20220391768A1
公开(公告)日:2022-12-08
申请号:US17883811
申请日:2022-08-09
Applicant: Adobe Inc.
Inventor: Kai Li , Christopher Alan Tensmeyer , Curtis Michael Wigington , Handong Zhao , Nikolaos Barmpalios , Tong Sun , Varun Manjunatha , Vlad Ion Morariu
Abstract: Adapting a machine learning model to process data that differs from training data used to configure the model for a specified objective is described. A domain adaptation system trains the model to process new domain data that differs from a training data domain by using the model to generate a feature representation for the new domain data, which describes different content types included in the new domain data. The domain adaptation system then generates a probability distribution for each discrete region of the new domain data, which describes a likelihood of the region including different content described by the feature representation. The probability distribution is compared to ground truth information for the new domain data to determine a loss function, which is used to refine model parameters. After determining that model outputs achieve a threshold similarity to the ground truth information, the model is output as a domain-agnostic model.
-
公开(公告)号:US20250078200A1
公开(公告)日:2025-03-06
申请号:US18952023
申请日:2024-11-19
Applicant: Adobe Inc.
Inventor: Ruiyi Zhang , Yufan Zhou , Christopher Tensmeyer , Jiuxiang Gu , Tong Yu , Tong Sun
Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods that implement a neural network framework for interactive multi-round image generation from natural language inputs. Specifically, the disclosed systems provide an intelligent framework (i.e., a text-based interactive image generation model) that facilitates a multi-round image generation and editing workflow that comports with arbitrary input text and synchronous interaction. In particular embodiments, the disclosed systems utilize natural language feedback for conditioning a generative neural network that performs text-to-image generation and text-guided image modification. For example, the disclosed systems utilize a trained model to inject textual features from natural language feedback into a unified joint embedding space for generating text-informed style vectors. In turn, the disclosed systems can generate an image with semantically meaningful features that map to the natural language feedback. Moreover, the disclosed systems can persist these semantically meaningful features throughout a refinement process and across generated images.
-
-
-
-
-
-
-
-
-