UTILIZING A JOINT-LEARNING SELF-DISTILLATION FRAMEWORK FOR IMPROVING TEXT SEQUENTIAL LABELING MACHINE-LEARNING MODELS

    公开(公告)号:US20220114476A1

    公开(公告)日:2022-04-14

    申请号:US17070568

    申请日:2020-10-14

    Applicant: Adobe Inc.

    Abstract: This disclosure describes one or more implementations of a text sequence labeling system that accurately and efficiently utilize a joint-learning self-distillation approach to improve text sequence labeling machine-learning models. For example, in various implementations, the text sequence labeling system trains a text sequence labeling machine-learning teacher model to generate text sequence labels. The text sequence labeling system then creates and trains a text sequence labeling machine-learning student model utilizing the training and the output of the teacher model. Upon the student model achieving improved results over the teacher model, the text sequence labeling system re-initializes the teacher model with the learned model parameters of the student model and repeats the above joint-learning self-distillation framework. The text sequence labeling system then utilizes a trained text sequence labeling model to generate text sequence labels from input documents.

    ANSWER SELECTION USING A COMPARE-AGGREGATE MODEL WITH LANGUAGE MODEL AND CONDENSED SIMILARITY INFORMATION FROM LATENT CLUSTERING

    公开(公告)号:US20200372025A1

    公开(公告)日:2020-11-26

    申请号:US16420764

    申请日:2019-05-23

    Applicant: ADOBE INC.

    Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for techniques for identifying textual similarity and performing answer selection. A textual-similarity computing model can use a pre-trained language model to generate vector representations of a question and a candidate answer from a target corpus. The target corpus can be clustered into latent topics (or other latent groupings), and probabilities of a question or candidate answer being in each of the latent topics can be calculated and condensed (e.g., downsampled) to improve performance and focus on the most relevant topics. The condensed probabilities can be aggregated and combined with a downstream vector representation of the question (or answer) so the model can use focused topical and other categorical information as auxiliary information in a similarity computation. In training, transfer learning may be applied from a large-scale corpus, and the conventional list-wise approach can be replaced with point-wise learning.

    Semantic Analysis-Based Query Result Retrieval for Natural Language Procedural Queries

    公开(公告)号:US20190392066A1

    公开(公告)日:2019-12-26

    申请号:US16019152

    申请日:2018-06-26

    Applicant: Adobe Inc.

    Abstract: Various embodiments describe techniques for retrieving query results for natural language procedural queries. A query answering (QA) system generates a structured semantic representation of a natural language query. The structured semantic representation includes terms in the natural language query and the relationship between the terms. The QA system retrieves a set of candidate query results for the natural language query from a repository, generates a structured semantic representation for each candidate query result, and determines a match score between the natural language query and each respective candidate query result based on the similarity between the structured semantic representations for the natural language query and each respective candidate query result. A candidate query result having the highest match score is selected as the query result for the natural language query. In some embodiments, paraphrasing rules are generated from user interaction data and are used to determine the match score.

    ABSTRACTIVE SUMMARIZATION OF LONG DOCUMENTS USING DEEP LEARNING

    公开(公告)号:US20190278835A1

    公开(公告)日:2019-09-12

    申请号:US15915775

    申请日:2018-03-08

    Applicant: Adobe Inc.

    Abstract: Techniques are disclosed for abstractive summarization process for summarizing documents, including long documents. A document is encoded using an encoder-decoder architecture with attentive decoding. In particular, an encoder for modeling documents generates both word-level and section-level representations of a document. A discourse-aware decoder then captures the information flow from all discourse sections of a document. In order to extend the robustness of the generated summarization, a neural attention mechanism considers both word-level as well as section-level representations of a document. The neural attention mechanism may utilize a set of weights that are applied to the word-level representations and section-level representations.

    SYSTEMS AND METHODS FOR COREFERENCE RESOLUTION

    公开(公告)号:US20230403175A1

    公开(公告)日:2023-12-14

    申请号:US17806751

    申请日:2022-06-14

    Applicant: ADOBE INC.

    CPC classification number: H04L12/1831 G06F40/284 G06N3/04

    Abstract: Systems and methods for coreference resolution are provided. One aspect of the systems and methods includes inserting a speaker tag into a transcript, wherein the speaker tag indicates that a name in the transcript corresponds to a speaker of a portion of the transcript; encoding a plurality of candidate spans from the transcript based at least in part on the speaker tag to obtain a plurality of span vectors; extracting a plurality of entity mentions from the transcript based on the plurality of span vectors, wherein each of the plurality of entity mentions corresponds to one of the plurality of candidate spans; and generating coreference information for the transcript based on the plurality of entity mentions, wherein the coreference information indicates that a pair of candidate spans of the plurality of candidate spans corresponds to a pair of entity mentions that refer to a same entity.

    Learning to fuse sentences with transformers for summarization

    公开(公告)号:US11620457B2

    公开(公告)日:2023-04-04

    申请号:US17177372

    申请日:2021-02-17

    Applicant: ADOBE INC.

    Abstract: Systems and methods for sentence fusion are described. Embodiments receive coreference information for a first sentence and a second sentence, wherein the coreference information identifies entities associated with both a term of the first sentence and a term of the second sentence, apply an entity constraint to an attention head of a sentence fusion network, wherein the entity constraint limits attention weights of the attention head to terms that correspond to a same entity of the coreference information, and predict a fused sentence using the sentence fusion network based on the entity constraint, wherein the fused sentence combines information from the first sentence and the second sentence.

    Utilizing logical-form dialogue generation for multi-turn construction of paired natural language queries and query-language representations

    公开(公告)号:US11561969B2

    公开(公告)日:2023-01-24

    申请号:US16834850

    申请日:2020-03-30

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media for generating pairs of natural language queries and corresponding query-language representations. For example, the disclosed systems can generate a contextual representation of a prior-generated dialogue sequence to compare with logical-form rules. In some implementations, the logical-form rules comprise trigger conditions and corresponding logical-form actions for constructing a logical-form representation of a subsequent dialogue sequence. Based on the comparison to logical-form rules indicating satisfaction of one or more trigger conditions, the disclosed systems can perform logical-form actions to generate a logical-form representation of a subsequent dialogue sequence. In turn, the disclosed systems can apply a natural-language-to-query-language (NL2QL) template to the logical-form representation to generate a natural language query and a corresponding query-language representation for the subsequent dialogue sequence.

Patent Agency Ranking