-
公开(公告)号:US11210470B2
公开(公告)日:2021-12-28
申请号:US16368334
申请日:2019-03-28
Applicant: ADOBE INC.
Inventor: Seokhwan Kim , Walter W. Chang , Nedim Lipka , Franck Dernoncourt , Chan Young Park
Abstract: Methods and systems are provided for identifying subparts of a text. A neural network system can receive a set of sentences that includes context sentences and target sentences that indicate a decision point in a text. The neural network system can generate context vector sentences and target sentence vectors by encoding context from the set of sentences. These context sentence vectors can be weighted to focus on relevant information. The weighted context sentence vectors and the target sentence vectors can then be used to output a label for the decision point in the text.
-
公开(公告)号:US11630952B2
公开(公告)日:2023-04-18
申请号:US16518894
申请日:2019-07-22
Applicant: Adobe Inc.
Inventor: Sean MacAvaney , Franck Dernoncourt , Walter Chang , Seokhwan Kim , Doo Soon Kim , Chen Fang
IPC: G06F17/15 , G06F17/16 , G06N3/045 , G06V10/80 , G06V10/82 , G06F40/279 , G06F18/2431 , G06V10/764 , G06V10/70
Abstract: This disclosure relates to methods, non-transitory computer readable media, and systems that can classify term sequences within a source text based on textual features analyzed by both an implicit-class-recognition model and an explicit-class-recognition model. For example, by applying machine-learning models for both implicit and explicit class recognition, the disclosed systems can determine a class corresponding to a particular term sequence within a source text and identify the particular term sequence reflecting the class. The dual-model architecture can equip the disclosed systems to apply (i) the implicit-class-recognition model to recognize implicit references to a class in source texts and (ii) the explicit-class-recognition model to recognize explicit references to the same class in source texts.
-
3.
公开(公告)号:US20220122357A1
公开(公告)日:2022-04-21
申请号:US17563901
申请日:2021-12-28
Applicant: Adobe Inc.
Inventor: Wentian Zhao , Seokhwan Kim , Ning Xu , Hailin Jin
Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media for generating a response to a question received from a user during display or playback of a video segment by utilizing a query-response-neural network. The disclosed systems can extract a query vector from a question corresponding to the video segment using the query-response-neural network. The disclosed systems further generate context vectors representing both visual cues and transcript cues corresponding to the video segment using context encoders or other layers from the query-response-neural network. By utilizing additional layers from the query-response-neural network, the disclosed systems generate (i) a query-context vector based on the query vector and the context vectors, and (ii) candidate-response vectors representing candidate responses to the question from a domain-knowledge base or other source. To respond to a user's question, the disclosed systems further select a response from the candidate responses based on a comparison of the query-context vector and the candidate-response vectors.
-
公开(公告)号:US20210118430A1
公开(公告)日:2021-04-22
申请号:US17135629
申请日:2020-12-28
Applicant: Adobe Inc.
Inventor: Seokhwan Kim , Walter Chang
IPC: G10L15/16 , G06F16/9032 , G10L15/22 , H04L12/58
Abstract: The present disclosure relates to generating digital responses based on digital dialog states generated by a neural network having a dynamic memory network architecture. For example, in one or more embodiments, the disclosed system provides a digital dialog having one or more segments to a dialog state tracking neural network having a dynamic memory network architecture that includes a set of multiple memory slots. In some embodiments, the dialog state tracking neural network further includes update gates and reset gates used in modifying the values stored in the memory slots. For instance, the disclosed system can utilize cross-slot interaction update/reset gates to accurately generate a digital dialog state for each of the segments of digital dialog. Subsequently, the system generates a digital response for each segment of digital dialog based on the digital dialog state.
-
5.
公开(公告)号:US20200312298A1
公开(公告)日:2020-10-01
申请号:US16366904
申请日:2019-03-27
Applicant: Adobe Inc.
Inventor: Trung Bui , Zahra Rahimi , Yinglan Ma , Seokhwan Kim , Franck Dernoncourt
IPC: G10L15/06 , G06F3/16 , G06F3/0482 , G06F3/0484 , G06F17/24 , G10L15/22 , G10L15/18 , G06N20/00
Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods that generate ground truth annotations of target utterances in digital image editing dialogues in order to create a state-driven training data set. In particular, in one or more embodiments, the disclosed systems utilize machine and user defined tags, machine learning model predictions, and user input to generate a ground truth annotation that includes frame information in addition to intent, attribute, object, and/or location information. In at least one embodiment, the disclosed systems generate ground truth annotations in conformance with an annotation ontology that results in fast and accurate digital image editing dialogue annotation.
-
公开(公告)号:US20200380030A1
公开(公告)日:2020-12-03
申请号:US16428308
申请日:2019-05-31
Applicant: ADOBE INC.
Inventor: Anthony Michael Colas , Doo Soon Kim , Franck Dernoncourt , Seokhwan Kim
Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for in-app video navigation in which videos including answers to user provided queries are presented within an application. And portions of the videos that specifically include the answer to the query are highlighted to allow for efficient and effective tutorial utilization. Upon receipt of a text or verbal query, top candidate videos including an answer to the query are determined. Within the top candidate videos, a video span with a starting sentence location and an ending location is identified based on the query and contextual information within each candidate video. The video span with the highest overall score calculated based on a video score and a span score is presented to the user.
-
公开(公告)号:US10783314B2
公开(公告)日:2020-09-22
申请号:US16024212
申请日:2018-06-29
Applicant: Adobe Inc.
Inventor: Franck Dernoncourt , Walter Wei-Tuh Chang , Seokhwan Kim , Sean Fitzgerald , Ragunandan Rao Malangully , Laurie Marie Byrum , Frederic Thevenet , Carl Iwan Dockhorn
IPC: G06F40/10 , G06F40/106 , G10L15/26 , G06F40/14 , G06F40/166
Abstract: Techniques are disclosed for generating a structured transcription from a speech file. In an example embodiment, a structured transcription system receives a speech file comprising speech from one or more people and generates a navigable structured transcription object. The navigable structured transcription object may comprise one or more data structures representing multimedia content with which a user may navigate and interact via a user interface. Text and/or speech relating to the speech file can be selectively presented to the user (e.g., the text can be presented via a display, and the speech can be aurally presented via a speaker).
-
公开(公告)号:US20200004803A1
公开(公告)日:2020-01-02
申请号:US16024212
申请日:2018-06-29
Applicant: Adobe Inc.
Inventor: Franck Dernoncourt , Walter Wei-Tuh Chang , Seokhwan Kim , Sean Fitzgerald , Ragunandan Rao Malangully , Laurie Marie Byrum , Frederic Thevenet , Carl Iwan Dockhorn
Abstract: Techniques are disclosed for generating a structured transcription from a speech file. In an example embodiment, a structured transcription system receives a speech file comprising speech from one or more people and generates a navigable structured transcription object. The navigable structured transcription object may comprise one or more data structures representing multimedia content with which a user may navigate and interact via a user interface. Text and/or speech relating to the speech file can be selectively presented to the user (e.g., the text can be presented via a display, and the speech can be aurally presented via a speaker).
-
公开(公告)号:US11657802B2
公开(公告)日:2023-05-23
申请号:US17135629
申请日:2020-12-28
Applicant: Adobe Inc.
Inventor: Seokhwan Kim , Walter Chang
CPC classification number: G10L15/16 , G06F16/90332 , G10L15/22 , H04L51/02
Abstract: The present disclosure relates to generating digital responses based on digital dialog states generated by a neural network having a dynamic memory network architecture. For example, in one or more embodiments, the disclosed system provides a digital dialog having one or more segments to a dialog state tracking neural network having a dynamic memory network architecture that includes a set of multiple memory slots. In some embodiments, the dialog state tracking neural network further includes update gates and reset gates used in modifying the values stored in the memory slots. For instance, the disclosed system can utilize cross-slot interaction update/reset gates to accurately generate a digital dialog state for each of the segments of digital dialog. Subsequently, the system generates a digital response for each segment of digital dialog based on the digital dialog state.
-
公开(公告)号:US11544590B2
公开(公告)日:2023-01-03
申请号:US16510491
申请日:2019-07-12
Applicant: Adobe Inc.
Inventor: Seokhwan Kim
IPC: G06N5/04 , G06N3/04 , G06F16/783 , G06V10/75
Abstract: In implementations of answering questions during video playback, a video system can receive a question related to a video at a timepoint of the video during playback of the video, and determine audio sentences of the video that occur within a segment of the video that includes the timepoint. The video system can generate a classification vector from words of the question and the audio sentences, and determine an answer to the question utilizing the classification vector. The video system can obtain answer candidates, and the answer to the question can be selected as one of the answer candidates based on matching the classification vector to one of the answer vectors.
-
-
-
-
-
-
-
-
-