Training a joint many-task neural network model using successive regularization

Invention Grant

US11042796B2 Training a joint many-task neural network model using successive regularization 有权

Please log in to see more content

Patent Title: Training a joint many-task neural network model using successive regularization
Application No.: US15421431

Application Date: 2017-01-31
Publication No.: US11042796B2

Publication Date: 2021-06-22
Inventor: Kazuma Hashimoto , Caiming Xiong , Richard Socher
Applicant: salesforce.com, inc.
Applicant Address: US CA San Francisco
Assignee: salesforce.com, inc.
Current Assignee: salesforce.com, inc.
Current Assignee Address: US CA San Francisco
Agency: Haynes and Boone, LLP
Main IPC: G06N3/04
IPC: G06N3/04 ; G06N3/08 ; G06F40/30 ; G06F40/205 ; G06F40/216 ; G06F40/253 ; G06F40/284 ; G06N3/063 ; G10L15/18 ; G10L25/30 ; G10L15/16 ; G06F40/00

Training a joint many-task neural network model using successive regularization

Abstract:

The technology disclosed provides a so-called “joint many-task neural network model” to solve a variety of increasingly complex natural language processing (NLP) tasks using growing depth of layers in a single end-to-end model. The model is successively trained by considering linguistic hierarchies, directly connecting word representations to all model layers, explicitly using predictions in lower tasks, and applying a so-called “successive regularization” technique to prevent catastrophic forgetting. Three examples of lower level model layers are part-of-speech (POS) tagging layer, chunking layer, and dependency parsing layer. Two examples of higher level model layers are semantic relatedness layer and textual entailment layer. The model achieves the state-of-the-art results on chunking, dependency parsing, semantic relatedness and textual entailment.

Public/Granted literature

US20180121799A1 Training a Joint Many-Task Neural Network Model using Successive Regularization Public/Granted day:2018-05-03

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N3/00	基于生物学模型的计算机系统
G06N3/02	.采用神经网络模型
G06N3/04	..体系结构，例如，互连拓扑