Patent search ap:("Beijing Baidu Netcom Science Technology Co. Page Ltd.") AND inv:"Junyuan Shang"

1.

发明申请
LARGE LANGUAGE MODEL TRAINING METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM 有权

公开(公告)号：US20250094806A1

公开(公告)日：2025-03-20

申请号：US18967167

申请日：2024-12-03

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Junyuan Shang , Yilong Chen , Zhenyu Zhang , Shuohuan Wang , Yu Sun , Hua Wu

IPC: G06N3/082 , G06N3/0475

Abstract: Provided is a large language model training method, an electronic device and a storage medium, relating to the field of artificial intelligence technologies, and in particular, to the fields of deep learning, natural language processing and large model. The method includes: performing dimension reduction parameter fusion on a two-dimensional parameter matrix on each channel in each network layer in a first large language model, respectively, to obtain a second large language model; performing layer reduction parameter fusion on network layers in the second large language model based on a three-dimensional parameter matrix of each network layer in the second large language model to obtain a third large language model; and training the third large language model to obtain a target large language model under the condition that the target loss function determined based on the first and third large language models meets a preset first function condition.

2.

发明授权
Method and apparatus of training natural language processing model, and method and apparatus of processing natural language 有权

公开(公告)号：US12131728B2

公开(公告)日：2024-10-29

申请号：US17828773

申请日：2022-05-31

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Siyu Ding , Chao Pang , Shuohuan Wang , Yanbin Zhao , Junyuan Shang , Yu Sun , Shikun Feng , Hao Tian , Hua Wu , Haifeng Wang

IPC: G10L15/00 , G10L15/02 , G10L15/06 , G10L15/18

CPC classification number: G10L15/063 , G10L15/02 , G10L15/18

Abstract: The present application provides a method of training a natural language processing model, which relates to a field of artificial intelligence, and in particular to a field of natural language processing. A specific implementation scheme includes: performing a semantic learning for multi-tasks on an input text, so as to obtain a semantic feature for the multi-tasks, wherein the multi-tasks include a plurality of branch tasks; performing a feature learning for each branch task based on the semantic feature, so as to obtain a first output result for each branch task; calculating a loss for each branch task according to the first output result for the branch task; and adjusting a parameter of the natural language processing model according to the loss for each branch task. The present application further provides a method of processing a natural language, an electronic device, and a storage medium.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification