Method and apparatus for learning low-precision neural network that combines weight quantization and activation quantization

Invention Grant

US11270187B2 Method and apparatus for learning low-precision neural network that combines weight quantization and activation quantization 有权

Please log in to see more content

Patent Title: Method and apparatus for learning low-precision neural network that combines weight quantization and activation quantization
Application No.: US15914229

Application Date: 2018-03-07
Publication No.: US11270187B2

Publication Date: 2022-03-08
Inventor: Yoo Jin Choi , Mostafa El-Khamy , Jungwon Lee
Applicant: Samsung Electronics Co., Ltd.
Applicant Address: KR Gyeonggi-do
Assignee: Samsung Electronics Co., Ltd.
Current Assignee: Samsung Electronics Co., Ltd.
Current Assignee Address: KR Gyeonggi-do
Agency: The Farrell Law Firm, P.C.
Main IPC: G06N3/04
IPC: G06N3/04 ; G06N3/08

Method and apparatus for learning low-precision neural network that combines weight quantization and activation quantization

Abstract:

A method is provided. The method includes selecting a neural network model, wherein the neural network model includes a plurality of layers, and wherein each of the plurality of layers includes weights and activations; modifying the neural network model by inserting a plurality of quantization layers within the neural network model; associating a cost function with the modified neural network model, wherein the cost function includes a first coefficient corresponding to a first regularization term, and wherein an initial value of the first coefficient is pre-defined; and training the modified neural network model to generate quantized weights for a layer by increasing the first coefficient until all weights are quantized and the first coefficient satisfies a pre-defined threshold, further including optimizing a weight scaling factor for the quantized weights and an activation scaling factor for quantized activations, and wherein the quantized weights are quantized using the optimized weight scaling factor.

Public/Granted literature

US20190138882A1 METHOD AND APPARATUS FOR LEARNING LOW-PRECISION NEURAL NETWORK THAT COMBINES WEIGHT QUANTIZATION AND ACTIVATION QUANTIZATION Public/Granted day:2019-05-09

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N3/00	基于生物学模型的计算机系统
G06N3/02	.采用神经网络模型
G06N3/04	..体系结构，例如，互连拓扑