Patent search ap:("IBM") AND inv:"KAILASH GOPALAKRISHNAN" Page 1

1.

发明专利
Enhanced low precision binary floating-point formatting 未知

公开(公告)号：GB2586559A

公开(公告)日：2021-02-24

申请号：GB202019054

申请日：2019-05-30

Applicant: IBM

Inventor： SILVIA MELITTA MUELLER , ANKUR AGRAWAL , BRUCE FLEISCHER , KAILASH GOPALAKRISHNAN , DONGSOO LEE

IPC: G06F7/483 , G06F7/499

Abstract: Techniques for operating on and calculating binary floating-point numbers using an enhanced floating-point number format are presented. The enhanced format can comprise a single sign bit, six bits for the exponent, and nine bits for the fraction. Using six bits for the exponent can provide an enhanced exponent range that facilitates desirably fast convergence of computing-intensive algorithms and low error rates for computing-intensive applications. The enhanced format can employ a specified definition for the lowest binade that enables the lowest binade to be used for zero and normal numbers; and a specified definition for the highest binade that enables it to be structured to have one data point used for a merged Not-a-Number (NaN)/infinity symbol and remaining data points used for finite numbers. The signs of zero and merged NaN/infinity can be "don't care" terms. The enhanced format employs only one rounding mode, which is for rounding toward nearest up.

2.

发明专利
Facilitating neural network efficiency 未知

公开(公告)号：GB2581728A

公开(公告)日：2020-08-26

申请号：GB202006969

申请日：2018-10-04

Applicant: IBM

Inventor： ZHUO WANG , JUNGWOOK CHOI , KAILASH GOPALAKRISHNAN , SWAGATH VENKATARAMANI , CHARBEL SAKR

IPC: G06N3/08

Abstract: Techniques that facilitate improving an efficiency of a neural network are described. In one embodiment, a system is provided that comprises a memory that stores computer-executable components and a processor that executes computer-executable components stored in the memory. In one implementation, the computer-executable components comprise an initialization component that selects an initial value of an output limit, wherein the output limit indicates a range for an output of an activation function of a neural network. The computer-executable components further comprise a training component that modifies the initial value of the output limit during training to a second value of the output limit, the second value of the output limit being provided as a parameter to the activation function. The computer-executable components further comprise an activation function component that determines the output of the activation function based on the second value of the output limit as the parameter.

3.

发明专利
Enhanced low precision binary floating-point formatting 未知

公开(公告)号：GB2586559B

公开(公告)日：2021-07-14

申请号：GB202019054

申请日：2019-05-30

Applicant: IBM

Inventor： SILVIA MELITTA MUELLER , ANKUR AGRAWAL , BRUCE FLEISCHER , KAILASH GOPALAKRISHNAN , DONGSOO LEE

IPC: G06F7/483 , G06F7/499

Abstract: Techniques for operating on and calculating binary floating-point numbers using an enhanced floating-point number format are presented. The enhanced format can comprise a single sign bit, six bits for the exponent, and nine bits for the fraction. Using six bits for the exponent can provide an enhanced exponent range that facilitates desirably fast convergence of computing-intensive algorithms and low error rates for computing-intensive applications. The enhanced format can employ a specified definition for the lowest binade that enables the lowest binade to be used for zero and normal numbers; and a specified definition for the highest binade that enables it to be structured to have one data point used for a merged Not-a-Number (NaN)/infinity symbol and remaining data points used for finite numbers. The signs of zero and merged NaN/infinity can be “don't care” terms. The enhanced format employs only one rounding mode, which is for rounding toward nearest up.

4.

发明专利
Robust Gradient weight compression schemes for deep learning applications 未知

公开(公告)号：GB2582232A

公开(公告)日：2020-09-16

申请号：GB202009717

申请日：2018-11-30

Applicant: IBM

Inventor： CHIA-YU CHEN , ANKUR AGRAWAL , DANIEL BRAND , KAILASH GOPALAKRISHNAN , JUNGWOOK CHOI

IPC: G06N3/08 , G06N3/04

Abstract: Embodiments of the present invention provide a computer-implemented method for adaptive residual gradient compression for training of a deep learning neural network (DNN). The method includes obtaining, by a first learner, a current gradient vector for a neural network layer of the DNN, in which the current gradient vector includes gradient weights of parameters of the neural network layer that are calculated from a mini-batch of training data. A current residue vector is generated that includes residual gradient weights for the mini-batch. A compressed current residue vector is generated based on dividing the residual gradient weights of the current residue vector into a plurality of bins of a uniform size and quantizing a subset of the residual gradient weights of one or more bins of the plurality of bins. The compressed current residue vector is then transmitted to a second learner of the plurality of learners or to a parameter server.

5.

发明专利
Padding input data for artificial intelligence accelerators 未知

公开(公告)号：GB2630701A

公开(公告)日：2024-12-04

申请号：GB202411765

申请日：2023-02-20

Applicant: IBM

Inventor： CEDRIC LICHTENAU , VIJAYALAKSHMI SRINIVASAN , SUNIL K SHUKLA , SWAGATH VENKATARAMANI , KAILASH GOPALAKRISHNAN , HOLGER HORBACH , RAZVAN PETER FIGULI , WEI WANG , YULONG LI , MARTIN LUTZ

IPC: G06F9/30 , G06F9/38 , G06F9/50

Abstract: Processing input data for transmittal to a data consumer such as an artificial intelligence engine is performed by arranging the input data into a uniform structure made up of sticks of data combined to form pages of sticks. A stick is any well-sized set of input data elements whereby the size of the stick is fixed. A masking pattern is established for sticks of data having certain ranges of invalid data for consumption of partial sticks while maintaining validity of the input data being transferred. The mask pattern is derived based on set-active-mask-and-value (SAMV) instructions. The derived mask pattern is carried forward for subsequent load instructions to the data consumer.

6.

发明专利
System-aware selective quantization for performance optimized distributed deep learning 未知

公开(公告)号：GB2600872A

公开(公告)日：2022-05-11

申请号：GB202201906

申请日：2020-07-17

Applicant: IBM

Inventor： JUNGWOOK CHOI , SWAGATH VENKATARAMANI , VIJAYALAKSHMI SRINIVASAN , KAILASH GOPALAKRISHNAN

IPC: G06K9/00

Abstract: A convolutional neural network includes a front layer, a back layer, and a plurality of other layers that are connected between the front layer and the back layer. One of the other layers is a transition layer. A first precision is assigned to activations of neurons from the front layer back to the transition layer and a second precision is assigned to activations of the neurons from the transition layer back to the back layer. A third precision is assigned to weights of inputs to neurons from the front layer back to the transition layer and a fourth precision is assigned to weights of inputs to the neurons from the transition layer back to the back layer. In some embodiments the layers forward of the transition layer have a different convolutional kernel than the layers rearward of the transition layer.

7.

发明专利
Machine learning hardware having reduced precision parameter components for efficient parameter update 未知

公开(公告)号：GB2600871A

公开(公告)日：2022-05-11

申请号：GB202201893

申请日：2020-08-17

Applicant: IBM

Inventor： XIAO SUN , JUNGWOOK CHOI , NAIGANG WANG , CHIA-YU CHEN , KAILASH GOPALAKRISHNAN

IPC: G06N3/08

Abstract: An apparatus for training and inferencing a neural network includes circuitry that is configured to generate a first weight having a first format including a first number of bits based at least in part on a second weight having a second format including a second number of bits and a residual having a third format including a third number of bits. The second number of bits and the third number of bits are each less than the first number of bits. The circuitry is further configured to update the second weight based at least in part on the first weight and to update the residual based at least in part on the updated second weight and the first weight. The circuitry is further configured to update the first weight based at least in part on the updated second weight and the updated residual.

8.

发明专利
Compression of fully connected/recurrent layers of deep network(s) through enforcing spatial locality to weight matrices and effecting frequency compression 未知

公开(公告)号：GB2582233A

公开(公告)日：2020-09-16

申请号：GB202009750

申请日：2018-11-30

Applicant: IBM

Inventor： JUNGWOOK CHOI , PRITISH NARAYANAN , CHIA-YU CHEN , KAILASH GOPALAKRISHNAN , SUYOG GUPTA

IPC: G06N3/08

Abstract: A system, having a memory that stores computer executable components, and a processor that executes the computer executable components, reduces data size in connection with training a neural network by exploiting spatial locality to weight matrices and effecting frequency transformation and compression. A receiving component receives neural network data in the form of a compressed frequency-domain weight matrix. A segmentation component segments the initial weight matrix into original sub-components, wherein respective original sub-components have spatial weights. A sampling component applies a generalized weight distribution to the respective original sub-components to generate respective normalized sub-components. A transform component applies a transform to the respective normalized sub-components. A cropping component crops high-frequency weights of the respective transformed normalized sub-components to yield a set of low-frequency normalized sub-components to generate a compressed representation of the original sub-components.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification