-
公开(公告)号:GB2581728A
公开(公告)日:2020-08-26
申请号:GB202006969
申请日:2018-10-04
Applicant: IBM
Inventor: ZHUO WANG , JUNGWOOK CHOI , KAILASH GOPALAKRISHNAN , SWAGATH VENKATARAMANI , CHARBEL SAKR
IPC: G06N3/08
Abstract: Techniques that facilitate improving an efficiency of a neural network are described. In one embodiment, a system is provided that comprises a memory that stores computer-executable components and a processor that executes computer-executable components stored in the memory. In one implementation, the computer-executable components comprise an initialization component that selects an initial value of an output limit, wherein the output limit indicates a range for an output of an activation function of a neural network. The computer-executable components further comprise a training component that modifies the initial value of the output limit during training to a second value of the output limit, the second value of the output limit being provided as a parameter to the activation function. The computer-executable components further comprise an activation function component that determines the output of the activation function based on the second value of the output limit as the parameter.
-
公开(公告)号:GB2590000A
公开(公告)日:2021-06-16
申请号:GB202100363
申请日:2019-06-13
Applicant: IBM
Inventor: SWAGATH VENKATARAMANI , SHUBHAM JAIN , VIJAYALAKSHMI SRINIVASAN , LELAND CHANG , JUNGWOOK CHOI
IPC: G06N3/02
Abstract: A compensated deep neural network (compensated-DNN) is provided. A first vector having a set of components and a second vector having a set of corresponding components are received. A component of the first vector includes a first quantized value and a first compensation instruction,and a corresponding component of the second vector includes a second quantized value and a second compensation instruction. The first quantized value is multiplied with the second quantized value to compute a raw product value. The raw product value is compensated for a quantization error according to the first and second compensation instructions to produce a compensated product value. The compensated product value is added into an accumulated value for the dot product. The accumulated value is converted into an output vector of the dot product. The output vector includes an output quantized value and an output compensation instruction.
-
公开(公告)号:GB2590000B
公开(公告)日:2022-12-07
申请号:GB202100363
申请日:2019-06-13
Applicant: IBM
Inventor: SWAGATH VENKATARAMANI , SHUBHAM JAIN , VIJAYALAKSHMI SRINIVASAN , LELAND CHANG , JUNGWOOK CHOI
IPC: G06N3/063
Abstract: A compensated deep neural network (compensated-DNN) is provided. A first vector having a set of components and a second vector having a set of corresponding components are received. A component of the first vector includes a first quantized value and a first compensation instruction, and a corresponding component of the second vector includes a second quantized value and a second compensation instruction. The first quantized value is multiplied with the second quantized value to compute a raw product value. The raw product value is compensated for a quantization error according to the first and second compensation instructions to produce a compensated product value. The compensated product value is added into an accumulated value for the dot product. The accumulated value is converted into an output vector of the dot product. The output vector includes an output quantized value and an output compensation instruction.
-
公开(公告)号:GB2630701A
公开(公告)日:2024-12-04
申请号:GB202411765
申请日:2023-02-20
Applicant: IBM
Inventor: CEDRIC LICHTENAU , VIJAYALAKSHMI SRINIVASAN , SUNIL K SHUKLA , SWAGATH VENKATARAMANI , KAILASH GOPALAKRISHNAN , HOLGER HORBACH , RAZVAN PETER FIGULI , WEI WANG , YULONG LI , MARTIN LUTZ
Abstract: Processing input data for transmittal to a data consumer such as an artificial intelligence engine is performed by arranging the input data into a uniform structure made up of sticks of data combined to form pages of sticks. A stick is any well-sized set of input data elements whereby the size of the stick is fixed. A masking pattern is established for sticks of data having certain ranges of invalid data for consumption of partial sticks while maintaining validity of the input data being transferred. The mask pattern is derived based on set-active-mask-and-value (SAMV) instructions. The derived mask pattern is carried forward for subsequent load instructions to the data consumer.
-
公开(公告)号:GB2604060A
公开(公告)日:2022-08-24
申请号:GB202206096
申请日:2020-09-29
Applicant: IBM
Inventor: SWAGATH VENKATARAMANI , VIJAYALAKSHMI SRINIVASAN , PHILIP HEIDELBERGER
IPC: G06N3/063
Abstract: Hybrid parallelism techniques where a mix of data and model parallelism techniques are used to split the workload of a layer across an array of processors are disclosed. When configuring the array, the bandwidth of the processors in one direction may be greater than the bandwidth in the other direction. Each layer is characterized according to whether they are more feature heavy or weight heavy. Depending on this characterization, the workload of an NN layer can be assigned to the array using a hybrid parallelism technique rather than using solely the data parallelism technique or solely the model parallelism technique. For example, if an NN layer is more weight heavy than feature heavy, data parallelism is used in the direction with the greater bandwidth (to minimize the negative impact of weight reduction) while model parallelism is used in the direction with the smaller bandwidth.
-
公开(公告)号:GB2600872A
公开(公告)日:2022-05-11
申请号:GB202201906
申请日:2020-07-17
Applicant: IBM
IPC: G06K9/00
Abstract: A convolutional neural network includes a front layer, a back layer, and a plurality of other layers that are connected between the front layer and the back layer. One of the other layers is a transition layer. A first precision is assigned to activations of neurons from the front layer back to the transition layer and a second precision is assigned to activations of the neurons from the transition layer back to the back layer. A third precision is assigned to weights of inputs to neurons from the front layer back to the transition layer and a fourth precision is assigned to weights of inputs to the neurons from the transition layer back to the back layer. In some embodiments the layers forward of the transition layer have a different convolutional kernel than the layers rearward of the transition layer.
-
-
-
-
-