-
公开(公告)号:EP3295383A1
公开(公告)日:2018-03-21
申请号:EP16719637
申请日:2016-04-14
Applicant: QUALCOMM INC
Inventor: LIN DEXU , BADIN MATTHEW , HOWARD DAVID EDWARD , DIJKMAN DANIEL HENDRICUS FRANCISCUS , TREMAINE MICHAEL COLIN , SARAH ANTHONY
IPC: G06N3/063
CPC classification number: G06N3/08 , G06N3/063 , G06N99/005
Abstract: A method of reducing computational complexity for a fixed point neural network operating in a system having a limited bit width in a multiplier-accumulator (MAC) includes reducing a number of bit shift operations when computing activations in the fixed point neural network. The method also includes balancing an amount of quantization error and an overflow error when computing activations in the fixed point neural network.
-
公开(公告)号:EP3295385A1
公开(公告)日:2018-03-21
申请号:EP16718825
申请日:2016-04-14
Applicant: QUALCOMM INC
Inventor: LIN DEXU , ANNAPUREDDY VENKATA SREEKANTA REDDY , HOWARD DAVID EDWARD , JULIAN DAVID JONATHAN , MAJUMDAR SOMDEB , BELL II WILLIAM RICHARD
Abstract: A method of quantizing a floating point machine learning network to obtain a fixed point machine learning network using a quantizer may include selecting at least one moment of an input distribution of the floating point machine learning network. The method may also include determining quantizer parameters for quantizing values of the floating point machine learning network based at least in part on the at least one selected moment of the input distribution of the floating point machine learning network to obtain corresponding values of the fixed point machine learning network.
-