EFFICIENT SOFTMAX COMPUTATION
    2.
    发明申请

    公开(公告)号:US20220067513A1

    公开(公告)日:2022-03-03

    申请号:US17112795

    申请日:2020-12-04

    Applicant: NVIDIA Corp.

    Abstract: Solutions improving efficiency of Softmax computation applied for efficient deep learning inference in transformers and other neural networks. The solutions utilize a reduced-precision implementation of various operations in Softmax, replacing ex with 2x to reduce instruction overhead associated with computing ex, and replacing floating point max computation with integer max computation. Further described is a scalable implementation that decomposes Softmax into UnNormalized Softmax and Normalization operations.

Patent Agency Ranking