SELF-TUNING MODEL COMPRESSION METHODOLOGY FOR RECONFIGURING DEEP NEURAL NETWORK AND ELECTRONIC DEVICE

    公开(公告)号:US20240078432A1

    公开(公告)日:2024-03-07

    申请号:US18508248

    申请日:2023-11-14

    Applicant: Kneron Inc.

    CPC classification number: G06N3/082 G06N3/04

    Abstract: A self-tuning model compression methodology for reconfiguring a Deep Neural Network (DNN) includes: receiving a pre-trained DNN model and a data set; performing an inter-layer sparsity analysis to generate a first sparsity result; and performing an intra-layer sparsity analysis to generate a second sparsity result, including: defining a plurality of sparsity metrics for the network; performing forward and backward passes to collect data corresponding to the sparsity metrics; using the collected data to calculate values for the defined sparsity metrics; and visualizing the calculated values using at least a histogram. The methodology further includes: according to the first and second sparsity results, performing low-rank approximation on the pre-trained DNN; pruning the represented DNN model according to the first and second sparsity results; performing quantization on the pruned DNN model according to the first and second sparsity results; and executing the reconfigured model on a user terminal for an end-user application.

Patent Agency Ranking