-
1.
公开(公告)号:US20240078432A1
公开(公告)日:2024-03-07
申请号:US18508248
申请日:2023-11-14
Applicant: Kneron Inc.
Inventor: JIE WU , JUNJIE SU , BIKE XIE , Chun-Chen Liu
Abstract: A self-tuning model compression methodology for reconfiguring a Deep Neural Network (DNN) includes: receiving a pre-trained DNN model and a data set; performing an inter-layer sparsity analysis to generate a first sparsity result; and performing an intra-layer sparsity analysis to generate a second sparsity result, including: defining a plurality of sparsity metrics for the network; performing forward and backward passes to collect data corresponding to the sparsity metrics; using the collected data to calculate values for the defined sparsity metrics; and visualizing the calculated values using at least a histogram. The methodology further includes: according to the first and second sparsity results, performing low-rank approximation on the pre-trained DNN; pruning the represented DNN model according to the first and second sparsity results; performing quantization on the pruned DNN model according to the first and second sparsity results; and executing the reconfigured model on a user terminal for an end-user application.