-
1.
公开(公告)号:US20230196030A1
公开(公告)日:2023-06-22
申请号:US17557219
申请日:2021-12-21
Applicant: GENESYS CLOUD SERVICES, INC.
Inventor: PAVAN BUDUGUPPA , RAMASUBRAMANIAN SUNDARAM , VEERA RAGHAVENDRA ELLURU
CPC classification number: G06F40/40 , G06F40/30 , G06N3/082 , G06N3/0454 , G06N5/022
Abstract: A method for creating a student model from a teacher model for knowledge distillation. The method including: providing a first model; using a first instance of the first model to create the teacher model by training the first instance of the first model on a training dataset; using a second instance of the first model to create the student model by training the second instance of the first model on a subset of the training dataset; identifying corresponding layers in the teacher model and the student model; for each of the corresponding layers, computing a weight similarity criterion; ranking the corresponding layers according to the weight similarity criterion; selecting, based on the ranking, one or more of the corresponding layers for designation as one or more discard layers; removing from the student model the one or more discard layers.
-
2.
公开(公告)号:US20230196024A1
公开(公告)日:2023-06-22
申请号:US17557245
申请日:2021-12-21
Applicant: GENESYS CLOUD SERVICES, INC.
Inventor: PAVAN BUDUGUPPA , RAMASUBRAMANIAN SUNDARAM , VEERA RAGHAVENDRA ELLURU
CPC classification number: G06F40/30 , G06N5/02 , G06N3/0454
Abstract: A method for creating a student model from a teacher model for knowledge distillation. The method may include: providing the teacher model trained on a first training dataset; generating candidate student models, wherein each of the candidate student models is a model having a unique permutation of layers derived by randomly selecting one or more layers of the plurality of layers of the teacher model for removing; generating a second training dataset; for each of the candidate student models: providing the second training dataset as inputs; recording outputs generated; and based on the recorded outputs, evaluating a performance according to a predetermined model evaluation criterion; determining which of the candidate student models performed best among the candidate student models based on the predetermined model evaluation criterion; identifying a preferred candidate student model.
-