-
公开(公告)号:GB2585616A
公开(公告)日:2021-01-13
申请号:GB202016400
申请日:2019-04-10
Applicant: IBM
Inventor: TAESUNG LEE , IAN MICHAEL MOLLOY , WILKA CARVALHO , BENJAMIN JAMES EDWARDS , JIALONG ZHANG , BRYANT CHEN
Abstract: Mechanisms are provided for evaluating a trained machine learning model to determine whether the machine learning model has a backdoor trigger. The mechanisms process a test dataset to generate output classifications for the test dataset, and generate, for the test dataset, gradient data indicating a degree of change of elements within the test dataset based on the output generated by processing the test dataset. The mechanisms analyze the gradient data to identify a pattern of elements within the test dataset indicative of a backdoor trigger. The mechanisms generate, in response to the analysis identifying the pattern of elements indicative of a backdoor trigger, an output indicating the existence of the backdoor trigger in the trained machine learning model.