Building a Machine Learning model without compromising data privacy
Abstract:
Systems and methods include obtaining file identifiers associated with files in production data; obtaining lab data from one or more public repositories of malware samples based on the file identifiers for the production data; and utilizing the lab data for training a machine learning process for classifying malware in the production data. The obtaining file identifiers can be based on monitoring of users associated with the files, and only the file identifiers are maintained based on the monitoring. The lab data can include samples from the one or more public repositories matching the corresponding file identifiers for the production data. The lab data can include samples from the one or more public repositories that have features closely related to features of the production data.
Information query
Patent Agency Ranking
0/0