Overly optimistic data patterns and learned adversarial latent features
Abstract:
Systems for improving security of a computer-implemented artificial intelligence by monitoring one or more transactions received by the machine learning decision model; receiving a first score generated by the machine learning decision model in association with a first transaction; identifying the first transaction as belonging to a first class, in response to the first score being lower than a certain score threshold and the first transaction having a low occurrence likelihood; receiving a second score in association with the first transaction based on one or more adversarial latent features associated with the first transaction as detectable by an adversary detection model; and determining at least one adversarial latent transaction feature being exploited by the first transaction, in response to determining that the second score falls above the certain score threshold.
Information query
Patent Agency Ranking
0/0