Security for generative models using attention analysis

    公开(公告)号:US12292915B1

    公开(公告)日:2025-05-06

    申请号:US18527696

    申请日:2023-12-04

    Abstract: Devices and techniques are generally described for security threat mitigation for generative machine learning models. In some examples, first prompt data including first data associated with a first natural language input and a first span may be determined. An LLM may determine first plan data using the first prompt data. The first plan data may include a call to the first API. A first classifier model may determine a first trust score for the first span. A first attention score may be determined for the first span and the first action plan. Second plan data may be generated based on at least one of the first trust score and the first attention score or the second trust score and the second attention score.

Patent Agency Ranking