Invention Grant
US08271483B2 Method and apparatus for detecting sensitive content in a document 有权
用于检测文档中的敏感内容的方法和装置

Method and apparatus for detecting sensitive content in a document
Abstract:
One embodiment of the present invention provides a system that detects sensitive content in a document. In doing so, the system receives a document, identifies a set of terms in the document that are candidate sensitive terms, and generates a combination of terms based on the identified terms that is associated with a semantic meaning. Next, the system performs searches through a corpus based on the combination of terms and determines hit counts returned for each term in the combination and for the combination. The system then determines whether the combination of terms is sensitive based on the hit count for the combination and the hit counts for the individual terms in the combination, and generates a result that indicates portions of the document which contain sensitive combinations.
Public/Granted literature
Information query
Patent Agency Ranking
0/0