De-identification of electronic records
Abstract:
A system is provided for de-identifying electronic records. The system may be configured to tokenize an electronic record to produce a plurality of tokens including a first token. The system may determine whether the first token is part of one of a plurality of expressions known to include protected health information. In response to determining that the first token is not part of any one of the plurality of expressions, the system may determine, based on a blacklist of tokens known to include protected health information, whether the first token includes protected health information. In response to determining that the first token includes protected health information, the system may generate a de-identified electronic record by replacing the first token with a second token obfuscating the protected health information. Related methods and computer program products are also provided.
Information query
Patent Agency Ranking
0/0