GENERATING AN IMPROVED NAMED ENTITY RECOGNITION MODEL USING NOISY DATA WITH A SELF-CLEANING DISCRIMINATOR MODEL

    公开(公告)号:US20250103813A1

    公开(公告)日:2025-03-27

    申请号:US18472746

    申请日:2023-09-22

    Applicant: Adobe Inc.

    Abstract: This disclosure describes one or more implementations of systems, non-transitory computer-readable media, and methods that train a named entity recognition (NER) model with noisy training data through a self-cleaning discriminator model. For example, the disclosed systems utilize a self-cleaning guided denoising framework to improve NER learning on noisy training data via a guidance training set. In one or more implementations, the disclosed systems utilize, within the denoising framework, an auxiliary discriminator model to correct noise in the noisy training data while training an NER model through the noisy training data. For example, while training the NER model to predict labels from the noisy training data, the disclosed systems utilize a discriminator model to detect noisy NER labels and reweight the noisy NER labels provided for training in the NER model.

Patent Agency Ranking