Invention Grant
- Patent Title: Keyword data augmentation tool for natural language processing
-
Application No.: US17452742Application Date: 2021-10-28
-
Publication No.: US12153881B2Publication Date: 2024-11-26
- Inventor: Elias Luqman Jalaluddin , Vishal Vishnoi , Thanh Long Duong , Mark Edward Johnson , Poorya Zaremoodi , Gautam Singaraju , Ying Xu , Vladislav Blinov
- Applicant: Oracle International Corporation
- Applicant Address: US CA Redwood Shores
- Assignee: Oracle International Corporation
- Current Assignee: Oracle International Corporation
- Current Assignee Address: US CA Redwood Shores
- Agency: Kilpatrick Townsend & Stockton LLP
- Main IPC: G06F40/279
- IPC: G06F40/279 ; G06F40/35 ; G06N20/00 ; H04L51/02 ; G06F40/205 ; G06F40/284 ; G06F40/289

Abstract:
Techniques for keyword data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes receiving a training set of utterances for training a machine-learning model to identify one or more intents for one or more utterances, augmenting the training set of utterances with out-of-domain (OOD) examples. The augmenting includes: identifying keywords within utterances of the training set of utterances, generating a set of OOD examples with the identified keywords, filtering out OOD examples from the set of OOD examples that have a context substantially similar to context of the utterances of the training set of utterances, and incorporating the set of OOD examples without the filtered OOD examples into the training set of utterances to generate an augmented training set of utterances. Thereafter, the machine-learning model is trained using the augmented training set of utterances.
Public/Granted literature
- US20220171930A1 KEYWORD DATA AUGMENTATION TOOL FOR NATURAL LANGUAGE PROCESSING Public/Granted day:2022-06-02
Information query