Computer-implemented method of preparing a training dataset for a natural language processing or natural language understanding machine learning algorithm
Abstract:
Described and claimed is a computer-implemented method of preparing a training dataset for a natural language processing, NLP, or natural language understanding, NLU, machine learning algorithm from an original text dataset, the method comprising the steps of selecting one or more sentences from the original text dataset as selected sentences, determining for each selected sentence one or more grammatical elements of the selected sentence that can be negated as negatable elements, determining for one or more negatable words in each negatable element one or more antonyms, based on each determined antonym creating a negated sentence by replacing the respective negatable element in the selected sentence for which the negatable element was determined with the determined antonym, and adding the negated sentences to the training dataset. Further, a computer-implemented method of training a word embedding or an NLP or NLU machine learning algorithm, a system and a computer program product are described and claimed.
Information query
Patent Agency Ranking
0/0