Intelligent text cleaning method and apparatus, and computer-readable storage medium
Abstract:
An intelligent text cleaning method includes: acquiring a text set, and preprocessing the text set to obtain a word vector text set; subjecting the word vector text set to a full-text matrix numeralization to generate a principal word vector matrix and a text word vector matrix; inputting the principal word vector matrix to a BiLSTM model to generate an intermediate text vector; inputting the text word vector matrix to a convolution neural network model to generate a target text vector; and concatenating the intermediate text vector and the target text vector to obtain combined text vectors, inputting the combined text vectors to a pre-constructed semantic recognition classifier model, outputting an aggregated text vector, subjecting the aggregated text vector to reverse recovery using a word2vec reverse algorithm, and outputting a standard text. The present application realizes accurate text cleaning.
Information query
Patent Agency Ranking
0/0