Invention Grant
- Patent Title: Intelligent text cleaning method and apparatus, and computer-readable storage medium
-
Application No.: US17613942Application Date: 2019-08-23
-
Publication No.: US11599727B2Publication Date: 2023-03-07
- Inventor: Ziou Zheng , Wei Wang
- Applicant: PING AN TECHNOLOGY (SHENZHEN) CO., LTD.
- Applicant Address: CN Guangdong
- Assignee: PING AN TECHNOLOGY (SHENZHEN) CO., LTD.
- Current Assignee: PING AN TECHNOLOGY (SHENZHEN) CO., LTD.
- Current Assignee Address: CN Guangdong
- Priority: CN201910601253.7 20190703
- International Application: PCT/CN2019/102204 WO 20190823
- International Announcement: WO2021/000391 WO 20210107
- Main IPC: G06F17/00
- IPC: G06F17/00 ; G06F40/30 ; G06F16/35 ; G06F40/279 ; G06F40/166

Abstract:
An intelligent text cleaning method includes: acquiring a text set, and preprocessing the text set to obtain a word vector text set; subjecting the word vector text set to a full-text matrix numeralization to generate a principal word vector matrix and a text word vector matrix; inputting the principal word vector matrix to a BiLSTM model to generate an intermediate text vector; inputting the text word vector matrix to a convolution neural network model to generate a target text vector; and concatenating the intermediate text vector and the target text vector to obtain combined text vectors, inputting the combined text vectors to a pre-constructed semantic recognition classifier model, outputting an aggregated text vector, subjecting the aggregated text vector to reverse recovery using a word2vec reverse algorithm, and outputting a standard text. The present application realizes accurate text cleaning.
Public/Granted literature
- US20220318515A1 INTELLIGENT TEXT CLEANING METHOD AND APPARATUS, AND COMPUTER-READABLE STORAGE MEDIUM Public/Granted day:2022-10-06
Information query