Method and system for classifying word as obscene word

Invention Grant

US12026465B2 Method and system for classifying word as obscene word 有权

Please log in to see more content

Patent Title: Method and system for classifying word as obscene word
Application No.: US17553800

Application Date: 2021-12-17
Publication No.: US12026465B2

Publication Date: 2024-07-02
Inventor: Mikhail Borisovich Libman
Applicant: YANDEX EUROPE AG
Applicant Address: CH Lucerne
Assignee: Direct Cursus Technology L.L.C
Current Assignee: Direct Cursus Technology L.L.C
Current Assignee Address: AE Dubai
Agency: BCF LLP
Priority: RU 2020142418 2020.12.22
Main IPC: G06F40/232
IPC: G06F40/232 ; G06F40/279 ; G06F40/40 ; G06V30/19

Method and system for classifying word as obscene word

Abstract:

There is disclosed a method and system for classifying a word as an obscene word, the method comprising, at a training phrase: acquiring a first word, the first word corresponding to a given obscene word; generating a first set of misspelled words, the first set of misspelled words comprising a plurality of misspelled variations of the first word; generating a training pairs, the training pairs comprising: a set of positive training pairs comprising the first word paired with each misspelled variations of the first word; training a machine learning algorithm, the training comprising: determining, for each training pairs, a set of features representative of a property of the training pairs; generating an inferred function based on the set of features, the inferred function being configured to assign, in use, an indecency score, the decency score being indicative of a likelihood of the word being obscene.

Public/Granted literature

US20220198143A1 METHOD AND SYSTEM FOR CLASSIFYING WORD AS OBSCENE WORD Public/Granted day:2022-06-23

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F40/00	处理自然语言数据（语音分析或综合，语音识别G10L）
G06F40/20	.自然语言分析（自然语言的语义分析入G06F40/30）
G06F40/232	..拼写校正，例如拼写差错程序或加元音符