SYSTEM AND METHOD FOR CLASSIFYING DATA SAMPLES

    公开(公告)号:US20240211799A1

    公开(公告)日:2024-06-27

    申请号:US18087958

    申请日:2022-12-23

    Inventor: IGAL MAZOR

    CPC classification number: G06N20/00 G06N5/04

    Abstract: A method and a system for improving classification of data samples, which may be considered as class outliers, are claimed. The method includes inferring a pretrained classifying ML-based model on the incoming data sample, to assign a particular class of a plurality of classes thereto; calculating a similarity metric value representing a degree of similarity between the incoming data sample and one or more previously classified data samples of the particular class; and validating assignment of the particular class to the incoming data sample, based on the calculated similarity metric value.

    SYSTEM AND METHOD OF TRAINING MACHINE-LEARNING-BASED MODEL

    公开(公告)号:US20240119370A1

    公开(公告)日:2024-04-11

    申请号:US17962729

    申请日:2022-10-10

    Inventor: IGAL MAZOR

    CPC classification number: G06N20/20 G06K9/6257

    Abstract: A system and method of training a machine-learning (ML) based model by at least one processor may include receiving an initial dataset, including a plurality of annotated data samples; based on the initial dataset, training at least one ML-based first-level model to perform a first-level task; based on training the at least one ML-based first-level model, calculating at least one characteristic, representing, for each data sample, a value of contribution into a training of a ML-based second-level model to perform a second-level task; omitting a subset of data samples from the initial dataset based on the at least one characteristic, to obtain a target dataset; and training ML-based second-level model, to perform the second-level task, based on the target dataset.

    SYSTEM AND METHOD FOR CLASSIFYING TEXTUAL DATA BLOCKS

    公开(公告)号:US20240211503A1

    公开(公告)日:2024-06-27

    申请号:US18087069

    申请日:2022-12-22

    Inventor: JOY CHEN IGAL MAZOR

    CPC classification number: G06F16/353 G06N3/08

    Abstract: A method and a system of classifying textual data blocks are claimed. The method includes receiving at least one textual data block in an original version, including a plurality of textual data elements; performing a preprocessing procedure on the at least one textual data block in the original version, wherein the preprocessing procedure includes replacing the textual data elements characterized by pertinence to at least one specific part-of-speech (POS) category with a respective POS token, thereby obtaining the at least one textual data block in a preprocessed version; inferring a pretrained ML-based model on the at least one textual data block in the preprocessed version, to classify the at least one textual data block by pertinence to the at least one class.

Patent Agency Ranking