Invention Grant
- Patent Title: Clustering-based data selection for optimization of risk predictive machine learning models
-
Application No.: US17334743Application Date: 2021-05-30
-
Publication No.: US12141806B2Publication Date: 2024-11-12
- Inventor: Danny Butvinik , Maria Zatsepin , Yoav Avneon
- Applicant: Actimize LTD.
- Applicant Address: IL Ra'anana
- Assignee: Actimize LTD.
- Current Assignee: Actimize LTD.
- Current Assignee Address: IL Ra'anana
- Agency: SOROKER AGMON NORDMAN RIBA
- Main IPC: G06N20/00
- IPC: G06N20/00 ; G06N5/04 ; G06Q20/40 ; G06F18/214 ; G06F18/23 ; G06F18/24

Abstract:
A risk-prediction-preparation module to generate a risk-prediction-model, is provided herein. The risk-prediction-preparation module includes accessing a data-storage of transactions to operate a group-by operation on transactions related to data-points, according to a logical-entity into entities. Then, clustering entities of a clean-financial dataset into clusters. Selecting data-points of: (a) entities from the clusters to a first dataset and (b) a preconfigured amount of entities randomly to a second dataset. Selecting all entities that have at least one ‘fraudulent’ data-points in at least one related data-point to add all the entities to the first dataset and the second dataset. Using vectorized and scaled extracted features for training a first machine-learning-model of fraud detection on the first dataset and training a second machine-learning-model of fraud detection on the second dataset to collect results. Using the results for combining the first machine-learning-model and the second machine-learning-model to an ensemble machine-learning-model for risk-prediction.
Public/Granted literature
- US20220383322A1 CLUSTERING-BASED DATA SELECTION FOR OPTIMIZATION OF RISK PREDICTIVE MACHINE LEARNING MODELS Public/Granted day:2022-12-01
Information query