-
公开(公告)号:US20240143641A1
公开(公告)日:2024-05-02
申请号:US18049958
申请日:2022-10-26
Applicant: SAP SE
Inventor: Lev Sigal , Anna Fishbein , Anton Ioffe , Iryna Butselan
Abstract: Some embodiments provide a non-transitory machine-readable medium that stores a program. The program may receive a plurality of string data. The program may determine an embedding for each string data in the plurality of string data. The program may cluster the embeddings into groups of embeddings. The program may determine a plurality of labels for the plurality of string data based on the groups of embeddings. The program may use the plurality of labels and the plurality of string data to train a classifier model. The program may provide a particular string data as an input to the trained classifier model, wherein the classifier model is configured to determine, based on the particular string data, a classification for the particular string data.
-
公开(公告)号:US12293601B2
公开(公告)日:2025-05-06
申请号:US17897022
申请日:2022-08-26
Applicant: SAP SE
Inventor: Lev Sigal , Anna Fishbein , Anton Ioffe , Iryna Butselan
IPC: G06V30/418 , G06V30/19 , G06V30/413
Abstract: Some embodiments provide a non-transitory machine-readable medium that stores a program. The program receives an image of a document, the document comprising a set of text. The program further provides the set of text to a machine learning model configured to determine, based on the set of text, a plurality of probabilities for a plurality of defined types of documents. Based on the plurality of probabilities for the plurality of defined types of documents, the program also determines a type of the document from the plurality of defined types of documents.
-
公开(公告)号:US20240071121A1
公开(公告)日:2024-02-29
申请号:US17897022
申请日:2022-08-26
Applicant: SAP SE
Inventor: Lev Sigal , Anna Fishbein , Anton Ioffe , Iryna Butselan
IPC: G06V30/418 , G06V30/19 , G06V30/413
CPC classification number: G06V30/418 , G06V30/19147 , G06V30/413
Abstract: Some embodiments provide a non-transitory machine-readable medium that stores a program. The program receives an image of a document, the document comprising a set of text. The program further provides the set of text to a machine learning model configured to determine, based on the set of text, a plurality of probabilities for a plurality of defined types of documents. Based on the plurality of probabilities for the plurality of defined types of documents, the program also determines a type of the document from the plurality of defined types of documents.
-
-