-
公开(公告)号:US20250086392A1
公开(公告)日:2025-03-13
申请号:US18463509
申请日:2023-09-08
Applicant: SAP SE
Inventor: Jianglei Han , Qisheng Hu , My Hoa Ha , Yue Yang
IPC: G06F40/284 , G06F16/34 , G06F40/169 , G06F40/40 , G06N3/0455 , G06N3/048 , G06N3/08
Abstract: Methods, systems, and computer-readable storage media for receiving a document provided as a computer-readable file, receiving a set of questions, for each question in the set of questions, generating an inference input including a question, at least a portion of text of the document, and multiple tokens, processing, by a PLM, the inference input to generate a set of text embeddings, processing, by a neural network, the set of text embeddings to provide sets of tokens, each set of tokens being specific to a segment of the document and including a start token and an end token respectively identifying a start position and an end position of the segment, determining, from the sets of tokens, a segment for display, and displaying at least a portion of the document in a UI and an annotation indicating the segment within the at least a portion of the document.
-
公开(公告)号:US11461552B2
公开(公告)日:2022-10-04
申请号:US16920916
申请日:2020-07-06
Applicant: SAP SE
Inventor: Jianglei Han , Traci Zheng Wen Lim , Juanlei Rocco Hu , Lijie Quan , Wei Jin , Lingxiao Liang
IPC: G06F40/289 , G06N20/00 , G06F40/284 , G06Q50/18 , G06V30/148
Abstract: Methods, systems, and computer-readable storage media for receiving, by an automated review system, a legal document as a computer-readable file, and determining, by the automated review system, that the legal document is of a first type, and in response: converting the legal document to a set of images, extracting text data from one or more images in the set of images, the text data including sub-sets of text data, each sub-set of text data representing text in a respective clause of a set of clauses of the legal document, for each sub-set of text data receiving a prediction from a machine learning (ML) model in a set of ML models, the ML model being specific to a clause in the set of clauses, and outputting a set of predictions and respective prediction values for display in a user interface (UI).
-
公开(公告)号:US20220237397A1
公开(公告)日:2022-07-28
申请号:US17160041
申请日:2021-01-27
Applicant: SAP SE
Inventor: Jianglei Han
Abstract: Technologies are described for automatically identifying handwritten signatures within digital images using OCR residues. For example, a digital image of a scanned document is received. The scanned document comprises typewritten content and handwritten content. Optical character recognition (OCR) is performed on the digital image to identify typewritten text within the digital image. Pixel areas containing the identified typewritten text are removed from the digital image. Density-based clustering is performed on the digital image to cluster remaining pixel data and generate candidate segments. The candidate segments are then processed using a trained image classifier to determine if they contain handwritten signatures.
-
公开(公告)号:US20220004713A1
公开(公告)日:2022-01-06
申请号:US16920916
申请日:2020-07-06
Applicant: SAP SE
Inventor: Jianglei Han , Traci Zheng Wen Lim , Juanlei Rocco Hu , Lijie Quan , Wei Jin , Lingxiao Liang
IPC: G06F40/289 , G06N20/00 , G06F40/284 , G06Q50/18 , G06K9/34
Abstract: Methods, systems, and computer-readable storage media for receiving, by an automated review system, a legal document as a computer-readable file, and determining, by the automated review system, that the legal document is of a first type, and in response: converting the legal document to a set of images, extracting text data from one or more images in the set of images, the text data including sub-sets of text data, each sub-set of text data representing text in a respective clause of a set of clauses of the legal document, for each sub-set of text data receiving a prediction from a machine learning (ML) model in a set of ML models, the ML model being specific to a clause in the set of clauses, and outputting a set of predictions and respective prediction values for display in a user interface (UI).
-
-
-