Method for determining output data for a plurality of text documents

Invention Grant

US11263251B2 Method for determining output data for a plurality of text documents 有权

Please log in to see more content

Patent Title: Method for determining output data for a plurality of text documents
Application No.: US16385054

Application Date: 2019-04-16
Publication No.: US11263251B2

Publication Date: 2022-03-01
Inventor: Mark Buckley
Applicant: Siemens Aktiengesellshaft
Applicant Address: DE Munich
Assignee: Siemens Aktiengesellshaft
Current Assignee: Siemens Aktiengesellshaft
Current Assignee Address: DE Munich
Agency: Schmeiser, Olsen & Watts LLP
Priority: EP18168202 20180419
Main IPC: G06F16/35
IPC: G06F16/35 ; G06F40/30 ; G06F40/284

Abstract:

Provided is a method for determining output data for a plurality of text documents, including the steps of: providing a feature matrix as input data; wherein the feature matrix includes information about frequencies of a plurality of features within the plurality of text documents; clustering the feature matrix using a clustering algorithm into at least one clustering matrix; wherein the at least one clustering matrix includes information about the cluster membership of each document of the plurality of documents or each feature of the plurality of features, assigning at least one score to each feature of the plurality of features based on the at least one clustering matrix; ranking the plurality of features based on their assigned scores; and outputting the ranked features as output data. A corresponding computer program product and system is also provided.

Public/Granted literature

US20190325026A1 METHOD FOR DETERMINING OUTPUT DATA FOR A PLURALITY OF TEXT DOCUMENTS Public/Granted day:2019-10-24

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F16/00	信息检索；数据库结构；文件系统结构
G06F16/30	.•非结构文本数据（文档管理系统入G06F 16/93）
G06F16/35	..••聚类；分类