Invention Grant
- Patent Title: System and method for the indexing of organic chemical structures mined from text documents
- Patent Title (中): 从文本文件开采有机化学结构索引的系统和方法
-
Application No.: US10797359Application Date: 2004-03-09
-
Publication No.: US07899827B2Publication Date: 2011-03-01
- Inventor: Stephen Boyer , Anna Rosa Coden , James William Cooper
- Applicant: Stephen Boyer , Anna Rosa Coden , James William Cooper
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: Harrington & Smith
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
Disclosed is a method, a computer program product and a system for processing documents that contain chemical names. The system has a unit to partition document text and to assign semantic meaning to words; a unit to recognize any substructures present in the chemical name fragments; and a unit to determine structural connectivity information of the chemical name fragments and recognized substructures and to store the determined structural connectivity information in a searchable index. The system further includes a unit to search a text index using at least one of a fragment name and a substructure name and to search the structure index by at least one of fragment connectivity and substructure connectivity. At an intersection of the search results from the structure index and the text index, the system operates to identify at least one document that contains a reference to a corresponding chemical compound.
Public/Granted literature
- US20050203898A1 System and method for the indexing of organic chemical structures mined from text documents Public/Granted day:2005-09-15
Information query