Invention Grant
- Patent Title: Textual analysis system for automatic content extaction
-
Application No.: US14009027Application Date: 2012-03-29
-
Publication No.: US10545928B2Publication Date: 2020-01-28
- Inventor: Hamid Gharib , Simon Thompson , Duong Nguyen , Marcus Thint
- Applicant: Hamid Gharib , Simon Thompson , Duong Nguyen , Marcus Thint
- Applicant Address: GB London
- Assignee: BRITISH TELECOMMUNICATIONS public limited company
- Current Assignee: BRITISH TELECOMMUNICATIONS public limited company
- Current Assignee Address: GB London
- Agency: Nixon & Vanderhye P.C.
- Priority: EP11250404 20110330
- International Application: PCT/GB2012/000296 WO 20120329
- International Announcement: WO2012/131310 WO 20121004
- Main IPC: G06F16/21
- IPC: G06F16/21 ; G06F17/27 ; G06F17/22

Abstract:
The present invention provides a method, and an associated apparatus configured to implement such a method, for analysing mark-up language text content, such as might be found on a website or within online user generated content. The method comprises a training phase, in which plurality of schemas are automatically generated from a specified text and a final schema is compiled. This final schema can then be used to compare with other online text content such that content which matched the final schema can be identified, for example for further analysis and comparison.
Public/Granted literature
- US20140025698A1 TEXTUAL ANALYSIS SYSTEM Public/Granted day:2014-01-23
Information query