Invention Grant
- Patent Title: Apparatus and method for accessing and indexing dynamic web pages
- Patent Title (中): 用于访问和索引动态网页的设备和方法
-
Application No.: US12122696Application Date: 2008-05-18
-
Publication No.: US08131753B2Publication Date: 2012-03-06
- Inventor: Ilya Rybak , Lior Harsat , Yael Schuldenfrei , Sagi Kariv
- Applicant: Ilya Rybak , Lior Harsat , Yael Schuldenfrei , Sagi Kariv
- Main IPC: G06F7/00
- IPC: G06F7/00 ; G06F17/30

Abstract:
A method and apparatus for enabling an external application such as a web crawler access to dynamic web pages associated with a primary application such as a portal page. The primary application addresses each component associated with it and requests a list of resource identifiers. Each component implements an interface and provides such list of resource identifiers. The list is returned to the external application, which then optionally requests the contents of the page associated with each resource identifier. The component provides the content of the page, which is then parsed by a parsing module associated with the primary application. The parsing module transforms the content into a data structure such as a Document Object Model, and then extracts text or Hypertext Markup Language code from the data structure. The text is then returned to the external application fro searching, indexing or other purposes.
Public/Granted literature
- US20090288099A1 APPARATUS AND METHOD FOR ACCESSING AND INDEXING DYNAMIC WEB PAGES Public/Granted day:2009-11-19
Information query