Invention Grant
US08196037B2 Method and device for extracting web information 有权
用于提取Web信息的方法和设备

Method and device for extracting web information
Abstract:
A method for extracting web information includes: selecting a number of Hypertext Markup Language, HTML, tags as tag ruler elements to generate a tag ruler from an HTML text of a web page according to sequence of the HTML text; matching the HTML text with the tag ruler elements in the tag ruler according to the sequence of the tag ruler elements in the tag ruler, segmenting web information according to matched HTML tags and saving web information segments and location information of HTML tags enclosing the web information segments in the HTML text; and determining location of HTML tags containing web information needed by a user in the HTML text, extracting web information segments corresponding to the web information needed by the user from the saved web information segments.
Public/Granted literature
Information query
Patent Agency Ranking
0/0