Abstract:
본 발명은 웹 크롤링에 소요되는 시간을 획기적으로 단축시킬 수 있는 웹 크롤링 시스템에 관한 것이다. 본 명세서에서 개시하는 웹 크롤링 시스템은 웹 크롤링을 위한 기준 웹 페이지들(시드 페이지들(seed pages))을 설정하고, 웹 크롤링을 통해 발견되는(Discovered) 상기 시드 페이지들의 각 시드 페이지(p i )에의 접근 확률(중요도)을 산출하여 상기 각 시드 페이지(p i )에 우선순위를 부여하는 시드 페이지 우선순위 부여부; 상기 부여된 각 시드 페이지(p i )의 우선순위 중 가장 높은 순위를 갖는 시드 페이지(p i , max )를 추출하여 우선적으로 다운로드하되, 상기 시드 페이지(p i , max )에 링크된 외부링크(outlink) 페이지들도 일괄적으로 다운로드하는 다운로드부; 및 상기 다운로드된 외부링크 페이지들의 각 링크 페이지(p j )에 대한 상기 시드 페이지(p i,max )내에서의 접근 확률(중요도)을 산출하여, 상기 각 링크 페이지(p j )에 우선순위를 부여하는 외부링크 페이지 우선순위 부여부를 포함하여 본 시스템 발명의 과제를 해결한다.
Abstract:
PURPOSE: A portable communication terminal capable of extracting interest theme of a user and method thereof are provided to grasp interest theme of a user using text data included in data created by portable communication terminal. CONSTITUTION: A word vector creation unit(210) creates word vector that represents each text data according to kinds of text data stored in data created by a portable communication terminal. A theme classification tree storage unit(230) includes one or more learning data. The theme classification tree storage unit stores the theme classification tree connected to plural nodes expressing theme. A similarity output unit(220) produces similarity between each node learning data included in the word vector and theme classification tree.
Abstract:
PURPOSE: A topic classification module and a contextual advertisement system using the same are provided to minimize the costs for creating the classification module by using an opened directory data. CONSTITUTION: A topic classification tree generator(132) processes the opened directory data and creates a theme classification tree. A training data generator(134) creates a learning data representing the directory based on text information of a web site which is included in the opened directory. A classification unit(136) maps the learning data to the directory. The classification unit determines the web page or the theme of the advertisement by calculating the similarity between the word vector and the directory representing vector.
Abstract:
PURPOSE: A customized advertisement providing method and a device thereof are provided to supply an advertisement according to related subject matter of a web page by checking the related subject matter through history information of the web page. CONSTITUTION: One or more web pages to be visited that are semantically related to an advertisement-containing web page are extracted(S410). Web page subject matter corresponding to each web page are determined through subject matter classification processing for the web pages to be visited and the advertisement-containing web page(S420). Advertisement contents corresponding to the web page subject matter are extracted from the advertisement contents which include the determined advertisement subject matter. [Reference numerals] (S410) Extracting a visiting web page related to an advertisement posting web page; (S420) Classifying subjects of extracted visiting web pages; (S430) Generating a class view according to the subject classification; (S440) Matching an advertisement according to the class view; (S450) Providing a customized advertisement
Abstract:
본 발명은 무선 방송 환경에서의 스카이라인 질의 처리 기술에 관한 것으로서, 본 발명에 따른 인덱스 구성 장치는, SWEEP 순서를 기반으로 인덱싱(indexing)된 DSI(Distribute Spatial Index) 구조 및 상기 DSI 구조에 대응하는 대응 데이터 객체의 NDP(Nearest Dominating Point) 정보를 포함하여 인덱스 테이블(index table)을 구성하는 인덱스 테이블 구성부; 및 상기 인덱스 테이블 및 상기 데이터 객체를 연관시켜 브로드캐스트하는 브로드캐스트부를 포함하여, 스카이라인 질의 처리시 모바일 클라이언트의 에너지 효율성을 개선하는 이점을 제공한다.
Abstract:
PURPOSE: A contextual advertisement system utilizing similarity graph is provided to match higher advertisement semantically related to a web page by generating similarity graph based on weight graph and calculating semantic relevancy from the similarity graph. CONSTITUTION: A web page set manager(222) manage one or more web pages which an advertisement will be published. An advertisement set manager(224) manages at least one advertisement which will be published on the web page. Based on similarity of a web page and an advertisement, an advertisement matching unti(226) matches the advertisement and the web page and exposes the advertisement to the web page. The advertisement matching unit generates similarity graph. The advertisement matching unit calculates semantic relationship and matches the advertisement and the web page.
Abstract:
PURPOSE: An apparatus and a method for configuring an index in wireless broadcasting environments, and a system and a method for processing skyline queries using the same are provided to use NDP information of a DSI structure and a data object based on SWEEP order, thereby accurately determining whether a currently read data object is a skyline point. CONSTITUTION: An index table configuring unit(310) configures an index table. The index table includes a DSI structure indexed based on SWEEP order and NDP(Nearest Dominating Point) information of a corresponding data object corresponding to the DSI structure. A broadcasting unit(320) interlinks the index table with the data object. The broadcasting unit broadcasts the index table and the data object.