APPARATUS AND METHODS FOR CLASSIFICATION OF WEB SITES
    1.
    发明申请
    APPARATUS AND METHODS FOR CLASSIFICATION OF WEB SITES 审中-公开
    网站分类的装置和方法

    公开(公告)号:WO2004053726A2

    公开(公告)日:2004-06-24

    申请号:PCT/EP0315017

    申请日:2003-11-14

    Applicant: IBM IBM FRANCE

    CPC classification number: G06F17/3071

    Abstract: Apparatus and methods for classifying web sites are provided. With the apparatus and methods, traffic data is obtained for a plurality of web sites. This patterns, or templates, for each web site are generated based on this traffic data and the patterns are clustered into classes of web sites using a clustering algorithm. The clusters, or classes, are then profiled to generate a template for each class. The template for each class is generated by first shifting the patterns for each web site that is part of the class to compensate for effects like time zone differences, if any, and then identifying a pattern that is most similar to all of the patterns in the class. Once the template for each class is generated, this template is then used with traffic data from a new web site to classify the new web site into one of the existing classes. In other words, when traffic data for a new web site is received, a pattern for the traffic data of the new web site is generated and compared to the templates for the various classes. If a matching class template is identified, the new web site is classified into the corresponding class. If the pattern for the new web site does not match any of the existing templates, a new template and class may be generated based on the pattern for the new web site.

    Abstract translation: 提供了分类网站的装置和方法。 利用该装置和方法,获得多个网站的交通数据。 基于该流量数据生成每个网站的这种模式或模板,并且使用聚类算法将模式聚类成网站类。 然后,对集群或类进行概要分析以为每个类生成一个模板。 每个类的模板是通过首先移动作为类的一部分的每个网站的模式来生成的,以补偿诸如时​​区差异(如果有的话)的效果,然后识别最相似于所有模式的模式 类。 一旦生成了每个类的模板,该模板随后与来自新网站的流量数据一起使用,将新网站分类到现有的一个类中。 换句话说,当接收到新的网站的交通数据时,生成用于新网站的交通数据的模式,并与各种类别的模板进行比较。 如果识别出匹配的类模板,则将新的网站分类到相应的类中。 如果新网站的模式与任何现有模板不匹配,则可能会根据新网站的模式生成新的模板和类。

    APPARATUS AND METHODS FOR CLASSIFICATION OF WEB SITES

    公开(公告)号:AU2003296756A1

    公开(公告)日:2004-06-30

    申请号:AU2003296756

    申请日:2003-11-14

    Applicant: IBM

    Abstract: Apparatus and methods for classifying web sites are provided. With the apparatus and methods, traffic data is obtained for a plurality of web sites. This patterns, or templates, for each web site are generated based on this traffic data and the patterns are clustered into classes of web sites using a clustering algorithm. The clusters, or classes, are then profiled to generate a template for each class. The template for each class is generated by first shifting the patterns for each web site that is part of the class to compensate for effects like time zone differences, if any, and then identifying a pattern that is most similar to all of the patterns in the class. Once the template for each class is generated, this template is then used with traffic data from a new web site to classify the new web site into one of the existing classes. In other words, when traffic data for a new web site is received, a pattern for the traffic data of the new web site is generated and compared to the templates for the various classes. If a matching class template is identified, the new web site is classified into the corresponding class. If the pattern for the new web site does not match any of the existing templates, a new template and class may be generated based on the pattern for the new web site.

    APPARATUS AND METHODS FOR CO-LOCATION AND OFFLOADING OF WEB SITE TRAFFIC BASED ON TRAFFIC PATTERN RECOGNITION

    公开(公告)号:CA2508047A1

    公开(公告)日:2004-06-24

    申请号:CA2508047

    申请日:2003-11-14

    Applicant: IBM

    Abstract: Apparatus and methods for identifying traffic patterns to web sites based on templates that characterize the arrival of traffic to the web sites are provided. Based on these templates, determinations are made as to which web sites should be co-located so as to optimize resource allocation. Specifically, web sites whose templates are complimentary, i.e. a first web site having a peak in arrival traffic at time t1 and a second web site that has a trough in arrival traffic at time tl, are designated as being candidat es for co-location. In addition, the present invention uses the templates identified for the traffic patterns of web sites to determine thresholds for offloading traffic to other servers. These thresholds include a first threshold at which offloading should be performed, a second threshold that takes into consideration the lead time needed to begin offloading, and a thi rd threshold that takes into consideration a lag time needed to stop all offloading of traffic to the other servers.

    APPARATUS AND METHODS FOR CO-LOCATION AND OFFLOADING OF WEB SITE TRAFFIC BASED ON TRAFFIC PATTERN RECOGNITION

    公开(公告)号:AU2003292269A1

    公开(公告)日:2004-06-30

    申请号:AU2003292269

    申请日:2003-11-14

    Applicant: IBM

    Abstract: Identifying traffic patterns to web sites based on templates that characterize the arrival of traffic to the web sites is provided. Based on these templates, determinations are made as to which web sites should be co-located so as to optimize resource allocation. Web sites whose templates are complimentary, i.e. a first web site having a peak in arrival traffic at time t1 and a second web site that has a trough in arrival traffic at time t1, are designated as being candidates for co-location. In addition, the templates identified for the traffic patterns of web sites are used to determine thresholds for offloading traffic to other servers. These thresholds include a first threshold at which offloading should be performed, a second threshold that takes into consideration the lead time needed to begin offloading, and a third threshold that takes into consideration a lag time needed to stop offloading of traffic.

Patent Agency Ranking