METHODS AND APPARATUS FOR SIMILARITY TEXT SEARCH BASED ON CONCEPTUAL INDEXING

    公开(公告)号:CA2329558C

    公开(公告)日:2006-09-19

    申请号:CA2329558

    申请日:2000-12-22

    Applicant: IBM

    Abstract: In one aspect of the invention, a method of performing a conceptual similari ty search comprises the steps of generating one or more conceptual word-chains from on e or more documents to be used in the conceptual similarity search; building a conceptual index of documents with the one or more word-chains; and evaluating a similarity query using the conceptual index. The evaluating step preferably returns one or more of the closest documents resulting from the search; one or more matching word-chains in the one or more documents; and one or mo re matching topical words of the one or more documents.

    Arrangements and methods for latency-sensitive hashing for collaborative web caching

    公开(公告)号:AU782314B2

    公开(公告)日:2005-07-21

    申请号:AU7217300

    申请日:2000-12-11

    Applicant: IBM

    Abstract: Systems and methods for collaborative web caching among geographically distributed cache servers, particularly, latency-sensitive hashing systems and methods for collaborative web caching among geographically distributed proxy caches. Network latency delays as well as proxy load conditions are taking into consideration during hashing. As a result, requests can be hashed into geographically closer proxy caches if the load conditions permit. Otherwise, requests will be hashed into geographically distant proxy caches to better balance the load among the caches.

    36.
    发明专利
    未知

    公开(公告)号:DE69530556T2

    公开(公告)日:2004-04-08

    申请号:DE69530556

    申请日:1995-09-15

    Applicant: IBM

    Abstract: A communications system and method include an efficient cache invalidation technique which allows a computer to relocate and to disconnect without informing the server. The server partitions the entire database into a number of groups. The server also dynamically identifies recently updated objects in a group and excludes them from the group when checking the validity of the group. If these objects have already been included in the most recent invalidation broadcast, the remote computer can invalidate them in its cache before checking the group validity with the server. With the recently updated objects excluded from a group, the server can conclude that the cold objects in the group can be retained in the cache, and validate the rest of the group.

    Optimisation of system performance based on communication relationship

    公开(公告)号:GB2352542B

    公开(公告)日:2003-11-19

    申请号:GB0006497

    申请日:2000-03-20

    Applicant: IBM

    Abstract: A method and apparatus for optimizing information-retrieval related system performance based on users' communication relationships. Users' interactions and relationships with each other are tracked by a 'relationship analyzer' that queries multiple heterogeneous information sources, such as e-mail logs, organization charts, calendar entries, phone logs, etc. A data structure is created for each user reflecting the intensity of communication relationship with other users, and modified over time as the data in the information sources change. A relationship group is defined based on the data structure and preference or importance ratings for each type of communication relationship that includes each user's group of highest-priority other users. A derived relationship group may also be defined based on high-priority users of a user's highest-intensity relationships. The relationship analyzer then acts as a proxy for user queries, and may modify queries and create persistent data stores or store the results of queries or sub-queries in order to improve system performance in a variety of ways: for example, to shorten retrieval time, to resolve missing or ambiguous results, to prioritize information for downloading to limited-resource computing devices, or to propagate updated information among closely related users. A way to derive a relationship group based on subject lines of communications, or other text-based content of communication-related information, is also described.

    38.
    发明专利
    未知

    公开(公告)号:DE69530556D1

    公开(公告)日:2003-06-05

    申请号:DE69530556

    申请日:1995-09-15

    Applicant: IBM

    Abstract: A communications system and method include an efficient cache invalidation technique which allows a computer to relocate and to disconnect without informing the server. The server partitions the entire database into a number of groups. The server also dynamically identifies recently updated objects in a group and excludes them from the group when checking the validity of the group. If these objects have already been included in the most recent invalidation broadcast, the remote computer can invalidate them in its cache before checking the group validity with the server. With the recently updated objects excluded from a group, the server can conclude that the cold objects in the group can be retained in the cache, and validate the rest of the group.

    Generating decision trees with discriminants and employing the same in data classification

    公开(公告)号:GB2369697A

    公开(公告)日:2002-06-05

    申请号:GB0109736

    申请日:2001-04-20

    Applicant: IBM

    Abstract: At least a portion of a decision tree structure is generated from one or more multidimensional data objects by representing data associated with one or more of the data objects as a node, determining a condition for dividing the data at the node into at least two subsequent nodes based on a discriminant measure which maximises the separation between classes associated with the data, and dividing the data according to the condition. The multidimensional objects may be data records including feature variables and class variables and the method comprises splitting a decision tree, recursively, such that the greatest amount of separation among the class values of the data is achieved. The discriminant measure is preferably determined in accordance with Fisher's discriminant technique and the data is divided at a split plane determined to be perpendicular to a direction determined according to said technique and where an entropy measure is substantially optimised, as determined in accordance with a gini index.

    LOAD BALANCING COOPERATING CACHE SERVERS

    公开(公告)号:HU0104250A2

    公开(公告)日:2002-02-28

    申请号:HU0104250

    申请日:1999-10-08

    Applicant: IBM

    Abstract: In a system including a collection of cooperating cache servers, such as proxy cache servers, a request can be forwarded to a cooperating cache server if the requested object cannot be found locally. An overload condition is detected if for example, due to reference skew, some objects are in high demand by all the clients and the cache servers that contain those hot objects become overloaded due to forwarded requests. In response, the load is balanced by shifting some or all of the forwarded requests from an overloaded cache server to a less loaded one. Both centralized and distributed load balancing environments are described.

Patent Agency Ranking