IDENTIFYING REQUESTS THAT INVALIDATE USER SESSIONS

    公开(公告)号:CA2762544A1

    公开(公告)日:2013-06-20

    申请号:CA2762544

    申请日:2011-12-20

    Applicant: IBM CANADA

    Abstract: An illustrative embodiment of a computer-implemented process for identifying a request invalidating a session excludes all marked logout requests of a Web application, crawls an identified next portion of the Web application and responsive to a determination, in one instance, that the state of the crawl is out of session, logs in to the Web application. The computer-implemented process further selects all crawl requests sent since a last time the crawl was in-session, excluding all marked logout requests and responsive to a determination that requests remain, crawls a selected next unprocessed request. Responsive to a determination, in the next instance, that state of the crawl is out of session and the selected request meets logout request criteria, the computer-implemented process marks the selected request as a logout request.

    TRACKING JAVASCRIPT ACTION
    22.
    发明专利

    公开(公告)号:CA2838911A1

    公开(公告)日:2015-07-09

    申请号:CA2838911

    申请日:2014-01-09

    Applicant: IBM CANADA

    Abstract: An illustrative embodiment of a computer-implemented process for tracking JavaScript actions in a rich Internet application, receives a document object model (DOM) representative of a particular page of an application at a particular time and analyzes the DOM received to identify each JavaScript action on the particular page for which for each JavaScript action identified, a JavaScript action characteristics ID is calculated and stored. Responsive to a determination multiple instances of a same ID exist, collecting a list of JavaScript actions corresponding to each ID corresponding to a multiple JavaScript action and removing from memory JavaScript action entries for the multiple instances of the same ID. A neighbor influence is computed for a member of the list of JavaScript actions remaining and the JavaScript action ID calculated for the member of the list of JavaScript actions remaining is stored. Responsive to a determination there are no more multiple JavaScript actions, return all JavaScript action IDs stored.

    CRAWLING RICH INTERNET APPLICATIONS

    公开(公告)号:CA2790379A1

    公开(公告)日:2014-03-20

    申请号:CA2790379

    申请日:2012-09-20

    Applicant: IBM CANADA

    Abstract: An illustrative embodiment of a computer-implemented process for crawling rich Internet applications executes sets of events discovered in a state exploration phase according to a predetermined priority of each set of events in the sets of events discovered, wherein events from a higher priority are exhausted before an event from a lower priority is executed and responsive to a determination that transitions remain, executes a set of events in a transition exploration phase. The computer-implemented process further determines whether a new state exists as a result of executing an event in the set of events and responsive to a determination that a new state exists, returning to the state exploration phase.

    IDENTIFYING EQUIVALENT JAVASCRIPT EVENTS

    公开(公告)号:CA2786418A1

    公开(公告)日:2014-02-16

    申请号:CA2786418

    申请日:2012-08-16

    Applicant: IBM CANADA

    Abstract: An illustrative embodiment of a computer-implemented process for identifying equivalent JavaScript events receives source code containing two JavaScript events for equivalency analysis, extracts an HTML element containing an event from each JavaScript event and analyzes the extracted HTML elements. Responsive to a determination that the HTML elements are of a same type according to equivalency criteria B, and responsive to a determination that the HTML elements have a same number of attributes according to equivalency criteria C, determines whether JavaScript function calls of each JavaScript event are similar according to equivalency criteria A. Responsive to a determination that the JavaScript function calls are similar according to equivalency criteria A, and responsive to a determination that the other attributes of the HTML elements satisfy equivalency criteria D, identifies the JavaScript events as equivalent.

    IDENTIFICATION OF SEQUENTIAL BROWSING OPERATIONS

    公开(公告)号:CA2789936A1

    公开(公告)日:2014-03-14

    申请号:CA2789936

    申请日:2012-09-14

    Applicant: IBM CANADA

    Abstract: An illustrative embodiment of a computer-implemented process for identifying sequential browsing operations receives session data associated with a plurality of sessions, creates a reduced page for each page in a series of pages associated with a first session in the plurality of sessions and creates a hash value associated with each reduced page for each page in the series of pages associated with the first session of the plurality of sessions. Responsive to a determination that the hash value of the first session is equivalent to the hash value of the second session, the computer-implemented process identifies an associated page as an equivalent page and merges equivalent pages to create a common sequence without a need to resend requests associated with the session data to a server.

    IDENTIFYING EQUIVALENT LINKS ON A PAGE

    公开(公告)号:CA2781391A1

    公开(公告)日:2013-12-26

    申请号:CA2781391

    申请日:2012-06-26

    Applicant: IBM CANADA

    Abstract: An illustrative embodiment of a computer-implemented process for identifying equivalent links on a page responsive to a determination that the crawler has not visited all required universal resource locators, locates a next URL to be crawled to form a current URL and processes the current URL to identify equivalent URLs. Responsive to a determination that the crawler has not visited the current URL, determine whether necessary to crawl all identified equivalent URLS and responsive to a determination that it is necessary to crawl all identified equivalent URLS, adding all equivalent URLs to a list of URLs to be crawled.

    IDENTIFYING UNVISITED PORTIONS OF VISITED INFORMATION

    公开(公告)号:CA2779235A1

    公开(公告)日:2013-12-06

    申请号:CA2779235

    申请日:2012-06-06

    Applicant: IBM CANADA

    Abstract: An illustrative embodiment for identifying unvisited portions of visited information to visit, receives information to crawl, wherein the information is representative of one of web based information and non-web based information, computes a locality sensitive hash (LSH) value for the received information and identifies a most similar information visited thus far. The illustrative embodiment determines whether the LSH of the received information is equivalent to most similar information visited thus far and responsive to a determination that the LSH of the received information is not equivalent to most similar information visited thus far, identifies a visited portion of the received information using information for most similar information visited thus far and crawls only unvisited portions of the received information.

    IDENTIFYING UNIVERSAL RESOURCE LOCATOR REWRITING RULES

    公开(公告)号:CA2702351A1

    公开(公告)日:2010-10-07

    申请号:CA2702351

    申请日:2010-05-14

    Applicant: IBM CANADA

    Inventor: IONESCU PAUL

    Abstract: An illustrative embodiment of a computer-implemented process for identifying universal resource locator rewriting rules receives input of universal resource locators of an application, to form received universal resource locators, represent the received universal resource locators in a specialized graph and apply analysis algorithms and heuristics to properties of the specialized graph. The computer-implemented process further identifies universal resource locator rewriting patterns using the specialized graph to form detected patterns and generates rewrite rules corresponding to the detected patterns.

Patent Agency Ranking