-
公开(公告)号:CA2786418C
公开(公告)日:2020-04-14
申请号:CA2786418
申请日:2012-08-16
Applicant: IBM CANADA LTD IBM CANADA LIMITEE
Inventor: ONUT IOSIF VIOREL , IONESCU PAUL , BRAKE NEVON CHRISTOPHER , SMITH WAYNE DUNCAN , DINCTURK MUSTAFA EMRE , TAHERI SEYED M MIR , JOURDAN GUY-VINCENT , BOCHMANN GREGOR VON
Abstract: An illustrative embodiment of a computer-implemented process for identifying equivalent JavaScript events receives source code containing two JavaScript events for equivalency analysis, extracts an HTML element containing an event from each JavaScript event and analyzes the extracted HTML elements. Responsive to a determination that the HTML elements are of a same type according to equivalency criteria B, and responsive to a determination that the HTML elements have a same number of attributes according to equivalency criteria C, determines whether JavaScript function calls of each JavaScript event are similar according to equivalency criteria A. Responsive to a determination that the JavaScript function calls are similar according to equivalency criteria A, and responsive to a determination that the other attributes of the HTML elements satisfy equivalency criteria D, identifies the JavaScript events as equivalent.
-
公开(公告)号:CA2789936C
公开(公告)日:2020-02-18
申请号:CA2789936
申请日:2012-09-14
Applicant: IBM CANADA LTD IBM CANADA LIMITEE
Inventor: IONESCU PAUL , ONUT IOSIF VIOREL
Abstract: An illustrative embodiment of a computer-implemented process for identifying sequential browsing operations receives session data associated with a plurality of sessions, creates a reduced page for each page in a series of pages associated with a first session in the plurality of sessions and creates a hash value associated with each reduced page for each page in the series of pages associated with the first session of the plurality of sessions. Responsive to a determination that the hash value of the first session is equivalent to the hash value of the second session, the computer-implemented process identifies an associated page as an equivalent page and merges equivalent pages to create a common sequence without a need to resend requests associated with the session data to a server.
-
公开(公告)号:CA3120833A1
公开(公告)日:2013-12-26
申请号:CA3120833
申请日:2012-06-26
Applicant: IBM CANADA LTD IBM CANADA LIMITEE
Inventor: ONUT IOSIF VIOREL , IONESCU PAUL , AYOUB KHALIL ANDREW , SMITH WAYNE DUNCAN
Abstract: A computer-implemented process for identifying equivalent links on a page responsive to a determination that the crawler has not visited all required universal resource locators, locates a next URL to be crawled to form a current URL and processes the current URL to identify equivalent URLs. Responsive to a determination that the crawler has not visited the current URL, determine whether necessary to crawl all identified equivalent URLs and responsive to a determination that it is necessary to crawl all identified equivalent URLs, adding all equivalent URLs to a list of URLs to be crawled.
-
公开(公告)号:CA2789909C
公开(公告)日:2019-09-10
申请号:CA2789909
申请日:2012-09-14
Applicant: IBM CANADA LTD IBM CANADA LIMITEE
Inventor: IONESCU PAUL , ONUT IOSIF VIOREL , AYOUB KHALIL ANDRES , MIRMOVITCH GIL
IPC: H04L29/06 , G06F16/83 , G06F16/958 , H04L12/16
Abstract: An illustrative embodiment of a computer-implemented process for synchronizing requests with a respective context, responsive to a determination that there are more pages to explore, performs regular crawling operations for a current page, records a current page in a list of explored pages and extract links from the current page. Responsive to a determination that there are more links to extract, selects a next link to analyze to form a selected link and responsive to a determination that there is a new request associated with the selected link, creates a new request identifier and saves an entry in a hashmap. Responsive to a determination that there is not a new request associated with selected link, updates a request associated with the selected link with a new link value when the link value differs.
-
公开(公告)号:CA2816781C
公开(公告)日:2022-07-05
申请号:CA2816781
申请日:2013-05-28
Applicant: IBM CANADA LTD IBM CANADA LIMITEE
Inventor: ONUT IOSIF VIOREL , IONESCU PAUL , TRIPP OMER , BYOOKI SEYED ALI MOOSAVI , JOURDAN GUY-VINCENT , BOCHMANN GREGOR VON
IPC: G06F17/00 , G06F16/951
Abstract: An illustrative embodiment of a method for identifying client states, receives a set of paths representative of a document object model (DOM) associated with a web page of a rich Internet application and for each path in the set of paths received, extracts a subtree, as Subtree X, for a current path. The method traverses all known sub-paths under the current path and delete corresponding subtrees from Subtree X and reads contents of and determines states of Subtree X to form a State X. The State X is added to a set of current states and responsive to a determination no more paths exist, returns the set of current states of the rich Internet application.
-
公开(公告)号:CA2779235C
公开(公告)日:2019-05-07
申请号:CA2779235
申请日:2012-06-06
Applicant: IBM CANADA LTD IBM CANADA LIMITEE
Inventor: ISLAM OBIDUL , ONUT IOSIF VIOREL , IONESCU PAUL , KONDRATOVA EUGENIA
IPC: H04L12/26 , G06F7/00 , G06F16/951 , G06F17/00
Abstract: An illustrative embodiment for identifying unvisited portions of visited information to visit, receives information to crawl, wherein the information is representative of one of web based information and non-web based information, computes a locality sensitive hash (LSH) value for the received information and identifies a most similar information visited thus far. The illustrative embodiment determines whether the LSH of the received information is equivalent to most similar information visited thus far and responsive to a determination that the LSH of the received information is not equivalent to most similar information visited thus far, identifies a visited portion of the received information using information for most similar information visited thus far and crawls only unvisited portions of the received information.
-
公开(公告)号:CA2738289C
公开(公告)日:2018-05-29
申请号:CA2738289
申请日:2011-04-28
Applicant: IBM CANADA LTD IBM CANADA LIMITEE
Inventor: ONUT IOSIF VIOREL , IONESCU PAUL , SEGAL ORY , SMITH WAYNE DUNCAN , JOURDAN GUY-VINCENT , BOCHMANN GREGOR VON
IPC: H04L12/26
Abstract: A computer-implemented process, computer program product, and apparatus for identifying session identification information. A recording is initiated and an operation sequence of interest is performed while recording and the recording ceases. Responsive to a determination that the operation sequence of interest was successful, information from the operation sequence of interest is saved as recorded information and responsive to a determination that a same operation sequence of interest was recorded, the recorded information from each operation sequence of interest is compared. Differences in the recorded information are identified to form identified differences and a session identifier is constructed using the identified differences.
-
公开(公告)号:CA3120755C
公开(公告)日:2022-12-06
申请号:CA3120755
申请日:2012-06-26
Applicant: IBM CANADA LTD IBM CANADA LIMITEE
Inventor: ONUT IOSIF VIOREL , IONESCU PAUL , AYOUB KHALIL ANDREW , SMITH WAYNE DUNCAN
IPC: H04L12/16 , G06F16/951 , G06F16/955
Abstract: An illustrative embodiment of a computer-implemented process for identifying equivalent links on a page responsive to a determination that the crawler has not visited all required universal resource locators, locates a next URL to be crawled to form a current URL and processes the current URL to identify equivalent URLs. Responsive to a determination that the crawler has not visited the current URL, determine whether necessary to crawl all identified equivalent URLS and responsive to a determination that it is necessary to crawl all identified equivalent URLS, adding all equivalent URLs to a list of URLs to be crawled.
-
公开(公告)号:CA2788100C
公开(公告)日:2022-07-05
申请号:CA2788100
申请日:2012-08-28
Applicant: IBM CANADA LTD IBM CANADA LIMITEE
Inventor: MATHIEU JEROME SIMON , ONUT IOSIF VIOREL
IPC: G06F17/00
Abstract: An illustrative embodiment of a computer-implemented process for selective processing of items having embedded delay actions, receives an item to process containing a delay action, processes the item using a delay action process, wherein the delay action process comprises exploring dynamically generated server-side content of the item received, by recognizing when a wait occurs for a server process, and performing one of a wait for a predetermined period of time, or circumventing an actual wait, to generate a result and returns the result to a requester.
-
公开(公告)号:CA2781391C
公开(公告)日:2021-08-03
申请号:CA2781391
申请日:2012-06-26
Applicant: IBM CANADA LTD IBM CANADA LIMITEE
Inventor: ONUT IOSIF VIOREL , IONESCU PAUL , AYOUB KHALIL ANDREW , SMITH WAYNE DUNCAN
IPC: H04L12/16 , G06F16/951 , G06F16/955
Abstract: An illustrative embodiment of a computer-implemented process for identifying equivalent links on a page responsive to a determination that the crawler has not visited all required universal resource locators, locates a next URL to be crawled to form a current URL and processes the current URL to identify equivalent URLs. Responsive to a determination that the crawler has not visited the current URL, determine whether necessary to crawl all identified equivalent URLS and responsive to a determination that it is necessary to crawl all identified equivalent URLS, adding all equivalent URLs to a list of URLs to be crawled.
-
-
-
-
-
-
-
-
-