-
公开(公告)号:US20200250179A1
公开(公告)日:2020-08-06
申请号:US16826113
申请日:2020-03-20
Applicant: Cloudera, Inc.
Inventor: Rituparna Agrawal , Anupam Singh , Prithviraj Pandian
IPC: G06F16/248 , G06F16/2455 , G06F16/28 , G06F16/21 , G06F16/84
Abstract: Systems and methods very fast grouping of “similar” SQL queries according to user-supplied similarity criteria are disclosed. The user-supplied similarity criteria includes a threshold quantifying the degree of similarity between SQL queries and common artifacts included in the queries. A similarity-characterizing data structure is disclosed that allows for the very fast grouping of “similar” SQL queries. Because the computation is distributed among multiple compute nodes, a small cluster of compute nodes takes a short time to compute the similarity-characterizing data on a workload of tens of millions of queries. The user can supply the similarity criteria through a UI or a command line tool. Furthermore, in some embodiments, the user can adjust the degree of similarity by supplying new similarity criteria. Accordingly, the system can display in real time or near real time, updated SQL groupings corresponding to the newly supplied similarity criteria using the originally computed similarity-characterizing data structure.
-
公开(公告)号:US10599664B2
公开(公告)日:2020-03-24
申请号:US15495397
申请日:2017-04-24
Applicant: Cloudera, Inc.
Inventor: Rituparna Agrawal , Anupam Singh , Prithviraj Pandian
IPC: G06F16/248 , G06F16/84 , G06F16/21 , G06F16/28 , G06F16/2455
Abstract: Systems and methods for very fast grouping of “similar” SQL queries according to user-supplied similarity criteria. The user-supplied similarity criteria include a threshold quantifying the degree of similarity between SQL queries and common artifacts included in the queries. A similarity-characterizing data structure allows for the very fast grouping of “similar” SQL queries. Because the computation is distributed among multiple compute nodes, a small cluster of compute nodes takes a short time to compute the similarity-characterizing data on a workload of tens of millions of queries. The user can supply the similarity criteria through a UI or a command line tool. Furthermore, the user can adjust the degree of similarity by supplying new similarity criteria. Accordingly, the system can display in real time or near real time, updated SQL groupings corresponding to the newly supplied similarity criteria using the originally computed similarity-characterizing data structure.
-
公开(公告)号:US20230350906A1
公开(公告)日:2023-11-02
申请号:US18127322
申请日:2023-03-28
Applicant: Cloudera, Inc.
Inventor: Rituparna Agrawal , Anupam Singh , Prithviraj Pandian
IPC: G06F16/248 , G06F16/84 , G06F16/21 , G06F16/28 , G06F16/2455
CPC classification number: G06F16/248 , G06F16/86 , G06F16/211 , G06F16/285 , G06F16/2455
Abstract: Systems and methods for very fast grouping of “similar” SQL queries according to user-supplied similarity criteria. The user-supplied similarity criteria include a threshold quantifying the degree of similarity between SQL queries and common artifacts included in the queries. A similarity-characterizing data structure allows for the very fast grouping of “similar” SQL queries. Because the computation is distributed among multiple compute nodes, a small cluster of compute nodes takes a short time to compute the similarity-characterizing data on a workload of tens of millions of queries. The user can supply the similarity criteria through a UI or a command line tool. Furthermore, the user can adjust the degree of similarity by supplying new similarity criteria. Accordingly, the system can display in real time or near real time, updated SQL groupings corresponding to the newly supplied similarity criteria using the originally computed similarity-characterizing data structure.
-
公开(公告)号:US20170308592A1
公开(公告)日:2017-10-26
申请号:US15495397
申请日:2017-04-24
Applicant: Cloudera, Inc.
Inventor: Rituparna Agrawal , Anupam Singh , Prithviraj Pandian
IPC: G06F17/30
CPC classification number: G06F16/248 , G06F16/211 , G06F16/2455 , G06F16/285 , G06F16/86
Abstract: Systems and methods very fast grouping of “similar” SQL queries according to user-supplied similarity criteria are disclosed. The user-supplied similarity criteria includes a threshold quantifying the degree of similarity between SQL queries and common artifacts included in the queries. A similarity-characterizing data structure is disclosed that allows for the very fast grouping of “similar” SQL queries. Because the computation is distributed among multiple compute nodes, a small cluster of compute nodes takes a short time to compute the similarity-characterizing data on a workload of tens of millions of queries. The user can supply the similarity criteria through a UI or a command line tool. Furthermore, in some embodiments, the user can adjust the degree of similarity by supplying new similarity criteria. Accordingly, the system can display in real time or near real time, updated SQL groupings corresponding to the newly supplied similarity criteria using the originally computed similarity-characterizing data structure.
-
公开(公告)号:US11645294B2
公开(公告)日:2023-05-09
申请号:US16826113
申请日:2020-03-20
Applicant: Cloudera, Inc.
Inventor: Rituparna Agrawal , Anupam Singh , Prithviraj Pandian
IPC: G06F16/248 , G06F16/84 , G06F16/21 , G06F16/28 , G06F16/2455
CPC classification number: G06F16/248 , G06F16/211 , G06F16/2455 , G06F16/285 , G06F16/86
Abstract: Systems and methods for very fast grouping of “similar” SQL queries according to user-supplied similarity criteria. The user-supplied similarity criteria include a threshold quantifying the degree of similarity between SQL queries and common artifacts included in the queries. A similarity-characterizing data structure allows for the very fast grouping of “similar” SQL queries. Because the computation is distributed among multiple compute nodes, a small cluster of compute nodes takes a short time to compute the similarity-characterizing data on a workload of tens of millions of queries. The user can supply the similarity criteria through a UI or a command line tool. Furthermore, the user can adjust the degree of similarity by supplying new similarity criteria. Accordingly, the system can display in real time or near real time, updated SQL groupings corresponding to the newly supplied similarity criteria using the originally computed similarity-characterizing data structure.
-
-
-
-