Abstract:
The present disclosure relates to methods and systems for compressing workloads for use with index tuning. The methods and systems receive a workload with a plurality of queries. The methods and systems represent each query using query features and a utility. The methods and systems select a query for a query subset based on a benefit of the query determined using the query features and the utility. The methods and systems update the features and the utility of the remaining queries in the workload and select another query to add to the query subset based on an updated benefit determined using the updated features and utilities. The methods and systems select queries for the query subset equal to a received query subset size. The methods and systems use the query subset in index tuning to provide one or more indexes to recommendations.
Abstract:
The present invention extends to methods, systems, and computer program products for understanding tables for search. Aspects of the invention include identifying a subject tuple (e.g., a subject column) for a table, detecting a tuple header (e.g., a column header) using other tables, and detecting a tuple header (e.g., a column header) using a knowledge base. Implementations can be utilized in a structured data search system (SDSS) that indexes structured information, such as, tables in a relational database or html tables extracted from web pages. The SDSS allows users to search over the structured information (tables) using different mechanisms including keyword search and data finding data.
Abstract:
Systems, methods, and computer-executable instructions for creating a query execution plan for a query of a database includes receiving, from the database, a set of previously executed query execution plans for the query. Each previously-executed query execution plans includes subplans. Each subplan indicates a tree of physical operators. Physical operators that executed in the set of previously-executed query execution plans are determined. For each physical operator, an execution cost based is determined. Invalid physical operators from the previously-executed query execution plans that are invalid for the database are removed. Equivalent subplans from the previously-executed query execution plans are identified based on physical properties and logical expressions of the subplans. A constrained search space is created based on the equivalent subplans. A query execution plan for the query is constructed from the constrained search space based on the execution cost. The constructed query execution plan is not within the previously-executed query execution plans.
Abstract:
Systems and techniques for leveraging query executions to improve index recommendations are described herein. In an example, a machine learning model is adapted to receive a first query plan and a second query plan for performing a query with a database, where the first query plan is different from the second query plan. The machine learning model may be further adapted to determine execution cost efficiency between the first query plan and the second query plan. The machine learning model is trained using relative execution cost comparisons between a set of pairs of query plans for the database. The machine learning model is further adapted to output a ranking of the first query plan and second query plan, where the first query plan and second query plan are ranked based on execution cost efficiency.
Abstract:
In one embodiment, datasets are stored in a catalog. The datasets are enriched by establishing relationships among the domains in different datasets. A user searches for relevant datasets by providing examples of the domains of interest. The system identifies datasets corresponding to the user-provided examples. The system them identifies connected subsets of the datasets that are directly linked or indirectly linked through other domains. The user provides known relationship examples to filter the connected subsets and to identify the connected subsets that are most relevant to the user's query. The selected connected subsets may be further analyzed by business intelligence/analytics to create pivot tables or to process the data.
Abstract:
Methods and systems for joining two tables are provided. At least two tables to be joined are received. A joinable row pair between the at least two tables is determined. The determined joinable row pair includes a first row from a first table having a common string value with a second row from a second table of the at least two tables. A transformation model is generated from the determined joinable row pair. A column of the first table is transformed based on the generated transformation model. The transformed first table is joined with the second table.
Abstract:
Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating data transformations, according to embodiments of the present invention. In one embodiment, a transformation function is executed using an example input value to obtain an initial output value. Thereafter, a plurality of supplemental transformation tools is applied to the initial output value to generate a plurality of intermediary output values. Based on a comparison of each of the intermediary output values to an example output value, the supplemental transformation tool that generated an intermediary output value having a greatest extent of similarity to the example output values is identified. The identified supplemental transformation tool and the transformation function are used to generate a transformation program that transforms the example input values to the desired form in which to transform data.
Abstract:
Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating data transformations, according to embodiments of the present invention. In one embodiment, a set of example values are received. A repository of transformation tools is searched to identify a new transformation tool as relevant to a data transformation associated with the received set of example values. The repository includes annotations associated with the new transformation tool. The new transformation tool is used to generate a transformation program that produces transformed output values. Additional annotations are generated for the new transformation tool based on the transformed output values.
Abstract:
Techniques for tenant performance isolation in a multiple-tenant database management system are described. These techniques may include providing a reservation of server resources. The server resources reservation may include a reservation of a central processing unit (CPU), a reservation of Input/Ouput throughput, and/or a reservation of buffer pool memory or working memory. The techniques may also include a metering mechanism that determines whether the resource reservation is satisfied. The metering mechanism may be independent of an actual resource allocation mechanism associated with the server resource reservation.
Abstract:
In one embodiment, datasets are stored in a catalog. The datasets are enriched by establishing relationships among the domains in different datasets. A user searches for relevant datasets by providing examples of the domains of interest. The system identifies datasets corresponding to the user-provided examples. The system them identifies connected subsets of the datasets that are directly linked or indirectly linked through other domains. The user provides known relationship examples to filter the connected subsets and to identify the connected subsets that are most relevant to the user's query. The selected connected subsets may be further analyzed by business intelligence/analytics to create pivot tables or to process the data.