Abstract:
A computer-implemented method and system are provided, including executing an application programming interface (API) in a network switch to define at least one of one or more database functions, performing, using one or more processors, the one or more database functions on at least a portion of data contained in a data message received at the switch, to generate result data, and routing the result data to one or more destination nodes. A database function-defined network switch includes a network switch and one or more processors to perform a pre-defined database function on query data contained in data messages received at the switch, to produce result data, wherein the pre-defined database function is performed on the query data in a first mode of operation to a state of full completion, generating complete result data and no skipped query data, or to a state of partial completion, generating partially completed result data and skipped query data.
Abstract:
The present technology relates to managing data caching in processing nodes of a massively parallel processing (MPP) database system. A directory is maintained that includes a list and a storage location of the data pages in the MPP database system. Memory usage is monitored in processing nodes by exchanging memory usage information with each other. Each of the processing nodes manages a list and a corresponding amount of available memory in each of the processing nodes based on the memory usage information. Data pages are read from a memory of the processing nodes in response to receiving a request to fetch the data pages, and a remote memory manager is queried for available memory in each of the processing nodes in response to receiving the request. The data pages are distributed to the memory of the processing nodes having sufficient space available for storage during data processing.
Abstract:
Embodiments are provided herein for using parameterized Intermediate Representation (IR) for just-in-time (JIT) compilation in database query execution engines. In an embodiment, a method supporting query JIT compilation and execution in a database management system includes identifying a central processing unit (CPU) intensive function in a query, and identifying, in the CPU intensive function, one or more parameters. The one or more parameters represent variables with values changeable at different query instances. The CPU intensive function tis compiled to a parameterized IR including the one or more parameters. The parameterized IR of the CPU intensive function is saved in a catalog of parameterized IRs.
Abstract:
System and method embodiments are provided for using different storage formats for a primary database and its replicas in a database managed replication (DMR) system. As such, the advantages of both formats can be combined with suitable design complexity and implementation. In an embodiment, data is arranged in a sequence of rows and stored in a first storage format at the primary database. The data arranged in the sequence of rows is also stored in a second storage format at the replica database. The sequence of rows is determined according to the first storage format or the second storage format. The first storage format is a row store (RS) and the second storage format is a column store (CS), or vice versa. In an embodiment, the sequence of rows is determined to improve compression efficiency at the CS.
Abstract:
System and method embodiments are provided for multi-version support in indexes in a database. The embodiments enable substantially optimized multi-version support in index and avoid backfill of commit log sequence number (LSN) for a transaction identifier (TxID). In an embodiment, a method in a data processing system for managing a database includes determining with the data processing system whether a record is deleted according to a delete indicator in an index leaf page record corresponding to the record; and determining with the data processing system, when the record is not deleted, whether the record is visible according to a new record indicator in the index leaf page record and according to a comparison of a system commit TxID at the transaction start with a record commit TxID obtained from the index leaf page record.
Abstract:
A method includes receiving, by a database system, a query statement and forming a runtime plan tree in accordance with the query statement. The method also includes traversing the runtime plan tree including determining whether a function node of the runtime plan tree is qualified for just-in-time (JIT) compilation. Additionally, the method includes, upon determining that the function node is a qualified for JIT compilation producing a string key in accordance with a function of the function node and determining whether a compiled object corresponding to the string key is stored in a compiled object cache.
Abstract:
System and method embodiments are provided for multi-version support in indexes in a database. The embodiments enable substantially optimized multi-version support in index and avoid backfill of commit log sequence number (LSN) for a transaction identifier (TxID). In an embodiment, a method in a data processing system for managing a database includes determining with the data processing system whether a record is deleted according to a delete indicator in an index leaf page record corresponding to the record; and determining with the data processing system, when the record is not deleted, whether the record is visible according to a new record indicator in the index leaf page record and according to a comparison of a system commit TxID at the transaction start with a record commit TxID obtained from the index leaf page record.
Abstract:
System and method embodiments are provided for consistent read in a record-based multi-version concurrency control (MVCC) in database (DB) management systems. In an embodiment, a method in a record-based multi-version concurrent control (MVCC) database (DB) management system for a snapshot consistent read includes copying a system commit transaction identifier (TxID) and a current log record sequence number (LSN) from a transaction log at a start of a reader without backfilling of a commit LSN of a transaction to records that are changed and without copying an entire transaction table by the reader; and determining whether a record is visible according to a record TxID, the commit TxID and a current LSN, wherein a transaction table is consulted only when the record TxID is equal to or larger than a commit TxID at a transaction start.
Abstract:
Data messages having different priorities may be stored in different communication buffers of a network node. The data messages may then be forwarded from the communication buffers to working buffers as space becomes available in the working buffers. After being forwarded to the working buffers, the data messages may be available to be processed by upper-layer operations of the network node. Priorities may be assigned to the data messages based on a priority level of a query associated with the data messages, a priority level of an upper-layer operation assigned to process the data messages, or combinations thereof.
Abstract:
A method includes dividing a dataset into partitions by hashing a specified key, selecting a set of distributed file system nodes as a primary node group for storage of the partitions, and causing a primary copy of the partitions to be stored on the primary node group by a distributed storage system file server such that the location of each partition is known by hashing of the specified key.