Abstract:
Techniques are provided for maintaining data persistently in one format, but making that data available to a database server in more than one format. For example, one of the formats in which the data is made available for query processing is based on the on-disk format, while another of the formats in which the data is made available for query processing is independent of the on-disk format. Data that is in the format that is independent of the disk format may be maintained exclusively in volatile memory to reduce the overhead associated with keeping the data in sync with the on-disk format copies of the data.
Abstract:
Techniques are provided for storing in in-memory unit (IMU) in a lower-storage tier and copying the IMU to DRAM when needed for query processing. Techniques are also provided for copying IMUs to lower tiers of storage when evicted from the cache of higher tiers of storage. Techniques are provided for implementing functionality of IMUs within a storage system, to enable database servers to push tasks, such as filtering, to the storage system where the storage system may access IMUs within its own memory to perform the tasks. Metadata associated with a set of data may be used to indicate whether an IMU for the data should be created by the database server machine or within the storage system.
Abstract:
Techniques are provided for maintaining data persistently in one format, but making that data available to a database server in more than one format. For example, one of the formats in which the data is made available for query processing is based on the on-disk format, while another of the formats in which the data is made available for query processing is independent of the on-disk format. Data that is in the format that is independent of the disk format may be maintained exclusively in volatile memory to reduce the overhead associated with keeping the data in sync with the on-disk format copies of the data. Selection of data to be maintained in the volatile memory may be based on various factors. Once selected the data may also be compressed to save space in the volatile memory. The compression level may depend on one or more factors that are evaluated for the selected data. The factors for the selection and compression level of data may be periodically evaluated, and based on the evaluation, the selected data may be removed from the volatile memory or its compression level changed accordingly.
Abstract:
Techniques for activity tracking, data classification, and in-database archiving are described. Activity tracking refers to techniques that collect statistics related to user access patterns, such as the frequency or recency with which users access particular database elements. The statistics gathered through activity tracking can be supplied to data classification techniques to automatically classify the database elements or to assist users with manually classifying the database elements. Then, once the database elements have been classified, in-database archiving techniques can be employed to move database elements to different storage tiers based on the classifications. However, although the techniques related to activity tracking, data classification, and in-database archiving may be used together as described above; each technique may also be practiced separately.
Abstract:
Techniques for activity tracking, data classification, and in-database archiving are described. Activity tracking refers to techniques that collect statistics related to user access patterns, such as the frequency or recency with which users access particular database elements. The statistics gathered through activity tracking can be supplied to data classification techniques to automatically classify the database elements or to assist users with manually classifying the database elements. Then, once the database elements have been classified, in-database archiving techniques can be employed to move database elements to different storage tiers based on the classifications. However, although the techniques related to activity tracking, data classification, and in-database archiving may be used together as described above; each technique may also be practiced separately.
Abstract:
Techniques are provided for more efficiently using the bandwidth of the I/O path between a CPU and volatile memory during the performance of database operation. Relational data from a relational table is stored in volatile memory as column vectors, where each column vector contains values for a particular column of the table. A binary-comparable format may be used to represent each value within a column vector, regardless of the data type associated with the column. The column vectors may be compressed and/or encoded while in volatile memory, and decompressed/decoded on-the-fly within the CPU. Alternatively, the CPU may be designed to perform operations directly on the compressed and/or encoded column vector data. In addition, techniques are described that enable the CPU to perform vector processing operations on the column vector values.
Abstract:
Techniques are provided for providing a data item to a transaction in a multi-versioning system in which the data item may exist on multiple versions of a data block, and were versioning is performed at the granularity of the data block. According to one aspect of the invention, the technique involves locating, within volatile memory, a first version of a data block that includes a first version of the data item. It is then determined whether the first version of the data item is usable by the transaction without respect to whether the first version of the data block is generally usable by the transaction. If the first version of the data item is usable by the transaction, then the data item is established as a candidate that can be provided to the transaction. Thus, the data item within a block may be considered a candidate to be provided to a transaction even when the version of the data block on which the data item resides would otherwise disqualify the data block from being seen by that transaction. If the first version of the data item is not usable by the transaction, then a version of the data item that is usable by the transaction is obtained from a second version of the data block that is different from the first version.
Abstract:
Techniques are provided for determining which data item version to supply to a query. According to the techniques, the determination is made by associating a new field, which indicates the time a data item version was current, with each data item version; associating a new field with each query, which indicates the last change that the query must see made by the transaction to which the query belongs; and determining which data item version to use to answer the query based, in part, on a comparison between the values of the two new fields.