SELECTIVELY READING DATA FROM CACHE AND PRIMARY STORAGE
    31.
    发明申请
    SELECTIVELY READING DATA FROM CACHE AND PRIMARY STORAGE 有权
    从缓存和初级存储中选择读取数据

    公开(公告)号:US20130212332A1

    公开(公告)日:2013-08-15

    申请号:US13839251

    申请日:2013-03-15

    Abstract: Techniques are provided for using an intermediate cache to provide some of the items involved in a scan operation, while other items involved in the scan operation are provided from primary storage. Techniques are also provided for determining whether to service an I/O request for an item with a copy of the item that resides in the intermediate cache based on factors such as a) an identity of the user for whom the I/O request was submitted, b) an identity of a service that submitted the I/O request, c) an indication of a consumer group to which the I/O request maps, or d) whether the intermediate cache is overloaded. Techniques are also provided for determining whether to store items in an intermediate cache in response to the items being retrieved, based on logical characteristics associated with the requests that retrieve the items.

    Abstract translation: 提供了使用中间缓存来提供扫描操作中涉及的一些项目的技术,而从主存储器提供涉及扫描操作的其他项目。 还提供了用于基于以下因素来确定是否对具有驻留在中间高速缓存中的项目的副本的项目的I / O请求进行服务的技术,例如:a)向其提交I / O请求的用户的身份 ,b)提交I / O请求的服务的身份,c)I / O请求所映射的消费者组的指示,或d)中间缓存是否过载。 还提供了基于与检索项目的请求相关联的逻辑特征来确定是否将项目存储在中间高速缓存中以响应于正被检索的项目的技术。

    HYPER-SCALE, ELASTIC, SMART, SHARED, DATABASE AWARE STORAGE

    公开(公告)号:US20250094385A1

    公开(公告)日:2025-03-20

    申请号:US18885394

    申请日:2024-09-13

    Abstract: Herein is an accelerated interface between a database server and a storage area network (SAN). Persistent torage being managed for a database is spread across a number of storage buckets. Global distributed storage metadata is used only for tracking the location of storage buckets on different storage servers. With this approach, a very small amount of memory is needed at a global distributed level to maintain the map. Each storage bucket can have any number of mirrored replicas for further increasing speed and reliability. A database server contains a storage bucket map in memory, and uses the map to do database online transaction processing (OLTP) I/O and smart (i.e. offloaded) database operations on storage. This allows for direct I/O between database server and storage server with lower latency and without using slow and remote middleware such as a logical unit number (LUN) metadata server on a separate network element.

    PMEM cache RDMA security
    33.
    发明授权

    公开(公告)号:US11573719B2

    公开(公告)日:2023-02-07

    申请号:US16831337

    申请日:2020-03-26

    Abstract: Techniques are described for providing one or more clients with direct access to cached data blocks within a persistent memory cache on a storage server. In an embodiment, a storage server maintains a persistent memory cache comprising a plurality of cache lines, each of which represent an allocation unit of block-based storage. The storage server maintains an RDMA table that include a plurality of table entries, each of which maps a respective client to one or more cache lines and a remote access key. An RDMA access request to access a particular cache line is received from a storage server client. The storage server identifies access credentials for the client and determines whether the client has permission to perform the RDMA access on the particular cache line. Upon determining that the client has permissions, the cache line is accessed from the persistent memory cache and sent to the storage server client.

    Consistently enforcing I/O resource constraints for workloads on parallel-access storage devices

    公开(公告)号:US11132131B2

    公开(公告)日:2021-09-28

    申请号:US16730608

    申请日:2019-12-30

    Abstract: The techniques described herein limit client utilization of a parallel-access storage device. Specifically, client utilization of a particular storage device is estimated using I/O cost metrics to estimate the costs of I/O requests from the client to the particular storage device. The I/O cost metrics are determined based on calibration-based system performance data, which represents a system-wide measure of storage device performance for a system in which the particular storage device resides. The calibration-based system performance data includes one or both of composite throughput data and composite IOPS data for multiple parallel-access devices in the system. The cost estimates for I/O requests issued from a client to a parallel-access device are tracked in a total cost estimate for the client. Client utilization of the storage device, as tracked by the total cost estimate for the client, is limited to a percentage of the total estimated bandwidth of the storage device.

    Secondary storage server caching
    36.
    发明授权

    公开(公告)号:US10831666B2

    公开(公告)日:2020-11-10

    申请号:US16153674

    申请日:2018-10-05

    Abstract: Techniques related to failover to the secondary storage server from a primary storage server of a database server without degrading the performance of servicing storage requests for client applications are provided. In an embodiment, the secondary storage server receives, from the database server, an eviction notification indicating that a set of data blocks has been evicted from a cache. The secondary storage server's memory hierarchy includes a secondary cache and a secondary persistent storage that stores a second copy of the set of data blocks. The secondary storage server persistently stores a copy of data, which is also persistently stored on the primary storage server, which includes a first copy of the set of data blocks. In an embodiment, upon receiving the eviction notification, the secondary storage server retrieves the second copy of the set of data blocks from the secondary persistent storage of the secondary storage server and loads the second copy of the set of data blocks into the secondary cache. After an interruption event, the secondary storage receives a request for a subset of the set of data blocks based on a request for data, at the database server. Upon receiving the request for the subset of the set of data blocks, the second storage server retrieves the subset of the set of data blocks from the second copy of the set of data blocks stored on the secondary cache of the secondary storage server without retrieving any of such data blocks from the second copy of the set of data blocks stored on the persistent storage of the secondary storage server. The second storage server sends the subset of the set of data blocks to the database server.

    Remote one-sided persistent writes
    37.
    发明授权

    公开(公告)号:US10732836B2

    公开(公告)日:2020-08-04

    申请号:US15720949

    申请日:2017-09-29

    Abstract: A shared storage architecture persistently stores database files in non-volatile random access memories (NVRAMs) of computing nodes of a multi-node DBMS. The computing nodes of the multi-node DBMS not only collectively store database data on NVRAMs of the computing nodes, but also host database server instances that process queries in parallel, host database sessions and database processes, and together manage access to a database stored on the NVRAMs of the computing nodes. To perform a data block read operation from persistent storage, a data block may be transferred directly over a network between NVRAM of a computing node that persistently stores the data block to a database buffer in non-volatile RAM of another computing node that requests the data block. The transfer is accomplished using remote direct memory access (“RDMA). In addition to techniques for performing a data block read operation to NVRAM, computing nodes perform a data block write operation to data blocks stored in NVRAM of the NVRAM shared storage architecture. The data block write operation is referred to herein as a one-sided write because only one database process needs to participate in the writing of a data block to NVRAM in order to successfully commit the write.

    Detection of avoidable cache thrashing for OLTP and DW workloads

    公开(公告)号:US10331573B2

    公开(公告)日:2019-06-25

    申请号:US15687296

    申请日:2017-08-25

    Abstract: Techniques are provided to adjust the behavior of a cache based on a count of cache misses for items recently evicted. In an embodiment, a computer responds to evicting a particular item (PI) from a cache by storing a metadata entry for the PI into memory. In response to a cache miss for the PI, the computer detects whether or not the metadata entry for the PI resides in memory. When the metadata entry for the PI is detected in memory, the computer increments a victim hit counter (VHC) that may be used to calculate how much avoidable thrashing is the cache experiencing, which is how much thrashing would be reduced if the cache were expanded. Either immediately or arbitrarily later, the computer adjusts a policy of the cache based on the VHC's value. For example, the computer may adjust the capacity of the cache based on the VHC.

Patent Agency Ranking