Abstract:
Techniques for a firewall to determine access to a portion of memory are provided. In one aspect, an access request to access a portion of memory within a pool of shared memory may be received at a firewall. The firewall may determine whether the access request to access the portion of memory is allowed. The access request may be allowed to proceed based on the determination. The operation of the firewall may not utilize address translation.
Abstract:
Apertures of a first size in a first physical address space of at least one processor are mapped to respective blocks of the first size in a second address space of a storage medium. Apertures of a second size in the first physical address space are mapped to respective blocks of the second size in the second address space, the second size being different from the first size.
Abstract:
Techniques for writing data to a subset of memory devices are described. In one aspect, a block of data to be written to a line in a rank of memory may be received. The rank of memory may comprise a set of memory devices. The block of data may be compressed. The compressed block of data may be written to a subset of the memory devices that comprise the line. The unwritten portions of the line may not be used to store valid data.
Abstract:
A computing device includes a coherence controller and memory comprising a coherent memory region and a non-coherent memory region. The coherence controller may: determine a coherent region of the memory, determine a non-coherent region of the memory, and responsive to receiving a memory allocation request for a block of memory in the memory: allocate, based on a received memory allocation request for a memory block, the requested block of memory in the non-coherent memory region or the coherent memory region based on whether the memory allocation request indicates the requested block is to be coherent or non-coherent.
Abstract:
In one example, a memory network may control access to a shared memory that is by multiple compute nodes. The memory network may control the access to the shared memory by receiving a memory access request originating from an application executing on the multiple compute nodes and determining a priority for processing the memory access request. The priority determined by the memory network may correspond to a memory address range in the memory that is specifically used by the application.
Abstract:
A crossbar array includes a number of memory elements. An analog-to-digital converter (ADC) is electronically coupled to the vector output register. A digital-to-analog converter (DAC) is electronically coupled to the vector input register. A processor is electronically coupled to the ADC and to the DAC. The processor may be configured to determine whether division of input vector data by output vector data from the crossbar array is within a threshold value, and if not within the threshold value, determine changed data values as between the output vector data and the input vector data, and write the changed data values to the memory elements of the crossbar array.
Abstract:
In an example, an apparatus is described that includes a memory array. The memory array includes a volatile memory, a first non-volatile memory, and a second non-volatile memory. The memory array further includes a cache manager that controls access by a computer system to the memory array. For instance, the cache manager may carry out memory operations, including read operations, write operations, and cache evictions, in conjunction with at least one of the volatile memory, the first non-volatile memory, or the second non-volatile memory.
Abstract:
Techniques for retrieving data blocks from memory devices are provided. In one aspect, a request to retrieve a block of data may be received. The block of data may be in a line in a rank of memory. The rank of memory may include multiple devices. The devices used to store the line in the rank of memory may be determined. The determined devices may be read.
Abstract:
In one example, a central processing unit (CPU) with dynamic thread mapping includes a set of multiple cores each with a set of multiple threads. A set of registers for each of the multiple threads monitors for in-flight memory requests the number of loads from and stores to at least a first memory interface and a second memory interface by each respective thread. The second memory interface has a greater latency than the first memory interface. The CPU further has logic to map and migrate each thread to respective CPU cores where the number of cores accessing only one of the at least first and second memory interfaces is maximized.
Abstract:
A technique includes, in response to a cache miss occurring with a given processing node of a plurality of processing nodes, using a directory-based coherence system for the plurality of processing nodes to regulate snooping of an address that is associated with the cache miss. Using the directory-based coherence system to regulate whether the address is included in a snooping domain is based at least in part on a number of cache misses associated with the address.