SHARING AWARE SNOOP FILTER APPARATUS AND METHOD

    公开(公告)号:US20170286299A1

    公开(公告)日:2017-10-05

    申请号:US15088921

    申请日:2016-04-01

    CPC classification number: G06F12/0831 G06F12/0811 G06F2212/283 G06F2212/621

    Abstract: An apparatus and method are described for a sharing aware snoop filter. For example, one embodiment of a processor comprises: a plurality of caches, each of the caches comprising a plurality of cache lines, at least some of which are to be shared by two or more of the caches; a snoop filter to monitor accesses to the plurality of cache lines shared by the two or more caches, the snoop filter comprising: a primary snoop filter comprising a first plurality of entries, each entry associated with one of the plurality of cache lines and comprising a N unique identifiers to uniquely identify up to N of the plurality of caches currently storing the cache line; an auxiliary snoop filter comprising a second plurality of entries, each entry associated with one of the plurality of cache lines, wherein once a particular cache line has been shared by more than N caches, an entry for that cache line is allocated in the auxiliary snoop filter to uniquely identify one or more additional caches storing the cache line.

    PROCESSORS, METHODS, AND SYSTEMS WITH A CONFIGURABLE SPATIAL ACCELERATOR

    公开(公告)号:US20180189231A1

    公开(公告)日:2018-07-05

    申请号:US15396402

    申请日:2016-12-30

    Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a core with a decoder to decode an instruction into a decoded instruction and an execution unit to execute the decoded instruction to perform a first operation; a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform a second operation when an incoming operand set arrives at the plurality of processing elements.

    MULTICAST TREE-BASED DATA DISTRIBUTION IN DISTRIBUTED SHARED CACHE
    3.
    发明申请
    MULTICAST TREE-BASED DATA DISTRIBUTION IN DISTRIBUTED SHARED CACHE 有权
    分布式共享缓存中基于多播树的数据分发

    公开(公告)号:US20160170880A1

    公开(公告)日:2016-06-16

    申请号:US14567026

    申请日:2014-12-11

    Abstract: Systems and methods for multicast tree-based data distribution in a distributed shared cache. An example processing system comprises: a plurality of processing cores, each processing core communicatively coupled to a cache; a tag directory associated with caches of the plurality of processing cores; a shared cache associated with the tag directory; a processing logic configured, responsive to receiving an invalidate request with respect to a certain cache entry, to: allocate, within the shared cache, a shared cache entry corresponding to the certain cache entry; transmit, to at least one of: a tag directory or a processing core that last accessed the certain entry, an update read request with respect to the certain cache entry; and responsive to receiving an update of the certain cache entry, broadcast the update to at least one of: one or more tag directories or one or more processing cores identified by a tag corresponding to the certain cache entry.

    Abstract translation: 在分布式共享缓存中基于组播树的数据分发的系统和方法。 一个示例处理系统包括:多个处理核心,每个处理核心通信地耦合到高速缓存; 与多个处理核心的高速缓存相关联的标签目录; 与标签目录相关联的共享缓存; 响应于接收关于某个高速缓存条目的无效请求而配置的处理逻辑,以在所述共享高速缓存内分配与所述某个高速缓存条目相对应的共享高速缓存条目; 发送到以下中的至少一个:最后访问所述特定条目的标签目录或处理核心,相对于所述某个高速缓存条目的更新读取请求; 并且响应于接收到所述某个高速缓存条目的更新,将所述更新广播到以下各项中的至少一个:一个或多个标签目录或由与所述某个高速缓存条目对应的标签标识的一个或多个处理核心。

    APPARATUS, METHODS, AND SYSTEMS FOR REMOTE MEMORY ACCESS IN A CONFIGURABLE SPATIAL ACCELERATOR

    公开(公告)号:US20190303297A1

    公开(公告)日:2019-10-03

    申请号:US15943608

    申请日:2018-04-02

    Abstract: Systems, methods, and apparatuses relating to remote memory access in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator includes a first memory interface circuit coupled to a first processing element and a cache, the first memory interface circuit to issue a memory request to the cache, the memory request comprising a field to identify a second memory interface circuit as a receiver of data for the memory request; and the second memory interface circuit coupled to a second processing element and the cache, the second memory interface circuit to send a credit return value to the first memory interface circuit, to cause the first memory interface circuit to mark the memory request as complete, when the data for the memory request arrives at the second memory interface circuit and a completion configuration register of the second memory interface circuit is set to a remote response value.

Patent Agency Ranking