METHOD AND APPARATUS FOR SMART STORE OPERATIONS WITH CONDITIONAL OWNERSHIP REQUESTS

    公开(公告)号:US20210279175A1

    公开(公告)日:2021-09-09

    申请号:US17328366

    申请日:2021-05-24

    Abstract: Method and apparatus implementing smart store operations with conditional ownership requests. One aspect includes a method implemented in a multi-core processor, the method comprises: receiving a conditional read for ownership (CondRFO) from a requester in response to an execution of an instruction to modify a target cache line (CL) with a new value, the CondRFO identifying the target CL and the new value; determining from a local cache a local CL corresponding to the target CL; determining a local value from the local CL; comparing the local value with the new value; setting a coherency state of the local CL to (S)hared when the local value is same as the new value; setting the coherency state of the local CL to (I)nvalid when the local value is different than the new value; and sending a response and a copy of the local CL to the requester. Other embodiments include an apparatus configured to perform the actions of the methods.

    TECHNOLOGIES FOR PERFORMING SWITCH-BASED COLLECTIVE OPERATIONS IN DISTRIBUTED ARCHITECTURES

    公开(公告)号:US20200304425A1

    公开(公告)日:2020-09-24

    申请号:US16895539

    申请日:2020-06-08

    Abstract: Technologies for performing switch-based collective operations in a fabric architecture include a network switch communicatively coupled to a plurality of computing nodes. The network switch is configured to identify sub-operations of a collective operation of a collective operation request received from one of the computing nodes and identify a plurality of operands for each of the sub-operations. The network switch is additionally configured to request a value for each of the operands from a corresponding target computing node at which the respective value is stored, determine a result of the collective operation as a function of the requested operand values, and transmit the result to the requesting computing node. Other embodiments are described herein.

    Apparatuses and methods to translate a logical thread identification to a physical thread identification

    公开(公告)号:US09886318B2

    公开(公告)日:2018-02-06

    申请号:US15055234

    申请日:2016-02-26

    CPC classification number: G06F9/5027 G06F9/3851 G06F9/455 G06F9/4552 G06F9/46

    Abstract: Methods and apparatuses relating to translating a logical thread identification to a physical thread identification. A processor may include a plurality of cores that include a buffer, and a thread mapping hardware unit to: return a physical thread identification in response to a logical thread identification sent to a buffer of a first core when the buffer includes a logical to physical thread mapping for the logical thread identification, and send a request to the buffers of the other cores when the first core's buffer does not include the logical to physical thread mapping for the logical thread identification, wherein each of the other cores are to send an unknown identification response if their buffer does not include the logical thread identification and at least one of the other cores is to send the physical thread identification to the first core if its buffer includes the logical thread identification.

    Technologies for dynamically sharing remote resources across remote computing nodes

    公开(公告)号:US11216306B2

    公开(公告)日:2022-01-04

    申请号:US15636969

    申请日:2017-06-29

    Abstract: Technologies for dynamically sharing remote resources include a computing node that sends a resource request for remote resources to a remote computing node in response to a determination that additional resources are required by the computing node. The computing node configures a mapping of a local address space of the computing node to the remote resources of the remote computing node in response to sending the resource request. In response to generating an access to the local address, the computing node identifies the remote computing node based on the local address with the mapping of the local address space to the remote resources of the remote computing node and performs a resource access operation with the remote computing node over a network fabric. The remote computing node may be identified with system address decoders of a caching agent and a host fabric interface. Other embodiments are described and claimed.

    Method and apparatus for smart store operations with conditional ownership requests

    公开(公告)号:US11016893B2

    公开(公告)日:2021-05-25

    申请号:US16329595

    申请日:2016-09-30

    Abstract: Method and apparatus implementing smart store operations with conditional ownership requests. One aspect includes a method implemented in a multi-core processor, the method comprises: receiving a conditional read for ownership (CondRFO) from a requester in response to an execution of an instruction to modify a target cache line (CL) with a new value, the CondRFO identifying the target CL and the new value; determining from a local cache a local CL corresponding to the target CL; determining a local value from the local CL; comparing the local value with the new value; setting a coherency state of the local CL to (S)hared when the local value is same as the new value; setting the coherency state of the local CL to (I)nvalid when the local value is different than the new value; and sending a response and a copy of the local CL to the requester. Other embodiments include an) apparatus configured to perform the actions of the methods.

    Loop nest parallelization without loop linearization

    公开(公告)号:US09760356B2

    公开(公告)日:2017-09-12

    申请号:US14493640

    申请日:2014-09-23

    CPC classification number: G06F8/452

    Abstract: Systems and methods may provide for identifying a nested loop iteration space in user code, wherein the nested loop iteration space includes a plurality of outer loop iterations, and distributing iterations from the nested loop iteration space across a plurality of threads, wherein each thread is assigned a group of outer loop iterations. Additionally, a compiler output may be automatically generated, wherein the compiler output contains serial code corresponding to each group of outer loop iterations and de-linearization code to be executed outside the plurality of outer loop iterations. In one example, the de-linearization code includes index recovery code that is positioned before one or more instances of the serial code in the compiler output.

    TECHNOLOGIES FOR DYNAMICALLY SHARING REMOTE RESOURCES ACROSS REMOTE COMPUTING NODES

    公开(公告)号:US20230047886A1

    公开(公告)日:2023-02-16

    申请号:US17978788

    申请日:2022-11-01

    Abstract: Technologies for dynamically sharing remote resources include a computing node that sends a resource request for remote resources to a remote computing node in response to a determination that additional resources are required by the computing node. The computing node configures a mapping of a local address space of the computing node to the remote resources of the remote computing node in response to sending the resource request. In response to generating an access to the local address, the computing node identifies the remote computing node based on the local address with the mapping of the local address space to the remote resources of the remote computing node and performs a resource access operation with the remote computing node over a network fabric. The remote computing node may be identified with system address decoders of a caching agent and a host fabric interface. Other embodiments are described and claimed.

    TECHNOLOGIES FOR PERFORMING SWITCH-BASED COLLECTIVE OPERATIONS IN DISTRIBUTED ARCHITECTURES

    公开(公告)号:US20180077226A1

    公开(公告)日:2018-03-15

    申请号:US15260638

    申请日:2016-09-09

    CPC classification number: H04L49/15 H04L29/10

    Abstract: Technologies for performing switch-based collective operations in a fabric architecture include a network switch communicatively coupled to a plurality of computing nodes. The network switch is configured to identify sub-operations of a collective operation of a collective operation request received from one of the computing nodes and identify a plurality of operands for each of the sub-operations. The network switch is additionally configured to request a value for each of the operands from a corresponding target computing node at which the respective value is stored, determine a result of the collective operation as a function of the requested operand values, and transmit the result to the requesting computing node. Other embodiments are described herein.

Patent Agency Ranking