-
公开(公告)号:US12204471B2
公开(公告)日:2025-01-21
申请号:US18243493
申请日:2023-09-07
Applicant: Intel Corporation
Inventor: Francesc Guim Bernat , Daniel Rivas Barragan , Kshitij A. Doshi , Mark A. Schmisseur
IPC: G06F12/00 , C07F15/00 , G06F12/0817 , G06F12/0831 , G06F12/1018 , G06F13/16 , H04L12/46 , H04L49/90
Abstract: In an example, there is disclosed a host-fabric interface (HFI), including: an interconnect interface to communicatively couple the HFI to an interconnect; a network interface to communicatively couple the HFI to a network; network interface logic to provide communication between the interconnect and the network; a coprocessor configured to provide an offloaded function for the network; a memory; and a caching agent configured to: designate a region of the memory as a shared memory between the HFI and a core communicatively coupled to the HFI via the interconnect; receive a memory operation directed to the shared memory; and issue a memory instruction to the memory according to the memory operation.
-
公开(公告)号:US12032977B2
公开(公告)日:2024-07-09
申请号:US17440701
申请日:2020-05-11
Applicant: Intel Corporation
Inventor: Bryan J. Rodriguez , Kshitij A. Doshi , Ned M. Smith , Michael G. Millsap
CPC classification number: G06F9/455 , G06F9/45533 , G06F9/45558 , G06F9/5077 , G06F2009/45562 , G06F2009/45583
Abstract: In one embodiment, a computing device comprises memory circuitry and processing circuitry. The memory circuitry is to store a plurality of container images, comprising: a first container image comprising a first set of applications; and a second container image comprising a virtual machine, a guest operating system, and a second set of applications. The processing circuitry is to: instantiate a plurality of containers on a host operating system, wherein the plurality of containers comprises a first container and a second container; execute the first set of applications in the first container, wherein the first set of applications is to be executed on the host operating system; and execute the virtual machine in the second container, wherein the guest operating system is to be executed on the virtual machine and the second set of applications is to be executed on the guest operating system.
-
公开(公告)号:US20240211310A1
公开(公告)日:2024-06-27
申请号:US18598837
申请日:2024-03-07
Applicant: Intel Corporation
Inventor: Francesc Guim Bernat , Kshitij A. Doshi , Daniel Rivas Barragan , Alejandro Duran Gonzalez , Harald Servat
IPC: G06F9/50 , H04L45/745 , H04L61/103 , H04L67/1004 , H04L67/1097 , H04L67/51 , H04L67/566 , H04W8/22
CPC classification number: G06F9/5005 , G06F9/5016 , G06F9/5022 , H04L67/1097 , H04L67/51 , H04W8/22 , G06F2209/463 , H04L45/745 , H04L61/103 , H04L67/1004 , H04L67/566
Abstract: Technologies for dynamically sharing remote resources include a computing node that sends a resource request for remote resources to a remote computing node in response to a determination that additional resources are required by the computing node. The computing node configures a mapping of a local address space of the computing node to the remote resources of the remote computing node in response to sending the resource request. In response to generating an access to the local address, the computing node identifies the remote computing node based on the local address with the mapping of the local address space to the remote resources of the remote computing node and performs a resource access operation with the remote computing node over a network fabric. The remote computing node may be identified with system address decoders of a caching agent and a host fabric interface. Other embodiments are described and claimed.
-
公开(公告)号:US20230195835A1
公开(公告)日:2023-06-22
申请号:US18083012
申请日:2022-12-16
Applicant: Intel Corporation
Inventor: Dmitry Y. Babokin , Kshitij A. Doshi , Vadim Sukhomlinov
CPC classification number: G06F17/16 , G06F7/5443 , G06F9/3001 , G06F9/30029 , G06F9/30036
Abstract: Detailed are embodiments related to bit matrix multiplication in a processor. For example, in some embodiments a processor comprising: decode circuitry to decode an instruction have fields for an opcode, an identifier of a first source bit matrix, an identifier of a second source bit matrix, an identifier of a destination bit matrix, and an immediate; and execution circuitry to execute the decoded instruction to perform a multiplication of a matrix of S-bit elements of the identified first source bit matrix with S-bit elements of the identified second source bit matrix, wherein the multiplication and accumulation operations are selected by the operation selector and store a result of the matrix multiplication into the identified destination bit matrix, wherein S indicates a plural bit size is described.
-
公开(公告)号:US11604889B2
公开(公告)日:2023-03-14
申请号:US15777721
申请日:2015-12-22
Applicant: Intel Corporation
Inventor: Ajith K. Illendula , Kshitij A. Doshi , Vincent J. Zimmer
Abstract: Systems, apparatuses and methods may provide for a memory apparatus that includes a client-side address space dedicated to an accessor of obfuscated multi-tenant data, wherein an executable view generation library is stored to the client-side address space. In one example, the executable view generation library is to receive a request to access at least a portion of the obfuscated multi-tenant data, convert the obfuscated multi-tenant data to deobfuscated multi-tenant data based on metadata associated with the executable view generation library and generate a single-tenant view based on the deobfuscated multi-tenant data.
-
公开(公告)号:US20230047886A1
公开(公告)日:2023-02-16
申请号:US17978788
申请日:2022-11-01
Applicant: Intel Corporation
Inventor: Francesc Guim Bernat , Kshitij A. Doshi , Daniel Rivas Barragan , Alejandro Duran Gonzalez , Harald Servat
IPC: G06F9/50 , H04W8/22 , H04L67/1097 , H04L67/51
Abstract: Technologies for dynamically sharing remote resources include a computing node that sends a resource request for remote resources to a remote computing node in response to a determination that additional resources are required by the computing node. The computing node configures a mapping of a local address space of the computing node to the remote resources of the remote computing node in response to sending the resource request. In response to generating an access to the local address, the computing node identifies the remote computing node based on the local address with the mapping of the local address space to the remote resources of the remote computing node and performs a resource access operation with the remote computing node over a network fabric. The remote computing node may be identified with system address decoders of a caching agent and a host fabric interface. Other embodiments are described and claimed.
-
公开(公告)号:US11522682B2
公开(公告)日:2022-12-06
申请号:US17332733
申请日:2021-05-27
Applicant: Intel Corporation
Inventor: Francesc Guim Bernat , Suraj Prabhakaran , Kshitij A. Doshi , Timothy Verrall
IPC: H04L9/08 , G06F3/06 , G06F9/50 , H04L69/12 , H04L69/32 , G06F16/25 , G06F16/2453 , H04L49/9005 , G11C8/12 , G11C29/02 , H04L41/0896 , G06F30/34 , B25J15/00 , G06F1/18 , G06F1/20 , G06F11/34 , G06F15/78 , H04L41/5025 , H04L67/1008 , H05K7/14 , H05K7/18 , H05K7/20 , H04L67/1001 , G11C29/36 , G11C29/38 , G11C29/44 , G06F16/22 , G06F16/2455 , G06F12/02 , G06F12/14 , G06F13/16 , G06F15/173 , G06F13/40 , G06F13/42 , G06F9/448 , G06F9/28 , G06F15/16 , H04L41/0893 , H04L69/22 , H04L69/321 , H04L41/0213 , H04L41/0668 , H04L41/0677 , H04L45/28 , H04L45/7453 , H04L47/11 , H04L47/125 , H04L49/00 , H04L49/351 , G06F9/4401 , G06F9/445 , G06F12/06 , G06F16/23 , G06F16/248 , G06F16/901 , G06F16/11 , G06F9/44 , G06F9/48 , G06F21/10 , G06N3/063 , G06Q10/06 , G06Q30/02 , H04L41/14 , H04L41/5019 , H04L49/40 , H04L9/40 , G06F12/0802 , G06F12/1045
Abstract: Technologies for providing streamlined provisioning of accelerated functions in a disaggregated architecture include a compute sled. The compute sled includes a network interface controller and circuitry to determine whether to accelerate a function of a workload executed by the compute sled, and send, to a memory sled and in response to a determination to accelerate the function, a data set on which the function is to operate. The circuitry is also to receive, from the memory sled, a service identifier indicative of a memory location independent handle for data associated with the function, send, to a compute device, a request to schedule acceleration of the function on the data set, receive a notification of completion of the acceleration of the function, and obtain, in response to receipt of the notification and using the service identifier, a resultant data set from the memory sled. The resultant data set was produced by an accelerator device during acceleration of the function on the data set. Other embodiments are also described and claimed.
-
公开(公告)号:US11494633B2
公开(公告)日:2022-11-08
申请号:US15859472
申请日:2017-12-30
Applicant: Intel Corporation
Inventor: Francesc Guim Bernat , Suraj Prabhakaran , Kshitij A. Doshi , Da-Ming Chiang
Abstract: Examples include techniques to manage training or trained models for deep learning applications. Examples include routing commands to configure a training model to be implemented by a training module or configure a trained model to be implemented by an inference module. The commands routed via out-of-band (OOB) link while training data for the training models or input data for the trained models are routed via inband links.
-
公开(公告)号:US11269801B2
公开(公告)日:2022-03-08
申请号:US17125439
申请日:2020-12-17
Applicant: Intel Corporation
Inventor: Francesc Guim Bernat , Da-Ming Chiang , Kshitij A. Doshi , Suraj Prabhakaran , Mark A. Schmisseur
Abstract: There is disclosed an example of an artificial intelligence (AI) system, including: a first hardware platform; a fabric interface configured to communicatively couple the first hardware platform to a second hardware platform; a processor hosted on the first hardware platform and programmed to operate on an AI problem; and a first training accelerator, including: an accelerator hardware; a platform inter-chip link (ICL) configured to communicatively couple the first training accelerator to a second training accelerator on the first hardware platform without aid of the processor; a fabric ICL to communicatively couple the first training accelerator to a third training accelerator on a second hardware platform without aid of the processor; and a system decoder configured to operate the fabric ICL and platform ICL to share data of the accelerator hardware between the first training accelerator and second and third training accelerators without aid of the processor.
-
公开(公告)号:US10540177B2
公开(公告)日:2020-01-21
申请号:US15438712
申请日:2017-02-21
Applicant: Intel Corporation
Inventor: Elmoustapha Ould-Ahmed-Vall , Suleyman Sair , Kshitij A. Doshi , Charles R. Yount , Bret L. Toll
Abstract: A processor core including a hardware decode unit to decode vector instructions for decompressing a run length encoded (RLE) set of source data elements and an execution unit to execute the decoded instructions. The execution unit generates a first mask by comparing set of source data elements with a set of zeros and then counts the trailing zeros in the mask. A second mask is made based on the count of trailing zeros. The execution unit then copies the set of source data elements to a buffer using the second mask and then reads the number of RLE zeros from the set of source data elements. The buffer is shifted and copied to a result and the set of source data elements is shifted to the right. If more valid data elements are in the set of source data elements this is repeated until all valid data is processed.
-
-
-
-
-
-
-
-
-