Patent search ap:("Intel Corporation") AND inv:"Zeev Sperber" Page 7

61.

发明授权
Systems and methods to load a tile register pair 有权

公开(公告)号：US11093247B2

公开(公告)日：2021-08-17

申请号：US15858932

申请日：2017-12-29

Applicant: Intel Corporation

Inventor： Raanan Sade , Simon Rubanovich , Amit Gradstein , Zeev Sperber , Alexander Heinecke , Robert Valentine , Mark J. Charney , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall , Menachem Adelman

IPC: G06F15/00 , G06F9/30

Abstract: Embodiments detailed herein relate to systems and methods to load a tile register pair. In one example, a processor includes: decode circuitry to decode a load matrix pair instruction having fields for an opcode and source and destination identifiers to identify source and destination matrices, respectively, each matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded load matrix pair instruction to load every element of left and right tiles of the identified destination matrix from corresponding element positions of left and right tiles of the identified source matrix, respectively, wherein the executing operates on one row of the identified destination matrix at a time, starting with the first row.

62.

发明授权
Systems and methods for performing instructions to convert to 16-bit floating-point format 有权

公开(公告)号：US11068262B2

公开(公告)日：2021-07-20

申请号：US17133078

申请日：2020-12-23

Applicant: Intel Corporation

Inventor： Alexander F. Heinecke , Robert Valentine , Mark J. Charney , Raanan Sade , Menachem Adelman , Zeev Sperber , Amit Gradstein , Simon Rubanovich

IPC: G06F9/30 , G06F9/38

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to convert to 16-bit floating-point format. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of a first source vector comprising N single-precision elements, and a destination vector comprising at least N 16-bit floating-point elements, the opcode to indicate execution circuitry is to convert each of the elements of the specified source vector to 16-bit floating-point, the conversion to include truncation and rounding, as necessary, and to store each converted element into a corresponding location of the specified destination vector, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.

63.

发明授权
Apparatus and method for detecting and recovering from data fetch errors 有权

公开(公告)号：US11048587B2

公开(公告)日：2021-06-29

申请号：US16292085

申请日：2019-03-04

Applicant: Intel Corporation

Inventor： Theodros Yigzaw , Geeyarpuram N. Santhanakrishnan , Ganapati N. Srinivasa , Jose A. Vargas , Hisham Shafi , Michael Mishaeli , Ehud Cohen , Zeev Sperber , Shlomo Raikin , Mohan J. Kumar , Julius Y. Mandelblat

IPC: G06F11/14 , G06F11/10

Abstract: An apparatus and method are described for detecting and correcting data fetch errors within a processor core. For example, one embodiment of an instruction processing apparatus for detecting and recovering from data fetch errors comprises: at least one processor core having a plurality of instruction processing stages including a data fetch stage and a retirement stage; and error processing logic in communication with the processing stages to perform the operations of: detecting an error associated with data in response to a data fetch operation performed by the data fetch stage; and responsively performing one or more operations to ensure that the error does not corrupt an architectural state of the processor core within the retirement stage.

64.

发明申请
Technology For Dynamically Tuning Processor Features 有权

公开(公告)号：US20210109839A1

公开(公告)日：2021-04-15

申请号：US17128291

申请日：2020-12-21

Applicant: Intel Corporation

Inventor： Adarsh Chauhan , Jayesh Gaur , Franck Sala , Lihu Rappoport , Zeev Sperber , Adi Yoaz , Sreenivas Subramoney

IPC: G06F11/34 , G06F11/30 , G06F9/24 , G06F9/38 , G06F15/78

Abstract: A processor comprises a microarchitectural feature and dynamic tuning unit (DTU) circuitry. The processor executes a program for first and second execution windows with the microarchitectural feature disabled and enabled, respectively. The DTU circuitry automatically determines whether the processor achieved worse performance in the second execution window. In response to determining that the processor achieved worse performance in the second execution window, the DTU circuitry updates a usefulness state for a selected address of the program to denote worse performance. In response to multiple consecutive determinations that the processor achieved worse performance with the microarchitectural feature enabled, the DTU circuitry automatically updates the usefulness state to denote a confirmed bad state. In response to the usefulness state denoting the confirmed bad state, the DTU circuitry automatically disables the microarchitectural feature for the selected address for execution windows after the second execution window. Other embodiments are described and claimed.

65.

发明授权
Method and apparatus for performing logical compare operations 有权

公开(公告)号：US10572251B2

公开(公告)日：2020-02-25

申请号：US16184994

申请日：2018-11-08

Applicant: INTEL CORPORATION

Inventor： Rajiv Kapoor , Ronen Zohar , Mark J. Buxton , Zeev Sperber , Koby Gottlieb

IPC: G06F9/30 , G06F7/02 , G06F9/38 , G06F12/0875

Abstract: A method and apparatus for including in processor instructions for performing logical-comparison and branch support operations on packed or unpacked data. In one embodiment, instruction decode logic decodes instructions for an execution unit to operate on packed data elements including logical comparisons. A register file including 128-bit packed data registers stores packed single-precision floating point (SPFP) and packed integer data elements. The logical comparisons may include comparison of SPFP data elements and comparison of integer data elements and setting at least one bit to indicate the results. Based on these comparisons, branch support actions are taken. Such branch support actions may include setting the at least one bit, which in turn may be utilized by a branching unit in response to a branch instruction. Alternatively, the branch support actions may include branching to an indicated target code location.

66.

发明授权
Vector multiplication with accumulation in large register space 有权

公开(公告)号：US10514912B2

公开(公告)日：2019-12-24

申请号：US16133269

申请日：2018-09-17

Applicant: Intel Corporation

Inventor： Shay Gueron , Vlad Krasnov , Robert Valentine , Zeev Sperber , Amit Gradstein , Simon Rubanovich

IPC: G06F9/30 , G06F7/52

Abstract: An apparatus is described having an instruction execution pipeline that has a vector functional unit to support a vector multiply add instruction. The vector multiply add instruction to multiply respective K bit elements of two vectors and accumulate a portion of each of their respective products with another respective input operand in an X bit accumulator, where X is greater than K.

67.

发明授权
Apparatus and method of improved permute instructions 有权

公开(公告)号：US10474459B2

公开(公告)日：2019-11-12

申请号：US15808788

申请日：2017-11-09

Applicant: Intel Corporation

Inventor： Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Bret L. Toll , Mark J. Charney , Zeev Sperber , Amit Gradstein

IPC: G06F9/30

Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.

68.

发明申请
CRITICALITY BASED PORT SCHEDULING 审中-公开

公开(公告)号：US20190243684A1

公开(公告)日：2019-08-08

申请号：US15890984

申请日：2018-02-07

Applicant: Intel Corporation

Inventor： Pooja Roy , Jayesh Gaur , Sreenivas Subramoney , Zeev Sperber , Alexandr Titov , Lihu Rappoport , Stanislav Shwartsman , Hong Wang , Adi Yoaz , Ronak Singhal , Robert S. Chappell

IPC: G06F9/48 , G06F9/38

Abstract: A processor including an execution unit, an instruction scheduler circuit to identify a first instruction of an instruction stream, identify a second instruction on which execution of the first instruction depends, and assign a first dispatch priority value to the first instruction and the second instruction, and a dispatch circuit to dispatch, based on the first dispatch priority value, the first instruction and the second instruction to an instruction execution circuit.

69.

发明授权
Packed rotate processors, methods, systems, and instructions 有权

公开(公告)号：US10324718B2

公开(公告)日：2019-06-18

申请号：US15864158

申请日：2018-01-08

Applicant: Intel Corporation

Inventor： Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal San Andrian , Suleyman Sair , Bret L. Toll , Zeev Sperber , Amit Gradstein , Asaf Rubinstein

IPC: G06F9/30

Abstract: A method of an aspect includes receiving a masked packed rotate instruction. The instruction indicates a first source packed data including a plurality of packed data elements, a packed data operation mask having a plurality of mask elements, at least one rotation amount, and a destination storage location. A result packed data is stored in the destination storage location in response to the instruction. The result packed data includes result data elements that each correspond to a different one of the mask elements in a corresponding relative position. Result data elements that are not masked out by the corresponding mask element include one of the data elements of the first source packed data in a corresponding position that has been rotated. Result data elements that are masked out by the corresponding mask element include a masked out value. Other methods, apparatus, systems, and instructions are disclosed.

70.

发明申请
ACCELERATING MEMORY FAULT RESOLUTION BY PERFORMING FAST RE-FETCHING 审中-公开

公开(公告)号：US20190171515A1

公开(公告)日：2019-06-06

申请号：US15831195

申请日：2017-12-04

Applicant: Intel Corporation

Inventor： Zeev Sperber , Stanislav Shwartsman , Jared W. Stark, IV , Lihu Rappoport , Igor Yanover , George Leifman

IPC: G06F11/07 , G06F12/02 , G06F9/38 , G06F9/30

CPC classification number: G06F11/0793 , G06F9/30043 , G06F9/30058 , G06F9/3802 , G06F9/3855 , G06F11/0721 , G06F12/0215 , G06F12/0253 , G06F2212/654 , G06F2212/702

Abstract: A method for handling load faults in an out-of-order processor is described. The method includes detecting, by a memory ordering buffer of the out-of-order processor, a load fault corresponding to a load instruction that was executed out-of-order by the out-of-order processor; determining, by the memory ordering buffer, whether instant reclamation is available for resolving the load fault of the load instruction; and performing, in response to determining that instant reclamation is available for resolving the load fault of the load instruction, instant reclamation to re-fetch the load instruction for execution prior to attempting to retire the load instruction.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification