Abstract:
Eine Vorrichtung und ein Verfahren zum Schutz von Sub-Seiten von erweiterten Seitentabellen. Zum Beispiel umfasst eine Ausführungsform einer Vorrichtung: einen Seiten-Fehlschlag-Handhaber zum Durchführen eines Seitenlaufs unter Verwendung einer physikalischen Gastadresse (GPA) und zum Erkennen, ob eine mit der GPA identifizierte Seite mit Sub-Seiten-Berechtigungen abgebildet ist; einen Sub-Seiten-Steuerspeicher zum Speichern mindestens einer GPA und anderer Sub-Seitebezogener Daten; wobei der Seiten-Fehlschlag-Handhaber vorgesehen ist, um zu bestimmen, ob die GPA in dem Sub-Seiten-Steuerspeicher programmiert ist; und der Seiten-Fehlschlag-Handhaber vorgesehen ist, eine Übersetzung an einen Adressenübersetzungspuffer (TLB) mit einer Sub-Seiten-Schutzangabe zu senden, die eingestellt ist, um eine Übereinstimmung der Sub-Seiten-bezogenen Daten zu bewirken, wenn ein TLB-Treffer vorliegt.
Abstract:
A vector functional unit implemented on a semiconductor chip to perform vector operations of dimension N is described. The vector functional unit includes N functional units. Each of the N functional units have logic circuitry to perform: a first integer multiply add instruction that presents highest ordered bits but not lowest ordered bits of a first integer multiply add calculation, and, a second integer multiply add instruction that presents lowest ordered bits but not highest ordered bits of a second integer multiply add calculation.
Abstract:
Es wird ein Verfahren zum Ausführen von Vektoroperationen auf einem Halbleiterchip beschrieben. Das Verfahren umfasst das Ausführen einer ersten Vektoranweisung mit einer Vektor-Funktionseinheit, die auf dem Halbleiterchip implementiert ist, und das Ausführen einer zweiten Vektoranweisung mit der Vektor-Funktionseinheit. Die erste Vektoranweisung ist eine Multiply-Add-Vektoranweisung. Die zweite Vektoranweisung ist eine Vektoranweisung zum Zählen von führenden Nullen.
Abstract:
A semiconductor processor is described. The semiconductor processor includes logic circuitry to perform a logical reduction instruction. The logic circuitry has swizzle circuitry to swizzle a vector's elements so as to form a swizzle vector. The logic circuitry also has vector logic circuitry to perform a vector logic operation on said vector and said swizzle vector.
Abstract:
A vector functional unit implemented on a semiconductor chip to perform vector operations of dimension N is described. The vector functional unit includes N functional units. Each of the N functional units have logic circuitry to perform: a first integer multiply add instruction that presents highest ordered bits but not lowest ordered bits of a first integer multiply add calculation, and, a second integer multiply add instruction that presents lowest ordered bits but not highest ordered bits of a second integer multiply add calculation.
Abstract:
A semiconductor processor is described. The semiconductor processor includes logic circuitry to perform a logical reduction instruction. The logic circuitry has swizzle circuitry to swizzle a vector's elements so as to form a swizzle vector. The logic circuitry also has vector logic circuitry to perform a vector logic operation on said vector and said swizzle vector.
Abstract:
A vector functional unit implemented on a semiconductor chip to perform vector operations of dimension N is described. The vector functional unit includes N functional units. Each of the N functional units have logic circuitry to perform: a first integer multiply add instruction that presents highest ordered bits but not lowest ordered bits of a first integer multiply add calculation, and, a second integer multiply add instruction that presents lowest ordered bits but not highest ordered bits of a second integer multiply add calculation.
Abstract:
A method and system to optimize prefetching of cache memory lines in a processing unit. The processing unit has logic to determine whether a vector memory operand is cached in two or more adjacent cache memory lines. In one embodiment of the invention, the determination of whether the vector memory operand is cached in two or more adjacent cache memory lines is based on the size and the starting address of the vector memory operand. In one embodiment of the invention, the pre-fetching of the two or more adjacent cache memory lines that cache the vector memory operand is performed using a single instruction that uses one issue slot and one data cache memory execution slot. By doing so, it avoids additional software prefetching instructions or operations to read a single vector memory operand when the vector memory operand is cached in more than one cache memory line.
Abstract:
A method of performing vector operations on a semiconductor chip is described. The method includes performing a first vector instruction with a vector functional unit implemented on the semiconductor chip and performing a second vector instruction with the vector functional unit. The first vector instruction is a vector multiply add instruction. The second vector instruction is a vector leading zeros count instruction.