Abstract:
A method includes executing a workload on a graphics (GFX) core in a first mode the GFX core comprising a plurality of Subslices wherein each of the plurality of Subslices dissipates power. The method further includes calculating a number of clock cycles, Tfirst mode, required for the GFX core to perform the workload in the first mode during a first decision window comprising a plurality of clock cycles and calculating a number of clock cycles, Tsecond mode, required for the GFX core to perform the workload in a second mode during the first decision window wherein the second mode comprises executing the workload with fewer of the plurality of Subslices receiving power than when executing the workload in the first mode. It is then determined, based in part upon Tfirst mode and Tsecond mode, if an energy savings is possible by transitioning the GFX core to the second mode.
Abstract:
The invention relates to performing adjustable time domain reflectometry (TDR). A TDR pulse count is set to a predetermined number (704). Next, a TDR pulse is transmitted through a cable (706). The width of the TDR pulse is a function of the multiplication of the TDR pulse count with the period of a TDR clock. It is then determined whether the TDR pulse has been reflected back (708). If the TDR pulse has not been reflected, the TDR pulse count is successively increased (714) to successively increase the width of the transmitted TDR pulse until a reflection is detected indicating an open in the cable (710). Furthermore, embodiments of the invention eliminate false detections of cable opens. Moreover, embodiments of the invention can be combined into a line interface unit (LIU) integrated circuit such that TDR functionality can be performed automatically without the use of a technician.
Abstract:
According to some embodiments, performance bottlenecks that arise in particular resources within a graphic processor unit may be alleviated by dynamically rebalancing workloads among the resources, with the goal of removing the current performance bottleneck, while at the same time maintaining power dissipation within a currently allocated power budget. In some embodiments this may be achieved by defining a separate clock domain for each of the plurality of graphics processor resources whose performance may then be rebalanced.
Abstract:
Power gating a portion of a graphics processor may be used to improve performance or to achieve a power budget. A processor granularity, such as a slice or subslice, may be gated.
Abstract:
An apparatus for managing a power consumption of processor or memory circuitry comprising a plurality of processing or memory functional units may be provided. The processor or memory circuitry is arranged to receive electrical power from an alternating current, AC, power source or a battery. The apparatus comprises processing circuitry to: based on an indication that the processor or memory circuitry is receiving electrical power from the AC power source, selectively cause operational electrical power to be provided to a first number of the functional units of the processor or memory circuitry. The processing circuitry is further to: based on an indication that the circuitry is receiving electrical power from the battery, selectively cause operational electrical power to be provided to a second number of the functional units of the processor or memory circuitry, the second number being less than the first number.
Abstract:
In one embodiment execution units, graphics cores, or graphics sub-cores can be dynamically scaled across a frame of graphics operations. Available execution units within each graphics core may be scaled using utilization metrics such as the current utilization rate of the execution units and the submission of new draw calls. In one embodiment, one of more of the sub-cores within each graphics core may be enable or disabled based on current or past utilization of the sub-cores based on a set of current graphics operations.
Abstract:
Described is an apparatus comprising a first circuitry and a second circuitry. The first circuitry may process a sequence of Graphics Processing Unit (GPU) commands including an instruction carrying a flag that indicates a workload characteristic corresponding with the sequence of GPU commands. The second circuitry may initiate a power-directed parameter adjustment based upon the flag.
Abstract:
A processor may operate at a first frequency level for a first time interval. The processor automatically may transition to a sleep state from the first frequency level after the first time interval. Then the processor automatically transitions from the sleep state to the first frequency level after a second time interval. As a result the processor may operate at a reduced power consumption and higher performance.
Abstract:
Methods, systems and computer system products to allow audio decryption and decoding to be performed on a graphics engine instead of on a host processor. This may be accomplished without having to modify media application software. A down codec function driver exposes a down codec to a media application, which may then send encrypted and encoded audio data to the down codec function driver. The down codec function driver may then redirect the audio data to a graphics driver. The graphics driver may then pass the audio data to a graphics engine. The graphics engine may then decrypt and decode the audio data. The decrypted and decoded audio data may be returned to the graphics driver, which may then send the decrypted and decoded audio data to the function driver. The function driver may then pass the decrypted and decoded audio data to the down codec for rendering.
Abstract:
Techniques are disclosed that involve the processing of audio streams. For instance, a host processing platform may receive a content stream that includes an encoded audio stream. In turn, a graphics engine produces from it a decoded audio stream. This producing may involve the graphics engine performing various operations, such as an entropy decoding operation, an inverse quantization operation, and an inverse discrete cosine transform operation. In embodiments, the content stream may further include an encoded video stream. Thus the graphics engine may produce from it a decoded video stream. This audio and video decoding may be performed in parallel.