-
公开(公告)号:US20190310854A1
公开(公告)日:2019-10-10
申请号:US15946719
申请日:2018-04-05
Applicant: Apple Inc.
Inventor: Eric Bainville , Tal Uliel , Jeffry E. Gonion , Ali Sazegari , Erik K. Norden
Abstract: In an embodiment, a computation engine may perform computations on input vectors having vector elements of a first precision and data type. The computation engine may convert the vector elements from the first precision to a second precision and may also interleave the vector elements as specified by an instruction issued by the processor to the computation engine. The interleave may be based on a ratio of a result precision and the second precision. An extract instruction may be supported to extract results from the computations and convert and deinterleave the vector elements to to provide a compact result in a desired order.
-
公开(公告)号:US20190034333A1
公开(公告)日:2019-01-31
申请号:US15663115
申请日:2017-07-28
Applicant: Apple Inc.
Inventor: Ali Sazegari , Charles E. Tucker , Jeffry E. Gonion , Gerard R. Williams, III , Chris Cheng-Chieh Lee
CPC classification number: G06F12/08 , G06F12/00 , G06F12/0886 , G06F13/00 , G06F2212/1016 , G06F2212/401 , H03M7/30 , H03M7/3088
Abstract: Systems, apparatuses, and methods for efficiently moving data for storage and processing are described. In various embodiments, a compression unit within a processor includes multiple hardware lanes, selects two or more input words to compress, and for assigns them to two or more of the multiple hardware lanes. As each assigned input word is processed, each word is compared to an entry of a plurality of entries of a table. If it is determined that each of the assigned input words indexes the same entry of the table, the hardware lane with the oldest input word generates a single read request for the table entry and the hardware lane with the youngest input word generates a single write request for updating the table entry upon completing compression. Each hardware lane generates a compressed packet based on its assigned input word.
-
公开(公告)号:US09792109B2
公开(公告)日:2017-10-17
申请号:US14941269
申请日:2015-11-13
Applicant: Apple Inc.
Inventor: Eric Bainville , Ali Sazegari
Abstract: A novel method for updating a bundle of files from an update package that minimize the free space requirement on disk is provided. The method segments the update of the entire package and performs the update in multiple passes. The method divide the archive payload of the entire update package into pieces and expand one piece of the archive in each pass. At the end of each pass, some embodiments remove from the disk the archive piece expanded in that pass in order to free additional space for the next pass.
-
公开(公告)号:US20240103858A1
公开(公告)日:2024-03-28
申请号:US18045928
申请日:2022-10-12
Applicant: Apple Inc.
Inventor: Ali Sazegari , Matthew L. Badin
CPC classification number: G06F9/3001 , G06F9/3013 , G06F17/16
Abstract: Techniques are disclosed relating to instruction set architecture support for matrix manipulations. In disclosed embodiments, front-end circuitry is configured to fetch and decode a matrix multiply instruction for execution, including to encode a given matrix input operand of the matrix multiply instruction to identify one or more vector registers defined according to an instruction set architecture. In some embodiments, datapath circuitry is configured to execute the matrix multiply instruction, where during execution of the instruction, the one or more vector registers corresponding to the given matrix operand are mapped within the datapath circuitry to at least two dimensions of the given matrix operand. In some embodiments, power management circuitry is configured to, during execution of the instruction, operate at least a portion of the front-end circuitry in a reduced-power mode. Disclosed techniques may advantageously increase throughput and reduce power consumption, relative to traditional implementations using vector operations.
-
公开(公告)号:US20240094989A1
公开(公告)日:2024-03-21
申请号:US18045577
申请日:2022-10-11
Applicant: Apple Inc.
Inventor: Ali Sazegari , Segev Elmalem , O-Cheng Chang , Jingwei Zhang , Ido Soffair , Aaftab A. Munshi
IPC: G06F7/487
CPC classification number: G06F7/4876
Abstract: Techniques are disclosed relating to dedicated power function circuitry for a floating-point power instruction. In some embodiments, execution circuitry is configured to execute a floating-point power instruction to evaluate the power function xy as 2y log2x. In some embodiments, base-2 logarithm circuitry is configured to evaluate a base-2 logarithm for a first input (e.g., log2 x) by determining coefficients for a polynomial function and evaluating the polynomial function using the determined coefficients and the first input. In some embodiments, multiplication circuitry multiplies the base-2 logarithm result by a second input to generate a multiplication result. In some embodiments, base-2 power function circuitry is configured to evaluate a base-2 power function for the multiplication result. Disclosed techniques may advantageously increase performance and reduce power consumption of floating-point power function operations with reasonable area and accuracy, relative to traditional techniques.
-
公开(公告)号:US11822921B2
公开(公告)日:2023-11-21
申请号:US18054017
申请日:2022-11-09
Applicant: Apple Inc.
Inventor: Eric Bainville , Ali Sazegari
CPC classification number: G06F9/30145 , G06F9/30036 , G06F9/30098 , H03M7/3059 , H03M7/3082 , G06F9/30018
Abstract: In an embodiment, a processor supports one or more compression assist instructions which may be employed in compression software to improve the performance of the processor when performing compression/decompression. That is, the compression/decompression task may be performed more rapidly and consume less power when the compression assist instructions are employed then when they are not. In some cases, the cost of a more effective, more complex compression algorithm may be reduced to the cost of a less effective, less complex compression algorithm.
-
公开(公告)号:US20230078235A1
公开(公告)日:2023-03-16
申请号:US17588114
申请日:2022-01-28
Applicant: Apple Inc.
Inventor: Sorin Constantin Cismas , Ali Sazegari , Christian Thomas Martelock , Guy Cote
Abstract: Compression techniques are described. In an embodiment, a first plane of sensor data is accessed, the first plane of sensor data is divided into a plurality of slices, each sample is encoded in each slice from the plurality of slices, where encoding a sample include computing a median based prediction for the sample, computing an error for the sample comprising a difference between the sample and the computed median based prediction, determining a context for the sample, selecting a model for the sample by using the determined context, and encoding the computed error by using the selected model.
-
公开(公告)号:US11086625B2
公开(公告)日:2021-08-10
申请号:US16566344
申请日:2019-09-10
Applicant: Apple Inc.
Inventor: Eric Bainville , Ali Sazegari
Abstract: In an embodiment, a processor supports one or more compression assist instructions which may be employed in compression software to improve the performance of the processor when performing compression/decompression. That is, the compression/decompression task may be performed more rapidly and consume less power when the compression assist instructions are employed then when they are not. In some cases, the cost of a more effective, more complex compression algorithm may be reduced to the cost of a less effective, less complex compression algorithm.
-
公开(公告)号:US10990401B2
公开(公告)日:2021-04-27
申请号:US16837631
申请日:2020-04-01
Applicant: Apple Inc.
Inventor: Tal Uliel , Eric Bainville , Jeffry E. Gonion , Ali Sazegari
IPC: G06F9/302 , G06F9/312 , G06F15/76 , G06F17/16 , G06F7/52 , G06F9/38 , G06F15/80 , G06F9/30 , G06F7/544
Abstract: In an embodiment, a computation engine may perform dot product computations on input vectors. The dot product operation may have a first operand and a second operand, and the dot product may be performed on a subset of the vector elements in the first operand and each of the vector elements in the second operand. The subset of vector elements may be separated in the first operand by a stride that skips one or more elements between each element to which the dot product operation is applied. More particularly, in an embodiment, the input operands of the dot product operation may be a first vector having second vectors as elements, and the stride may select a specified element of each second vector.
-
公开(公告)号:US10769065B2
公开(公告)日:2020-09-08
申请号:US16436635
申请日:2019-06-10
Applicant: Apple Inc.
Inventor: Ali Sazegari , Charles E. Tucker , Jeffry E. Gonion , Gerard R. Williams, III , Chris Cheng-Chieh Lee
Abstract: Systems, apparatuses, and methods for efficiently moving data for storage and processing a compression unit within a processor includes multiple hardware lanes, selects two or more input words to compress, and for assigns them to two or more of the multiple hardware lanes. As each assigned input word is processed, each word is compared to an entry of a plurality of entries of a table. If it is determined that each of the assigned input words indexes the same entry of the table, the hardware lane with the oldest input word generates a single read request for the table entry and the hardware lane with the youngest input word generates a single write request for updating the table entry upon completing compression. Each hardware lane generates a compressed packet based on its assigned input word.
-
-
-
-
-
-
-
-
-