-
公开(公告)号:US10592239B2
公开(公告)日:2020-03-17
申请号:US16423702
申请日:2019-05-28
Applicant: Apple Inc.
Inventor: Eric Bainville , Tal Uliel , Erik Norden , Jeffry E. Gonion , Ali Sazegari
Abstract: In an embodiment, a matrix computation engine is configured to perform matrix computations (e.g. matrix multiplications). The matrix computation engine may perform numerous matrix computations in parallel, in an embodiment. More particularly, the matrix computation engine may be configured to perform numerous multiplication operations in parallel on input matrix elements, generating resulting matrix elements. In an embodiment, the matrix computation engine may be configured to accumulate results in a result memory, performing multiply-accumulate operations for each matrix element of each matrix.
-
公开(公告)号:US10362319B2
公开(公告)日:2019-07-23
申请号:US15665404
申请日:2017-07-31
Applicant: Apple Inc.
Inventor: Lars M. Lindberg , Paul S. Chang , Ali Sazegari
IPC: H04N19/186 , H04N19/182 , H04N19/60 , H04N19/61 , H04N19/593
Abstract: Disclosed are techniques for pre-processing an image for compression, e.g., one that includes a plurality of pixels, where each pixel is composed of sub-pixels that include at least an alpha sub-pixel. First, the alpha sub-pixels are separated into a first data stream. Next, invertible transformations are applied to the remaining sub-pixels to produce transformed sub-pixels. Next, for each row of the pixels: (i) identifying a predictive function that yields a smallest prediction differential total for the row, (ii) providing an identifier of the predictive function to a second data stream, and (iii) converting the transformed sub-pixels of the pixels in the row into prediction differentials based on the predictive function. Additionally, the prediction differentials for each of the pixels are encoded into first and second bytes that are provided to third and fourth data streams, respectively. In turn, the various data streams are compressed into a compressed image.
-
33.
公开(公告)号:US20180121199A1
公开(公告)日:2018-05-03
申请号:US15629126
申请日:2017-06-21
Applicant: Apple Inc.
Inventor: Tal Uliel , Jeffry E. Gonion , Ali Sazegari , Eric Bainville
CPC classification number: G06F9/30036 , G06F7/483 , G06F7/485 , G06F7/4876 , G06F7/5443 , G06F9/30014
Abstract: In an embodiment, a processor may implement a fused multiply-add (FMA) instruction that accepts vector operands having vector elements with a first precision, and performing both the multiply and add operations at a higher precision. The add portion of the operation may add adjacent pairs of multiplication results from the multiply portion of the operation, which may allow the result to be stored in a vector register of the same overall length as the input vector registers but with fewer, higher precision vector elements, in an embodiment. Additionally, the overall operation may have high accuracy because of the higher precision throughout the operation.
-
公开(公告)号:US20170357894A1
公开(公告)日:2017-12-14
申请号:US15619348
申请日:2017-06-09
Applicant: Apple Inc.
Inventor: Eric Bainville , Ali Sazegari
CPC classification number: G06N3/063 , G06F17/153 , G06N3/0454
Abstract: Convolution processing performance in digital image processing is enhanced using a data packing process for convolutional layers in deep neural networks and corresponding computation kernel code. The data packing process includes an input and weight packing of the input channels of data into a contiguous block of memory in preparation for convolution. In addition, data packing process includes an output unpacking process for unpacking convolved data into output channel blocks of memory, where the input channel block and output channel block sizes are configured for efficient data transfer and data reuse during convolution. The input packing and output packing processes advantageously improve convolution performance and conserve power while satisfying the real-time demands of digital image processing.
-
公开(公告)号:US20170090902A1
公开(公告)日:2017-03-30
申请号:US14941229
申请日:2015-11-13
Applicant: Apple Inc.
Inventor: Eric Bainville , Ali Sazegari
IPC: G06F9/445
Abstract: A novel software updating method is provided. A target file is divided into segments, where some segments are updated by patching, while other segments are updated by archiving. The segmentation of the update allows very large files such as DYLD shared caches to be patched in-place, i.e., by using free space available within the file to perform patching rather than requiring enough free space on disk to store both the new version and the old version of the file. The segmentation of the update also allows each segment to be updated individually by the most optimal update method (copy, patch, or archive) so that the size of the update file can be minimized.
-
公开(公告)号:US20230121984A1
公开(公告)日:2023-04-20
申请号:US18054017
申请日:2022-11-09
Applicant: Apple Inc.
Inventor: Eric Bainville , Ali Sazegari
Abstract: In an embodiment, a processor supports one or more compression assist instructions which may be employed in compression software to improve the performance of the processor when performing compression/decompression. That is, the compression/decompression task may be performed more rapidly and consume less power when the compression assist instructions are employed then when they are not. In some cases, the cost of a more effective, more complex compression algorithm may be reduced to the cost of a less effective, less complex compression algorithm.
-
公开(公告)号:US11042373B2
公开(公告)日:2021-06-22
申请号:US16928752
申请日:2020-07-14
Applicant: Apple Inc.
Inventor: Eric Bainville , Jeffry E. Gonion , Ali Sazegari , Gerard R. Williams, III
Abstract: In an embodiment, a computation engine is configured to perform vector multiplications, producing either vector results or outer product (matrix) results. The instructions provided to the computation engine specify a matrix mode or a vector mode for the instructions. The computation engine performs the specified operation. The computation engine may perform numerous computations in parallel, in an embodiment. In an embodiment, the instructions may also specify an offset with the input memories, providing additional flexibility in the location of operands. More particularly, the computation engine may be configured to perform numerous multiplication operations in parallel and to accumulate results in a result memory, performing multiply-accumulate operations for each matrix/vector element in the targeted locations of the output memory.
-
公开(公告)号:US20210132942A1
公开(公告)日:2021-05-06
申请号:US16949828
申请日:2020-11-16
Applicant: Apple Inc.
Inventor: Eric Bainville , Ali Sazegari
Abstract: A novel software updating method is provided. A target file is divided into segments, where some segments are updated by patching, while other segments are updated by archiving. The segmentation of the update allows very large files such as DYLD shared caches to be patched in-place, i.e., by using free space available within the file to perform patching rather than requiring enough free space on disk to store both the new version and the old version of the file. The segmentation of the update also allows each segment to be updated individually by the most optimal update method (copy, patch, or archive) so that the size of the update file can be minimized.
-
公开(公告)号:US20210072994A1
公开(公告)日:2021-03-11
申请号:US16566344
申请日:2019-09-10
Applicant: Apple Inc.
Inventor: Eric Bainville , Ali Sazegari
Abstract: In an embodiment, a processor supports one or more compression assist instructions which may be employed in compression software to improve the performance of the processor when performing compression/decompression. That is, the compression/decompression task may be performed more rapidly and consume less power when the compression assist instructions are employed then when they are not. In some cases, the cost of a more effective, more complex compression algorithm may be reduced to the cost of a less effective, less complex compression algorithm.
-
公开(公告)号:US10809869B2
公开(公告)日:2020-10-20
申请号:US15700113
申请日:2017-09-09
Applicant: Apple Inc.
Inventor: Lars M. Lindberg , Paul S. Chang , Ali Sazegari
IPC: G06F3/0481 , H04N19/17 , H04L29/06 , H04N19/167 , H04N19/117 , H04N19/23 , H04N19/174
Abstract: Disclosed are techniques for pre-processing layered images prior to compression and distribution. According to some embodiments, a technique can include accessing at least two images of a layered image: (i) a background image, and (ii) one or more layer images. Next, a flattened image is generated based on the at least two images. Next, respective one or more delta layer images are generated for the one or more layer images by: for at least one pixel of each layer image having (i) an alpha sub-pixel set to fully opaque, and (ii) a first color property equivalent to a second color property of a corresponding pixel within the flattened image: setting bits of the first color property of the pixel to the same value (e.g., zero (0) or one (1)). Finally, the one or more delta layer images are compressed and provided to a destination computing device.
-
-
-
-
-
-
-
-
-