Abstract:
A processing system includes a memory and a processing logic operatively coupled to the memory. The processing logic includes a message scheduling module selectively operating in one of a SHA mode or an SM3 mode to generate a sequence of message words based on an incoming message. The processing logic also includes a round computation module selectively operating in one of the SHA mode or the SM3 mode to perform at least one of a message expansion or a message compression based on at least one message word of the sequence of message words.
Abstract:
Embodiments include apparatuses, methods, and systems for a physically unclonable function (PUF) circuit. The PUF circuit may include an array of PUF cells to generate respective PUF bits of an encryption code. Individual PUF cells may include first and second inverters cross-coupled between a bit node and a bit bar node. The individual PUF cells may further include a first pre-charge transistor coupled to the bit node and configured to receive a clock signal via a first clock path, and a second pre-charge transistor coupled to the bit bar node and configured to receive the clock signal via a second clock path. Features and techniques of the PUF cells are disclosed to improve the stability and/or bias strength of the PUF cells, to generate a dark bit mask for the array of PUF cells, and to improve resilience to probing attacks. Other embodiments may be described and claimed.
Abstract:
An apparatus and method for performing parallel decoding of prefix codes such as Huffman codes. For example, one embodiment of an apparatus comprises: a first decompression module to perform a non-speculative decompression of a first portion of a prefix code payload comprising a first plurality of symbols; and a second decompression module to perform speculative decompression of a second portion of the prefix code payload comprising a second plurality of symbols concurrently with the non-speculative decompression performed by the first compression module.
Abstract:
In an embodiment, a processor includes a compression domain threshold filter coupled to a plurality of cores. The compression domain threshold filter is to: receive a sample vector of compressed data to be filtered; calculate, based at least on a first subset of the elements of the sample vector, an estimated upper bound value of a dot product of the sample vector and a steering vector; determine whether the estimated upper bound value of the dot product satisfies a filter threshold value; and in response to a determination that the estimated upper bound value of the dot product does not satisfy the filter threshold value, discard the sample vector without completion of a calculation of the dot product of the sample vector and the steering vector. Other embodiments are described and claimed.
Abstract:
Embodiments include apparatuses, methods, and systems for a physically unclonable function (PUF) circuit. The PUF circuit may include an array of PUF cells to generate respective PUF bits of an encryption code. Individual PUF cells may include first and second inverters cross-coupled between a bit node and a bit bar node. The individual PUF cells may further include a first pre-charge transistor coupled to the bit node and configured to receive a clock signal via a first clock path, and a second pre-charge transistor coupled to the bit bar node and configured to receive the clock signal via a second clock path. Features and techniques of the PUF cells are disclosed to improve the stability and/or bias strength of the PUF cells, to generate a dark bit mask for the array of PUF cells, and to improve resilience to probing attacks. Other embodiments may be described and claimed.
Abstract:
Physically unclonable functions response in memory cells is improved by transistor sizing, transistor threshold voltage (VT) and body bias in the memory cell to improve the reproducibility of the memory cell and multiple Sense Amplifiers (SA) per column to further enhance physically unclonable function entropy. A physically unclonable function exploits a large number of read-sequence-order combinations available in a physically unclonable function memory array to generate an exponentially large challenge-response pair space, without incurring the area and energy costs of hosting and operating an exponentially large memory array.
Abstract:
Methods and apparatus to parallelize data decompression are disclosed. An example method selecting initial starting positions in a compressed data bitstream; adjusting a first one of the initial starting positions to determine a first adjusted starting position by decoding the bitstream starting at a training position in the bitstream, the decoding including traversing the bitstream from the training position as though first data located at the training position is a valid token; outputting first decoded data generated by decoding a first segment of the bitstream starting from the first adjusted starting position; and merging the first decoded data with second decoded data generated by decoding a second segment of the bitstream, the decoding of the second segment starting from a second position in the bitstream and being performed in parallel with the decoding of the first segment, and the second segment preceding the first segment in the bitstream.
Abstract:
A processing system includes a processing core and a hardware accelerator communicatively coupled to the processing core. The hardware accelerator includes a random number generator to generate a byte order indicator. The hardware accelerator also includes a first switching module communicatively coupled to the random value indicator generator. The switching module receives an byte sequence in an encryption round of the cryptographic operation and feeds a portion of the input byte sequence to one of a first substitute box (S-box) module or a second S-box module in view of a byte order indicator value generated by the random number generator.
Abstract:
This application sets forth methods and apparatus to parallelize data decompression. An example method selecting initial starting positions in a compressed data bitstream; adjusting a first one of the initial starting positions to determine a first adjusted starting position by decoding the bitstream starting at a training position in the bitstream, the decoding including traversing the bitstream from the training position as though first data located at the training position is a valid token; outputting first decoded data generated by decoding a first segment of the bitstream starting from the first adjusted starting position; and merging the first decoded data with second decoded data generated by decoding a second segment of the bitstream, the decoding of the second segment starting from a second position in the bitstream and being performed in parallel with the decoding of the first segment, and the second segment preceding the first segment in the bitstream.
Abstract:
A processing system includes a processor to construct an input message comprising a target value and a nonce and a hardware accelerator, communicatively coupled to the processor, implementing a plurality of circuits to perform stage-1 secure hash algorithm (SHA) hash and stage-2 SHA hash, wherein to perform the stage-2 SHA hash, the hardware accelerator is to perform a plurality of rounds of compression on state data stored in a plurality of registers associated with a stage-2 SHA hash circuit using an input value, calculate a plurality of speculative computation bits using a plurality of bits of the state data, and transmit the plurality of speculative computation bits to the processor.