Abstract:
A method, non-transitory computer readable medium, and device that prefetchs includes identifying a candidate data block from one of one or more immediate successor data blocks. The identified candidate data block has a historical access probability value from an initial accessed data block which is higher than a historical access probability value for each of the other immediate successor data blocks and is above a prefetch threshold value. The identifying is repeated until a next identified candidate data block has the historical access probability value below the prefetch threshold value. In the repeating, the identifying next immediate successor data blocks is from the previously identified candidate data block and the historical access probability value for each of the next immediate successor data blocks is determined from the originally accessed data block. The identified candidate data block with the historical access probability value above the prefetch threshold value is fetched.
Abstract:
Fault isolation capabilities made available by user space can be provided for a embedded network storage system without sacrificing efficiency. By giving user space processes direct access to specific devices (e.g., network interface cards and storage adapters), processes in a user space can initiate Input/Output requests without issuing system calls (and entering kernel mode). The multiple user spaces processes can initiate requests serviced by a user space device driver by sharing a read-only address space that maps the entire physical memory one-to-one. In addition, a user space process can initiate communication with another user space process by use of transmit and receive queues similar to transmit and receiver queues used by hardware devices. And, a mechanism of ensuring that virtual addresses that work in one address space reference the same physical page in another address space is used.
Abstract:
A novel technique for improving throughput in a multi-core system in which data is processed according to a producer-consumer relationship by eliminating latencies caused by compulsory cache misses. The producer and consumer entities run as multiple slices of execution. Each such slice has an associated execution context that comprises of the code and data that particular slice would access. The execution contexts of the producer and consumer slices are small enough to fit in the processor caches simultaneously. When a producer entity scheduled on a first core completed production of data elements as constrained by the size of cache memories, a consumer entity is scheduled on that same core to consume the produced data elements. Meanwhile, a second slice of the producer entity is moved to another core and a second slice of a consumer entity is scheduled to consume elements produced by the second slice of the producer.