-
公开(公告)号:US11048562B2
公开(公告)日:2021-06-29
申请号:US15836459
申请日:2017-12-08
Applicant: Apple Inc.
Inventor: Daniel A. Steffen , Pierre Habouzit , Daniel A. Chimene , Jeremy C. Andrus , James M. Magee , Puja Gupta
Abstract: Techniques are disclosed relating to efficiently handling execution of multiple threads to perform various actions. In some embodiments, an application instantiates a queue and a synchronization primitive. The queue maintains a set of work items to be operated on by a thread pool maintained by a kernel. The synchronization primitive controls access to the queue by a plurality of threads including threads of the thread pool. In such an embodiment, a first thread of the application enqueues a work item in the queue and issues a system call to the kernel to request that the kernel dispatch a thread of the thread pool to operate on the first work item. In various embodiments, the dispatched thread is executable to acquire the synchronization primitive, dequeue the work item, and operate on it.
-
公开(公告)号:US20210157748A1
公开(公告)日:2021-05-27
申请号:US17141914
申请日:2021-01-05
Applicant: Apple Inc.
Inventor: Jainam A. Shah , Jeremy C. Andrus , Daniel A. Chimene , Kushal Dalmia , Pierre Habouzit , James M. Magee , Marina Sadini , Daniel A. Steffen
Abstract: A turnstile OS primitive is provided that enables support for owner tracking and waiting. The turnstile primitive enables a common framework that can be adopted across multiple different types of synchronization primitives to provide a common service for priority boosting and wait queuing. A turnstile can also provide a mechanism to enable a turnstile to block on another turnstile, allowing multi-hop priority boosting within a chain of multiple blocking turnstiles.
-
公开(公告)号:US20200379804A1
公开(公告)日:2020-12-03
申请号:US16882092
申请日:2020-05-22
Applicant: Apple Inc.
Inventor: Kushal Dalmia , Jeremy C. Andrus , Daniel A. Chimene , Nigel R. Gamble , James M. Magee , Daniel A. Steffen
Abstract: Embodiments described herein provide multi-level scheduling for threads in a data processing system. One embodiment provides a data processing system comprising one or more processors, a computer-readable memory coupled to the one or more processors, the computer-readable memory to store instructions which, when executed by the one or more processors, configure the one or more processors to receive execution threads for execution on the one or more processors, map the execution threads into a first plurality of buckets based at least in part on a quality of service class of the execution threads, schedule the first plurality of buckets for execution using a first scheduling algorithm, schedule a second plurality thread groups within the first plurality of buckets for execution using a second scheduling algorithm, and schedule a third plurality of threads within the second plurality of thread groups using a third scheduling algorithm.
-
公开(公告)号:US10599481B2
公开(公告)日:2020-03-24
申请号:US15870770
申请日:2018-01-12
Applicant: Apple Inc.
Inventor: Constantin Pistol , Daniel A. Chimene , Jeremy C. Andrus , Russell A. Blaine , Kushal Dalmia
IPC: G06F9/50 , G06F9/48 , G06F1/3234 , G06F1/329 , G06F1/3296 , G06F9/38 , G06F9/26 , G06F9/54 , G06F1/20 , G06F1/324 , G06F1/3206 , G06F9/30
Abstract: Systems and methods are disclosed for scheduling threads on a processor that has at least two different core types, such as an asymmetric multiprocessing system. Each core type can run at a plurality of selectable voltage and frequency scaling (DVFS) states. Threads from a plurality of processes can be grouped into thread groups. Execution metrics are accumulated for threads of a thread group and fed into a plurality of tunable controllers for the thread group. A closed loop performance control (CLPC) system determines a control effort for the thread group and maps the control effort to a recommended core type and DVFS state. A closed loop thermal and power management system can limit the control effort determined by the CLPC for a thread group, and limit the power, core type, and DVFS states for the system. Deferred interrupts can be used to increase performance.
-
公开(公告)号:US20190155656A1
公开(公告)日:2019-05-23
申请号:US16195429
申请日:2018-11-19
Applicant: Apple Inc.
Inventor: Russell A. Blaine , Daniel A. Chimene , Shantonu Sen , James M. Magee
Abstract: Techniques for scheduling threads for execution in a data processing system are described herein. According to one embodiment, in response to a request for executing a thread, a scheduler of an operating system of the data processing system accesses a global run queue to identify a global run entry associated with the highest process priority. The global run queue includes multiple global run entries, each corresponding to one of a plurality of process priorities. A group run queue is identified based on the global run entry, where the group run queue includes multiple threads associated with one of the processes. The scheduler dispatches one of the threads that has the highest thread priority amongst the threads in the group run queue to one of the processor cores of the data processing system for execution.
-
公开(公告)号:US10140157B2
公开(公告)日:2018-11-27
申请号:US14290791
申请日:2014-05-29
Applicant: Apple Inc.
Inventor: Russell A. Blaine , Daniel A. Chimene , Shantonu Sen , James M. Magee
Abstract: Techniques for scheduling threads for execution in a data processing system are described herein. According to one embodiment, in response to a request for executing a thread, a scheduler of an operating system of the data processing system accesses a global run queue to identify a global run entry associated with the highest process priority. The global run queue includes multiple global run entries, each corresponding to one of a plurality of process priorities. A group run queue is identified based on the global run entry, where the group run queue includes multiple threads associated with one of the processes. The scheduler dispatches one of the threads that has the highest thread priority amongst the threads in the group run queue to one of the processor cores of the data processing system for execution.
-
公开(公告)号:US09582326B2
公开(公告)日:2017-02-28
申请号:US14292687
申请日:2014-05-30
Applicant: Apple Inc.
Inventor: Daniel A. Steffen , Matthew W. Wright , Russell A. Blaine, Jr. , Daniel A. Chimene , Kevin J. Van Vechten , Thomas B. Duffy
CPC classification number: G06F9/4893 , G06F9/4881 , G06F9/5038 , G06F2209/5021 , Y02D10/22
Abstract: In one embodiment, tasks executing on a data processing system can be associated with a Quality of Service (QoS) classification that is used to determine the priority values for multiple subsystems of the data processing system. The QoS classifications are propagated when tasks interact and the QoS classes are interpreted a multiple levels of the system to determine the priority values to set for the tasks. In one embodiment, one or more sensors coupled with the data processing system monitor a set of system conditions that are used in part to determine the priority values to set for a QoS class.
Abstract translation: 在一个实施例中,在数据处理系统上执行的任务可以与用于确定数据处理系统的多个子系统的优先级值的服务质量(QoS)分类相关联。 当任务交互并且QoS类被解释为系统的多个级别以确定为任务设置的优先级值时,QoS分类被传播。 在一个实施例中,与数据处理系统耦合的一个或多个传感器监视在一部分中用于确定为QoS类设置的优先级值的一组系统条件。
-
公开(公告)号:US11422857B2
公开(公告)日:2022-08-23
申请号:US16882092
申请日:2020-05-22
Applicant: Apple Inc.
Inventor: Kushal Dalmia , Jeremy C. Andrus , Daniel A. Chimene , Nigel R. Gamble , James M. Magee , Daniel A. Steffen
Abstract: Embodiments described herein provide multi-level scheduling for threads in a data processing system. One embodiment provides a data processing system comprising one or more processors, a computer-readable memory coupled to the one or more processors, the computer-readable memory to store instructions which, when executed by the one or more processors, configure the one or more processors to receive execution threads for execution on the one or more processors, map the execution threads into a first plurality of buckets based at least in part on a quality of service class of the execution threads, schedule the first plurality of buckets for execution using a first scheduling algorithm, schedule a second plurality thread groups within the first plurality of buckets for execution using a second scheduling algorithm, and schedule a third plurality of threads within the second plurality of thread groups using a third scheduling algorithm.
-
公开(公告)号:US10956220B2
公开(公告)日:2021-03-23
申请号:US15870760
申请日:2018-01-12
Applicant: Apple Inc.
Inventor: Jeremy C. Andrus , John G. Dorsey , James M. Magee , Daniel A. Chimene , Cyril de la Cropte de Chanterac , Bryan R. Hinch , Aditya Venkataraman , Andrei Dorofeev , Nigel R. Gamble , Russell A. Blaine , Constantin Pistol , James S. Ismail
IPC: G06F1/3206 , G06F9/50 , G06F9/48 , G06F1/3234 , G06F1/329 , G06F1/3296 , G06F9/38 , G06F9/26 , G06F9/54 , G06F1/20 , G06F1/324 , G06F9/30
Abstract: Systems and methods are disclosed for scheduling threads on a processor that has at least two different core types, such as an asymmetric multiprocessing system. Each core type can run at a plurality of selectable voltage and frequency scaling (DVFS) states. Threads from a plurality of processes can be grouped into thread groups. Execution metrics are accumulated for threads of a thread group and fed into a plurality of tunable controllers for the thread group. A closed loop performance control (CLPC) system determines a control effort for the thread group and maps the control effort to a recommended core type and DVFS state. A closed loop thermal and power management system can limit the control effort determined by the CLPC for a thread group, and limit the power, core type, and DVFS states for the system. Deferred interrupts can be used to increase performance.
-
公开(公告)号:US20210019258A1
公开(公告)日:2021-01-21
申请号:US16513225
申请日:2019-07-16
Applicant: Apple Inc.
Inventor: John G. Dorsey , Aditya Venkataraman , Bryan R. Hinch , Daniel A. Chimene , Andrei Dorofeev , Constantin Pistol
IPC: G06F12/0802
Abstract: A processing system can include a plurality of processing clusters. Each processing cluster can include a plurality of processor cores and a last level cache. Each processor core can include one or more dedicated caches and a plurality of counters. The plurality of counters may be configured to count different types of cache fills. The plurality of counters may be configured to count different types of cache fills, including at least one counter configured to count total cache fills and at least one counter configured to count off-cluster cache fills. Off-cluster cache fills can include at least one of cross-cluster cache fills and cache fills from system memory. The processing system can further include one or more controllers configured to control performance of one or more of the clusters, the processor cores, the fabric, and the memory responsive to cache fill metrics derived from the plurality of counters.
-
-
-
-
-
-
-
-
-