-
公开(公告)号:US10157045B2
公开(公告)日:2018-12-18
申请号:US15816606
申请日:2017-11-17
Applicant: The MathWorks, Inc.
Inventor: Girish Venkataramani , Rama P. Kokku , Jayaprabha Shankar , James L. Brock , Chun-Yu Shei , Vijaya Raghavan , Yaohung Tsai
Abstract: Systems and methods may automatically generate code for deep learning networks. The systems methods may provide a code generation framework for generating target specific code. The code generation framework may include one or more predefined class hierarchies for constructing objects of the generated code. The objects of the class hierarchies may provide an interface to predefined libraries of deep learning functions optimized for use on a target platform. The systems and methods may perform one or more optimizations on the code being generated.
-
公开(公告)号:US20180157471A1
公开(公告)日:2018-06-07
申请号:US15816377
申请日:2017-11-17
Applicant: The MathWorks, Inc.
Inventor: Girish Venkataramani , Rama P. Kokku , Jayaprabha Shankar , James L. Brock , Chun-Yu Shei , Vijaya Raghavan
IPC: G06F8/41
CPC classification number: G06F8/452 , G06F8/4434 , G06F8/4441 , G06F8/445 , G06F8/456 , G06F8/458
Abstract: Systems and methods generate code from a source program where the generated code may be compiled and executed on a Graphics Processing Unit (GPU). A parallel loop analysis check may be performed on regions of the source program identified for parallelization. One or more optimizations also may be applied to the source program that convert mathematical operations into a parallel form. The source program may be partitioned into segments for execution on a host and a device. Kernels may be created for the segments to be executed on the device. The size of the kernels may be determined, and memory transfers between the host and device may be optimized.
-
公开(公告)号:US10949182B2
公开(公告)日:2021-03-16
申请号:US15816377
申请日:2017-11-17
Applicant: The MathWorks, Inc.
Inventor: Girish Venkataramani , Rama P. Kokku , Jayaprabha Shankar , James L. Brock , Chun-Yu Shei , Vijaya Raghavan
IPC: G06F8/41
Abstract: Systems and methods generate code from a source program where the generated code may be compiled and executed on a Graphics Processing Unit (GPU). A parallel loop analysis check may be performed on regions of the source program identified for parallelization. One or more optimizations also may be applied to the source program that convert mathematical operations into a parallel form. The source program may be partitioned into segments for execution on a host and a device. Kernels may be created for the segments to be executed on the device. The size of the kernels may be determined, and memory transfers between the host and device may be optimized.
-
公开(公告)号:US20180136912A1
公开(公告)日:2018-05-17
申请号:US15816606
申请日:2017-11-17
Applicant: The MathWorks, Inc.
Inventor: Girish Venkataramani , Rama P. Kokku , Jayaprabha Shankar , James L. Brock , Chun-Yu Shei , Vijaya Raghavan , Yaohung Tsai
CPC classification number: G06F8/35 , G06F8/20 , G06F8/30 , G06F9/44563 , G06N3/04 , G06N3/0454 , G06N3/0481 , G06N3/08 , G06N3/10 , G06N3/105
Abstract: Systems and methods may automatically generate code for deep learning networks. The systems methods may provide a code generation framework for generating target specific code. The code generation framework may include one or more predefined class hierarchies for constructing objects of the generated code. The objects of the class hierarchies may provide an interface to predefined libraries of deep learning functions optimized for use on a target platform. The systems and methods may perform one or more optimizations on the code being generated.
-
-
-