Patent search ap:("The MathWorks Page Inc.") AND inv:"Girish Venkataramani"

1.

发明申请
SYSTEMS AND METHODS FOR CONFIGURING PROGRAMMABLE LOGIC DEVICES FOR DEEP LEARNING NETWORKS 审中-公开

公开(公告)号：US20200151088A1

公开(公告)日：2020-05-14

申请号：US16270082

申请日：2019-02-07

Applicant: The MathWorks, Inc.

Inventor： Yongfeng Gu , Girish Venkataramani , Wang Chen , Bharathi Yogaraj , Yuteng Zhou , Vibha Patil , Anusha Vasantala , Purshottam Vishwakarma

IPC: G06F11/36 , G06F8/41 , G06N3/063 , G06N7/00 , G06F15/78 , G06F5/06

Abstract: Systems and methods may configure a programmable logic device to efficiently run a deep learning (DL) network. Architecture code and algorithmic code may be generated. The architecture code may define convolutional and fully connected processor cores structured to run the layers of a Deep Neural Network (DNN). The processor cores may be interconnected by a First In First Out (FIFO) memory. The architecture code may also define stride-efficient memories for implementing convolution. The algorithmic code may include configuration instructions for running the DNN's layers at the processor cores. The algorithmic code may also include a schedule for executing the configuration instructions on the processor cores, for moving network parameters to the processor cores, and for transferring outputs between the layers.

2.

发明授权
Systems and methods for hardware resource sharing 有权
Title translation: 硬件资源共享的系统和方法

公开(公告)号：US09436441B1

公开(公告)日：2016-09-06

申请号：US14098016

申请日：2013-12-05

Applicant: The MathWorks, Inc.

Inventor： Girish Venkataramani

IPC: G06F9/44

CPC classification number: G06F8/34 , G06F8/443

Abstract: A system and method optimizes hardware description generated from a graphical program or model having oversampling constraints automatically. The system may include a streaming optimizer, a resource sharing optimizer, a delay balancing engine, and a global scheduler. The streaming optimizer may transform vector data paths to scalar or smaller-sized vector data paths. The resource sharing optimizer may replace multiple, functionally equivalent blocks with a single shared block. The delay balancing may insert one or more elements to correct for data path misalignment. The global scheduler may place portions of the program or model into conditional execution sections and create control logic that controls the model sample times or steps that the portions are enabled. A validation model, a report, or hardware description code that utilizes fewer hardware resources may be generated from a modified version of the model that is created.

Abstract translation: 系统和方法优化从具有自动过采样约束的图形程序或模型生成的硬件描述。系统可以包括流优化器，资源共享优化器，延迟平衡引擎和全局调度器。流优化器可以将矢量数据路径变换为标量或较小尺寸的矢量数据路径。资源共享优化器可以用单个共享块来替换多个功能上等效的块。延迟平衡可以插入一个或多个元件以校正数据路径未对准。全局调度器可以将程序或模型的一部分放置到条件执行部分中，并创建控制逻辑，以控制模型采样次数或部分启用的步骤。可以从创建的模型的修改版本生成使用较少硬件资源的验证模型，报告或硬件描述代码。

3.

发明申请
MODEL-BASED RETIMING WITH FUNCTIONAL EQUIVALENCE CONSTRAINTS 有权
Title translation: 基于模型的具有功能等效约束的消除

公开(公告)号：US20150178418A1

公开(公告)日：2015-06-25

申请号：US14640239

申请日：2015-03-06

Applicant: The MathWorks, Inc.

Inventor： Yongfeng Gu , Girish Venkataramani

IPC: G06F17/50

CPC classification number: G06F17/505 , G06F17/5022 , G06F17/504 , G06F17/5081

Abstract: A system and method tests for functional equivalence prior to automatically retiming a high-level specification. An Intermediate Representation (IR) includes one or more graphs or trees based on the high-level specification. A functional equivalence (FE) analyzer determines whether one or more components in the graph meet certain value and state conditions and thus is a candidate for retiming. A scheduler can use components that fail FE as a retiming boundary.

Abstract translation: 在自动重新定义高级规范之前，系统和方法测试功能等效性。中间表示（IR）包括基于高级规范的一个或多个图或树。功能等价（FE）分析器确定图中的一个或多个组件是否符合某些值和状态条件，因此是重新定时的候选者。调度程序可以使用将FE失败的组件作为重定时边界。

4.

发明授权
Systems and methods for sharing resources having different data types 有权

公开(公告)号：US10423733B1

公开(公告)日：2019-09-24

申请号：US15099111

申请日：2016-04-14

Applicant: The Mathworks, Inc.

Inventor： Girish Venkataramani , Yongfeng Gu , Rama Kokku , Sanmukh Rao Kuppannagari

IPC: G06F17/50 , G06F8/30

Abstract: A system and method generates optimized code for a source model. The system may include a resource sharing optimizer that evaluates the source model and replaces multiple model elements of the source model that are functionally equivalent with a single shared model element. The model elements replaced with the single shared model element may have different fixed point data types. The resource sharing optimizer may convert some of the fixed point data types to a common fixed point data type.

5.

发明授权
Systems and methods for mapping executable models to programmable logic device resources 有权

公开(公告)号：US10114917B1

公开(公告)日：2018-10-30

申请号：US15225193

申请日：2016-08-01

Applicant: The MathWorks, Inc.

Inventor： Girish Venkataramani , Purshottam Vishwakarma , Rama Kokku

IPC: G06F17/50

Abstract: Systems and methods automatically generate code from an executable model. The code may be generated from one or more in-memory representations constructed for the model. The in-memory representations may be analyzed, and portions that can be mapped to DSP slices of a programmable logic device may be identified. The portions may be modified based on information for a particular programmable logic device, such as the structure of the device's DSP slices. The modifications may ensure that elements of the generated code get mapped to DSP slices, when the generated code is used to synthesize the programmable logic device.

6.

发明授权
Resource sharing workflows within executable graphical models 有权
Title translation: 可执行图形模型中的资源共享工作流程

公开(公告)号：US09298862B1

公开(公告)日：2016-03-29

申请号：US14245629

申请日：2014-04-04

Applicant: The MathWorks, Inc.

Inventor： Girish Venkataramani , Kiran Kintali

IPC: G06F17/50

CPC classification number: G06F17/5009 , G06F8/35 , G06F17/5045 , G06F17/5054

Abstract: A system and method optimizes hardware description generated from a graphical program or model automatically. The system may include a streaming optimizer, a resource sharing optimizer and a delay balancing engine. The streaming optimizer transforms one or more vector data paths in the source model to scalar data paths or to a smaller-sized vector data paths. The resource sharing optimizer may replace multiple blocks of the source model that are functionally equivalent with a single shared block. The streaming and resource sharing optimizers may also configure portions of the modified model to execute at a faster rate. The delay balancing engine may examine the modified model to determine whether any delays or latencies have been introduced. If so, the delay balancing engine may insert one or more blocks into the modified model to correct for any data path misalignment caused by the introduction of the delays or latencies. A validation model, a report, or hardware description code that utilizes fewer hardware resources may be generated from the modified model.

Abstract translation: 系统和方法自动优化从图形程序或模型生成的硬件描述。系统可以包括流优化器，资源共享优化器和延迟平衡引擎。流优化器将源模型中的一个或多个向量数据路径转换为标量数据路径或更小尺寸的向量数据路径。资源共享优化器可以替换与单个共享块功能等同的源模型的多个块。流和资源共享优化器还可以配置经修改的模型的部分以更快的速率执行。延迟平衡引擎可以检查修改的模型以确定是否引入了任何延迟或延迟。如果是这样，延迟平衡引擎可以将一个或多个块插入到经修改的模型中，以校正由引入延迟或延迟引起的任何数据路径未对准。可以从修改的模型生成利用较少硬件资源的验证模型，报告或硬件描述代码。

7.

发明授权
Systems and methods for generating code for parallel processing units 有权

公开(公告)号：US10949182B2

公开(公告)日：2021-03-16

申请号：US15816377

申请日：2017-11-17

Applicant: The MathWorks, Inc.

Inventor： Girish Venkataramani , Rama P. Kokku , Jayaprabha Shankar , James L. Brock , Chun-Yu Shei , Vijaya Raghavan

IPC: G06F8/41

Abstract: Systems and methods generate code from a source program where the generated code may be compiled and executed on a Graphics Processing Unit (GPU). A parallel loop analysis check may be performed on regions of the source program identified for parallelization. One or more optimizations also may be applied to the source program that convert mathematical operations into a parallel form. The source program may be partitioned into segments for execution on a host and a device. Kernels may be created for the segments to be executed on the device. The size of the kernels may be determined, and memory transfers between the host and device may be optimized.

8.

发明授权
Systems and methods for estimating performance characteristics of hardware implementations of executable models 有权

公开(公告)号：US10078717B1

公开(公告)日：2018-09-18

申请号：US14562647

申请日：2014-12-05

Applicant: The MathWorks, Inc.

Inventor： Girish Venkataramani , Yongfeng Gu , Rama Kokku

IPC: G06F17/50

CPC classification number: G06F17/505 , G06F17/5031 , G06F17/5081 , G06F2217/80 , G06F2217/82 , G06F2217/84

Abstract: Systems and methods automatically generate optimized hardware description language code for a model created in a modeling environment. A training tool selects and provides scripts to a hardware synthesis tool chain that direct the tool chain to synthesize hardware components for core components of the modeling environment. A report generated by the tool chain is evaluated to extract performance data for the core components, and the performance data is stored in a library. An optimization tool estimates the performance of the model using the performance data in the library. Based on the performance estimate and an analysis of the model, the optimization tool selects an optimization technique which it applies to the model generating a revised. Estimating performance, and selecting and applying optimizations may be repeated until a performance constraint is satisfied or a termination criterion is met.

9.

发明申请
SYSTEMS AND METHODS FOR AUTOMATICALLY GENERATING CODE FOR DEEP LEARNING SYSTEMS 审中-公开

公开(公告)号：US20180136912A1

公开(公告)日：2018-05-17

申请号：US15816606

申请日：2017-11-17

Applicant: The MathWorks, Inc.

Inventor： Girish Venkataramani , Rama P. Kokku , Jayaprabha Shankar , James L. Brock , Chun-Yu Shei , Vijaya Raghavan , Yaohung Tsai

IPC: G06F8/35 , G06N3/10 , G06N3/04 , G06N3/08 , G06F9/445

CPC classification number: G06F8/35 , G06F8/20 , G06F8/30 , G06F9/44563 , G06N3/04 , G06N3/0454 , G06N3/0481 , G06N3/08 , G06N3/10 , G06N3/105

Abstract: Systems and methods may automatically generate code for deep learning networks. The systems methods may provide a code generation framework for generating target specific code. The code generation framework may include one or more predefined class hierarchies for constructing objects of the generated code. The objects of the class hierarchies may provide an interface to predefined libraries of deep learning functions optimized for use on a target platform. The systems and methods may perform one or more optimizations on the code being generated.

10.

发明授权
Utilizing clock rate pipelining to generate code for multi-rate systems 有权

公开(公告)号：US09846571B1

公开(公告)日：2017-12-19

申请号：US14596443

申请日：2015-01-14

Applicant: The MathWorks, Inc.

Inventor： Girish Venkataramani , Yongfeng Gu , Wang Chen

IPC: G06F9/44 , G06F11/36

CPC classification number: G06F8/35 , G06F11/3672

Abstract: A device generates a model associated with a multi-rate system. The multi-rate system includes a system associated with a clock rate and a sample rate, and the clock rate is greater than the sample rate. The device identifies the clock rate of the multi-rate system based on the model, and identifies a portion, of the model, associated with the sample rate. The device applies clock rate pipelining to adjust the sample rate associated with the portion of the model so that the sample rate substantially equals the clock rate, and generates code associated with the model and the applied clock rate pipelining.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification