Processing computational models in parallel
Abstract:
The present disclosure relates to an artificial intelligence chip for processing computations for machine learning models that provides a compute node and a method of processing a computational model using a plurality of compute nodes in parallel. In some embodiments, the compute node, comprises: a communication interface configured to communicate with one or more other compute nodes; a memory configured to store shared data that is shared with the one or more other compute nodes; and a processor configured to: determine an expected computational load for processing a computational model for input data; obtain a contributable computational load of the compute node and the one or more other compute nodes; and select a master node to distribute the determined expected computational load based on the obtained contributable computational load. Consequently, learning and inference can be performed efficiently on-device.
Public/Granted literature
Information query
Patent Agency Ranking
0/0