Method and apparatus for accelerating distributed training of a deep neural network

    公开(公告)号:US11514309B2

    公开(公告)日:2022-11-29

    申请号:US16215033

    申请日:2018-12-10

    Abstract: Embodiments of the present invention provide a method and apparatus for accelerating distributed training of a deep neural network. The method comprises: based on parallel training, the training of deep neural network is designed as a distributed training mode. A deep neural network to be trained is divided into multiple sub-networks. A set of training samples is divided into multiple subsets of samples. The training of the deep neural network to be trained is performed with the multiple subsets of samples based on a distributed cluster architecture and a preset scheduling method. The multiple sub-networks are simultaneously trained so as to fulfill the distributed training of the deep neural network. The utilization of the distributed cluster architecture and the preset scheduling method may reduce, through data localization, the effect of network delay on the sub-networks under distributed training; adapt the training strategy in real time; and synchronize the sub-networks trained in parallel. As such, the time required for the distributed training of the deep neural network may be reduced and the training efficiency of the deep neural network may be improved.

Patent Agency Ranking