Near-memory processing of embeddings method and system for reducing memory size and energy in deep learning-based recommendation systems

Invention Grant

US11755898B1 Near-memory processing of embeddings method and system for reducing memory size and energy in deep learning-based recommendation systems 有权

Please log in to see more content

Patent Title: Near-memory processing of embeddings method and system for reducing memory size and energy in deep learning-based recommendation systems
Application No.: US18308567

Application Date: 2023-04-27
Publication No.: US11755898B1

Publication Date: 2023-09-12
Inventor: Chae Eun Rhee , Myungkeun Cho
Applicant: Inha University Research and Business Foundation
Applicant Address: KR Incheon
Assignee: Inha University Research and Business Foundation
Current Assignee: Inha University Research and Business Foundation
Current Assignee Address: KR Incheon
Agency: Christensen O'Connor Johnson Kindness PLLC
Priority: KR 20220085533 2022.07.12
Main IPC: G06F9/46
IPC: G06F9/46 ; G06N3/063 ; G06N3/09 ; G06F9/50

Near-memory processing of embeddings method and system for reducing memory size and energy in deep learning-based recommendation systems

Abstract:

Provided is a hybrid near-memory processing system including a GPU, a PIM-HBM, a CPU, and a main memory. An embedding vector is loaded through the GPU and the PIM-HBM, an embedding table is divided and stored in the main memory and the HBM in a training process for inference of a recommendation system, an embedding lookup operation is performed in the main memory or the HBM according to a location of a necessary embedding vector in an inference process of the recommendation system, an additional embedding manipulation operation is performed in the CPU and the PIM with respect to the embedding vector of which the embedding lookup operation is completed, embedding vectors processed through embedding manipulation are finally concatenated in the PIM to generate an embedding result, and the embedding result is transmitted to the GPU to derive a final inference result through a top multiplayer perceptron (MLP) process.

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F9/00	程序控制装置，例如，控制单元（用于外部设备的程序控制入G06F13/10）
G06F9/06	.应用存入的程序的，即应用处理设备的内部存储来接收程序并保持程序的
G06F9/46	..多道程序装置