- Patent Title: Systems and methods of data augmentation for pre-trained embeddings
-
Application No.: US17898780Application Date: 2022-08-30
-
Publication No.: US11809828B2Publication Date: 2023-11-07
- Inventor: Keld Lundgaard , Cameron Wolfe
- Applicant: Salesforce, Inc.
- Applicant Address: US CA San Francisco
- Assignee: Salesforce, Inc.
- Current Assignee: Salesforce, Inc.
- Current Assignee Address: US CA San Francisco
- Agency: Butzel Long
- Main IPC: G06F40/00
- IPC: G06F40/00 ; G06F40/30 ; G06F40/151 ; G06F17/18 ; G06N3/08 ; G06N20/10 ; G06N3/04 ; G06F18/214 ; G06F18/25 ; G06F18/2431 ; G06V10/764 ; G06V10/80 ; G06V10/82 ; G06V10/40 ; G06N20/00

Abstract:
Systems and methods are provided for generating textual embeddings by tokenizing text data and generating vectors to be provided to a transformer system, where the textual embeddings are vector representations of semantic meanings of text that is part of the text data. The vectors may be averaged for every token of the generated textual embeddings and concatenating average output activations of two layers of the transformer system. Image embeddings may be generated with a convolutional neural network (CNN) from image data, wherein the image embeddings are vector representations of the images that are part of the image data. The textual embeddings and image embeddings may be combined to form combined embeddings to be provided to the transformer system.
Public/Granted literature
- US20230039734A1 SYSTEMS AND METHODS OF DATA AUGMENTATION FOR PRE-TRAINED EMBEDDINGS Public/Granted day:2023-02-09
Information query