Invention Grant
- Patent Title: Multi-scale transformer for image analysis
-
Application No.: US17787699Application Date: 2021-07-01
-
Publication No.: US11887270B2Publication Date: 2024-01-30
- Inventor: Junjie Ke , Feng Yang , Qifei Wang , Yilin Wang , Peyman Milanfar
- Applicant: Google LLC
- Applicant Address: US CA Mountain View
- Assignee: Google LLC
- Current Assignee: Google LLC
- Current Assignee Address: US CA Mountain View
- Agency: Botus Churchill IP Law LLP
- International Application: PCT/US2021/040111 2021.07.01
- International Announcement: WO2023/277919A 2023.01.05
- Date entered country: 2022-06-21
- Main IPC: G06K9/00
- IPC: G06K9/00 ; G06T3/00 ; G06T3/40 ; G06T7/00

Abstract:
The technology employs a patch-based multi-scale Transformer (300) that is usable with various imaging applications. This avoids constraints on image fixed input size and predicts the quality effectively on a native resolution image. A native resolution image (304) is transformed into a multi-scale representation (302), enabling the Transformer's self-attention mechanism to capture information on both fine-grained detailed patches and coarse-grained global patches. Spatial embedding (316) is employed to map patch positions to a fixed grid, in which patch locations at each scale are hashed to the same grid. A separate scale embedding (318) is employed to distinguish patches coming from different scales in the multiscale representation. Self-attention (508) is performed to create a final image representation. In some instances, prior to performing self-attention, the system may prepend a learnable classification token (322) to the set of input tokens.
Public/Granted literature
- US20230222623A1 MULTI-SCALE TRANSFORMER FOR IMAGE ANALYSIS Public/Granted day:2023-07-13
Information query