Patent search ap:("Google LLC") AND inv:"Qifei Wang" Page 1

1.

发明授权
Multi-scale transformer for image analysis 有权

公开(公告)号：US12217382B2

公开(公告)日：2025-02-04

申请号：US18527528

申请日：2023-12-04

Applicant: Google LLC

Inventor： Junjie Ke , Feng Yang , Qifei Wang , Yilin Wang , Peyman Milanfar

IPC: G06K9/00 , G06T3/04 , G06T3/40 , G06T7/00

Abstract: The technology employs a patch-based multi-scale Transformer (300) that is usable with various imaging applications. This avoids constraints on image fixed input size and predicts the quality effectively on a native resolution image. A native resolution image (304) is transformed into a multi-scale representation (302), enabling the Transformer's self-attention mechanism to capture information on both fine-grained detailed patches and coarse-grained global patches. Spatial embedding (316) is employed to map patch positions to a fixed grid, in which patch locations at each scale are hashed to the same grid. A separate scale embedding (318) is employed to distinguish patches coming from different scales in the multiscale representation. Self-attention (508) is performed to create a final image representation. In some instances, prior to performing self-attention, the system may prepend a learnable classification token (322) to the set of input tokens.

2.

发明公开
Multi-scale Transformer for Image Analysis 审中-公开

公开(公告)号：US20240119555A1

公开(公告)日：2024-04-11

申请号：US18527528

申请日：2023-12-04

Applicant: Google LLC

Inventor： Junjie Ke , Feng Yang , Qifei Wang , Yilin Wang , Peyman Milanfar

IPC: G06T3/00 , G06T3/40 , G06T7/00

CPC classification number: G06T3/0012 , G06T3/40 , G06T7/0002 , G06T2207/20016 , G06T2207/20081 , G06T2207/30168

Abstract: The technology employs a patch-based multi-scale Transformer (300) that is usable with various imaging applications. This avoids constraints on image fixed input size and predicts the quality effectively on a native resolution image. A native resolution image (304) is transformed into a multi-scale representation (302), enabling the Transformer's self-attention mechanism to capture information on both fine-grained detailed patches and coarse-grained global patches. Spatial embedding (316) is employed to map patch positions to a fixed grid, in which patch locations at each scale are hashed to the same grid. A separate scale embedding (318) is employed to distinguish patches coming from different scales in the multiscale representation. Self-attention (508) is performed to create a final image representation. In some instances, prior to performing self-attention, the system may prepend a learnable classification token (322) to the set of input tokens.

3.

发明申请
CROSS-PLATFORM DISTILLATION FRAMEWORK 有权

公开(公告)号：US20240370717A1

公开(公告)日：2024-11-07

申请号：US18313189

申请日：2023-05-05

Applicant: Google LLC

Inventor： Qifei Wang , Yicheng Fan , Wei Xu , Jiayu Ye , Lu Wang , Chuo-Ling Chang , Dana Alon , Erik Nathan Vee , Hongkun Yu , Matthias Grundmann , Shanmugasundaram Ravikumar , Andrew Stephen Tomkins

IPC: G06N3/08

Abstract: A method for a cross-platform distillation framework includes obtaining a plurality of training samples. The method includes generating, using a student neural network model executing on a first processing unit, a first output based on a first training sample. The method also includes generating, using a teacher neural network model executing on a second processing unit, a second output based on the first training sample. The method includes determining, based on the first output and the second output, a first loss. The method further includes adjusting, based on the first loss, one or more parameters of the student neural network model. The method includes repeating the above steps for each training sample of the plurality of training samples.

4.

发明授权
Learnable cost volume for determining pixel correspondence 有权

公开(公告)号：US11790550B2

公开(公告)日：2023-10-17

申请号：US17292647

申请日：2020-07-08

Applicant: Google LLC

Inventor： Taihong Xiao , Deqing Sun , Ming-Hsuan Yang , Qifei Wang , Jinwei Yuan

IPC: G06K9/00 , G06T7/593 , G06T7/215

CPC classification number: G06T7/593 , G06T7/215 , G06T2207/10012 , G06T2207/20081

Abstract: A method includes obtaining a first plurality of feature vectors associated with a first image and a second plurality of feature vectors associated with a second image. The method also includes generating a plurality of transformed feature vectors by transforming each respective feature vector of the first plurality of feature vectors by a kernel matrix trained to define an elliptical inner product space. The method additionally includes generating a cost volume by determining, for each respective transformed feature vector of the plurality of transformed feature vectors, a plurality of inner products, wherein each respective inner product of the plurality of inner products is between the respective transformed feature vector and a corresponding candidate feature vector of a corresponding subset of the second plurality of feature vectors. The method further includes determining, based on the cost volume, a pixel correspondence between the first image and the second image.

5.

发明申请
Multi-scale Transformer for Image Analysis 有权

公开(公告)号：US20250124537A1

公开(公告)日：2025-04-17

申请号：US18999336

申请日：2024-12-23

Applicant: Google LLC

Inventor： Junjie Ke , Feng Yang , Qifei Wang , Yilin Wang , Peyman Milanfar

IPC: G06T3/04 , G06T3/40 , G06T7/00

Abstract: The technology employs a patch-based multi-scale Transformer (300) that is usable with various imaging applications. This avoids constraints on image fixed input size and predicts the quality effectively on a native resolution image. A native resolution image (304) is transformed into a multi-scale representation (302), enabling the Transformer's self-attention mechanism to capture information on both fine-grained detailed patches and coarse-grained global patches. Spatial embedding (316) is employed to map patch positions to a fixed grid, in which patch locations at each scale are hashed to the same grid. A separate scale embedding (318) is employed to distinguish patches coming from different scales in the multiscale representation. Self-attention (508) is performed to create a final image representation. In some instances, prior to performing self-attention, the system may prepend a learnable classification token (322) to the set of input tokens.

6.

发明申请
HIGHLY EFFICIENT MODEL FOR VIDEO QUALITY ASSESSMENT 有权

公开(公告)号：US20240422369A1

公开(公告)日：2024-12-19

申请号：US18336577

申请日：2023-06-16

Applicant: Google LLC

Inventor： Yilin Wang , Miao Yin , Qifei Wang , Boqing Gong , Neil Aylon Charles Birkbeck , Balineedu Chowdary Adsumilli

IPC: H04N21/2343 , G06T7/00 , H04N19/132 , H04N21/466 , H04N21/485

Abstract: A method for generating, for a video stream of a first spatial resolution and a first temporal resolution, a first reduced quality steam of a second spatial resolution and a second reduced-quality stream of a second temporal resolution. A first subset of STPs is sampled from the first reduced-quality stream and a second subset of STPs is sampled from the second reduced-quality stream. Using a machine learning model (MLM) the STPs are processed to identify a quality score for each quality-representative STPs that are representative of a quality of the video stream. One or more quality-improving actions for the video stream are identified using the quality scores of the quality-representative STPs.

7.

发明公开
Systems and Methods for Generation of Machine-Learned Multitask Models 审中-公开

公开(公告)号：US20230267307A1

公开(公告)日：2023-08-24

申请号：US18014314

申请日：2020-07-23

Applicant: Google LLC

Inventor： Qifei Wang , Junjie Ke , Grace Chu , Gabriel Mintzer Bender , Luciano Sbaiz , Feng Yang , Andrew Gerald Howard , Alec Michael Go , Jeffrey M. Gilbert , Peyman Milanfar , Joshua William Charles Greaves

IPC: G06N3/045 , G06N3/092 , G06N3/084

CPC classification number: G06N3/045 , G06N3/084 , G06N3/092

Abstract: Systems and methods of the present disclosure are directed to a method for generating a machine-learned multitask model configured to perform tasks. The method can include obtaining a machine-learned multitask search model comprising candidate nodes. The method can include obtaining tasks and machine-learned task controller models associated with the tasks. As an example, for a task, the method can include using the task controller model to route a subset of the candidate nodes in a machine-learned task submodel for the corresponding task. The method can include inputting task input data to the task submodel to obtain a task output. The method can include generating, using the task output, a feedback value based on an objective function. The method can include adjusting parameters of the task controller model based on the feedback value.

8.

发明公开
MULTI-SCALE TRANSFORMER FOR IMAGE ANALYSIS 审中-公开

公开(公告)号：US20230222623A1

公开(公告)日：2023-07-13

申请号：US17787699

申请日：2021-07-01

Applicant: Google LLC

Inventor： Junjie Ke , Feng Yang , Qifei Wang , Yilin Wang , Peyman Milanfar

IPC: G06T3/00 , G06T3/40 , G06T7/00

CPC classification number: G06T3/0012 , G06T3/40 , G06T7/0002 , G06T2207/30168 , G06T2207/20081 , G06T2207/20016

Abstract: The technology employs a patch-based multi-scale Transformer (300) that is usable with various imaging applications. This avoids constraints on image fixed input size and predicts the quality effectively on a native resolution image. A native resolution image (304) is transformed into a multi-scale representation (302), enabling the Transformer's self-attention mechanism to capture information on both fine-grained detailed patches and coarse-grained global patches. Spatial embedding (316) is employed to map patch positions to a fixed grid, in which patch locations at each scale are hashed to the same grid. A separate scale embedding (318) is employed to distinguish patches coming from different scales in the multiscale representation. Self-attention (508) is performed to create a final image representation. In some instances, prior to performing self-attention, the system may prepend a learnable classification token (322) to the set of input tokens.

9.

发明授权
Multi-scale transformer for image analysis 有权

公开(公告)号：US11887270B2

公开(公告)日：2024-01-30

申请号：US17787699

申请日：2021-07-01

Applicant: Google LLC

Inventor： Junjie Ke , Feng Yang , Qifei Wang , Yilin Wang , Peyman Milanfar

IPC: G06K9/00 , G06T3/00 , G06T3/40 , G06T7/00

CPC classification number: G06T3/0012 , G06T3/40 , G06T7/0002 , G06T2207/20016 , G06T2207/20081 , G06T2207/30168

Abstract: The technology employs a patch-based multi-scale Transformer (300) that is usable with various imaging applications. This avoids constraints on image fixed input size and predicts the quality effectively on a native resolution image. A native resolution image (304) is transformed into a multi-scale representation (302), enabling the Transformer's self-attention mechanism to capture information on both fine-grained detailed patches and coarse-grained global patches. Spatial embedding (316) is employed to map patch positions to a fixed grid, in which patch locations at each scale are hashed to the same grid. A separate scale embedding (318) is employed to distinguish patches coming from different scales in the multiscale representation. Self-attention (508) is performed to create a final image representation. In some instances, prior to performing self-attention, the system may prepend a learnable classification token (322) to the set of input tokens.

10.

发明申请
Systems and Methods for Improved Computer Vision in On-Device Applications 有权

公开(公告)号：US20230091374A1

公开(公告)日：2023-03-23

申请号：US17802060

申请日：2020-02-24

Applicant: Google LLC

Inventor： Qifei Wang , Alexander Kuznetsov , Alec Michael Go , Grace Chu , Eunyoung Kim , Feng Yang , Andrew Gerald Howard , Jeffrey M. Gilbert

IPC: G06V30/413 , G06V10/22

Abstract: The present disclosure is directed to object and/or character recognition for use in applications such as computer vision. Advantages of the present disclosure include lightweight functionality that can be used on devices such as smart phones. Aspects of the present disclosure include a sequential architecture where a lightweight machine-learned model can receive an image, detect whether an object is present in one or more regions of the image, and generate an output based on the detection. This output can be applied as a filter to remove image data that can be neglected for more memory intensive machine-learned models applied downstream.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification