-
公开(公告)号:WO2022265573A2
公开(公告)日:2022-12-22
申请号:PCT/SG2022/050343
申请日:2022-05-23
Applicant: LEMON INC.
Inventor: JIN, Xiaojie , ZHOU, Daquan , LIAN, Xiaochen , YANG, Linjie , FENG, Jiashi
Abstract: A super-network comprising a plurality of layers may be generated. Each layer may comprise cells with different structures. A predetermined number of cells from each layer may be selected. A plurality of cells may be generated based on selected cells using a local mutation model, wherein the local mutation model comprises a mutation window for removing redundant edges from each selected cell. Performance of the plurality of cells may be evaluated using a differentiable fitness scoring function. The operations of the generating a plurality of cells using the local mutation model, the evaluating performance of the plurality of cells using the differentiable fitness scoring function and the selecting the subset of cells based on the evaluation results may be iteratively performed until the super-network converges. A search space for each layer may be generated based on a predetermined top number of cells with largest fitness scores after the super-network converges.
-
2.
公开(公告)号:WO2022260591A1
公开(公告)日:2022-12-15
申请号:PCT/SG2022/050296
申请日:2022-05-10
Applicant: LEMON INC.
Inventor: LIAN, Xiaochen , YANG, Linjie , WANG, Peng , JIN, Xiaojie , DING, Mingyu
Abstract: Systems and methods for searching a search space, specifically neural architecture search, are disclosed. Some examples may include using a first parallel module including a first plurality of stacked searching blocks and a second plurality of stacked searching blocks to output first feature maps of a first resolution and to output second feature maps of a second resolution. In some examples, a fusion module is configured to generate multiscale feature maps by fusing one or more feature maps of the first resolution received from the first parallel module with one or more feature maps of the second resolution received from the first parallel module, and wherein the fusion module is configured to output the multiscale feature maps and output third feature maps of a third resolution. The searching blocks can comprise a transformer with a projection function for learning self-attention in low-dimensional space.
-
公开(公告)号:WO2022260590A1
公开(公告)日:2022-12-15
申请号:PCT/SG2022/050295
申请日:2022-05-10
Applicant: LEMON INC.
Inventor: LIAN, Xiaochen , DING, Mingyu , YANG, Linjie , WANG, Peng , JIN, Xiaojie
Abstract: Systems and methods for obtaining attention features are described. Some examples may include: receiving, at a projector of a transformer, a plurality of tokens associated with image features of a first dimensional space; generating, at the projector of the transformer, projected features by concatenating the plurality of tokens with a positional map, the projected features having a second dimensional space that is less than the first dimensional space; receiving, at an encoder of the transformer, the projected features and generating encoded representations of the projected features using self-attention; decoding, at a decoder of the transformer, the encoded representations and obtaining a decoded output; and projecting the decoded output to the first dimensional space and adding the image features of the first dimensional space to obtain attention features associated with the image features.
-
公开(公告)号:WO2023014292A2
公开(公告)日:2023-02-09
申请号:PCT/SG2022/050530
申请日:2022-07-26
Applicant: LEMON INC.
Inventor: YANG, Linjie , LIN, Peter , SALEEMI, Imran
IPC: G06T7/194 , H04N19/20 , G06N3/044 , G06N3/0455 , G06T2207/10016 , G06T2207/20081 , G06T2207/30196 , G06T3/40 , G06T7/11 , G06V10/454 , G06V10/82 , G06V20/46 , G06V20/695
Abstract: The present disclosure describes techniques of improving video matting. The techniques comprise extracting features from each frame of a video by an encoder of a model, wherein the video comprises a plurality of frames; incorporating, by a decoder of the model, into any particular frame temporal information extracted from one or more frames previous to the particular frame, wherein the particular frame and the one or more previous frames are among the plurality of frames of the video, and the decoder is a recurrent decoder; and generating a representation of a foreground object included in the particular frame by the model, wherein the model is trained using segmentation dataset and matting dataset.
-
公开(公告)号:WO2022186780A1
公开(公告)日:2022-09-09
申请号:PCT/SG2022/050111
申请日:2022-03-03
Applicant: LEMON INC.
Inventor: YANG, Linjie , JIANG, Ziyu , LIU, Ding , WEN, Longyin
Abstract: Systems and method directed to performing video object segmentation are provided. In examples, video data representing a sequence of image frames and video data representing an object mask may be received at a video object segmentation server. Image features may be generated based on a first image frame of the sequence of image frames, image features may be generated based on a second image frame of the sequence of image frames; and object features may be generated based on the object mask. A transform matrix may be computed based on the image features of the first image frame and image features of the second image frame; the transform matrix may be applied to the object features resulting in transformed object features. A predicted object mask associated with the second image frame may be obtained by decoding the transformed object features.
-
-
-
-