Systems and methods for image feature recognition using a lensless camera

    公开(公告)号:US12272171B2

    公开(公告)日:2025-04-08

    申请号:US18371186

    申请日:2023-09-21

    Inventor: Zhu Li

    Abstract: Systems and methods are described for generating pixel image data, using a lensless camera, based on light that travels through a mask that with pattern masking the lensless camera. The system applies a transformation function to the pixel image data to generate frequency domain image data. The system inputs the frequency domain image data into a machine learning model, wherein the machine learning model does not have access to data that represents the pattern of the mask. The model is trained using a set of images with the feature that are captured by the flat, lensless camera through the mask. The system processes the frequency domain image data using the machine learning model to determine whether the pixel image data depicts the image feature. The system further performs an action based on determining that the pixel image data depicts the image feature.

    Systems and methods for quality of experience computation

    公开(公告)号:US12192263B2

    公开(公告)日:2025-01-07

    申请号:US17890683

    申请日:2022-08-18

    Inventor: Zhu Li Tao Chen

    Abstract: The system trains a machine learning model using a loss function, with a part that penalizes overall signal loss, and a second part of the loss function that penalizes texture loss. The system computes a first neural feature of a first media frame stored by a media server using the trained machine learning model. The system causes a client device to receive a second media frame as a part of a media stream from the media server where the second frame is a modified version of the first media frame. The system causes the client to compute a second neural feature of the second media frame using the trained machine learning model, and compute a QoE metric based on the first neural feature and the second neural feature. The system receives the QoE metric, and uses it to modify at least one parameter of the media stream.

    SYSTEMS AND METHODS FOR ENCODING THREE-DIMENSIONAL MEDIA CONTENT

    公开(公告)号:US20250056044A1

    公开(公告)日:2025-02-13

    申请号:US18924670

    申请日:2024-10-23

    Inventor: Zhu Li

    Abstract: Systems and methods are provided for encoding a frame of 3D media content. The systems and methods may be configured to access a first frame of 3D media content and generate a data structure for the first frame based on color attributes information of the first frame, wherein each element of the data structure encodes a single color. The systems and methods may be configured to train a machine learning model based on the first frame of 3D media content, wherein the machine learning model is trained to receive as input a coordinate of a voxel of the first frame, and to output an identifier of a particular element in the generated data structure. The systems and methods may be configured to generate encoded data for the first frame based at least in part on weights of the trained machine learning model and the generated data structure.

    SYSTEMS AND METHODS FOR MESH GEOMETRY PREDICTION BASED ON A CENTROID-NORMAL REPRESENTATION

    公开(公告)号:US20240404121A1

    公开(公告)日:2024-12-05

    申请号:US18203897

    申请日:2023-05-31

    Inventor: Zhu Li Tao Chen

    Abstract: Systems and methods are provided for predictive mesh coding based on a centroid-normal (C-N) representation. An encoder generates C-N representations of a high-resolution (hi-res) mesh and a downscaling of the mesh (lo-res mesh), each representation having respective centroids and normals. The encoder generates predicted centroids corresponding to the hi-res mesh based on the lo-res centroids using a centroid prediction model. The encoder generates predicted normals corresponding to the hi-res mesh based on the predicted centroids and lo-res normals using a normal vector prediction model. Residuals are computed for the respective predicted geometry data. The encoder transmits encodings of the lo-res mesh and the residuals for decoding at a client device.

    Systems and methods for mesh geometry prediction for high efficiency mesh coding

    公开(公告)号:US12198273B2

    公开(公告)日:2025-01-14

    申请号:US17974863

    申请日:2022-10-27

    Inventor: Zhu Li Tao Chen

    Abstract: Systems and methods are provided for efficiently encoding geometry information for 3D media content. An illustrative system generates a low-resolution polygon mesh from a high-resolution polygon mesh. The system uses a vertex occupancy prediction network to generate, from vertices of the low-resolution polygon mesh, approximated vertices of the high-resolution polygon mesh. The system uses a connectivity prediction network to generate, from approximated vertices of the high-resolution polygon mesh, approximated connections of the high-resolution polygon mesh. The system computes vertex errors between the approximated vertices and the vertices of the high-resolution polygon mesh, and connectivity errors between the approximated connections and the connections of the high-resolution polygon mesh. The system transmits, to a receiver over a communication network, bitstreams of the low-resolution polygon mesh, the vertex errors, and the connectivity errors for reconstruction of the high-resolution polygon mesh and display of the 3D media content.

    Systems and methods for encoding three-dimensional media content

    公开(公告)号:US12160610B2

    公开(公告)日:2024-12-03

    申请号:US17896755

    申请日:2022-08-26

    Inventor: Zhu Li

    Abstract: Systems and methods are provided for encoding a frame of 3D media content. The systems and methods may be configured to access a first frame of 3D media content and generate a data structure for the first frame based on color attributes information of the first frame, wherein each element of the data structure encodes a single color. The systems and methods may be configured to train a machine learning model based on the first frame of 3D media content, wherein the machine learning model is trained to receive as input a coordinate of a voxel of the first frame, and to output an identifier of a particular element in the generated data structure. The systems and methods may be configured to generate encoded data for the first frame based at least in part on weights of the trained machine learning model and the generated data structure.

    Predictive coding of lenslet images

    公开(公告)号:US12299945B2

    公开(公告)日:2025-05-13

    申请号:US17734611

    申请日:2022-05-02

    Inventor: Zhu Li

    Abstract: Systems, methods and apparatuses are described herein for accessing image data, generated at least in part using a device comprising a lenslet array, determining a plurality of reference pixel blocks of the image data, and determining a prediction block in a vicinity of the reference pixel blocks. Implementing any of the technique(s) described herein, the system or systems may determine, based on the plurality of reference pixel blocks, a first component representing average pixel values of the prediction block, a second component representing low frequency pixel values of the prediction block, and a third component representing high frequency pixel values of the prediction block. The system(s) may determine predicted pixel values of the prediction block based on the first component, the second component and the third component, and encode the image data based at least in part on the predicted pixel values of the prediction block.

    SYSTEMS AND METHODS FOR QUALITY OF EXPERIENCE COMPUTATION

    公开(公告)号:US20250088554A1

    公开(公告)日:2025-03-13

    申请号:US18960888

    申请日:2024-11-26

    Inventor: Zhu Li Tao Chen

    Abstract: The system trains a machine learning model using a loss function, with a part that penalizes overall signal loss, and a second part of the loss function that penalizes texture loss. The system computes a first neural feature of a first media frame stored by a media server using the trained machine learning model. The system causes a client device to receive a second media frame as a part of a media stream from the media server where the second frame is a modified version of the first media frame. The system causes the client to compute a second neural feature of the second media frame using the trained machine learning model, and compute a QoE metric based on the first neural feature and the second neural feature. The system receives the QoE metric, and uses it to modify at least one parameter of the media stream.

Patent Agency Ranking