MULTIMODAL DIFFUSION MODELS
    92.
    发明公开

    公开(公告)号:US20240265505A1

    公开(公告)日:2024-08-08

    申请号:US18165141

    申请日:2023-02-06

    Applicant: ADOBE INC.

    CPC classification number: G06T5/70 G06T2207/20081 G06T2207/20084

    Abstract: Systems and methods for image processing are described. Embodiments of the present disclosure obtain a noise image and guidance information for generating an image. A diffusion model generates an intermediate noise prediction for the image based on the noise image. A conditioning network generates noise modulation parameters. The intermediate noise prediction and the noise modulation parameters are combined to obtain a modified intermediate noise prediction. The diffusion model generates the image based on the modified intermediate noise prediction, wherein the image depicts a scene based on the guidance information.

    High resolution conditional face generation

    公开(公告)号:US11887216B2

    公开(公告)日:2024-01-30

    申请号:US17455796

    申请日:2021-11-19

    Applicant: ADOBE INC.

    CPC classification number: G06T11/00 G06N3/08 G06V40/168 G06V40/172

    Abstract: The present disclosure describes systems and methods for image processing. Embodiments of the present disclosure include an image processing apparatus configured to generate modified images (e.g., synthetic faces) by conditionally changing attributes or landmarks of an input image. A machine learning model of the image processing apparatus encodes the input image to obtain a joint conditional vector that represents attributes and landmarks of the input image in a vector space. The joint conditional vector is then modified, according to the techniques described herein, to form a latent vector used to generate a modified image. In some cases, the machine learning model is trained using a generative adversarial network (GAN) with a normalization technique, followed by joint training of a landmark embedding and attribute embedding (e.g., to reduce inference time).

Patent Agency Ranking