Training sequence generation neural networks using quality scores

    公开(公告)号:US11699074B2

    公开(公告)日:2023-07-11

    申请号:US16746654

    申请日:2020-01-17

    Applicant: Google LLC

    CPC classification number: G06N3/08

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a sequence generation neural network. One of the methods includes obtaining a batch of training examples; for each of the training examples: processing the training network input in the training example using the neural network to generate an output sequence; for each particular output position in the output sequence: identifying a prefix that includes the system outputs at positions before the particular output position in the output sequence, for each possible system output in the vocabulary, determining a highest quality score that can be assigned to any candidate output sequence that includes the prefix followed by the possible system output, and determining an update to the current values of the network parameters that increases a likelihood that the neural network generates a system output at the position that has a high quality score.

    PROCESSING TEXT SEQUENCES USING NEURAL NETWORKS

    公开(公告)号:US20200026765A1

    公开(公告)日:2020-01-23

    申请号:US16338174

    申请日:2017-10-03

    Applicant: Google LLC

    Abstract: A computer-implemented method for training a neural network that is configured to generate a score distribution over a set of multiple output positions. The neural network is configured to process a network input to generate a respective score distribution for each of a plurality of output positions including a respective score for each token in a predetermined set of tokens that includes n-grams of multiple different sizes. Example methods described herein provide trained neural networks which produce results with improved accuracy compared to the state of the art, e.g. translations that are more accurate compared to the state of the art, or more accurate speech recognition compared to the state of the art.

    Very deep convolutional neural networks for end-to-end speech recognition

    公开(公告)号:US10510004B2

    公开(公告)日:2019-12-17

    申请号:US16380101

    申请日:2019-04-10

    Applicant: Google LLC

    Abstract: A speech recognition neural network system includes an encoder neural network and a decoder neural network. The encoder neural network generates an encoded sequence from an input acoustic sequence that represents an utterance. The input acoustic sequence includes a respective acoustic feature representation at each of a plurality of input time steps, the encoded sequence includes a respective encoded representation at each of a plurality of time reduced time steps, and the number of time reduced time steps is less than the number of input time steps. The encoder neural network includes a time reduction subnetwork, a convolutional LSTM subnetwork, and a network in network subnetwork. The decoder neural network receives the encoded sequence and processes the encoded sequence to generate, for each position in an output sequence order, a set of substring scores that includes a respective substring score for each substring in a set of substrings.

    Sequence modeling using imputation
    25.
    发明授权

    公开(公告)号:US12242818B2

    公开(公告)日:2025-03-04

    申请号:US17797872

    申请日:2021-02-08

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for sequence modeling. One of the methods includes receiving an input sequence having a plurality of input positions; determining a plurality of blocks of consecutive input positions; processing the input sequence using a neural network to generate a latent alignment, comprising, at each of a plurality of input time steps: receiving a partial latent alignment from a previous input time step; selecting an input position in each block, wherein the token at the selected input position of the partial latent alignment in each block is a mask token; and processing the partial latent alignment and the input sequence using the neural network to generate a new latent alignment, wherein the new latent alignment comprises, at the selected input position in each block, an output token or a blank token; and generating, using the latent alignment, an output sequence.

    Image Enhancement via Iterative Refinement based on Machine Learning Models

    公开(公告)号:US20230067841A1

    公开(公告)日:2023-03-02

    申请号:US17391150

    申请日:2021-08-02

    Applicant: Google LLC

    Abstract: A method includes receiving, by a computing device, training data comprising a plurality of pairs of images, wherein each pair comprises an image and at least one corresponding target version of the image. The method also includes training a neural network based on the training data to predict an enhanced version of an input image, wherein the training of the neural network comprises applying a forward Gaussian diffusion process that adds Gaussian noise to the at least one corresponding target version of each of the plurality of pairs of images to enable iterative denoising of the input image, wherein the iterative denoising is based on a reverse Markov chain associated with the forward Gaussian diffusion process. The method additionally includes outputting the trained neural network.

    Processing text sequences using neural networks

    公开(公告)号:US11182566B2

    公开(公告)日:2021-11-23

    申请号:US16338174

    申请日:2017-10-03

    Applicant: Google LLC

    Abstract: A computer-implemented method for training a neural network that is configured to generate a score distribution over a set of multiple output positions. The neural network is configured to process a network input to generate a respective score distribution for each of a plurality of output positions including a respective score for each token in a predetermined set of tokens that includes n-grams of multiple different sizes. Example methods described herein provide trained neural networks which produce results with improved accuracy compared to the state of the art, e.g. translations that are more accurate compared to the state of the art, or more accurate speech recognition compared to the state of the art.

Patent Agency Ranking