-
公开(公告)号:US20230325658A1
公开(公告)日:2023-10-12
申请号:US18010426
申请日:2021-09-02
Applicant: Google LLC
Inventor: Nanxin Chen , Byungha Chun , William Chan , Ron J. Weiss , Mohammad Norouzi , Yu Zhang , Yonghui Wu
CPC classification number: G06N3/08 , G06V10/26 , G06V10/764 , G06V10/82 , G10L13/02 , G10L25/18 , G10L25/30
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating outputs conditioned on network inputs using neural networks. In one aspect, a method comprises obtaining the network input; initializing a current network output; and generating the final network output by updating the current network output at each of a plurality of iterations, wherein each iteration corresponds to a respective noise level, and wherein the updating comprises, at each iteration: processing a model input for the iteration comprising (i) the current network output and (ii) the network input using a noise estimation neural network that is configured to process the model input to generate a noise output, wherein the noise output comprises a respective noise estimate for each value in the current network output; and updating the current network output using the noise estimate and the noise level for the iteration.
-
公开(公告)号:US11699074B2
公开(公告)日:2023-07-11
申请号:US16746654
申请日:2020-01-17
Applicant: Google LLC
Inventor: Mohammad Norouzi , William Chan , Sara Sabour Rouh Aghdam
IPC: G06N3/08
CPC classification number: G06N3/08
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a sequence generation neural network. One of the methods includes obtaining a batch of training examples; for each of the training examples: processing the training network input in the training example using the neural network to generate an output sequence; for each particular output position in the output sequence: identifying a prefix that includes the system outputs at positions before the particular output position in the output sequence, for each possible system output in the vocabulary, determining a highest quality score that can be assigned to any candidate output sequence that includes the prefix followed by the possible system output, and determining an update to the current values of the network parameters that increases a likelihood that the neural network generates a system output at the position that has a high quality score.
-
公开(公告)号:US20200026765A1
公开(公告)日:2020-01-23
申请号:US16338174
申请日:2017-10-03
Applicant: Google LLC
Inventor: Navdeep Jaitly , Yu Zhang , Quoc V. Le , William Chan
IPC: G06F17/28 , G10L15/16 , G10L15/197 , G06N3/08
Abstract: A computer-implemented method for training a neural network that is configured to generate a score distribution over a set of multiple output positions. The neural network is configured to process a network input to generate a respective score distribution for each of a plurality of output positions including a respective score for each token in a predetermined set of tokens that includes n-grams of multiple different sizes. Example methods described herein provide trained neural networks which produce results with improved accuracy compared to the state of the art, e.g. translations that are more accurate compared to the state of the art, or more accurate speech recognition compared to the state of the art.
-
公开(公告)号:US10510004B2
公开(公告)日:2019-12-17
申请号:US16380101
申请日:2019-04-10
Applicant: Google LLC
Inventor: Navdeep Jaitly , Yu Zhang , William Chan
Abstract: A speech recognition neural network system includes an encoder neural network and a decoder neural network. The encoder neural network generates an encoded sequence from an input acoustic sequence that represents an utterance. The input acoustic sequence includes a respective acoustic feature representation at each of a plurality of input time steps, the encoded sequence includes a respective encoded representation at each of a plurality of time reduced time steps, and the number of time reduced time steps is less than the number of input time steps. The encoder neural network includes a time reduction subnetwork, a convolutional LSTM subnetwork, and a network in network subnetwork. The decoder neural network receives the encoded sequence and processes the encoded sequence to generate, for each position in an output sequence order, a set of substring scores that includes a respective substring score for each substring in a set of substrings.
-
公开(公告)号:US12242818B2
公开(公告)日:2025-03-04
申请号:US17797872
申请日:2021-02-08
Applicant: Google LLC
Inventor: William Chan , Chitwan Saharia , Geoffrey E. Hinton , Mohammad Norouzi , Navdeep Jaitly
IPC: G06F40/47 , G06F40/284
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for sequence modeling. One of the methods includes receiving an input sequence having a plurality of input positions; determining a plurality of blocks of consecutive input positions; processing the input sequence using a neural network to generate a latent alignment, comprising, at each of a plurality of input time steps: receiving a partial latent alignment from a previous input time step; selecting an input position in each block, wherein the token at the selected input position of the partial latent alignment in each block is a mask token; and processing the partial latent alignment and the input sequence using the neural network to generate a new latent alignment, wherein the new latent alignment comprises, at the selected input position in each block, an output token or a blank token; and generating, using the latent alignment, an output sequence.
-
公开(公告)号:US20230385990A1
公开(公告)日:2023-11-30
申请号:US18227120
申请日:2023-07-27
Applicant: Google LLC
Inventor: Chitwan Saharia , Jonathan Ho , William Chan , Tim Salimans , David Fleet , Mohammad Norouzi
CPC classification number: G06T5/002 , G06T5/50 , G06T3/4007 , G06N3/08 , G06N3/045 , G06T2207/20081 , G06T2207/20016 , G06T2207/20084
Abstract: A method includes receiving, by a computing device, training data comprising a plurality of pairs of images, wherein each pair comprises an image and at least one corresponding target version of the image. The method also includes training a neural network based on the training data to predict an enhanced version of an input image, wherein the training of the neural network comprises applying a forward Gaussian diffusion process that adds Gaussian noise to the at least one corresponding target version of each of the plurality of pairs of images to enable iterative denoising of the input image, wherein the iterative denoising is based on a reverse Markov chain associated with the forward Gaussian diffusion process. The method additionally includes outputting the trained neural network.
-
公开(公告)号:US11769228B2
公开(公告)日:2023-09-26
申请号:US17391150
申请日:2021-08-02
Applicant: Google LLC
Inventor: Chitwan Saharia , Jonathan Ho , William Chan , Tim Salimans , David Fleet , Mohammad Norouzi
CPC classification number: G06T5/002 , G06N3/045 , G06N3/08 , G06T3/4007 , G06T5/50 , G06T2207/20016 , G06T2207/20081 , G06T2207/20084
Abstract: A method includes receiving, by a computing device, training data comprising a plurality of pairs of images, wherein each pair comprises an image and at least one corresponding target version of the image. The method also includes training a neural network based on the training data to predict an enhanced version of an input image, wherein the training of the neural network comprises applying a forward Gaussian diffusion process that adds Gaussian noise to the at least one corresponding target version of each of the plurality of pairs of images to enable iterative denoising of the input image, wherein the iterative denoising is based on a reverse Markov chain associated with the forward Gaussian diffusion process. The method additionally includes outputting the trained neural network.
-
公开(公告)号:US11756166B2
公开(公告)日:2023-09-12
申请号:US18155420
申请日:2023-01-17
Applicant: Google LLC
Inventor: Chitwan Saharia , Jonathan Ho , William Chan , Tim Salimans , David Fleet , Mohammad Norouzi
CPC classification number: G06T5/002 , G06N3/045 , G06N3/08 , G06T3/4007 , G06T5/50 , G06T2207/20016 , G06T2207/20081 , G06T2207/20084
Abstract: A method includes receiving, by a computing device, training data comprising a plurality of pairs of images, wherein each pair comprises an image and at least one corresponding target version of the image. The method also includes training a neural network based on the training data to predict an enhanced version of an input image, wherein the training of the neural network comprises applying a forward Gaussian diffusion process that adds Gaussian noise to the at least one corresponding target version of each of the plurality of pairs of images to enable iterative denoising of the input image, wherein the iterative denoising is based on a reverse Markov chain associated with the forward Gaussian diffusion process. The method additionally includes outputting the trained neural network.
-
公开(公告)号:US20230067841A1
公开(公告)日:2023-03-02
申请号:US17391150
申请日:2021-08-02
Applicant: Google LLC
Inventor: Chitwan Saharia , Jonathan Ho , William Chan , Tim Salimans , David Fleet , Mohammad Norouzi
Abstract: A method includes receiving, by a computing device, training data comprising a plurality of pairs of images, wherein each pair comprises an image and at least one corresponding target version of the image. The method also includes training a neural network based on the training data to predict an enhanced version of an input image, wherein the training of the neural network comprises applying a forward Gaussian diffusion process that adds Gaussian noise to the at least one corresponding target version of each of the plurality of pairs of images to enable iterative denoising of the input image, wherein the iterative denoising is based on a reverse Markov chain associated with the forward Gaussian diffusion process. The method additionally includes outputting the trained neural network.
-
公开(公告)号:US11182566B2
公开(公告)日:2021-11-23
申请号:US16338174
申请日:2017-10-03
Applicant: Google LLC
Inventor: Navdeep Jaitly , Yu Zhang , Quoc V. Le , William Chan
IPC: G06F40/47 , G06N3/08 , G10L15/16 , G10L15/197
Abstract: A computer-implemented method for training a neural network that is configured to generate a score distribution over a set of multiple output positions. The neural network is configured to process a network input to generate a respective score distribution for each of a plurality of output positions including a respective score for each token in a predetermined set of tokens that includes n-grams of multiple different sizes. Example methods described herein provide trained neural networks which produce results with improved accuracy compared to the state of the art, e.g. translations that are more accurate compared to the state of the art, or more accurate speech recognition compared to the state of the art.
-
-
-
-
-
-
-
-
-