Universal transformers
Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing a sequence to sequence model that is recurrent in depth while employing self-attention to combine information from different parts of sequences.
Public/Granted literature
Information query
Patent Agency Ranking
0/0