Uncovering mesa-optimization algorithms in Transformers

Uncovering mesa-optimization algorithms in Transformers

55 лет назад

422 Просмотров

The paper proposes that the superior performance of Transformers in deep learning is due to an architectural bias towards mesa-optimization, a learned process within the forward pass. They reverse-engineer Transformers and show that the learned optimization algorithm can be used for few-shot tasks. They also propose a new self-attention layer that improves performance.

00:00 Section: 1 Introduction
03:35 Section: Linear self-attention can implement one step of gradient descent.
07:42 Section: Multi-layer mesa-optimizers.
13:38 Section: 5.1 Prediction of linear dynamics by in-context learning
17:31 Section: Multiple self-attention layers.
22:22 Section: 5.2 Simple autoregressive models become few-shot learners
25:32 Section: A toy model for in-context learning.

https://arxiv.org/abs//2309.05858

YouTube: https://www.youtube.com/@ArxivPapers

PODCASTS:
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

Скачать видео

Комментарии:

Сейчас смотрят

Uncovering mesa-optimization algorithms in Transformers

Uncovering mesa-optimization algorithms in Transformers Arxiv Papers

درد شانه به چه دلیل است؟ | درمان درد شانه | درمان شانه منجمد | نحوه ی تشخیص درد شانه | شانه درد |

درد شانه به چه دلیل است؟ | درمان درد شانه | درمان شانه منجمد | نحوه ی تشخیص درد شانه | شانه درد | roshdto |حسام زارع

talk2anirudha live show आज के अतिथि inspector अजीत

talk2anirudha live show आज के अतिथि inspector अजीत cop_anirudha

The Evolution of Luigi Music (2001-2019)

The Evolution of Luigi Music (2001-2019) Piano Music Bros.

A4 copy paper data Thailand

A4 copy paper data Thailand Rajesh Khandar

Is the Wilson DYNAPWR Carbon a contender in 2025?

Is the Wilson DYNAPWR Carbon a contender in 2025? Golf Liberty / Matt Blois

کنترل دیابت با سرکه سیب | کاهش وزن با سرکه سیب | فواید سرکه سیب | کاهش کلسترول خون با سرکه سیب |

کنترل دیابت با سرکه سیب | کاهش وزن با سرکه سیب | فواید سرکه سیب | کاهش کلسترول خون با سرکه سیب | roshdto |حسام زارع

How to Edit Cinematic Text Reveal in YouCut? | Trending Video Editing Tutorial |

How to Edit Cinematic Text Reveal in YouCut? | Trending Video Editing Tutorial | YouCut Video Editor - Official

درمان پیچ خوردگی مچ پا | بعد از پیچ خوردگی مچ پا چه کنیم؟ | روش مراقبت های بعد از پیچ خوردگی مچ پا

درمان پیچ خوردگی مچ پا | بعد از پیچ خوردگی مچ پا چه کنیم؟ | روش مراقبت های بعد از پیچ خوردگی مچ پا roshdto |حسام زارع

frozen forest-we never thought we'd show you this | winter hiking video | the windy burrow Canon M50

frozen forest-we never thought we'd show you this | winter hiking video | the windy burrow Canon M50 The Windy Burrow

ear prosthesis | پروتز گوش | نحوه ی چسباندن پروتز گوش

ear prosthesis | پروتز گوش | نحوه ی چسباندن پروتز گوش roshdto |حسام زارع

이번엔 연골판 봉합! 연골판 봉합 후 제일 중!요!한! 관리! 어떻게 해야 할까?!

이번엔 연골판 봉합! 연골판 봉합 후 제일 중!요!한! 관리! 어떻게 해야 할까?! 올바른TV