DiscoverSuper Data Science: ML & AI Podcast with Jon Krohn759: Full Encoder-Decoder Transformers Fully Explained, with Kirill Eremenko
759: Full Encoder-Decoder Transformers Fully Explained, with Kirill Eremenko

759: Full Encoder-Decoder Transformers Fully Explained, with Kirill Eremenko

Update: 2024-02-20
Share

Description

Encoders, cross attention and masking for LLMs: SuperDataScience Founder Kirill Eremenko returns to the SuperDataScience podcast, where he speaks with Jon Krohn about transformer architectures and why they are a new frontier for generative AI. If you’re interested in applying LLMs to your business portfolio, you’ll want to pay close attention to this episode!

This episode is brought to you by Ready Tensor, where innovation meets reproducibility (https://www.readytensor.ai/), by Oracle NetSuite business software (netsuite.com/superdata), and by Intel and HPE Ezmeral Software Solutions (http://hpe.com/ezmeral/chatbots). Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information.

In this episode you will learn:
• How decoder-only transformers work [15:51 ]
• How cross-attention works in transformers [41:05 ]
• How encoders and decoders work together (an example) [52:46 ]
• How encoder-only architectures excel at understanding natural language [1:20:34 ]
• The importance of masking during self-attention [1:27:08 ]

Additional materials: www.superdatascience.com/759
Comments 
loading
In Channel
loading
Download from Google Play
Download from App Store
00:00
00:00
1.0x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

759: Full Encoder-Decoder Transformers Fully Explained, with Kirill Eremenko

759: Full Encoder-Decoder Transformers Fully Explained, with Kirill Eremenko

Jon Krohn