Build an LLM from Scratch 2: Working with text data

01:28:01

Download information and video details for Build an LLM from Scratch 2: Working with text data

Uploader:

Sebastian Raschka

Published at:

3/2/2025

Views:

10.5K

Download Video

Description:

This supplementary video, part of the "Build an LLM from Scratch" series, walks through the text data preparation steps for training large language models, including tokenization, byte pair encoding, data loaders, and more. The video covers tokenizing text (00:00), converting tokens into token IDs (14:02), adding special context tokens (23:56), byte pair encoding (30:26), data sampling with a sliding window (44:00), creating token embeddings (1:07:10), and encoding word positions (1:15:45).

Build an LLM from Scratch 2: Working with text data

Download information and video details for Build an LLM from Scratch 2: Working with text data

Uploader:

Published at:

Views:

Description:

Similar videos: Build an LLM from Scratch

Package Your n8n Workflows Into Full Web Apps (Step-By-Step)

Building LLMs from the Ground Up: A 3-hour Coding Workshop

ESP32 - CMake with ESP-IDF Tutorial

Part 7: Prediction Sense | "Alien: Isolation" Smart AI in UE5

Compositing in After Effects - Advanced Explosions Tutorial!

Part 2: The OpenCage Mod Toolkit | "Alien: Isolation" Smart AI in UE5