Building an LLM from Scratch in C++

This is part 1 in a TBD part series on creating an LLM from scratch! Part 2 can be found here. In this document I will be talking about how I’ll be approaching this. This project is currently in progress, so for now it’ll be mostly high-level, and as I go through it I’ll update it with more detail and links to relevant articles. Math & Linear Algebra Foundations Lists, vectors, and matrices Randomness Neurons, Chain Rule, and Autodiff Optimization & Training Mechanics Probabilistic Language Modeling Embeddings and a (Tiny) Transformer

January 10, 2024 · Amitav Krishna