To understand the specifics of today’s AI, including the “attention mechanism” and “transformer” architecture that underlie things like LLMs and much of AI today, I recommend three posts from the Gradient Ascendant blog: