AI Literacy

How modern AI actually works — distilled from primary sources

The foundational ideas behind large language models, agents, and alignment — sourced from Anthropic, Karpathy, DeepMind, and the original Transformer paper. Just enough mental model to reason about AI capabilities, limits, and the trade-offs every product decision touches.

4 documents · sourced from Vaswani et al · Andrej Karpathy · Anthropic

Install this pack — try MIND free →Open in MIND

What’s inside

Attention Is All You Need — The Transformer Architecture

Vaswani et al, NeurIPS 2017 (the original Transformer paper)

Before 2017, sequence models (RNNs, LSTMs) processed tokens left-to-right, one at a time — slow, hard to parallelize, and weak at long-range dependencies. The Transformer's core insight: replace recurrence with attention. Self-attention lets every token in a sequence directly look at every other token in parallel, weighted by learned relevance scores (Query × Key, normalized, then applied to Value vectors). Stacked layers of self-attention + feed-forward networks build hierarchical representations. Multi-head attention runs the mechanism in parallel with different learned projections, capturing different relationship types simultaneously. Positional encodings substitute for the implicit ordering RNNs provided. Why it matters: this architecture is the foundation under GPT, Claude, Gemini, Llama, and every other modern LLM. The compute paradigm shift — fully parallelizable attention over GPU tensor cores — is what made training on internet-scale corpora feasible. Every capability advance since 2017 has come from scaling this architecture, not replacing it.

How LLMs Actually Work — Andrej Karpathy's Mental Model

Andrej Karpathy, 'Intro to Large Language Models' (2023) + 1hr State of GPT

Karpathy's distillation for non-researchers: a Large Language Model is a giant compression of the internet, trained to predict the next token given previous tokens. The 'magic' decomposes into three stages: (1) Pretraining — feed trillions of tokens of text through the transformer; cross-entropy loss on next-token prediction; result is a base model that has internalized patterns from web text, books, code. (2) Supervised fine-tuning — show the base model thousands of high-quality (prompt, ideal response) pairs from human labelers; result is an instruction-following assistant. (3) RLHF / RLAIF — train a reward model on human preferences (or AI-generated preferences) and use reinforcement learning to optimize the assistant toward higher-rated responses. The implications: LLMs hallucinate because they were trained to generate plausible-sounding tokens, not ground-truthed facts. They reason imperfectly because reasoning was absorbed from text, not first-principles taught. They get better with scale because more parameters mean more nuanced compression of language patterns. The model is a 'lossy zip file of the internet'.

Constitutional AI — How Claude Is Trained To Be Helpful + Harmless

Anthropic, 'Constitutional AI: Harmlessness from AI Feedback' (Bai et al, 2022)

Constitutional AI is Anthropic's training method for producing AI that is simultaneously helpful AND harmless without requiring massive human labeling for every harm category. The technique: (1) Write a 'constitution' — a list of principles in natural language ('be helpful, honest, harmless; refuse illegal advice; respect autonomy; etc.'). (2) Have the AI critique its own responses against the constitution, revise them, and use those AI-generated preference pairs as the RLHF training signal. This replaces the bottleneck of human reviewers for harm categories with the AI grading itself against principles. The result is Claude — trained to refuse genuinely harmful requests but remain genuinely useful for legitimate work, without the brittle refusal patterns of earlier methods. Why this matters for product design: when integrating Claude into a workflow, the constitutional training means the assistant has stable values, will push back on incoherent requests, and is steerable via system prompts without needing to retrain. The trade-off versus pure helpfulness: occasional over-refusal in edge cases. Anthropic's research focus is on closing this gap while preserving the safety floor.

Agents + Tool Use — The MCP Standard

Anthropic, Model Context Protocol spec (modelcontextprotocol.io)

MCP (Model Context Protocol) is the emerging standard for letting AI agents call external tools and access persistent context across sessions. Before MCP, every product integrating AI built its own ad-hoc function-calling layer, creating an N×M integration problem (N agents × M tools). MCP defines: (1) Servers — programs that expose tools, resources, and prompts via a standard JSON-RPC protocol. (2) Clients — AI applications (Claude Code, Cursor, custom agents) that discover and call MCP servers. (3) Transports — stdio for local, HTTP/SSE for remote. Once any agent speaks MCP, it can call any MCP-compliant tool. The implication for memory: persistent knowledge graphs exposed via MCP become the cross-agent memory layer. The same MIND that you write to with Claude Code can be read by ChatGPT, Gemini, or any custom agent — your knowledge graph stops being siloed inside one chat app and becomes infrastructure. This is the transition from 'AI as features inside products' to 'AI as a layer every product talks through'.

Your AI shouldn’t start from zero.

Install this pack and your MIND begins smart — then every answer is grounded in your own knowledge graph.

Try MIND free →