Attention Dilution
Attention dilution (also called context dilution) is one of the fundamental limitations of transformer-based LLMs when dealing with long contexts or extended agent memory.
Read articleBin Zhang's Field Notes
Practical essays, implementation notes, and mental models for LLMs, infrastructure, data platforms, and the craft of building durable software.
Latest dispatch
Attention dilution (also called context dilution) is one of the fundamental limitations of transformer-based LLMs when dealing with long contexts or extended agent memory.
Read articleFresh from the notebook
How many of these terms do you actually recognize?
Read noteFrom input to output, a prompt generally goes through seven steps: request packaging, tokenization, inference scheduling, prefill, and...
Read noteChatGPT Stats ChatGPT Growth ChatGPT Revenue
Read noteOver the next 12 to 24 months, the differentiator among engineers will shift from mastery of programming languages like Rust, Go, or Python,...
Read noteHyperparameters are external settings chosen before training, such as the learning rate or regularization strength.
Read noteAs large language models (LLMs) scale up, researchers have begun to notice a growing imbalance between model size and the availability of...
Read note