Bin Zhang's Field Notes

Modern notes for data, AI, and engineering systems.

Practical essays, implementation notes, and mental models for LLMs, infrastructure, data platforms, and the craft of building durable software.

Browse posts View archive

Reading paths

Browse collections

Pinned dispatch

Recent writing

All posts

Jul 21, 2026

Standing on Open Source: How a Codex Edge Case Became Agentusage

AI ENGINEERINGSOFTWARE ENGINEERING

Agentusage would not exist without open source.

Read note

Jul 12, 2026

Statistical Tests for Data Analysis: A Practical Onboarding Guide

DATA SCIENCE

Learn how to choose and use statistical tests without turning analysis into a p-value checklist—from experimental design and assumptions to...

Read note

Jul 10, 2026

Use Local Models in VS Code Copilot with LM Studio and Unsloth Studio

AI ENGINEERINGSOFTWARE ENGINEERING

Before you begin Install VS Code with Copilot Chat, then download a model in LM Studio or Unsloth Studio.

Read note

Jul 5, 2026

KV-Centric LLM Serving: vLLM, SGLang, and Disaggregated Attention

AI ENGINEERINGLARGE LANGUAGE MODELS

The more I look at LLM serving, the more it feels like the main object is not the request, the model, or even the GPU.

Read note

Jul 3, 2026

Reproducing vLLM and LMCache KV Cache Reuse on a CPU-Only MacBook

AI ENGINEERINGSOFTWARE ENGINEERING

I became interested in LMCache because it sits in the part of LLM serving that feels both very practical and very under-discussed: KV cache...

Read note

Jun 23, 2026

Use Databricks Models with VS Code Copilot and Copilot CLI

AI ENGINEERINGDATA ENGINEERING

I wanted one Databricks-hosted model to work in two developer surfaces:

Read note

Modern notes for data, AI, and engineering systems.

Browse collections

Featured article

From Prompt to Response: A Step-by-Step Walkthrough of LLM Inference

Recent writing

Standing on Open Source: How a Codex Edge Case Became Agentusage

Statistical Tests for Data Analysis: A Practical Onboarding Guide

Use Local Models in VS Code Copilot with LM Studio and Unsloth Studio

KV-Centric LLM Serving: vLLM, SGLang, and Disaggregated Attention

Reproducing vLLM and LMCache KV Cache Reuse on a CPU-Only MacBook

Use Databricks Models with VS Code Copilot and Copilot CLI