Mar 15, 2026
Attention Dilution
Attention dilution (also called context dilution) is one of the fundamental limitations of transformer-based LLMs when dealing with long contexts or extended agent memory.
Blog
Notes on LLMs, machine learning, data engineering, and systems work.
Mar 15, 2026
Attention dilution (also called context dilution) is one of the fundamental limitations of transformer-based LLMs when dealing with long contexts or extended agent memory.
How many of these terms do you actually recognize?
From input to output, a prompt generally goes through seven steps: request packaging, tokenization, inference scheduling, prefill, and decode before the result is returned.
Jan 4, 2026
ChatGPT Stats ChatGPT Growth ChatGPT Revenue
Nov 27, 2025
Over the next 12 to 24 months, the differentiator among engineers will shift from mastery of programming languages like Rust, Go, or Python, or the volume of code produced, to the...
Oct 30, 2025
Hyperparameters are external settings chosen before training, such as the learning rate or regularization strength.
Oct 29, 2025
As large language models (LLMs) scale up, researchers have begun to notice a growing imbalance between model size and the availability of high-quality training tokens. The...
Oct 20, 2025
In large-language-model (LLM) inference serving contexts, once the model compute becomes sufficiently fast, the performance bottleneck often shifts to the key-value (KV) cache...
Oct 19, 2025
Reflection is related to agent self-improvement or reasoning feedback loops.
Oct 2, 2025
[x] Independent deployable services - Each agent can scale horizontally (e.g., analysisservice replicas) - You can version and deploy agents independently
Sep 29, 2025
Its advantages over traditional sequential chains are evident in two areas:
Aug 10, 2025
1. Objective 2. Environment Setup
Jul 16, 2025
MCP Server Hub Currently, our different projects are using various MCP servers. To streamline and unify the process, we plan to implement a HUB MCP server that can handle multiple...
Jul 11, 2025
Tools in Large Language Models (LLMs) Tools enable large language models (LLMs) to interact with external systems, APIs, or data sources, extending their capabilities beyond text...
Jul 1, 2025
LangChain Invoke Retry Logic LLM call is not stable and may fail due to network issues or other reasons, therefore, retry logic is necessary.
Jun 23, 2025
| Feature | stdio | sse (Server-Sent Events) | streamable-http | |--------------------------|------------------------------------------|--------------------------------------------...
May 4, 2025
Out: None [Step 1: Duration 146.87 seconds| Input tokens: 2,113 | Output tokens: 923] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ─ Executing...
Apr 25, 2025
Step-by-Step Guide: Building an MCP Server using Python-SDK, AlphaVantage & Claude AI Model Context Protocol (MCP) lab
Apr 22, 2025
Retrieval-Augmented Generation (RAG) is a powerful approach that combines retrieval and generation to produce high-quality responses. However, the quality of the final response can...
Apr 21, 2025
You start by creating a Modelfile, which acts as a key to unlock any GGUF model you want to use.
Mar 29, 2025
Learning never exhausts the mind ― Leonardo da Vinci
Feb 16, 2025
Skyvern ScrapegraphAI Crawl4AI Reader Firecrawl Markdowner
Feb 9, 2025
|Feature| LangGraph| AutoGen| |---|---|---| |Core Concept| Graph-based workflow for LLM chaining| Multi-agent system with customizable agents| |Architecture| Node-based computation...
Feb 8, 2025
AutoGen is a framework for creating multi-agent AI applications that can act autonomously or work alongside humans.
Feb 2, 2025
If you find this in your VSCode, congratulations! You have successfully set up Ollama for code generation and assistance in Visual Studio Code. alt text
Dec 15, 2024
%%{init: { 'look':'handDrawn' } }%%
Nov 15, 2024
```python linenums="1" spark = ( SparkSession.builder.master("local[]").appName("test").getOrCreate() ) d = [ Event(1, "abc"), Event(2, "ddd"), ]
Nov 1, 2024
My previous spark project is scala based and I use IDEA to compile and test conveniently.:smile::smile::smile: Databricks Job nice UI save your time to create JAR job.
Oct 23, 2024
:bulb: It will extend your function behaviors during runtime.
Oct 16, 2024
This video is helpful to understand it. type:video
Oct 13, 2024
Reflex (pynecone) Reflex is a library to build full-stack web apps in pure Python. Repo Video type:video
Oct 5, 2024
I have enrolled in a private Snowflake Data Science Training. Let me list what I learned from it.
Sep 8, 2024
We can use internal runpy to execute different moduls in our project.
Sep 8, 2024
```python linenums="1" title="myclient.py"
Aug 12, 2024
Problem: How to introduce ml-based production/features to cross-functional teams.
Jul 18, 2021
bin/spark-submit \ master k8s://https://192.168.99.100:8443 \ deploy-mode cluster \ name spark-pi \ class org.apache.spark.examples.SparkPi \ conf spark.driver.cores=1 \ conf...
Nov 18, 2020
Recently I'm working in Azure to implement ETL jobs. The main tool is ADF (Azure Data Factory). This post show some solutions to resolve issue in my work.
Mar 1, 2020
scala ref create dataframe
Feb 21, 2020
```txt master MASTERURL --> 运行模式 例:spark://host:port, mesos://host:port, yarn, or local.
Feb 21, 2020
PROCESSLOCAL data is in the same JVM as the running code. This is the best locality possible NODELOCAL data is on the same node. Examples might be in HDFS on the same node, or in...
Feb 11, 2020
import airflow from airflow.models import DAG from airflow.operators.pythonoperator import PythonOperator
Feb 11, 2020
Whitening Transformation
Feb 8, 2020
Recently reading a blog Structured Streaming in PySpark It's implemented in Databricks platform. Then I try to implement in my local Spark. Some tricky issue happened during my...
Feb 4, 2020
Batch Normalization is one of important parts in our NN.
Feb 2, 2020
Vanilla gradient descent, aka batch gradient descent, computes the gradient of the cost function w.r.t. to the parameters θ
Oct 15, 2012
Repos Repo List language link