Tacnode
All posts
Xiaowei Jiang

Xiaowei Jiang

CEO & Chief Architect at Tacnode

Xiaowei Jiang is CEO and Chief Architect at Tacnode, where he designed the Context Lake architecture from first principles. He previously built distributed query engines at Meta and Microsoft, working at petabyte scale across some of the largest data systems in production. His formal analysis of decision coherence — the Composition Impossibility Theorem — is published on arXiv (2601.17019) and provides the theoretical foundation for the Context Lake as a system category. He writes about database architecture, AI agent infrastructure, and the structural limitations of composed data stacks.

Database ArchitectureDistributed SystemsAI InfrastructureDecision Coherence
LinkedIn

Posts by Xiaowei (20)

ACID for Agents: Why Database Consistency Is the Bottleneck for Production AI
AI Infrastructure

ACID for Agents: Why Database Consistency Is the Bottleneck for Production AI

Oracle just validated what production agent teams already know: the agent data layer is broken. Here's why ACID compliance across retrieval patterns is the fix.

Xiaowei JiangXiaowei Jiang|Mar 27, 2026
OLTP vs OLAP: The False Choice for the Agentic Era
Data Engineering

OLTP vs OLAP: The False Choice for the Agentic Era

Every architecture guide frames OLTP vs OLAP as a choice: optimize for transactions or optimize for analytics. But automated decision systems — fraud checks, credit approvals, agent actions — need both transactional consistency and analytical power at the same moment. The Composition Impossibility Theorem proves you can't stitch separate OLTP and OLAP systems together to get there. Here's what comes after the tradeoff.

Xiaowei JiangXiaowei Jiang|Mar 17, 2026
Apache Kafka vs Apache Flink: The Real Comparison Is Flink vs Kafka Streams
Data Engineering

Apache Kafka vs Apache Flink: The Real Comparison Is Flink vs Kafka Streams

Most people comparing Kafka and Flink are actually asking which stream processing layer do I need? The real architectural choice is Apache Flink vs the Kafka Streams API — and understanding the difference changes how you build.

Xiaowei JiangXiaowei Jiang|Mar 2, 2026
What Retrieval Really Means for AI Agents
AI & Machine Learning

What Retrieval Really Means for AI Agents

AI retrieval is not one operation. Production decisions require exact and semantic retrieval patterns used together: point lookups, range scans, filters, joins, aggregations, and similarity search.

Xiaowei JiangXiaowei Jiang|Feb 18, 2026
What Is Derived Context?
Architecture

What Is Derived Context?

Why data freshness matters for AI decisions: derived context is state computed from events that must be current at decision time. When feature freshness degrades, decisions fail—not from bad models, but stale context.

Xiaowei JiangXiaowei Jiang|Feb 13, 2026
Context Silos: When the System Knows But the Decision-Maker Doesn't
Architecture

Context Silos: When the System Knows But the Decision-Maker Doesn't

Why AI agent memory fails even when data exists: context silos prevent agents from accessing knowledge computed elsewhere. The fraud pattern was detected—but the checkout agent couldn't see it. Stale context isn't always old. Sometimes it's just unreachable.

Xiaowei JiangXiaowei Jiang|Feb 6, 2026
What Is Context Engineering? The Discipline Behind Effective AI Agents
AI & Machine Learning

What Is Context Engineering? The Discipline Behind Effective AI Agents

Context engineering is the discipline of designing how AI agents receive, manage, and act on information. It goes far beyond prompt engineering — covering context windows, tool calls, memory architecture, and the retrieval systems that determine whether an agent makes good decisions or bad ones.

Xiaowei JiangXiaowei Jiang|Feb 3, 2026
ClickHouse JOINs Are Slow: Here's Why (And What To Do About It)
Data Engineering

ClickHouse JOINs Are Slow: Here's Why (And What To Do About It)

If your ClickHouse JOINs are killing query performance, you're not alone. Here's why columnar databases struggle with JOINs, what join algorithms are available, how to read the query plan, and when it's time to consider alternatives.

Xiaowei JiangXiaowei Jiang|Feb 5, 2026
AI Agent Memory Architecture: The Three Layers Production Systems Need
AI Infrastructure

AI Agent Memory Architecture: The Three Layers Production Systems Need

AI agents need more than a vector database. Production systems require three distinct memory layers — episodic, semantic, and state. Here's what each layer does and why it matters.

Xiaowei JiangXiaowei Jiang|Feb 4, 2026
Semantic Operators: Run LLM Queries Directly in SQL
AI Infrastructure

Semantic Operators: Run LLM Queries Directly in SQL

Classify, summarize, and extract data using LLM reasoning inside your database. No external pipelines, no data movement — just SQL.

Xiaowei JiangXiaowei Jiang|Jan 28, 2026
Join Tacnode at Current 2025: Putting Context in Motion
Company News

Join Tacnode at Current 2025: Putting Context in Motion

Context Lake comes to the Big Easy.

Xiaowei JiangXiaowei Jiang|Oct 14, 2025
Context Lake: The Infrastructure Imperative for Real-Time AI
AI & Agentic Systems

Context Lake: The Infrastructure Imperative for Real-Time AI

The next evolution from Data Lake to Context Lake.

Xiaowei JiangXiaowei Jiang|Aug 16, 2025
Tacnode Context Lake is now available in the new AWS Marketplace AI Agents and Tools category
Company News

Tacnode Context Lake is now available in the new AWS Marketplace AI Agents and Tools category

Helping usher in a new category of real-time AI solutions.

Xiaowei JiangXiaowei Jiang|Jul 16, 2025
The Decision-Time System Model
AI & Machine Learning

The Decision-Time System Model

Kafka + ClickHouse solves streaming analytics—but not AI decision-making. Here’s why teams searching for Kafka alternatives or a streaming database still hit walls: split state, temporal misalignment, and consistency gaps that break automated decisions.

Xiaowei JiangXiaowei Jiang|Feb 14, 2026
Agent Drift and AI Drift: Why Production AI Models Quietly Get Worse
AI & Machine Learning

Agent Drift and AI Drift: Why Production AI Models Quietly Get Worse

AI drift is the umbrella term for the gradual degradation of a machine learning model’s performance in production as data, relationships, or context diverge from training. Classical ML recognizes three types — data drift (covariate shift), concept drift, and label drift — detectable with statistical tests like the Kolmogorov-Smirnov test, Population Stability Index, and KL divergence. Agent systems introduce a fourth the classical toolkit misses — agent drift, where the model is unchanged but the derived context the agent reads at decision time has gone stale. This guide covers all four types, how to detect model drift, and how to prevent agent drift with the right context infrastructure.

Xiaowei JiangXiaowei Jiang|Apr 22, 2026
Context Under Concurrency: Why Your Cache Collapses Under Load
Real-Time Architecture

Context Under Concurrency: Why Your Cache Collapses Under Load

Context under concurrency is the production failure mode where cached derived state goes stale faster than the system can refresh it, and parallel decisions commit against divergent snapshots. This post covers why high-velocity state plus concurrent decisions break the caching pattern, how the preparation gap and the retrieval gap compound under load, and what a serving layer has to do differently to keep decisions coherent when every millisecond of staleness has a business consequence.

Xiaowei JiangXiaowei Jiang|Apr 21, 2026
Real-Time Fraud Detection Architecture: Where Coherence Breaks
Fraud Detection

Real-Time Fraud Detection Architecture: Where Coherence Breaks

Fraud detection architectures converge on the same canonical stack — Kafka → Flink → feature store → model serving → rules engine — and fail at three predictable seams under concurrent load: velocity counter staleness, feature-store / rules-engine divergence, and cross-channel retrieval gap. Sub-50ms p99 on each component doesn’t fix any of these.

Xiaowei JiangXiaowei Jiang|Apr 23, 2026
Real-Time Credit Decisioning Architecture
Financial Services

Real-Time Credit Decisioning Architecture

Real-time credit decisioning is not batch underwriting with a faster SLA. Every transaction reads three derived signals — exposure, velocity, and risk — from separate pipelines that drift under concurrent load. The composite a decision reads is a chimera, correct only in the sense that each part was correct against its own snapshot.

Xiaowei JiangXiaowei Jiang|Apr 23, 2026
Stateful Stream Processing for Decisions: Where Flink Stops Being Enough
Stream Processing

Stateful Stream Processing for Decisions: Where Flink Stops Being Enough

Flink gives you stateful stream processing. It does not give you a decision-coherent serving layer. The gap is what teams discover when they put Redis or Postgres in front of Flink to serve decisions — and hit the same split-state problem Flink was supposed to have solved.

Xiaowei JiangXiaowei Jiang|Apr 24, 2026
Real-Time ML: Architecture, Feature Freshness, and Where ML Models Make Bad Decisions
AI & Machine Learning

Real-Time ML: Architecture, Feature Freshness, and Where ML Models Make Bad Decisions

Real-time ML — the architecture that runs ML models against live requests for instant decisions — is bottlenecked by feature freshness, not model latency. The model serves in 8 milliseconds; the features it scored are 40 seconds old. For real-time machine learning systems committing against fresh state, the freshness budget is the binding constraint, and most stacks never measure it.

Xiaowei JiangXiaowei Jiang|Apr 24, 2026