ai tools

Your LLM Agent Is Obsolete

Traditional AI tool-calling is slow, costly, and surprisingly inaccurate. TanStack's new 'Code Mode' solves this by letting your LLM write and execute its own TypeScript, leading to a 10x boost in speed and precision.

Stork.AI
Hero image for: Your LLM Agent Is Obsolete
💡

TL;DR / Key Takeaways

Traditional AI tool-calling is slow, costly, and surprisingly inaccurate. TanStack's new 'Code Mode' solves this by letting your LLM write and execute its own TypeScript, leading to a 10x boost in speed and precision.

The Hidden Tax on Your AI Agent

AI agents struggle with a critical bottleneck: tool calling. While large language models excel at natural language understanding and generation, their inherent design as conversational systems makes them surprisingly inefficient when tasked with executing external functions. This fundamental mismatch forces a chat-based paradigm onto programmatic execution, creating a cascade of hidden costs that severely limit agent potential.

Standard tool calling operates through a series of laborious round-trips. An LLM might suggest a tool, requiring the application to execute it, then feed the result back to the model as a new conversational turn. This iterative, turn-based interaction, akin to a human repeatedly asking for updates or instructions, introduces significant overhead into what should be a straightforward, programmatic computation. This approach is fundamentally misaligned with the speed and precision required for real-world automation.

These inefficiencies levy a steep tax on AI agent performance and viability, manifesting in three critical areas: - Latency: Executing even a simple query, like calculating the average cost of shoes, demands multiple back-and-forth interactions. Traditional methods required four round-trips between the client and server, culminating in a 27-second response time in one benchmark. This conversational overhead severely hampers real-time applications and user experience, making agents feel sluggish and unresponsive. - Context Bloat: Each subsequent request includes the entire message history and tool outputs, rapidly expanding the model's context window. The shoe cost example saw context usage balloon to 9.8K tokens to answer a basic question, inflating API costs and increasing processing time. This constant re-transmission of data is financially unsustainable, especially for complex or long-running agent tasks. - Inaccuracy: LLMs are not reliable computational engines. Despite their linguistic prowess, they often fail at precise mathematical operations. The same shoe cost query, using standard tools, incorrectly returned $134.50, while the correct average was $137.75. Relying on an LLM for exact calculations introduces critical errors into agent workflows, undermining trust and utility.

Collectively, these substantial costs prevent AI agents from achieving their full potential in demanding real-world applications. The current approach to tool calling transforms a promising technology into a slow, expensive, and often unreliable system, effectively placing a ceiling on what agents can truly accomplish. This hidden tax must be addressed for agents to move beyond novelty into indispensable utility, unlocking the next generation of intelligent automation.

Why Your LLM Fails at Simple Math

Illustration: Why Your LLM Fails at Simple Math
Illustration: Why Your LLM Fails at Simple Math

A seemingly straightforward query—calculating the average cost of shoes—exposes a critical flaw in standard LLM tool-calling. This isn’t about complex algorithms; it’s about basic arithmetic, and your agent is likely failing at it.

To answer "What is the average cost of our shoes?", a typical LLM agent orchestrates a series of tool calls. It first invokes `getProductListPage` to retrieve all product IDs and the total page count. Then, for each page or ID, it makes subsequent calls to `getProductByID` to fetch individual product details. This iterative, conversational approach forces the LLM into a programmatic loop.

This pattern, known as N+1 querying, is notoriously inefficient. Each tool call necessitates a full round trip: the LLM requests a tool, the tool executes, and its results, along with all prior context, are sent back to the LLM for the next step. For our simple shoe price average, this resulted in four full round trips between the client and server, ballooning the context payload to an astonishing 9.8KB.

Even after this resource-intensive, multi-step process, the LLM’s final calculation was shockingly inaccurate. It reported the average shoe price as $134.50. The actual, precise average, derived from direct programmatic calculation, stood at $137.75. This isn't a trivial difference; it's a fundamental miscalculation on a basic task.

LLMs excel at pattern recognition and language generation, not deterministic computation. They are predictive text engines, not calculators. Asking an LLM to perform precise arithmetic or complex data aggregation is akin to using a paintbrush for brain surgery—it’s the wrong tool for the job. Their probabilistic nature makes them inherently unreliable for tasks demanding exact numerical accuracy.

If your AI agent cannot reliably calculate the average of a handful of numbers, its capability to handle more intricate business logic is severely compromised. Imagine the consequences for financial reporting, inventory management, or critical data analysis where precision is paramount. This simple example underscores a profound limitation, highlighting why traditional LLM agents are not built for reliable, programmatic execution.

The Paradigm Shift: From Calling to Coding

The answer to these pervasive inefficiencies arrives with TanStack AI Code Mode. This innovative approach fundamentally redefines how AI agents interact with external functionalities. Instead of prompting an LLM to merely identify *which tool to call*, Code Mode instructs the model to *write the code to solve the problem* directly. This shifts the LLM from a decision-maker to a highly capable software engineer, leveraging its innate strength in code generation.

Code Mode operates by having the LLM generate executable TypeScript code. This code, complete with access to defined tools, then runs within an isolated environment like QuickJS, Node, or a Cloudflare worker. This bypasses the traditional, multi-turn chat paradigm where each tool call necessitates a new request-response cycle, repeatedly sending context back and forth between the agent and the server.

The performance gains are stark. Consider the "average shoe price" example: a standard tool-calling agent required four LLM calls, consumed 9.8 kilobytes of context, and took 27 seconds to return an incorrect average of $134.50. Code Mode, by contrast, completed the task in just two LLM calls, using a mere 1.7 kilobytes of context, and finished in a blazing 8 seconds. Crucially, its TypeScript-driven calculation yielded the correct average of $137.75.

This isn't just an optimization; it's a paradigm shift in AI agent design. By delegating complex logic, state management, and precise mathematical operations to generated code, Code Mode mitigates the LLM's inherent weaknesses—latency, context bloat, and numerical inaccuracies. It transforms agents from chat interfaces into robust, programmatic entities, setting a new standard for efficiency and reliability in AI-driven applications.

Inside the Isolate: Secure & Powerful Execution

Code Mode fundamentally re-architects how LLMs interact with external systems, shifting from a conversational tool-calling paradigm to direct code execution. This innovative approach leverages an isolate, a secure, sandboxed environment where the LLM's generated TypeScript code runs. Instead of sequential, chat-driven tool invocations, the AI now directly orchestrates complex operations within a controlled execution context.

Developers define their existing functions and utilities, injecting them directly into this isolate. Functions like `getProductListPage` or `getProductByID` become available to the AI's generated script as native functions within its execution environment, not external API calls. This mechanism provides the LLM with direct, powerful, yet controlled access to application logic, all while remaining confined within the sandbox's strict boundaries, preventing arbitrary or malicious code execution.

This secure sandbox can be instantiated using various drivers, offering flexibility based on deployment needs and existing infrastructure. Current options include: - QuickJS: A lightweight, high-performance JavaScript engine ideal for edge environments and resource-constrained scenarios, ensuring minimal overhead. - Node.js: A familiar, powerful JavaScript runtime for server-side execution, offering broad compatibility and access to a vast ecosystem. - Cloudflare Workers: For serverless, globally distributed execution, leveraging Cloudflare's edge network to minimize latency and maximize scalability.

TanStack AI abstracts this entire complex architecture behind a single

Benchmarks Don't Lie: The 10x Performance Leap

Illustration: Benchmarks Don't Lie: The 10x Performance Leap
Illustration: Benchmarks Don't Lie: The 10x Performance Leap

The 'average shoe price' calculation, a seemingly simple task, starkly exposes the inefficiencies of traditional LLM tool calling. This real-world case study, documented with hard data, demonstrates TanStack AI Code Mode's transformative impact on agent performance. It moves beyond theoretical advantages, showcasing concrete, measurable gains that redefine agent capabilities.

Direct comparisons reveal dramatic improvements across multiple vectors. Standard LLM tool-calling required four separate LLM calls to resolve the average price query, each incurring latency, processing time, and API costs. Code Mode slashes this interaction to just two calls, effectively halving the computational overhead and streamlining the entire workflow.

This efficiency extends significantly to context management, a critical factor in LLM expenses. A conventional agent operation balloons the context window to 9.8KB, repeatedly sending redundant information back and forth. Code Mode, by executing code within a secure isolate, shrinks this footprint to a mere 1.7KB, an astounding 82.6% reduction in data transfer.

The most compelling metric is total execution time. The standard approach crawled to a 27-second completion for the average price query, bogged down by round trips and repeated context processing. Code Mode delivers the accurate answer in a blistering 8 seconds, representing a 3.4x speed increase and a profound shift in user experience.

These aren't abstract figures; they translate directly into tangible advantages for developers and end-users alike. Developers realize significantly lower API bills due to fewer calls and drastically reduced token usage, making agents far more economically viable. Users benefit from lightning-fast responses, transforming previously sluggish interactions into fluid, instantaneous experiences.

Beyond speed and cost, accuracy is paramount, especially for numerical tasks. The traditional LLM, prone to numerical hallucinations, incorrectly calculated the average shoe price as $134.50. TanStack AI Code Mode, leveraging native TypeScript execution, consistently yields the precise average of $137.75 every single time. This deterministic outcome eliminates guesswork, reinforces trust in AI-driven applications, and ensures computational tasks are handled with the unwavering reliability expected from standard programming, not the probabilistic nature of LLMs.

Unleashing 'Skills': Your AI's New Superpower

Beyond the immediate performance gains of Code Mode, TanStack AI introduces a profound evolution: Code Mode Skills. This additional library layers on top of Code Mode, empowering the LLM to transcend rote execution and actively learn from its own successful operations. The agent no longer simply generates and executes TypeScript code; it intelligently identifies viable snippets it creates and deems valuable for future use.

When the LLM successfully solves a problem by generating a specific piece of TypeScript, it can now recognize that generated code as a potential reusable skill. Each skill is meticulously defined with an input schema, an output schema, the executable TypeScript code itself, and a descriptive label. This allows the AI to catalog its own solutions, storing them in a persistent manner—whether on disk, in a database, or other custom storage solutions—making them readily available for recall.

Consider the 'average shoe price' calculation, a task previously highlighted for its inefficiency with traditional tool calling. While Code Mode initially reduced this from 27 seconds and 9.8KB of context to 8 seconds and 1.7KB, the impact of Skills is even more dramatic. The second time the agent is asked for the average cost, it doesn't regenerate the code. Instead, it instantly retrieves and executes its newly created `getAverageProductPrice` skill. This results in an astonishing 3-second execution time, utilizing a mere 0.5KB of context over just two LLM calls, a massive leap in efficiency.

This capability fundamentally transforms the nature of AI agents. Rather than remaining stateless executors, agents equipped with Code Mode Skills become dynamic, self-optimizing entities. They continuously build an internal library of proven solutions, learning and refining from every successful task. The agent evolves, accumulating a sophisticated, custom-built toolset over time, making future interactions progressively faster, more efficient, and inherently more intelligent. This paradigm shift enables truly adaptive AI agents that improve with every use.

Beyond APIs: Talking Directly to Your Database

Code Mode's capabilities extend far beyond orchestrating simple API calls and pre-defined tools. The architecture pushes the boundaries of LLM agents, enabling deeply integrated system interactions that were previously unachievable. This evolution positions Code Mode as a foundational layer for truly autonomous AI.

A compelling demonstration showcased Code Mode connected with direct database access. Within its secure, isolated execution environment, the LLM gained the unprecedented ability to interact with a live database instance, bypassing traditional ORM layers or rigid API endpoints. This represents a significant leap, granting AI agents granular control over data without requiring pre-defined interactions.

Empowered by this direct access, the AI generated both the necessary TypeScript logic and intricate SQL queries to fulfill complex data requests. The LLM dynamically constructs precise database operations on the fly, including complex joins, aggregations, and schema exploration. All operations execute directly against the live data source within the isolate's secure boundaries.

This paradigm shift transforms how organizations approach data analysis and reporting. Users can now pose nuanced, natural language questions, with Code Mode autonomously translating them into executable code and precise SQL queries. It delivers a powerful, out-of-the-box solution for building sophisticated natural language data analysis tools, democratizing access to data insights without manual coding, and empowering a new generation of data-driven applications.

The Ultimate Flex: AI-Generated User Interfaces

Illustration: The Ultimate Flex: AI-Generated User Interfaces
Illustration: The Ultimate Flex: AI-Generated User Interfaces

Dynamic interfaces represent the ultimate demonstration of TanStack AI Code Mode’s transformative power. Moving beyond data retrieval and complex computations, Code Mode enables large language models to construct entire user interfaces from scratch, on demand. This capability shifts the paradigm from merely processing information to generating interactive, client-side applications.

Here’s how it works: developers expose UI component functions as tools to the LLM. Functions like `createChart(data, type, options)` or `renderTable(data, columns)` become primitives. The LLM then leverages its inherent strength in generating TypeScript code to orchestrate these functions, writing a complete front-end application within the secure isolate. This code precisely defines the layout, data binding, and interactivity of the resulting interface.

Traditional approaches often rely on rigid JSON-based UI schemas, which demand predefined structures and limit expressive power. Such schemas are inherently restrictive, struggling with dynamic layouts, conditional rendering, or complex user interactions. Code Mode bypasses these limitations by allowing the LLM to write actual, executable code, offering infinitely more flexibility and control over the final user experience.

The distinction is profound. Instead of receiving a static JSON object that a separate front-end layer must interpret and render, the LLM directly outputs the code that *builds* the UI. This allows for intricate logic, custom styling, and dynamic data visualizations that would be cumbersome or impossible with declarative schemas. The LLM becomes a front-end engineer, assembling components into a cohesive whole.

Imagine asking an agent for a "report on Q3 sales by region, showing trends and top performers." Instead of a raw data dump or a pre-templated output, Code Mode can generate a fully interactive, custom-built dashboard. This dashboard might feature multiple charts, sortable tables, and user-adjustable filters, all tailored to the specific request and generated in real-time.

This capability unlocks a future where users command their AI agents to produce bespoke analytical tools and visual reports instantaneously. The LLM’s role evolves from a conversational partner to a full-stack developer, delivering not just answers but complete, functional applications. This marks a monumental leap in agent autonomy and utility.

Ultimately, dynamic UI generation solidifies Code Mode’s position as a foundational technology for truly intelligent and autonomous agents. It moves beyond the limitations of chat-based interactions to create tangible, interactive experiences, showcasing the expansive potential when an LLM can write and execute code.

Getting Your Hands Dirty with Code Mode

Developers eager to harness Code Mode's power can immediately dive into the TanStack AI monorepo on GitHub. Locate the `examples/TSCodeModeWeb` directory, which hosts the exact demo application used in our benchmarks. This robust example provides a practical, real-world blueprint for integrating Code Mode's revolutionary approach into your own projects, demonstrating both the client-side and API implementations.

Implementation begins by establishing an isolate driver, the secure sandbox where your AI-generated TypeScript executes. Code Mode currently supports environments like QuickJS, Node.js, or Cloudflare Workers, offering flexibility for various deployment strategies. This driver is paramount for providing a high-performance, isolated runtime that ensures both execution safety and optimal resource utilization.

Next, define your AI tool using the `createCodeMode` function, a core component of the TanStack AI library. This function requires your configured isolate driver and a collection of functions explicitly injected into the isolate's scope. These injected functions become the direct, callable "tools" for your LLM, completely bypassing the inefficient, chat-based tool invocation patterns.

Crucially, integrate the specialized system prompt generated by `createCodeMode` with your existing LLM prompts. This prompt furnishes the LLM with comprehensive TypeScript typings for all your injected tools and precise instructions on how to invoke the `executeTypeScript` command. It empowers the model to accurately generate executable code, leveraging its strength in writing TypeScript.

With these foundational steps, you've successfully enabled Code Mode within your application, unlocking a new paradigm for AI agents. Explore the provided examples, modify the injected functions, and experiment with diverse prompts to witness the dramatic performance, accuracy, and context efficiency improvements firsthand. Developers are highly encouraged to contribute to the TanStack AI project, helping to shape the future of intelligent, code-centric AI development.

The Future is Code-First AI

The era of treating large language models as mere function callers, shackled by chat-based interaction loops and inefficient context management, is rapidly drawing to a close. TanStack AI's Code Mode ushers in a new paradigm, recognizing the LLM's true strength: its unparalleled ability to generate robust, executable code. This fundamental shift — from calling to coding — unlocks unprecedented performance, accuracy, and versatility for AI agents.

Developers have witnessed the stark difference. Code Mode agents perform tasks like calculating the average shoe price with a 10x performance leap, slashing LLM calls from four to two, and reducing context size from 9.8KB to 1.7KB. Critically, it delivers accurate results, leveraging TypeScript execution in secure isolates rather than relying on the LLM's often-flawed internal arithmetic. This isn't just an optimization; it's a redefinition of agent capability.

This code-first approach extends far beyond simple API orchestration. With Code Mode Skills, agents autonomously learn and store reusable code snippets, drastically improving efficiency for recurring tasks. Moreover, Code Mode directly interfaces with databases, generating both TypeScript and SQL to perform complex queries and reporting without intermediate layers. The ultimate expression of this power lies in dynamic UI generation, allowing agents to construct entire user interfaces on the fly.

Imagine a future where AI agents aren't just intelligent assistants but active participants in system development and maintenance. This code-centric foundation paves the way for:

  • 1Self-healing infrastructure that diagnoses and patches vulnerabilities with generated code.
  • 2Autonomous software development, where agents write, test, and deploy features.
  • 3Hyper-personalized applications that dynamically adapt their interfaces and logic to individual user needs.

The call to action is clear: stop building brittle, chat-based LLM agents that struggle with basic math and incur exorbitant context costs. Embrace the power of the LLM as a code generator. Start building dynamic, efficient, and truly intelligent code-first AI systems with TanStack AI Code Mode, and redefine what's possible in agent development.

Frequently Asked Questions

What is TanStack AI Code Mode?

It's a new paradigm for AI agents where the LLM writes and executes TypeScript code to interact with tools, rather than making inefficient, direct function calls. This drastically improves performance and accuracy.

How does Code Mode improve on traditional tool calling?

By generating code, it minimizes LLM round trips, shrinks context window usage, increases execution speed, and leverages the precision of TypeScript for tasks like mathematics, avoiding common LLM inaccuracies.

What are Code Mode 'Skills'?

Skills are reusable TypeScript functions generated by the LLM that can be saved and recalled for similar future tasks. This makes subsequent requests incredibly fast and efficient, essentially allowing the AI to learn and optimize its own tools.

Can I use Code Mode with my existing tools?

Yes, Code Mode is designed to work alongside traditional tools. It's implemented as a single, special tool, allowing for a flexible, hybrid approach to building powerful AI agents.

Frequently Asked Questions

What is TanStack AI Code Mode?
It's a new paradigm for AI agents where the LLM writes and executes TypeScript code to interact with tools, rather than making inefficient, direct function calls. This drastically improves performance and accuracy.
How does Code Mode improve on traditional tool calling?
By generating code, it minimizes LLM round trips, shrinks context window usage, increases execution speed, and leverages the precision of TypeScript for tasks like mathematics, avoiding common LLM inaccuracies.
What are Code Mode 'Skills'?
Skills are reusable TypeScript functions generated by the LLM that can be saved and recalled for similar future tasks. This makes subsequent requests incredibly fast and efficient, essentially allowing the AI to learn and optimize its own tools.
Can I use Code Mode with my existing tools?
Yes, Code Mode is designed to work alongside traditional tools. It's implemented as a single, special tool, allowing for a flexible, hybrid approach to building powerful AI agents.

Topics Covered

#tanstack#ai#typescript#llm#agent-development#performance
🚀Discover More

Stay Ahead of the AI Curve

Discover the best AI tools, agents, and MCP servers curated by Stork.AI. Find the right solutions to supercharge your workflow.

Back to all posts