You Don’t Know What You Want Until You Want It

You Don’t Know What You Want Until You Want It

Why Runtime Discovery Is the Future of the Agentic Web

Invoke Labs | June 2025


Abstract

As the agentic web emerges, one thing becomes clear: the interface between models and the world must be dynamic, lightweight, and open. While current approaches like MCP (Anthropic, 2023) attempt to standardize tool use through rigid schemas and compile-time bindings, they miss the core truth about agentic behavior: you don’t know what you want until you want it. This paper introduces Invoke — a lightweight execution layer for LLMs — and proposes a framework where tool discovery happens at runtime, not compile.


The Problem with Compile-Time Tools

Today, most “agents” are little more than glorified prompting layers wrapped around an LLM with some statically defined tools. Whether through MCP or custom plugins, these agents suffer from three core limitations:

  • Premature Tool Binding: Developers must guess in advance what tools the model might need — and hard-wire them in.
  • Context Bloat: Every added tool increases the prompt size, bloating memory and slowing inference.
  • Lack of Serendipity: Agents can’t discover new tools mid-chain. There’s no native way to say, “I don’t know what tool I need yet. Let me look.”

This is like forcing a browser to preload every website it might visit before the user types a URL


The Invoke Model

Invoke flips the paradigm. Instead of treating tools as hardcoded runtime functions, it treats them the way the web treats documents: as discoverable resources. Any API that exposes a simple agents.json descriptor can be used by any model, on demand. In this way, Invoke avoids persistent connections, orchestration runtimes and central registries, instead introducing one declarative JSON file and a single HTTP tool. The model does the rest.

1. The agents.json Schema

Every API exposes a file — typically hosted at /agents.json — describing:

  • Available tools (name, description, method, URL)
  • Required parameters (with types)
  • Example calls (for grounding)
  • Auth requirements (oauth:Bearer, query:key, or none)

It’s the robots.txt of the agentic web — not for crawlers, but for capable agents. It tells the model not what it can’t do, but what it can.

Figure 2. An example of an agents.json API manifest for OpenWeatherMap

Recent advances in large language models have highlighted the power of example-driven prompting to elicit reasoning capabilities. Zhou et al. (2022) demonstrated that chain-of-thought prompting, which introduces intermediate reasoning steps through examples, significantly improves accuracy across a range of complex tasks. This builds on the foundational insight of in-context learning introduced by Brown et al. (2020), who showed that models can rapidly generalize to new tasks simply by conditioning on a few demonstrations. Extending this further, Chen et al. (2021) found that language models trained on source code—such as Codex—exhibit strong capabilities in tool use, particularly when provided with structured examples that mirror realistic schemas and payloads.

By embedding real examples—complete with URLs, parameters and notes—we give models the scaffolding they need to understand the intent, syntax, and semantics of any API.

2. The Universal Invoke Tool

Instead of hard-wiring tools into the model’s context, Invoke exposes just one: a universal router. That tool:

  • Accepts a tool name + parameters
  • Fetches the descriptor from the relevant agents.json
  • Executes the request (handling formatting, auth, and response parsing)
  • Returns the result to the model

This is like giving the model a browser, not a fixed dashboard. Tool discovery happens at inference time, not compile time.

Figure 2. Runtime discovery and invocation flow in the agentic web

3. Lightweight Auth

Tool use demands auth — but it shouldn’t demand friction. Invoke handles this with a federated, pluggable model:

  • APIs declare required auth (auth: "oauth:Bearer:i", "g", "m" etc.)
  • Users authenticate once per provider, cached locally
  • Tokens are scoped, secure, and only surfaced on demand

Invoke is the first provider — but not the only provider. An open agentic web means in the future, users might choose from Google, Microsoft, Meta, or others when authenticating agents. Just like the web didn’t require a central login to take off, the agentic web just needs lightweight, pluggable auth that works at runtime.


Why This Matters

🔍 Runtime Discovery

When a user says, “Book me a restaurant,” the agent shouldn’t rely on a pre-defined list of food services. It should follow links, fetch agents.json files, and find the best option in real-time. This mirrors how the web works — and how agents should behave.

🧠 Model-Native Simplicity

LLMs are excellent at navigating ambiguity. The more we delegate tool selection and error handling to the model, the less infrastructure we need. Invoke doesn’t scaffold the agent. It empowers it.

🚫 No Vendor Lock-in

MCP wants to be the USB-C of tools — but it’s a protocol that must be opted into and compiled against. Invoke is just HTTP and JSON. Open. Forkable. Crawlable. Forever.


A Tale of Two Futures

Feature

MCP

Invoke

Tool Discovery

Compile-time only

Runtime (via agents.json)

Integration Effort

High (protocol, state, hosting)

Low (single config file)

Tool Usage

Requires orchestration/runtime

Just a single HTTP tool

Openness

Semi-open (via protocol lock-in)

Fully open (no dependencies)

Model Autonomy

Constrained by preloaded schema

Free to explore and fail


Conclusion

The agentic web will not be built on protocols. It will be built on links. The future isn’t a central registry or a giant YAML file. It’s a world where agents follow their curiosity — and every API is just one step away. Invoke makes this possible. Because you don’t know what you want… until you want it.


📚 References

  • Anthropic, 2023 Tool use with claude 2 and the mcp. https://docs.anthropic.com/claude/docs/tool-use
  • Zhou et al., 2022 — Zhou, J., Chung, H. W., Hou, L., Wu, S., Schuurmans, D., Szolovits, P., Le, Q., & Chi, E. H. (2022). Least-to-Most Prompting Enables Complex Reasoning in Large Language Models. arXiv:2205.10625. https://arxiv.org/abs/2205.10625
  • Brown et al., 2020 — Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al. (2020). Language Models are Few-Shot Learners. NeurIPS 2020. https://arxiv.org/abs/2005.14165
  • Chen et al., 2021 — Chen, M., Tworek, J., Jun, H., Yuan, Q., Pinto, H. P. de O., Kaplan, J., Edwards, H., Burda, Y., Joseph, N., Brockman, G., et al. (2021). Evaluating Large Language Models Trained on Code. arXiv:2107.03374. https://arxiv.org/abs/2107.03374