Rather than building monolithic agents that try to do everything, build small, focused agents that do one thing well. Agents are just one building block in a larger, mostly deterministic system.
The key insight here is about LLM limitations: the bigger and more complex a task is, the more steps it will take, which means a longer context window. As context grows, LLMs are more likely to get lost or lose focus. By keeping agents focused on specific domains with 3-10, maybe 20 steps max, we keep context windows manageable and LLM performance high.
Benefits of small, focused agents:
- Manageable Context: Smaller context windows mean better LLM performance
- Clear Responsibilities: Each agent has a well-defined scope and purpose
- Better Reliability: Less chance of getting lost in complex workflows
- Easier Testing: Simpler to test and validate specific functionality
- Improved Debugging: Easier to identify and fix issues when they occur
Do we still need this if LLMs get smart enough to handle 100-step+ workflows?
tl;dr yes. As agents and LLMs improve, they might naturally expand to be able to handle longer context windows. This means handling MORE of a larger DAG. This small, focused approach ensures you can get results TODAY, while preparing you to slowly expand agent scope as LLM context windows become more reliable. (If you've refactored large deterministic code bases before, you may be nodding your head right now).
agent-scope-grow.mp4
GIF Version
Being intentional about size/scope of agents, and only growing in ways that allow you to maintain quality, is key here. As the team that built NotebookLM put it:
I feel like consistently, the most magical moments out of AI building come about for me when I'm really, really, really just close to the edge of the model capability
Regardless of where that boundary is, if you can find that boundary and get it right consistently, you'll be building magical experiences. There are many moats to be built here, but as usual, they take some engineering rigor.