The Bandwidth Gap.
A long-form web paper on attention bottlenecks in AI-native work.
What this brief says
AI capability is scaling much faster than deliberate human throughput. The bottleneck in modern AI workflows is increasingly the operator's attention, not model output volume.
Zeno Center treats this as a design and measurement problem: reduce supervisory noise, improve decision quality, and preserve cognitive sovereignty while using more capable systems.
- The key risk is cognitive debt: teams feel faster while decision quality quietly erodes.
- Research priority: make multi-agent supervision measurable instead of anecdotal.
- Product priority: ship careful primitives that filter and prioritize before operators must decide.
Frontier compute doubles every 5.2 months. Agent task horizons double every seven months and may now be doubling every four. Human conscious cognition, the part that decides, prioritises, and judges, has held steady at roughly ten bits per second for as long as it has been measurable. The capability stack now scales orders of magnitude faster than the people supervising it.
This paper makes four claims. First, the bottleneck of AI-augmented work has moved from the model to the operator, and the empirical record now supports this in a way it did not eighteen months ago. Second, the most consequential gap is not at the sensory or motor periphery but at what neuroscientists have begun calling the inner brain: the slow, deliberate layer where attention, judgment, and commitment happen. Third, closing this gap is an engineering problem, not a self-help one. It calls for a new class of artefact, the cognitive primitive, whose job is to filter, prioritise, and decide alongside the operator rather than for them. Fourth, the differentiator that will matter is not maximal capability per agent but maximal operator leverage per unit of attention spent. Zeno Center calls the property this protects cognitive sovereignty. It is the operating thesis of the lab.
Zeno's first deliverables are inline cognitive primitives for AI product teams, agent developers, and operator-heavy organisations, instrumented for multi-agent supervisor cognitive load. This paper sets out the evidence, the lineage, and the agenda.
The thesis in one sentence.
AI compounds. Human attention does not.
We build the cognitive layer that closes the gap.
This is a thesis about scarcity. For most of computing's history the scarce resource was machine cycles, and most of the field's progress can be read as a story about making them cheaper. That regime is ending. Frontier language model training compute has been doubling every 5.2 months since 2020 and overall growth is running at roughly 4.5x per year (Epoch AI, 2026). The length of tasks an autonomous agent can complete with reasonable reliability has been doubling on a seven-month cadence since 2019, with the trend appearing to accelerate to roughly four months in the most recent twelve (Kwa et al., 2025; METR Trend Tracker, 2026). The supply of capability is no longer the constraint on AI-augmented work. The supply of human attention is.
This claim is older than it looks. Herbert Simon stated it in 1971 in a now-famous passage on the design of organisations: a wealth of information creates a poverty of attention (Simon, 1971). What is new is that the wealth of information has been transformed into a wealth of agentic output. The artefact pressing against the operator is no longer the news feed. It is a workforce of legible, plausible, often wrong machine collaborators. The economic question of the next decade is whether the operator can keep up. The operating thesis of Zeno Center is that the tools that close this gap will define the next platform of knowledge work, and the companies that build them will hold a position structurally analogous to the one personal computing held in the 1980s and the cloud held in the 2010s.
Two curves.
A picture is sufficient. On one axis, time. On the other, capability. The AI curve is a near-vertical exponential. The human curve is a flat line that has been flat for fifty thousand years. The gap between them is the operating environment of every knowledge worker alive today.
The two curves are not an analogy. They are the result of the only two empirical regimes we have direct measurements of: the scaling laws of frontier models and the throughput of conscious human cognition.
On the AI side, three numbers anchor the trend. Training compute for notable models grows 4.5x per year and for frontier language models doubles every 5.2 months (Epoch AI, 2026). Inference cost on equivalent capability falls by approximately one order of magnitude per year, halving roughly every two months (Epoch AI, 2026). Time-horizon-of-task, the longest task an agent can complete with 50% reliability, doubled every seven months from 2019 to 2025, with recent data suggesting acceleration to roughly four months (Kwa et al., 2025).
On the human side, the numbers are older and more stable. Working memory holds approximately four chunks at a time, revised down from George Miller's classic seven (Miller, 1956; Cowan, 2001). Sustained attention on a single screen lasts an average of forty-seven seconds before a switch (Mark et al., 2016; Mark, 2023). Recovery from an interruption takes twenty-three minutes and fifteen seconds (Mark, Gudith and Klocke, 2008). And the load-bearing figure for this paper, recently re-derived in Neuron by Zheng and Meister: human conscious behavioural throughput averages around ten bits per second (Zheng and Meister, 2025).
The compression ratio is approximately 100,000,000 to 1.Sensory intake (~10⁹ bits/s) versus deliberate cognition (~10¹ bits/s).
The paradox is not that we are slow. It is that we have built a peripheral nervous system, our senses and our motor systems, that handles roughly a billion bits per second, and a central deciding layer that handles ten. Whatever the inner layer is doing, it is not bandwidth-matching.
A second arithmetic is worth doing in plain sight. Microsoft's June 2025 Work Trend Index special report finds that the average knowledge worker now experiences 275 interruptions per day, roughly one every two minutes during the standard nine-to-five (Microsoft WTI, 2025). Mark, Gudith and Klocke have established that recovery from a single interruption takes twenty-three minutes (Mark, Gudith and Klocke, 2008). If both numbers held independently and additively, recovery alone would consume approximately 105 hours of working time per day. They cannot both be true at full strength. The implication is therefore the only available one: the contemporary knowledge worker is operating in a state of chronic partial recovery, never reaching baseline cognitive engagement before the next interruption arrives. This is the operating environment AI is being deployed into.
The inner brain.
Zheng and Meister's contribution is not the figure itself. The slowness of conscious cognition has been observed at least since Miller (1956). Their contribution is a clean conceptual split. The brain has an outer layer, which handles perception, motor control, and reflex, and an inner layer, which handles deliberate thought. The outer brain is fast, parallel, and subconscious. The inner brain is slow, serial, and the seat of attention.
This split clarifies the AI augmentation question. Most existing AI tooling targets the outer brain. Voice transcription, autocomplete, image recognition, retrieval, and search are all outer-brain prosthetics. They feed perception or extend motor output. They are not where the cognitive load of AI-augmented work lives.
The load lives in the inner layer. It lives in the moment when an operator reads three drafts from three agents and has to decide which one matters. It lives in the moment when a developer reviews a diff that compiles, looks reasonable, and is wrong in a way only the human can catch. It lives in the moment when a product manager has eight Slack threads, four agents running, two meetings queued, and forty seconds before the next interruption.
The inner brain is the bottleneck, and almost nothing in the current AI tool stack is designed for it. The dominant paradigm of 2024 to 2026 has been to give the operator more outputs to inspect, more agents to supervise, more drafts to choose between, more interfaces to switch across. Each of these moves expands outer-brain bandwidth and consumes inner-brain bandwidth. The category most analogous to the inner brain in the current stack is the prompt, which is also the category most under-engineered.
If this analysis is right, the next decade of AI-augmented productivity gains will not come primarily from larger models. It will come from artefacts that are designed for the ten-bit channel.
AI made experts slower, and lied to them about it.
The first wave of empirical work on generative AI showed dramatic productivity gains. A randomised trial of customer support agents found a 14% increase in issues resolved per hour, with novice agents seeing a 34% lift (Brynjolfsson, Li and Raymond, 2023). A controlled trial of writing tasks found 40% faster completion and 18% higher quality scores (Noy and Zhang, 2023). A controlled experiment with 758 BCG consultants found a 12.2% lift in tasks completed, a 25.1% lift in speed, and a 40% lift in human-rated quality on tasks within the AI's capability frontier (Dell'Acqua et al., 2023). GitHub Copilot users completed a controlled programming task 55.8% faster than the control, with a 95% confidence interval of 21% to 89% (Peng et al., 2023).
These numbers are real. They are also incomplete. The same Dell'Acqua paper that reported 12% to 40% gains within the AI frontier reported a 19% drop in correct solutions on a complex managerial task selected to lie outside the frontier. That drop is the pivot. Each of the celebrated productivity studies was generated under conditions that systematically favour AI: short-horizon tasks, in-frontier problems, novice-skewed populations, lab settings with low task ambiguity. None of them was a fair test of the population that does most of the world's high-value knowledge work.
The fair test arrived in July 2025. METR ran a randomised controlled trial on sixteen experienced open-source developers working on their own repositories. The repositories averaged ten years of development and more than one million lines of code, and each developer had on average five years and 1,500 commits of personal authorship on the codebase they worked in. Each task was randomly assigned an AI-allowed or AI-disallowed condition. The numbers are best read in order:
Before the study, developers forecast that AI would make them 24% faster.
After the study, developers reported feeling 20% faster.
The measured result was a 19% slowdown.
— Becker, Rush et al., 2025
The gap between perceived productivity and measured productivity was thirty-nine percentage points. The developers were not just slower. They believed they were faster while being slower. This is a metacognitive failure, not a productivity failure, and it is the single most important finding in the AI productivity literature to date.
Three observations from the METR data deserve to land. First, the effect was robust across twenty-one alternative explanations the authors tested. It is not an artefact of the specific tools (Cursor Pro and Claude 3.5/3.7 Sonnet at the time), the task selection, or the developer population. Second, the one outlier in the dataset, a developer with more than fifty hours of Cursor experience, did show a positive speedup. The implication is that there is a high skill ceiling on getting useful work out of these tools, and most operators are nowhere near it. Third, the dominant time sink reported by participants was cleaning up AI-generated code, not generating it.
The METR result does not refute the earlier productivity studies. It triangulates with them. The honest synthesis is this: AI substantially accelerates novices on short, in-frontier tasks; it can slow experts on long, ambiguous, off-frontier tasks; and across the population it does so in ways that the operators themselves cannot accurately perceive.
The metacognitive failure is the deeper finding, and it has been replicated across populations and tasks. Lee, Sarkar, Tankelevitch and colleagues at Microsoft Research and Carnegie Mellon surveyed 319 knowledge workers across 936 first-hand examples of AI use at work. The headline result: higher confidence in the AI predicted less critical thinking; higher confidence in oneself predicted more (Lee et al., 2025). Sarkar's companion framing put it in three words: when copilot becomes autopilot (Sarkar et al., 2024). Stadler and colleagues showed that LLM-assisted students achieve cognitive ease at the cost of depth (Stadler, Bannert and Sailer, 2024). Bastani and colleagues ran a randomised study of GPT-tutored students who scored worse than controls on subsequent unaided tests (Bastani et al., 2024). Kosmyna and colleagues at the MIT Media Lab measured EEG connectivity during AI-assisted essay writing and found brain network engagement weakest in the LLM-assisted group, persistently so even after the assistance was withdrawn (Kosmyna et al., 2025).
The MIT Media Lab paper, with senior author Pattie Maes, named this pattern cognitive debt. The Zeno thesis is that cognitive debt is not a side effect to be flagged and managed. It is the central design constraint of the next decade of knowledge work, and the artefact category that pays it down does not yet exist.
Cognitive debt is not an argument against AI. It is an argument for instrumentation, and for a new class of tool.
The babysitting tax.
If single-agent assistance is already producing measurable cognitive debt under controlled conditions, multi-agent supervision is producing something larger and almost entirely unmeasured.
The trajectory is established. Anthropic's Economic Index reports through 2025 and 2026 trace a clear migration of API workloads toward automation patterns, with enterprise API traffic running at roughly three-quarters automation by mid-2025, while consumer Claude.ai usage has oscillated between augmentation and automation majorities and trended back toward augmentation in the most recent reports (Anthropic Economic Index, September 2025; January 2026; March 2026). Microsoft's 2025 Work Trend Index Annual Report finds that 82% of leaders plan to use agents to expand workforce capacity in the next twelve to eighteen months (Microsoft WTI Annual Report, April 2025). The companion Infinite Workday special report, drawing on Microsoft 365 telemetry from 31,000 workers across 31 countries, documents the 275 daily interruptions cited above (Microsoft WTI, June 2025).
These numbers, taken together, describe the operating environment of the next two years. Each operator will be expected to supervise more agents, in shorter timeslices, against a baseline of attentional fragmentation that already exceeds anything in human work history.
The literature on what happens to cognitive load when a supervisor runs more than two or three concurrent agents is, at the time of writing, almost empty. The single most relevant empirical paper, Zhou and colleagues' OrchVis study at CHI 2025, demonstrates that hierarchical orchestration interfaces materially reduce supervisor load compared to flat task lists, but the absolute baseline of multi-agent supervisory load has never been normed (Zhou et al., 2025). NASA-TLX, the standard subjective load instrument since 1988, was developed for cockpit and process-control work and has not been recalibrated for parallel agent supervision (Hart and Staveland, 1988). Mozannar and colleagues have begun modelling interaction cost with single AI assistants (Mozannar et al., 2024); the multi-assistant case remains open.
We will call this gap the babysitting tax. It is the cognitive cost per unit of supervised agent capacity, and the inflection point at which adding another agent reduces total throughput rather than increasing it. The babysitting tax is currently invisible in vendor reporting, because every vendor of agent platforms has a commercial interest in treating agent count as additive. It will not be invisible for long. The first organisations that measure it credibly will gain a structural advantage in agent deployment. Zeno is committing to producing the first such measurement (see Section IX).
A sixty-year lineage.
Nothing in this thesis is original. The argument that humans need engineered artefacts to think well at scale was made in 1945 by Vannevar Bush, when he proposed the Memex as a personal associative memory (Bush, 1945). It was made in 1960 by J.C.R. Licklider in Man-Computer Symbiosis, the paper Zeno's thesis is a sixty-year-old descendant of (Licklider, 1960). It was made in 1962 by Douglas Engelbart, in the most ambitious technical document ever written on the subject, Augmenting Human Intellect: A Conceptual Framework (Engelbart, 1962). It was made by Alan Kay and Adele Goldberg in 1977 with the Dynabook (Kay and Goldberg, 1977). It was made in 1988 by John Sweller, who established that cognitive load is a measurable engineering quantity rather than a metaphor, the empirical foundation on which everything in this paper rests (Sweller, 1988). It was made in 1994 by Pattie Maes in Agents That Reduce Work and Information Overload, a paper whose title alone made the cognitive-load thesis thirty years before the LLM era (Maes, 1994).
The philosophical scaffolding is older still. Andy Clark and David Chalmers's Extended Mind hypothesis, published in Analysis in 1998, argued that cognition can constitutively include the artefacts in the environment, not just rely on them (Clark and Chalmers, 1998). The hypothesis has serious critics, principally Adams and Aizawa's coupling-constitution fallacy objection (Adams and Aizawa, 2008), and Zeno does not endorse the strong version. The defensible position, which Kim Sterelny has articulated as the scaffolded mind thesis, is more modest: cognition is reliably scaffolded by external structure, and the design of that scaffolding is a serious cognitive engineering problem (Sterelny, 2010). The strong claim is contested. The scaffolding claim is not.
Edwin Hutchins's Cognition in the Wild, his ethnography of navigation aboard a US Navy vessel, remains the canonical empirical demonstration that real cognitive work is distributed across people, instruments, and procedures, with no single operator holding the whole task in mind (Hutchins, 1995). The Navy has known for thirty years what AI agent platforms are now rediscovering: a competent supervisor of complex distributed work is mostly running the artefacts, not the cognition.
The contemporary lineage runs through Bret Victor, Andy Matuschak and Michael Nielsen, Geoffrey Litt, and Maggie Appleton. Matuschak and Nielsen's 2019 essay How Can We Develop Transformative Tools for Thought? defined the modern movement and named its central failure: that the tools-for-thought industry has produced thousands of note-taking apps and zero genuine cognitive amplifiers (Matuschak and Nielsen, 2019). Litt and the Ink and Switch lab have demonstrated the technical preconditions of the next generation, particularly malleable software and local-first data (Litt et al., 2025). Appleton's correction is essential: tools for thought are cultural practices first and computational objects second, and the current movement has confused the two (Appleton, 2022).
The MIT Media Lab Fluid Interfaces group, under Pattie Maes, and the Advancing Humans with AI group under Pattaranutaporn, are currently the most active academic node of this work. Pattaranutaporn and colleagues' 2021 Nature Machine Intelligence work on AI-generated characters and personalised learning interventions is the most cited example of designing for the inner brain rather than the outer one (Pattaranutaporn et al., 2021).
Zeno's contribution to this lineage is not philosophical. It is operational. The thesis has been correct for sixty years. What has been missing is a research-driven product company that builds for the ten-bit channel as a primary engineering target rather than as a side effect.
Cognitive primitives.
A cognitive primitive is a reusable artefact whose primary engineering target is the operator's inner-brain bandwidth.
The category is defined by what it is not. A cognitive primitive is not a chatbot, because a chatbot consumes attention rather than conserving it. It is not an integration, because integrations multiply surface area. It is not an agent, because an agent is a producer of work to be supervised, not a structure for supervising work. It is not a productivity app, because productivity apps in the 2010s sense optimise output volume, and the bottleneck of 2026 is not output volume.
A cognitive primitive does one of three things, sometimes more than one. It filters: it reduces the number of artefacts the inner brain has to evaluate. It prioritises: it imposes a serial order on a parallel input stream. It commits: it converts an open decision into a closed one with a documented rationale.
These three functions are exactly the ones the inner brain is doing at ten bits per second. A well-designed primitive does not replace the operator's judgement. It pre-conditions the input stream so that the operator's judgement is applied to the right ten bits.
The product surface follows from the function. Zeno's first releases are expected to take some combination of these forms: a daily decision brief that consolidates the operator's open commitments, agent outputs, and asynchronous communications into a single artefact reviewed once at a fixed time, replacing the tens of context switches a knowledge worker currently makes to assemble the same picture; a supervisor console for multi-agent work that exposes only the decisions the operator must make, hides the agent traffic that is purely internal to the work, and produces a load-aware reading order rather than a chronological feed; a human context API that provides downstream agents with structured information about the operator's current focus, capacity, and commitments, so that agent-side scheduling can become attention-aware rather than calendar-aware; and a cognitive sovereignty layer that gives the operator graduated control over how much agency is delegated to AI on which tasks, with audit trails sufficient for the operator to recover the reasoning of past delegated decisions.
Each is a hypothesis, not a roadmap commitment. The roadmap commitment is the methodology: every primitive Zeno ships is instrumented for cognitive load, evaluated on real operators in real workflows, and published with negative results when negative results are what we find.
Cognitive sovereignty.
The brand bears the name of Zeno of Citium, founder of the Stoic school, and the name is doing real work. The Stoic tradition is the one body of pre-modern philosophical practice most directly concerned with the disciplined management of attention under input overload. Pierre Hadot's reading of the tradition as philosophy as a way of life, embodied and protocolised rather than purely textual, is the position Zeno operates from (Hadot, 1995).
The Stoic discipline of attention, prosoche, was protocolised in the Discourses of Epictetus and the Meditations of Marcus Aurelius. The Zeno wager is that this protocol can be re-engineered for an environment Marcus could not have imagined and would have recognised instantly: an inbox of agents.
We are aware that Stoicism has been flattened into a productivity meme over the last decade and we will not contribute to that. The reason the Stoic tradition is load-bearing for Zeno is precise. It supplies a vocabulary, a body of practice, and an ethical commitment that ordinary productivity discourse does not. It says, in language we can borrow without translation, that attention is a virtue, that the operator's relation to their own mind is the central engineering problem, and that the discipline can be protocolised.
The term we use for the goal of the discipline, in twenty-first century language, is cognitive sovereignty. It is the property of an operator who retains meaningful authorship over their own attention, judgement, and decisions in an environment saturated with AI inputs. Cognitive sovereignty is not the rejection of AI. It is the architecture of one's relationship with AI, deliberately designed.
It is also, in practical terms, the differentiator that matters. The vendor universe is converging on a single product pattern: the maximally capable agent with the maximally low-friction interface. Zeno's wager is that the maximally capable agent is the wrong product. The right product is the one that gives the operator the maximum amount of leverage per unit of attention spent. These two design targets diverge, and they will diverge more sharply each year.
The research agenda.
The bibliography that informs this paper, available on request, contains 165 sources. It also contains seven gaps where the published literature is conspicuously thin and where Zeno proposes to contribute primary research.
Flagship research directions.
Zeno is developing a NASA-TLX-equivalent normative dataset for cognitive load under multi-agent LLM supervision, in collaboration with Eindhoven-region and Dutch academic partners, using NASA-TLX plus optional eye-tracking and EEG where possible. The intended output is a benchmark and measurement protocol that other organisations can reuse.
Zeno is also developing a randomized trial of a four-week prosoche-derived attention protocol for knowledge workers, with NASA-TLX, output quality, sustained attention, and self-reported flow as outcomes. The purpose is to convert a long philosophical lineage into a measurable intervention.
Additional research directions.
The babysitting tax. The supervisor-attention-per-agent function and the inflection point at which additional agents reduce total throughput. A defensible empirical answer would change how every agent platform on the market is sold.
Skill atrophy under sustained AI assistance. A six-to-twelve-month longitudinal study of professional knowledge workers under different AI-use protocols, with cognitive and biometric outcomes.
A defensible census of AI-tool count per knowledge worker. The figure most commonly cited in vendor sprawl reports conflates SaaS apps with AI tools. A clean methodology and an open dataset is a high-citation contribution and a piece of public infrastructure.
Cross-cultural replication of attention-residue and interruption findings in AI-saturated workflows. Most existing data is American or European. The Anthropic and World Economic Forum data show that adoption curves vary sharply across regions; the cognitive consequences will too.
Agent-orchestration cognitive load theory. Sweller's CLT framework has been formalised for instructional design, programming education, and pilot work. It has not been formalised for the case where the learner is also the supervisor of an autonomous workforce. Zeno proposes to do the work (Sweller, 2020).
Outputs that succeed will be published, peer-reviewed where possible, and released as open datasets. Outputs that fail will be published as well.
An invitation.
Zeno Center is based in Eindhoven. The choice of location is deliberate. Brainport is one of Europe's densest deep-tech corridors, and the cluster of TU Eindhoven (TU/e Innovation Space), the Eindhoven AI Systems Institute (EAISI), the Jheronimus Academy of Data Science (JADS), and MindLabs Tilburg maps closely to the disciplines this thesis draws from. We are set up to be embedded in that ecosystem rather than adjacent to it.
If you build AI products and you have noticed, against your own expectations, that your team is not faster, this paper is for you. If you ship under sustained pressure and your edge depends on judgement that does not yet have a tool, this paper is for you. If you are an academic working on cognitive load, human-AI interaction, or distributed cognition, and you would like access to the instrumentation and data we are building, this paper is also for you, and we would like to talk.
AI's bottleneck has moved from silicon to attention. The companies, teams, and operators who close that gap will define the next decade of knowledge work. Zeno is the lab and the toolkit for that transition.
Design partner program
We work with a small number of design partners who receive early access to cognitive primitives and shared measurement protocols. If this fits your team, join updates at zeno.center or write to admin@zeno.center.
The site is at zeno.center. The bibliography and full source list are available on request.
About Zeno.
Zeno Center is a research and product lab based in Eindhoven, NL. The legal entity is a Dutch BV.
The lab is founded by Mar Helali, an operator with a decade in site reliability engineering and active executive roles across an AI orchestration platform, an engineering services firm, and an AI-augmented enterprise software collaboration. The cognitive load thesis grew out of his own operating experience and an in-progress academic pivot toward philosophy and psychology research on human-AI cognitive integration.
The team is international by design. Engineering leadership is based in Tunisia. Design and cognitive UX are based in Korea. Operations and university partnerships are based in Eindhoven. The lab works with academic partners in the JADS, MindLabs, EAISI, and TU/e cluster.
Zeno is structured to fund its research programme through a focused set of cognitive primitives for AI developers and agent builders, with a research-collaboration track for academic and operator-heavy partner organisations.
Contact: admin@zeno.center.