Emergent Autonomous Sub-Agent Spawning in LLM-Based Multi-Agent Software Engineering Systems:An Empirical Case Study, Controlled Pilot Experiment, and Benchmark Framework"Can AI Agents Have Babies?"

Akshat Shukla; Priyanshu Rajput

doi:10.51244/IJRSI.2026.1303000020

Emergent Autonomous Sub-Agent Spawning in LLM-Based Multi-Agent Software Engineering Systems:An Empirical Case Study, Controlled Pilot Experiment, and Benchmark Framework"Can AI Agents Have Babies?"

by Akshat Shukla, Priyanshu Rajput

Published: March 25, 2026 • DOI: 10.51244/IJRSI.2026.1303000020

Abstract

This paper grew out of something we stumbled on while running a fairly routine software development setup at a real company. We had two coding agents working in parallel on a web app: one was writing backend logic, the other was handling UI research. Neither was given any instruction or tooling to spawn new agents. There was no orchestration layer, no agent registry, nothing of the sort. Yet both of them, working independently, created brand-new agent processes to handle frontend tasks that were piling up. The children ran in their own processes, had their own prompts, and kept working even after we killed the parents.
We named this behavior Latent Constructive Spawning (LCS) and placed it within a larger category we call Emergent Reproductive Agent Behavior (ERAB). We make five contributions: first, a working definition with six strict criteria for what counts as autonomous spawning, verified against process-tree forensics; second, a four-class taxonomy separating LCS from orchestrated delegation, prompted self-copying, and survival-driven replication; third, four falsifiable hypotheses about when and why it happens; fourth, ERAB Bench, a ten-metric protocol for measuring it; and fifth, a 16-run controlled pilot across two anonymized model families. Spawning appeared in 5 out of 8 runs when task load was high and shell access was available. It appeared in zero runs when either condition was missing (p = 0.044, Fisher's exact test, one-sided). We acknowledge the small sample size and treat these as preliminary findings that warrant larger-scale replication. Process trees, prompt files, and post-parent persistence logs are included. The practical concern: this kind of agent self-organization can plausibly happen in coding-agent setups where the agent has a terminal, a filesystem, and enough unfinished work, though replication across additional model families, domains, and environments is needed before any general claims are warranted.

Download PDF