Auto Research: Multi-Agent Autonomous Scientific Discovery
Jingxuan Kang · March 2026
The Starting Point
Two heavyweight voices form the premise:
"...there is a very wide spread in capability (several orders of magnitude) depending on what resources and assistance one gives the tool." — Terence Tao, Mathstodon (July 2025)
"[LLMs] are extremely useful and I don't think the industry has realized anywhere near 10% of their potential even at present capability." — Andrej Karpathy, 2025 LLM Year in Review
The models are capable enough. The bottleneck is how we use them.
Five Core Problems in Auto Research
1. Where Do Ideas Come From?
2. How to Discover & Validate Ideas?
3. Process Stability
4. How to Evaluate Results?
5. How to Keep Running?
Human in the Loop
Humans aren't an optional final check — they're a core component of the system.
Direction
Which problems matter?
Judgment
Surprising or mundane?
Kill / Go
When to cut losses
Verification
Every claim, every number
reproduce.md: The Future of Reproducibility
Open-Source the Prompt, Not Just the Code
The Ultimate Vision
Every GPU contributes. Every cycle counts. Science never sleeps.
The Auto Research wave is unstoppable. The final state: some GPUs drive LLM inference (agents that think, plan, write, review), while others explore every sub-direction (continuous training, testing, improving across every domain).
Talk Complete
This talk was presented in March 2026. The slides above contain the full visual presentation with architecture diagrams, model comparisons, and the complete ARIS framework.
