Open-source DeepResearch – Freeing our search agents
read at source ↗ huggingface.co
Open-source DeepResearch – Freeing our search agents
Source: HuggingFace Date: 2025-02-04 URL: https://huggingface.co/blog/open-deep-research
Summary
Library update and research reproduction: HF’s open-source reproduction of OpenAI Deep Research, built in a 24-hour sprint using smolagents. Architecture: code-based agents (Python code as action representation) rather than JSON tool-calling, with a text-based web browser (adapted from Microsoft Magentic-One) and file inspector. GAIA benchmark: 55.15% average vs OpenAI Deep Research at 67.36%; vs JSON-based agent (same setup) at 33% — a 22-point improvement from code agents alone. Code agents use 30% fewer tokens than JSON equivalents.
Implications
Model release cadence (agent reasoning). The 22-point GAIA improvement from code agents vs JSON agents is a methodologically significant result: the action representation format — not the model size or the browsing tools — is the primary variable explaining the gap between poor and strong agentic performance. This shifts the frame for teams building research agents from “which model?” to “which action representation?”
HF as open-source ML hub. Releasing an open-weights alternative to a proprietary system within 24 hours of the original announcement — and getting within 12 points of OpenAI Deep Research on GAIA — demonstrates that the open ecosystem can rapidly replicate capabilities that closed systems position as differentiated. The smolagents code-agent pattern will inform how subsequent search agents are built.