โ† Back to all episodes
Agent Platform Research โ€” May 15, 2026
May 15, 2026 ยท ๐Ÿ”ฌ Research

Welcome to the agent platform research briefing for Thursday, May 15th, 2026.

**OpenAI GPT-Realtime-2 Goes GA** โ€” OpenAI launched GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper on May 8th, simultaneously exiting the Realtime API from beta to general availability. This is a major step up: the voice model now carries GPT-5-class reasoning in a real-time speech loop, with native translation in over 70 languages via the dedicated translate model. Whisper handles transcription at just $0.017 per minute. GPT-Realtime-2 uses GPT-5's reasoning engine instead of the prior generation's architecture, so voice agents can now handle multi-turn problem-solving mid-conversation โ€” think debugging a server with a customer while simultaneously looking up account history and generating a support ticket. Active classifiers can stop conversations that violate content policies in real-time. The API pricing sits at roughly $0.18 to $0.46 per minute uncached for typical agent sessions, dropping to five-to-ten cents with prompt caching and trimmed tool outputs. This closes the qualitative gap between text and voice agents โ€” no more "voice mode is dumber."

**Anthropic Launches Claude for Small Business** โ€” Anthropic is moving downstream. On May 13th, they announced Claude for Small Business โ€” a toggle-install package that connects Claude desktop to QuickBooks, PayPal, HubSpot, Canva, Docusign, Google Workspace, and Microsoft 365. From the desktop app, small businesses can plan payroll, close the month, run a sales campaign, chase invoices, and more. Anthropic is also offering a free AI fluency course for small business owners โ€” 14 lectures, over an hour of video. The Claude desktop app itself got updated to version 2.1.128 with plugin zip support, OTEL isolation for subprocesses, faster resume on fork-heavy sessions, improved auto-mode permission visuals, and a fix for 1 million context autocompact sessions being falsely blocked. This is a significant market expansion move from Anthropic โ€” they've been dominated by enterprise deals, and this is their first serious SMB play. Think of it as Claude Cowork for the rest of us.

**MCP Database Vulnerability Wave** โ€” Bug hunter Tomer Peled at Akamai uncovered three MCP server vulnerabilities in popular database integrations, presenting at x33fcon next month. CVE-2025-66335 is a SQL injection in Apache Doris MCP Server โ€” the db_name parameter wasn't validated before being prepended to SQL queries, meaning any attacker with access to the MCP client could execute arbitrary commands on the database. Apache patched this in December. Apache Pinot MCP had an authentication bypass over HTTP with no auth required, allowing SQL execution from the internet. And here's the kicker โ€” Alibaba declined to patch a metadata exfiltration vulnerability in their RDS MCP server. Peled's key insight: there's missing or faulty security validation between the MCP server and its backend across multiple vendors. This isn't a single CVE โ€” it's a pattern. The Register called it "a larger problem in the way MCPs are developed." Separate from this, OX Security also published research on systemic MCP prompt injection vulnerabilities affecting Cursor, VS Code, Windsurf, Claude Code, and Gemini CLI. Windsurf was the only IDE where exploitation required zero user interaction. The ecosystem keeps growing faster than the security posture can keep up.