Mira Murati served as OpenAI's CTO for approximately 1.5 years (January 2023 to September 2024) watching AI models get smarter but not faster. Her new company, Thinking Machines, thinks it can fix that. The startup announced Monday it's developing "interaction models" that respond to users in real time — a direct attack on the latency problem that still plagues enterprise AI.
Key Takeaways
- Thinking Machines is developing AI "interaction models" that respond in real time
- This represents the first public announcement from Mira Murati's post-OpenAI venture
- Technical details about how these interaction models differ from existing AI remain limited
What Happened
Thinking Machines made its first substantive public move this week: demonstrating AI systems that respond instantly to user inputs. According to The Verge, the company is showing off what it calls "interaction models" — though technical specifics remain scarce.
The timing matters. Murati left OpenAI in September after helping build GPT-4 and ChatGPT. She's kept quiet about her next move until now. This announcement signals where she thinks the real opportunity lies: not making AI smarter, but making it faster.
The company is currently demonstrating functional prototypes, suggesting they've moved beyond research into working systems. But the source material doesn't specify the technical architecture or performance benchmarks that would prove these models actually outperform existing systems.
Why Latency Is the Real Problem
What most coverage misses: this isn't about building better AI. It's about fixing the deployment problem that enterprise customers actually face.
Current AI systems — including those Murati helped develop at OpenAI — create noticeable delays between input and response. That lag kills conversational flow in customer service, slows collaborative workflows, and breaks the illusion of natural interaction that makes AI useful for end users.
For enterprise applications, those few seconds matter enormously. Real-time responses could unlock customer service automation that doesn't feel robotic, collaborative AI that keeps pace with human thinking, and interactive decision-making systems that work at business speed.
Murati's OpenAI experience gives her unusual insight into why this problem persists: it's not just about faster chips. It requires fundamental optimization of how models process and generate responses.
What Remains Unproven
The available reports don't specify how Thinking Machines achieves real-time responses. Are they using novel architectures? Optimized inference techniques? Specialized hardware? The technical approach matters because it determines whether this solution can scale.
The company hasn't disclosed target markets, pricing models, or integration approaches. More importantly, no performance comparisons with existing systems have been provided. Without benchmarks, "real-time" remains a marketing claim rather than a technical achievement.
The computational requirements also remain unclear. Real-time AI typically demands significant infrastructure investment, which affects both cost and accessibility for potential customers.
What To Watch Next
The next signal will be technical documentation. If Thinking Machines publishes research papers or detailed performance metrics, that suggests they have genuine technical breakthroughs rather than just optimized existing approaches.
Enterprise pilot programs would provide stronger evidence of commercial readiness. Given Murati's network, partnerships with major customers could emerge quickly if the technology delivers on its promises.
The broader question: whether Thinking Machines can prove that real-time interaction represents a genuine advance over current AI capabilities, or whether they're solving a problem that better hardware would fix anyway. That distinction will determine whether this becomes a product category or just another optimization story.