Published on Apr 9, 2024

On-Demand macOS Instances for Agents

The most natural interface for humans is also the most natural interface for AI agents. We're making that interface accessible.

There's a pattern you see repeated throughout the history of computing: the most powerful interfaces tend to be the most general ones. Unix pipes. The web browser. And perhaps most fundamentally, the graphical user interface itself.

When we started building our platform for providing remote macOS instances, we had a thesis: the same interface that works best for humans would work best for AI agents. Not specialized APIs. Not custom protocols. Just pure, unfettered access to macOS through the same visual interface humans use.

This might seem counterintuitive. Surely AI agents would prefer a programmatic interface? But that's looking at it backwards. The GUI isn't just an accommodation for human limitations—it's an elegant abstraction layer that's evolved over decades to be the most effective way to interact with computers.

Think about what happens when an AI agent needs to use Final Cut Pro. You could try to create a specialized API for video editing operations. But that API would always lag behind the actual software's capabilities. Instead, why not let the agent interact with Final Cut Pro directly, just like a human would?

This is what our platform enables. We provide on-demand macOS instances with full VNC access over WebSocket connections. AI agents can see the screen and interact with it just like humans do. They can learn from watching real users. They can operate any application, not just ones with automation APIs.

The implications are profound:

AI assistants can learn creative software by actually using it, not through limited APIs
Testing automation can work with any application, because it's operating at the OS level
AI agents can maintain persistent states across sessions, just like human users
Training data can be collected from real OS interactions, not simulated environments

What's particularly interesting is how this approach scales. When you give AI agents access to the standard human interface, they can adapt to new applications and workflows without requiring new APIs or integrations. The same interface that lets them use Final Cut Pro today will let them use whatever new applications emerge tomorrow.

This matters because the future of AI isn't just about processing data—it's about interacting with the real world. And for many professional workflows, that real world runs on macOS. By providing AI agents with genuine OS access, we're enabling a new generation of assistants that can truly understand and participate in human workflows.

The technical challenges here weren't trivial. We needed to make instance provisioning instant and reliable. We had to ensure VNC connections were fast enough for real-time interaction. We had to solve user persistence across sessions. But the result is transformative: AI agents can now interact with macOS as naturally as humans do.

This is just the beginning. As AI models become more sophisticated in understanding visual interfaces, having access to real OS environments will become increasingly crucial. We're not just providing infrastructure—we're enabling a fundamental shift in how AI agents interact with computers.

The next generation of AI assistants won't be limited to chat interfaces or specialized APIs. They'll be able to use computers just like we do, learning and adapting to new applications naturally. And that's going to change everything.

The future of AI automation isn't about replacing human interfaces—it's about making them accessible to artificial agents. Because in the end, the best interface for humans turned out to be the best interface for AI too.

All macOS API articles