Listen

Description

(00:00:00) Introducing Copilot Studio's New "Computer Use" Feature

(00:01:22) The Power of Direct Computer Interaction

(00:03:19) Setting Up Computer Use: A Step-by-Step Guide

(00:06:16) Watching the AI Learn: A Fascinating but Flawed Process

(00:09:49) The Governance Catch: Balancing Autonomy and Control

(00:15:02) Building a Responsible AI Workforce

(00:20:16) Upcoming Deep Dives and Subscription Call



Opening: “The AI Agent That Runs Your Power App”Most people still think Copilot writes emails and hallucinates budget summaries. Wrong. The latest update gives it opposable thumbs. Copilot Studio can now physically use your computer—clicking, typing, dragging, and opening apps like a suspiciously obedient intern. Yes, Microsoft finally taught the cloud to reach through the monitor and press buttons for you.And that’s not hyperbole. The feature is literally called “Computer Use.” It lets a Copilot agent act inside a real Windows session, not a simulated one. No more hiding behind connectors and APIs; this is direct contact with your desktop. It can launch your Power App, fill fields, and even submit forms—all autonomously. Once you stop panicking, you’ll realize what that means: automation that transcends the cloud sandbox and touches your real-world workflows.Why does this matter? Because businesses run on a tangled web of “almost integrated” systems. APIs don’t always exist. Legacy UIs don’t expose logic. Computer Use moves the AI from talking about work to doing the work—literally moving the cursor across the screen. It’s slow. It’s occasionally clumsy. But it’s historic. For the first time, Office AI interacts with software the way humans do—with eyes, fingers, and stubborn determination.Here’s what we’ll cover: setting it up without accidental combustion, watching the AI fumble through real navigation, dissecting how the reasoning engine behaves, then tackling the awkward reality of governance. By the end, you’ll either fear for your job or upgrade your job title to “AI wrangler.” Both are progress.Section 1: What “Computer Use” Really MeansLet’s clarify what this actually is before you overestimate it. “Computer Use” inside Copilot Studio is a new action that lets your agent operate a physical or virtual Windows machine through synthetic mouse and keyboard input. Imagine an intern staring at the screen, recognizing the Start menu, moving the pointer, and typing commands—but powered by a large language model that interprets each pixel in real time. That’s not a metaphor. It literally parses the interface using computer vision and decides its next move based on reasoning, not scripts.Compare that to a Power Automate flow or an API call. Those interact through defined connectors; predictable, controlled, and invisible. This feature abandons that polite formality. Instead, your AI actually “looks” at the UI like a user. It can misclick, pause to think, and recover from errors. Every run is different because the model reinterprets the visual state freshly each time. That unpredictability isn’t a bug—it’s adaptive problem solving. You said “open Power Apps and send an invite,” and it figures out which onscreen element accomplishes that, even if the layout changes.Microsoft calls this agentic AI—an autonomous reasoning agent capable of acting independently within a digital environment. It’s the same class of system that will soon drive cross-platform orchestration in Fabric or manage data flows autonomously. The shift is profound: instead of you guiding automation logic, you set intent, and the agent improvises the method.The beauty, of course, is backward compatibility with human nonsense. Legacy desktop apps, outdated intranet portals, anything unintegrated—all suddenly controllable again. The vision engine provides the bridge between modern AI language models and the messy GUIs of corporate history.But let’s be honest: giving your AI mechanical control requires more than enthusiasm. It needs permission, environment binding, and rigorous setup. Think of it like teaching a toddler to use power tools—possible, but supervision is mandatory. Understanding how Computer Use works under the hood prepares you for why the configuration feels bureaucratic. Because it is. The next part covers exactly that setup pain in excruciating, necessary detail so the only thing your agent breaks is boredom, not production servers.Section 2: Setting It Up Without Breaking ThingsAll right, you want Copilot to touch your machine. Brace yourself. This process feels less like granting autonomy and more like applying for a security clearance. But if you follow the rules precisely, the only thing that crashes will be your patience, not Windows.Step one—machine prerequisites. You need Windows 10 or 11 Pro or better. And before you ask: yes, “Home” editions are excluded. Because “Home” means not professional. Copilot refuses to inhabit a machine intended for gaming and inexplicable toolbars. You also need the Power Automate Desktop runtime installed. That’s the bridge connecting Copilot Studio’s cloud instance to your local compute environment. Without it, your agent is just shouting commands into the void.Install Power Automate Desktop from Microsoft, run the setup, and confirm the optional component called Machine Runtime is present. That’s the agent’s actual driver license. Skip that and nothing will register. Once it’s installed, launch the Machine Runtime app; sign in with your work or school Entra account—the same one tied to your Copilot Studio environment. The moment you sign in, pick an environment to register the PC under. There’s no confirmation dialog—it simply assumes you made the right decision. Microsoft’s version of trust.Step two—verify registration in the Power Automate portal. Open your browser, go to Power Automate → Monitor → Machines, and you should see your device listed with a friendly green check mark. If it isn’t there, you’re either on Windows Home (I told you) or the runtime didn’t authenticate properly. Reinstall, reboot, and resist cursing—it doesn’t help, though it’s scientifically satisfying.Step three—enable it for Computer Use. Inside the portal, open the machine’s settings pane. You’ll find a toggle labeled “Enable for Computer Use.” Turn it on. You’ll get a stern warning about security best practices—as you should. You’re authorizing an AI system to press keys on your behalf. Make sure this machine contains no confidential spreadsheets named “final_v27_reallyfinal.xlsx.” Click Activate, then Save. Congratulations, you’ve just created a doorway for an autonomous agent.Step four—confirm compatibility. Computer Use requires runtime version 2.59 or newer. Anything older and the feature simply won’t appear in Copilot Studio. Check the version on your device or in the portal list. If you’re current, you’re ready.Now, about accounts. You can use a local Windows user or a domain profile; both work. But the security implications differ. A local account keeps experiments self‑contained. A domain account inherits corporate access rights, which is tantamount to letting the intern borrow your master keycard. Be deliberate. Credentials persist between sessions, so if this is a shared PC, you could end up with multiple agents impersonating each other—a delightful compliance nightmare.Final sanity check: run a manual test from Copilot Studio. In the Tools area, try creating a new “Computer Use” tool. If the environment handshake worked, you’ll see your machine as a selectable target. If not—backtrack, because something’s broken. Likely you, not the system.It’s bureaucratic, yes, but each click exists for a reason. You’re conferring physical agency on software. That requires ceremony. When you finally see the confirmation message, resist the urge to celebrate. You’ve only completed orientation. The real chaos begins when the AI starts moving your mouse.Section 3: Watching the AI Struggle (and Learn)Here’s where theory meets slapstick. I let the Copilot agent run on a secondary machine—an actual Windows laptop, not a sandbox—and instructed it to open my Power App and send a university invite. You’d expect a swift, robotic performance. Instead, imagine teaching a raccoon to operate Excel. Surprisingly determined. Terrifyingly curious. Marginally successful.The moment I hit Run, the test interface in Copilot Studio showed two views: on the right, a structured log detailing its thoughts; on the left, a live feed of that sacrificial laptop. The cursor twitched, paused—apparently thinking—and then lunged for the Start button. Success. It typed “Power Apps,” opened the app, and stared at the screen as if waiting for applause. Progress achieved through confusion.Now, none of this was pre‑programmed. It wasn’t a macro replaying recorded clicks; it was improvisation. Each move was a new decision, guided by vision and reasoning. Sometimes it used the Start menu; sometimes the search bar; occasionally, out of creative rebellion, it used the Run dialog. The large language model interpreted screenshots, reasoned out context, and decided which action would achieve the next objective. It’s automation with stage fright—fascinating, if occasionally painful to watch.Then came the date picker. The great nemesis of automation. The agent needed to set a meeting for tomorrow. Simple for a human, impossible for anyone who’s ever touched a legacy calendar control. It clicked the sixth, the twelfth, then decisively chose the thirteenth. Close, but temporal nonsense. Instead of crashing, it reasoned again, reopened the control, and kept trying—thirteen, eight, ten—like a toddler learning arithmetic through trial. Finally, it surrendered to pure typing and entered the correct date manually. Primitive? Yes. Impressive? Also yes. Because what you’re seeing there isn’t repetition; it’s adaptation.That’s the defining point of agentic behavior. The AI doesn’t memorize keystrokes; it understands goals. It assessed that manual typing would solve what clicking couldn’t. That’s autonomous reasoning. You can’t script that with Power Automate’s flow logic. It’s the digital equivalent of “fine, I’ll do it myself.”This unpredictable exploration means every run looks a little different. Another attempt produced the right date on its third click. A third attempt nailed it instantly but missed the “OK” button afterward, accidentally reverting its work. In each ru

Become a supporter of this podcast: https://www.spreaker.com/podcast/m365-show-modern-work-security-and-productivity-with-microsoft-365--6704921/support.

Follow us on:
LInkedIn
Substack