A practical first move. Pick the right task, give the agent what it needs to learn your work, and ship a result you can measure this month.
Disclosure: I build in this space at Personize. I use it once below as a concrete example of the memory step, not as the point of the piece.
The capability is here. Agents now handle hour-long, multi-step tasks across every department, and the productivity gains in independent studies are real. (I laid out that evidence in Agents Are Eating the Hour-Long Task.) So the question is no longer whether to use them. It's where to start so your first attempt actually works.
Here's the encouraging part. MIT's 2025 study of enterprise AI found that roughly 95 percent of pilots delivered no measurable impact, but the cause was not the models. It was a learning gap: generic tools that never adapt to a team's actual workflow. The 5 percent that succeed share a pattern you can copy. This guide is that pattern, turned into a starting move.
1. Start with a task, not a chatbot
The old instinct is to roll out a chat assistant and hope people find uses. Skip it. Pick one concrete, repeatable task that currently eats real hours: triaging inbound leads, drafting first-pass contracts, reconciling reports, qualifying support tickets. The agent era rewards delegated tasks, so delegate a task. A good first task is high-volume, well-defined, and currently done by hand.
2. Aim at the back office and the newest people
Two findings point the same direction. MIT found the clearest ROI in back-office automation, not flashy customer-facing demos. And the productivity studies found the largest gains go to the least experienced workers (a 34 percent jump for the newest support reps, versus 15 percent on average). So your highest-odds first deployment is an internal, high-volume process, put in the hands of your newer staff. That is where the numbers are friendliest and the risk is lowest.
A quick litmus test for a first task: it should be something you could write a one-page checklist for, that happens many times a week, where a wrong answer is recoverable. If it passes all three, start there.
3. Give the agent memory, or it relearns everything every run
This is the step most pilots skip, and it is why they stall. An agent with no memory re-reads the same context on every run, never gets better, and never learns that your company reports revenue net of refunds or that this customer churned last year. MIT named this exact failure: tools that "don't learn from or adapt to workflows." Before you scale, give the agent a place to accumulate what it learns about your entities and your rules between runs. (This is the layer I work on at Personize, but the principle holds whatever you use: memory is what turns a clever demo into a system that compounds.)
4. Buy and adapt before you build from scratch
MIT's data is blunt here: buying from specialized vendors and adapting succeeded about 67 percent of the time, while internal builds succeeded roughly a third as often. Building your own agent stack from zero feels like control; statistically it's the slower road to value. Start by adapting something proven to your workflow, and reserve custom building for the parts that are genuinely yours.
5. Rework the workflow, don't bolt the agent on
McKinsey found that the high performers, the small share actually capturing EBIT impact, were far more likely to redesign the workflow around the agent rather than paste it onto the old one. The cautionary flip side is real: in one trial, experienced developers were 19 percent slower when an agent was bolted onto work they already did fluently. Ask "what would this process look like if the agent did the first 80 percent," not "where can I insert AI into what we already do."
6. Measure outcomes, then govern what you trust
Pick one metric that maps to money or time before you start: tickets resolved per hour, hours saved on reconciliation, cycle time on contracts. Activity ("we ran 10,000 agent calls") is not impact. Once you see a real result, add the guardrails that let you trust it without watching every action: clear rules on what the agent may write and where, and an audit trail when it does. That is what turns a successful pilot into something you can expand.
Your move this week
You don't need a strategy deck. Pick one back-office task that passes the litmus test, hand it to an adaptable agent with a memory of your context, put a newer team member on it, and measure one number for two weeks. That single loop teaches you more than a quarter of planning, and it puts you on the road the 5 percent took.
The agents are ready. The only thing between you and the gains is choosing the first task. Choose it today.
Want help scoping a first task or wiring the memory layer so your agent actually learns? Email me, or subscribe for the next pieces on getting real value from agents.