Category

Last updated: June 2026
If you run a $10M to $200M company and your leadership team keeps asking "where are we on generative AI?" with no shared answer, the budget conversation that follows will pick a vendor before it picks a stage, and the next twelve months will produce a stalled pilot, a quietly canceled subscription, and an awkward note in the board pack. Most operators conflate "we use ChatGPT" with "we are doing generative AI" and skip the question of whether any GenAI output reaches a real operating workflow. In this guide, you will get the five-stage generative AI maturity model, the operator-grade self-assessment behind it, and the decision rule for what to build next, so you can place your business honestly and choose the right move instead of the loudest one.
According to the Stanford HAI 2025 AI Index, 78% of organizations reported using AI in 2024, up from 55% in 2023, the largest year-over-year jump in the Index's history. Adoption is no longer the question. Whether your specific use of generative AI has crossed the line from prompt-pasting into operating leverage is. Arkeo has spent three years deploying GenAI agents on its own operations and on mid-market client engagements, and the recurring pattern is not model failure. It is misplaced confidence: a team that scores itself at Stage 4 and is actually at Stage 1, because nothing produced by a model ever lands in the system where the decision gets made.
Quick Answer
• What it is: A five-stage diagnostic for placing your business on a generative AI maturity curve: Curious, Sanctioned, Workflow-Integrated, Agentic, and Compounding.
• How to use it: Score each candidate workload, not the company. The same business will sit at Stage 1 for finance and Stage 3 for marketing on the same Monday.
• Why it matters: Stage 3 is the line between using GenAI and operating GenAI. Below it, the spend is overhead; above it, the spend is leverage.
• Cost of skipping it: A purchase order for a platform license signed before the workflow audit, and a written-off pilot twelve months later (Deloitte 2024 reports more than two-thirds of enterprises expect 30% or fewer of their GenAI experiments to be fully scaled in the next 3 to 6 months).
• Next step: Book a free AI Assessment. Arkeo will audit your workflows to see if you are ready for custom agents.

A generative AI maturity model is a diagnostic grid that scores a specific workload, not a whole company, against the operating requirements of GenAI: policy and access, data reach, workflow integration, action and approval design, and ownership after ship. It is the answer to "where do we stand?" expressed as a stage-per-workload picture, not a single corporate score. Treat it as the same kind of instrument the NIST AI Risk Management Framework uses when it splits AI risk into Govern, Map, Measure, and Manage: a structured way to see the work that has actually been done, separated from the noise of headcount, hype, and platform licenses.
The reason the workload framing matters is that GenAI maturity is not evenly distributed inside a business. Marketing might be running a sanctioned copilot at Stage 2 with an acceptable-use policy and paid seats, while sales has one engineer plumbing GenAI into the CRM at Stage 3, while finance is still on personal ChatGPT accounts at Stage 1, while HR has banned the tool outright. The shorthand "we are a Stage 2 company" hides all four pictures and produces the worst kind of strategy decision: an average that nobody is actually living. For the broader, non-GenAI ai readiness picture, including data, infrastructure, and culture, run this exercise alongside the cluster's parent assessment.
of organizations used AI in 2024, up from 55% in 2023 — the largest year-over-year jump in the Index's history.
Source: Stanford HAI 2025 AI Index
The five stages below are GenAI-specific. They focus on the path a piece of generative output takes from a chat window into a workflow where money, contracts, or customer outcomes move. The shorthand: at Stage 1 the output is a draft a human retypes; at Stage 5 the output is an action the business booked while leadership was asleep.
THE FIVE STAGES
Score each candidate workload against the same five stages. Move the workload, not the company.
STAGE 01
Employees use ChatGPT, Claude, or Gemini on personal accounts. No acceptable-use policy, no inventory, no audit trail. Output is copy-pasted manually. Shadow GenAI is the dominant mode and the IT team often does not know the full footprint.
STAGE 02
Leadership has named GenAI a priority. A written acceptable-use policy is in place, paid enterprise seats for one or two copilots exist (Microsoft 365 Copilot, ChatGPT Enterprise, or similar), and a single department is running a structured pilot. Output is still pasted manually into the workflow.
STAGE 03
At least one GenAI step lives inside a real operating workflow, reading from the ERP or CRM, writing a draft back, and tracked against a baseline. The data pipeline is documented. A human approves the material output. The first ROI number sits on a board slide and the workload has a named operator.
STAGE 04
GenAI does not just suggest. It calls tools, queries systems, writes records, and routes work, gated by human-in-the-loop approvals for material actions. Three or more agents run across two or more departments, integrated with the ERP or CRM and monitored daily. Failure modes are catalogued and tested.
STAGE 05
GenAI is part of the operating system. Agents are first-class workers with scoped permissions, audit logs, private or on-premise deployment where required, and a managed lifecycle. New use cases stand up in days, not quarters, and the marginal cost of the next agent is hours of configuration rather than another six-month project.
Stage 3 is the line between using GenAI and operating GenAI. Reach it in one workflow first.
Most mid-market operators score themselves at Stage 2 or Stage 3 and are honestly at Stage 1 for every workload that matters. The PwC AI Agent Survey of 300 senior US executives in May 2025 reports that 79% of US businesses say AI agents are already being adopted and 88% plan to increase AI-related budgets in the next 12 months. The budget intent is universal. The integration evidence is not. The maturity model exists to keep those two numbers from being confused.
Measurement happens per workload, scored against five operator dimensions. Treat any score below 3 out of 5 in any single dimension as a no-go for that workload until the gap is closed. The dimensions are deliberately operating-flavored, not technical, because every GenAI failure Arkeo has seen in three years of deployments has been an operating failure dressed in technology costumes.
DIMENSION 01
Is there a written acceptable-use policy, paid enterprise seats with data-retention guarantees, and a named owner for the policy? Personal accounts score zero.
DIMENSION 02
Can the GenAI step actually see the data the workflow needs? PDFs in a shared drive count as zero until parsed. A copilot that cannot read the CRM is a brochure tool, not a workflow tool.
DIMENSION 03
Does the output land inside the system where work moves (ERP, CRM, ticketing, approval queue), or does it stop at a chat window the user must retype? Copy-paste is Stage 1, by definition.
DIMENSION 04
Is there a designed approval point for material output (an invoice over $5K, a customer refund, a contract clause), or is the agent allowed to act unsupervised on things it should not?
DIMENSION 05
Does a named operator own the workload the day after launch? The IBM 2025 IBV CEO Study of 2,000 CEOs across 33 countries names "lack of expertise and knowledge" as the top barrier to AI innovation. Unowned workloads are how that barrier shows up in operations.
A workload that scores 3 or higher across all five dimensions is Stage 3 or above. A workload with any single dimension below 3 stays at Stage 1 or Stage 2 regardless of how impressive the demo looked. The math is unforgiving on purpose, because the alternative is signing a build contract for a workflow nobody can actually reach.
Place your business on the maturity curve in 60 minutesThe free AI Assessment audits one of your workflows against the five GenAI dimensions, places it on the stage curve, and tells you the shortest path to Stage 3. No pitch deck.
Book Your Free AI Assessment →
The honest answer is almost always "lower than the leadership team thinks." Across mid-market deployments, the same self-assessment gap repeats. Marketing has a copilot, finance has shadow ChatGPT, sales has a half-built integration, HR has a ban, and the executive summary reads "we are operationalizing AI across the business." The summary is the problem. Replace it with a per-workload table and the next move becomes obvious within an hour.
Paid copilot seats without integration are a faster typewriter, not an operating system.
Two patterns are worth naming explicitly because they show up in nearly every assessment. The first is the copilot-equals-maturity confusion: a company purchases enterprise copilot seats, declares itself at Stage 2 or Stage 3, and never measures whether any output reaches the workflow it was sold to improve. Paid seats are necessary but not sufficient; without the integration step, the copilot is a faster typewriter. The second is the pilot-purgatory loop: a single department runs a structured pilot that produces a measured uplift, leadership celebrates, the pilot does not generalize, and twelve months later the budget is reallocated. The Deloitte State of Generative AI Wave 4 survey of 2,773 C-suite and director-level leaders across 14 countries quantifies this exactly: more than two-thirds of enterprises expect 30% or fewer of their GenAI experiments to be fully scaled within the next three to six months. The shorthand: most pilots will not survive a budget cycle.

Each transition has a single dominant move. Naming the move is more useful than naming the stage, because the move is what the team actually does on Monday.
STAGE-TO-STAGE MOVES
Name the move, not the stage. The move is what the team does on Monday.
STAGE 1 → 2
Publish an acceptable-use policy, buy enterprise seats with no-training data terms, and inventory which teams are using what. Off-the-shelf copilots typically land in days and run about $20 to $30 per user per month at the enterprise tier.
STAGE 2 → 3
Pick one workflow, map it on paper, audit the data, design the approval points, and connect GenAI to the system where the work moves. A scoped single-workflow build typically runs $15K to $40K and 6 to 10 weeks to production.
STAGE 3 → 4
Move from drafting to action: let the agent call APIs, query systems, and write records under designed approvals. Stand up a second workflow in a second department to prove the pattern generalizes.
STAGE 4 → 5
Move agents under a managed lifecycle with shared permissions, audit logs, and a Manage phase. For regulated or sensitive data, private or on-premise deployment runs about 8 to 12 weeks. The marginal cost of agent number four becomes days, not quarters.
The fastest credible path is Stage 1 to Stage 3 in one workflow. Skip steps, stall the program.
Those cost and timeline ranges are operator prose from Arkeo's own builds, not sourced benchmarks. The first quick win typically lands inside 30 to 90 days when Stage 2 is reached cleanly. Arkeo has been deploying GenAI agents on this pattern since 2023, including the agents that run Arkeo itself: we use what we sell.
Picture a 200-person specialty manufacturer where one GenAI step now lives inside the quoting workflow. The model reads incoming RFQs, extracts the spec data, queries the ERP for stock and lead time, drafts a quote, routes the draft to the sales engineer for a 30-second review, and writes the approved quote back to the CRM. The data pipeline is documented. The integration uses the ERP's REST API plus a Slack-based approval queue. A named internal owner runs a 30-minute morning check; Arkeo runs a quarterly review under the Manage phase. The agent has a runbook, an owner, an SLA, and a row on the operating P&L. The company can describe in one paragraph why this workflow was first, what it cost, what it saves, and what is shipping next.
That is what Stage 3 looks like in practice. It is not a moonshot. It is a normal operating system upgrade with a generative model inside one well-chosen workflow. The companies that get there first are not the largest in their sector. They are the ones that ran the audit before the purchase order and operated one workflow in production for a full quarter before scaling.
This is the question that breaks most maturity conversations. The generative AI maturity model owns the diagnosis of the current state for a GenAI workload. The broader ai maturity model covers the full data, infrastructure, and culture readiness picture across all AI workloads. Strategy, sequencing, and the 30/90/12-month plan live in the strategy cluster, not here. ROI math and use-case prioritization live in the ROI cluster. Keeping the lines clean keeps the assessment honest.
The right reading order for a mid-market operator: place each workload on the GenAI maturity curve (this article), score the broader readiness picture against the parent ai readiness framework, decide the sequence and timeline in the strategy cluster, then justify the build with the ROI math. Skipping the diagnosis is how a perfectly logical roadmap crashes at month three on a data integration nobody priced.
Audit your workflows to see if you are ready for custom agentsArkeo's free AI Assessment scores one of your workflows against the five GenAI maturity dimensions and gives you a go, fix-first, or no-go decision you can take to the board.
Book Your Free AI Assessment →
Apply for the free AI Assessment. In 60 minutes you walk away with a 12-month plan tailored to your business. No software demo. No obligation.
Free Planning Session →