From AI activity to meaningful, scaled impact

By Diana Spiridon · 6 June 2026

Cover image

Johnson & Johnson ran around nine hundred GenAI projects across the business before CIO Jim Swanson sat down with his team and audited the portfolio. The result was awkward: 10 to 15 per cent of those projects were carrying about 80 per cent of the value, and a lot of work had quietly accumulated that wasn’t going anywhere useful. The decision, reported in April 2025, was to narrow the work to the high-impact set and stop investing in the rest.

Dell did something similar at smaller scale around the same time. CTO John Roese described starting with around 800 GenAI ideas circulating inside the business, narrowing first to 132 use cases, then to eight priority projects across services, supply chain, engineering and sales (reported January 2025).

I keep thinking about both of those audits, because I think they describe more than a J&J or Dell problem. Most large enterprises I work with are running somewhere between twenty initiatives and several hundred, and the leadership teams know they should be doing this kind of audit but haven’t found the time, the discipline, or the framework to do it properly.

Deloitte’s State of AI in the Enterprise 2026 — surveyed across 3,235 leaders in 24 countries, fieldwork August to September 2025 — found 74 per cent of organisations expect AI to drive revenue growth in the next year, with 20 per cent currently seeing it. The same study reports 34 per cent actively reimagining the business with AI, and 37 per cent using it at surface level — productivity gains without meaningful process change. Older surveys point the same way: Gartner’s May 2025 CIO survey (n=506) had 72 per cent of CIOs breaking even or losing money on AI; IBM’s 2025 CEO study (n=2,000, with Oxford Economics) found roughly 25 per cent of AI initiatives delivering the expected ROI and 16 per cent scaling enterprise-wide. The numbers vary by definition. The pattern doesn’t: the gap is less about awareness — leadership teams know AI matters — and more about translation.

When I sit with leadership teams trying to make sense of all this, the diagnosis usually comes from one of two places: the skills gap, or the state of the data. Both are real constraints. Neither, in my experience, is what’s actually holding ROI back. The binding constraint is focus. Most enterprises aren’t under-investing in AI — they’re over-spreading it. Twenty, fifty, sometimes hundreds of initiatives, each carrying a bit of attention and a bit of budget, with no honest sort between the ones that will drive return and the ones that are absorbing capital without changing the P&L.

HBR (April 2026, Bain–OpenAI co-authors) has a phrase for one piece of this: the micro-productivity trap. A company runs twenty or fifty AI initiatives, each one improving a task within the existing operating model, and somehow the firm-level numbers don’t move. Task-level productivity inside functions that already do a small share of the company’s value rarely adds up to a firm-level return. To get firm-level return you need focus — fewer bets, deeper investment, and at least one that changes something structural about what the customer is buying, what the business model is, or what new lines of business become possible.

The same squeeze is visible in corporate venturing. Frank Kumli’s review of new business building across the DACH market (June 2026) describes units recalibrating from broad exploration to leaner, strategically relevant setups — measurable impact is now the mandate, and weak relevance risks shutdown. Two of his observations carry straight over to AI portfolios. The bottleneck is not a shortage of ideas; it is a shortage of strategically aligned opportunities — winning teams work from clear search fields and disciplined selection, not open calls for use cases. And ‘strategic relevance is a cliff, not a hurdle’: a misaligned initiative doesn’t move slowly, it loses funding, commitment and survival. He makes the AI point too — AI-native models are an enabler only when tied to a clear strategic advantage, which is another way of saying that AI bolted onto a workflow is not a bet.

What follows is the method I use when a leadership team asks for help identifying the most significant, strategically aligned AI initiatives in a portfolio — and translating them into impact the business can actually scale. It runs in four phases — Scope, Shape, Validate, Back — drawn from eighteen years of running customer discovery, concept design, commercial validation and Lean Business Case work inside enterprises, adapted with the instrumentation that AI specifically requires. The sequence is plain: Scope is strategic alignment — where the organisation places its biggest attention. Shape is business design — each initiative given a form: process, product, service or new line. Validate is rapid validation and prioritisation. Back is planning — the case, the resource, the first hundred days. The phases describe the shape of the work, not a timetable; how long each takes depends on the size of the portfolio, the maturity of the customer evidence already in the building, and the seriousness with which the leadership team is willing to sort the existing initiatives. What’s set out below is detailed enough to function as a playbook on its own, if you want to run a version of this work without me.

Four shapes of AI bet

Every AI bet I look at takes one of four shapes. The mix in the portfolio tends to decide whether the work is going anywhere or just generating activity reports.

Process improvement. Internal workflows get faster, cheaper or more accurate. Copilots, summarisation, automated routing, AI-enhanced back-office work. Easy to start, visible early wins, probably necessary at this point. Almost never changes a company’s competitive position.

Product improvement. An existing product gets a new AI feature customers feel. A recommendation engine that genuinely changes what people buy. A pricing model that captures more margin without losing volume. A risk model that lets you underwrite a customer segment your competitors can’t. Extends the existing competitive edge without usually changing what the business fundamentally is.

Service improvement. An existing service is delivered at a higher standard or lower cost. AI-driven customer service. Advisory at scale. Moves margin, churn, satisfaction.

New line of business. A new revenue line opens because AI made some unit-economic constraint go away. A repositioning around a service model that wasn’t economic at human scale. A frontier-lab partnership that buys capabilities you’d otherwise spend five years building. The rarest shape, and the hardest case to write — time-to-revenue is longer and internal incentives rarely reward it.

Most portfolios are heavy on the first two, have some of the third, and almost none of the fourth. Quite a lot of what’s labelled ‘transformation’ inside the company is really process improvement promoted somewhere in the deck. There is no shame in a portfolio weighted toward the operational shapes — in most enterprises that is where the realised AI value sits, and a Core-heavy answer is a correct answer when the evidence says so.

One bet in the backed portfolio earns a different name: the AI business shifter — the bet that changes the company’s operating economics or competitive position. Sometimes that’s a new revenue line. More often it’s a deep process or service play: an agentic claims operation that settles in a day what took eleven, a quote turnaround competitors can’t match without rebuilding their operation. The shifter is held to a different evidentiary standard, because the bets with the hardest cases to write get killed by the same screen that passes the easy ones — which is how portfolios end up with activity instead of advantage.

Phase 1: Scope

The question. Where could AI matter for this specific business — across customer-facing arenas and internal CX, Ops, processes and capabilities — and which of the initiatives already in motion belong inside those territories?

Phase 1 runs in three steps: define the territories worth playing in (1A), align the in-flight initiatives against them (1B), then read the balance of the whole portfolio before any new ideation begins (1C).

1A: Territories

What you do. Map value pools across the enterprise — recurring leaks where money, time, capacity or risk are being spent — on both axes: external (customer-facing markets, segments, value streams) and internal (CX, Ops, processes, capabilities, infrastructure). Frame each as a strategic arena with a named buyer or operator. Gather Jobs-to-be-Done evidence — sampled, not uniform: deep (20+ conversations) on the two or three territories with the largest value pools and most strategic weight, moderate (8–10) on the next tier, desk-only on well-understood efficiency territories that don’t need fresh customer evidence. For each territory, capture two or three desired-outcome statements in Ulwick’s format — a direction, a metric and an object: ‘minimise the time from claim notification to settlement’ — rated for importance and current satisfaction. Those statements are what make a leak sized rather than asserted, and they travel with the bet all the way to the day-100 milestone. Reconcile every territory to the existing strategy: a stated priority it serves, or a gap the leadership team accepts in writing — alignment decides funding, commitment and survival, so it’s a gate condition, not a footnote. The surviving territories then double as the clear search fields disciplined venture teams work from: anything new gets generated inside them, never from an open call for use cases. Score each for right to play (capability, data, distribution, brand permission). Tier-classify each as Core (existing process or product), Adjacent (new incremental), or Transformative (substantially new). Deduplicate. Name the surviving 4–7.

Tools. Value-pool mapping (McKinsey’s value-pool lens, extended from markets to internal operations). Strategic arena framing (Roger Martin’s Strategic Choice Cascade — where to play, how to win). JTBD discovery interview guide (Christensen, Competing Against Luck, 2016). Desired-outcome statements (Ulwick, Outcome-Driven Innovation). Right-to-play capability lens (BCG, adapted). Tier classification grid (Nagji-Tuff Innovation Ambition Matrix; McKinsey Three Horizons). Portfolio Innovation Model (Accenture) — reads each territory against business maturity (legacy / growth / emerging) and innovation type (incremental / breakthrough / disruptive); in legacy businesses, non-incremental AI bets carry roughly twice the growth potential of incremental ones, which is the evidence for not over-weighting Scope toward process improvement.

Canvases. A3 Scoping Canvas (Board of Innovation) per candidate territory — challenge, why, customer segment, current situation, assumptions, questions, goal, related initiatives. Territory Map — one row per territory: value pool, JTBD and desired outcomes, strategic anchor, right-to-play note, tier, business maturity.

What comes out. Territory Map with 4–7 named territories. JTBD evidence log per territory. Right-to-play assessment per territory. Tier classification per territory.

Gate — a territory survives if: a specific value pool is named with leakage quantified or directionally bounded; at least one JTBD is evidenced in customer or operator language; it maps to a stated strategic priority (or surfaces a strategy gap leadership has accepted); there’s a defensible right to play, not just adjacency; the implied investment posture for the tier has been accepted.

1B: Initiative alignment

Most enterprises arrive at this stage with hundreds of initiatives, most of them a single line in a tracker — loose, unsponsored, half-formed. You cannot classify hundreds of loose items one at a time; at fifteen minutes each, the triage alone is weeks of work. So 1B runs in two passes.

Pass 1 — Bulk triage. Normalise every initiative onto a one-line Initiative Intake card — name, sponsor (or “none”), territory guess, one-line description, status. Most cards come back 80% empty; that emptiness is the first signal. Then cut in bulk before any careful work: bulk-park the shadow activity (individuals using ChatGPT or Copilot for emails and summaries — an enablement and governance matter, not a portfolio decision); cluster the duplicates (eleven regional “predictive maintenance” pilots become one candidate); bulk-stop the orphans that map to no surviving territory and have no sponsor, under a single reason code. Hundreds of lines reduce to a few dozen distinct candidates.

Pass 2 — Detailed classification. Now the per-initiative work is affordable. For each surviving candidate, classify across three dimensions: output shape (Process / Product / Service / New line of business); tier (Core / Adjacent / Transformative); and AI embedment locus — ingredient (a model inside an existing product, workflow or service) or tool (an assistant people use directly). The locus matters: an AI tool used by people in certain areas is properly shaped as an internal product for the team using it, or as a process improvement for the process it improves — not as a customer-facing product. Apply the four AI-specific failure-mode tests. Issue a Stop / Pause / Proceed call per candidate with a written reason. An initiative too loose to classify is Paused, not Stopped — assigned back to a named owner with a two-week deadline to complete the Intake card, or it auto-stops. This neither kills good-but-unarticulated ideas nor waves through fog.

Tools. Initiative Intake card (Board of Innovation’s Innovation Project Template, adapted) — normalises each initiative to one line for bulk triage. A ten-question Stop / Pause / Proceed gate per candidate, in the Stage-Gate tradition (Cooper). The four AI-specific failure-mode tests: value-pool absence / AI bolted on; promoted productivity (process work labelled as new line of business); capability/data/talent gap; defensibility under commoditising APIs. AI embedment locus classifier.

Canvases. Initiative-to-Territory Grid — one row per surviving candidate: territory, shape, tier, embedment locus, spend and FTE, failure-mode test results, Stop/Pause/Proceed call, reason code. Board of Innovation’s AI Opportunity Radar, arena-adapted, as a coverage check.

What comes out. Initiative-to-Territory Grid for the surviving candidates. Stop/Pause/Proceed list with reason codes — every initiative accounted for, including the bulk-parked and bulk-stopped, so nothing returns under a new name next quarter. Orphan list. Phase 1 Decision Memo for sponsor and leadership team.

Gate — an initiative is stopped if: no surviving territory accepts it; it fails any of the four AI-specific failure-mode tests without credible remediation; it duplicates an initiative already further advanced in the same territory; it can’t be expressed as one of the four output shapes; or the AI embedment locus can’t be named. An initiative too loose to classify is Paused with an owner and a deadline, not stopped outright.

1C: Portfolio balance

What you do. Step back from the individual initiatives and read the sorted portfolio as a whole, across four views (Board of Innovation’s Innovation Portfolio Management mappings). Strategic alignment: group every surviving initiative under the strategic objective it serves — the orphans, the ones that serve no stated objective, are the grey zone. Ambition: plot the portfolio across Core / Adjacent / Transformative; almost every enterprise portfolio over-clusters in Core. Resource allocation: size each initiative by the investment behind it and check it against its tier — a small bet in a Transformative territory or a large bet stuck in Core is, in Board of Innovation’s phrase, bound to fail. Blockers: a funnel view of where initiatives are stalling, and why.

The output is the one slide a leadership team rarely has: not a list of initiatives, but the shape of the whole portfolio — where it’s concentrated, where it’s thin, and where the resourcing contradicts the ambition. Accenture’s portfolio research frames the same choice as Innovation for Longevity (most investment defends the legacy business) versus Innovation for Balance (investment spread across legacy, growth and emerging) — and its core rule is to allocate to a business’s future potential, not today’s revenue.

Two things this view almost always surfaces. First, the transformative bet is nearly invisible at scale: in a portfolio of dozens or hundreds, the one genuinely transformative candidate is usually starved — a couple of people, no budget — while Core dashboards carry large teams. The ambition mapping and the resource-ambition check are what find it. If no transformative candidate survives at all, the method triggers a deliberate ‘generate one’ workstream rather than accepting a portfolio with no direction-changer in it. Second, blockers are usually portfolio-wide, not per-initiative: forty ‘data gap’ pauses are one data-readiness problem, and it’s escalated as a single programme, not forty fixes.

Tools. Innovation Portfolio Management mappings — Strategic Alignment, Innovation Ambition, Resource Allocation, Blocker Identification (Board of Innovation). Innovation for Longevity vs Innovation for Balance (Accenture). Resource-ambition mismatch check (Board of Innovation).

Canvases. Portfolio Balance Map — Board of Innovation’s four mappings on a single view: strategic alignment, ambition spread, resource-to-ambition fit, funnel blockers.

What comes out. A one-page portfolio balance read. A named list of grey-zone orphans (no strategic objective). A named list of resource-ambition mismatches. A stated portfolio posture — Longevity or Balance — the leadership team has chosen on purpose, rather than drifted into.

Gate — Phase 1 closes when: the leadership team has seen the whole-portfolio shape, accepted (or corrected) the ambition spread and the resource-to-ambition fit, named the orphans to stop, and signed the Phase 1 Decision Memo — strategic alignment on the opportunity areas and the direction for the high-priority initiatives. Only then does new ideation begin in Shape.

Phase 2: Shape

The question. For each surviving initiative, what’s the specific problem, who’s the customer or operator, what’s the business rationale, and what shape does the solution take?

What you do. Run per-initiative problem-and-customer discovery — 5–8 conversations per initiative, sharper than the territory-level JTBD work of Phase 1A. Then write one desired-outcome statement per initiative — direction, metric, object, carried down from the territory work: ‘minimise the time from broker submission to quote’. That statement becomes the Why in the framing canvas and the evaluation metric in the concept; problem, concept and test stay on one thread. Frame the problem using the AI Problem Framing Canvas across four lenses: Why (the outcome and its metric), What (solution shape), Who (human), How (data and feasibility). Run a half-day convergent-divergent ideation workshop per initiative — diverge into possible shapes, converge on the most defensible. Complete a Concept Canvas per surviving idea. Name the customer or operator, the business rationale, and the sponsor.

Prototyping, user testing, DVF balance and commercial sizing don’t happen here. Phase 2’s job is to produce concepts clear enough to be tested, not to test them.

Tools. Per-initiative discovery interview guide (IDEO discovery interviewing). Desired-outcome statements (Ulwick). Convergent-divergent workshop kit — Crazy 8s, Rose-Thorn-Bud, How-Might-We reframing, dot voting, speed critique (Design Council Double Diamond; Design Sprint pattern, Knapp). The four-shape typology used as design discipline. Three AI-specific concept fields. Bain’s AI Deployment Matrix (Personal Productivity / Amplified Intelligence / Embedded Assistant / Digital Worker) as a sanity-check on the embedment and autonomy of each concept.

Canvases. AI Problem Framing Canvas (Design Sprint Academy, A3) — four lenses capturing business domain, KPI, status quo, customer in their own words, data feasibility, technical feasibility, legal and compliance risk. Concept Canvas (Strategyzer Business Model Canvas and Lean Canvas base, with three AI-specific fields added) — six base fields (job-to-be-done, user, value, channel, cost, revenue) plus data dependency, model/vendor posture, evaluation approach; the evaluation field references the desired-outcome metric rather than inventing a new one. Value Proposition Canvas (Strategyzer) inside the workshop. Concept Register carrying every concept across phases.

What comes out. 5–8 shaped opportunity concepts. A completed Concept Canvas, AI Problem Framing Canvas and Value Proposition Canvas per concept. A named business owner per concept. Updated Concept Register.

Gate — a concept survives if: the specific problem is named and defensible (not “we want to use AI”); the customer or operator is named — specific, addressable, not a category; the business rationale is clear (revenue, cost, capacity, risk — named, with directional magnitude); the output shape is unambiguous and accepted by the sponsor; the Concept Canvas is complete with no “TBD” in the three AI-specific fields. The AI business shifter (new-line-of-business) is held to a higher bar: the business rationale must be specific to a JTBD the organisation doesn’t serve today.

Phase 3: Validate

The question. Which shaped concepts hold up against real customers or operators, real economics, real feasibility and AI-specific risk — and how do they rank when leadership trades them off?

What you do. This is the phase teams skip most often. They build a prototype, it works, they call it validation. A prototype only tells you something is technically possible. It tells you nothing about whether anyone will pay for it, whether it survives at production scale, or whether it stays useful when the model is updated and the behaviour changes.

Build a rough prototype per concept — fake landing page, clickable mock-up, Wizard-of-Oz front-end, video storyboard or paper prototype. Run 15–30 user or operator conversations against it; log what people did, not what they predicted. Run willingness-to-pay probes for paid bets, saved-cost / saved-time probes against a measured two-week baseline for internal bets. Run a competitor and alternatives feature grid, including the cheap substitute (paper, phone, Excel, incumbent). Check delivery readiness while you check the money: does the data, talent and platform the bet needs actually exist, or does the first tranche have to fund a data product before anything else? Apply DVF balance — Desirable, Viable, Feasible.

Alongside the commercial work, the AI-specific risk is sized — not solved, sized — across seven areas. Evaluation design: golden dataset, evaluation harness, documented thresholds for accuracy, hallucination and latency, with regression evals re-run on every model or prompt change. Data and drift: data product readiness; distribution drift in the inputs, concept drift in the input-output relationship, and version drift when the vendor silently updates the endpoint. Inference economics: unit cost, latency and cost variance at production volume — where most 2024–25 cases broke. Regulatory posture: EU AI Act risk tier, GPAI obligations, UK sectoral regulators where relevant (ICO, FCA, PRA, MHRA, CMA). Security: prompt injection and the attack surface of tool-calling agents — a claimant-submitted document that manipulates a settlement agent is a 2026 risk, not a hypothetical (OWASP’s LLM Top 10 is the working reference). Autonomy, oversight and accountability: for any agentic bet, what it may do unsupervised, where the human sits in or on the loop, the blast radius, the kill-switch, the action log — and who answers for an autonomous action, named rather than assumed. Vendor and data-rights lock-in: switchability, data portability, and what happens to your data in the vendor’s training pipeline. These checks don’t end at the gate — evaluation, drift and security monitoring carry into production as a lifecycle, which is how the NIST AI Risk Management Framework intends it.

Underneath, the riskiest-assumption work, in the Discovery-Driven Planning tradition (McGrath & MacMillan): score the top 5–8 assumptions per concept Risk × Impact (1–25); design the smallest test for the top three. Label every claim Strong, Moderate or Weak — and Strong means observed: someone did the thing, with money or time, not said they would. Score every concept on the DVF+ scorecard and the Investability Scorecard. Rank on the combined strategic + commercial value mix.

Phase 3 ends with a leadership alignment session. Surface trade-offs across the ranked portfolio. Confirm the top 3. Name the AI business shifter — the bet that changes the operating economics or competitive position, whichever shape it takes. A Transformative candidate that isn’t top-3-ready doesn’t have to be killed or crowned: it can be held as a small staged option with a named re-decision date. Agree the kill list with reasons.

Tools. Rough prototyping kit (Bland & Osterwalder, Testing Business Ideas). Customer Development testing discipline (Blank). WTP three-question probe sheet — what did you spend last time, what’s a fair price, whose budget. Rapid quant survey at n ≈ 1,000 where feasible. Competitor and alternatives feature grid. Readiness checklist — data, talent, platform. R-W-W screen as the gate logic — is it real, can we win, is it worth it (George Day, Wharton). DVF balance grid (IDEO, Tim Brown, Change by Design). Evaluation design template (NIST AI Risk Management Framework; Google’s ML Test Score for production readiness). Security checklist (OWASP LLM Top 10). Assumption Map (Bland & Osterwalder; McGrath & MacMillan, Discovery-Driven Planning). Evidence labels. Investability Scorecard.

Canvases. Service Blueprint and Customer Journey Map (service design standards; IDEO). Bet Card — one page per concept, updated each phase: problem, user, shape, desired outcome, evidence labels per claim, AI risk note, top assumptions, owner. Portfolio Ranking Sheet — two-axis: strategic value (strategic fit + capability/data readiness + defensibility + regulatory exposure) and commercial value (market size + time-to-revenue).

What comes out. A ranked portfolio of all Phase 2 survivors. Top 3 named, each with claim-level evidence labels, AI-specific risk note, Assumption Map, Investability Scorecard score. Kill list with reason codes. Leadership Alignment Outcome — trade-offs surfaced, top 3 confirmed, the AI business shifter named, any staged option held with its re-decision date.

Gate — a concept enters Phase 4 if: it places in the top 3 on combined strategic + commercial value; the strongest commercial claim is Strong — observed, not stated; WTP is bounded for paid bets, or saved-cost / saved-time bounded against a real baseline for internal bets; the top three Risk × Impact assumptions have defined success thresholds and a viable test design; AI-specific risk is documented across all seven areas — gaps flagged, not hidden; leadership has aligned at the closing session. The shifter runs on its own standard: directional on value, Strong on desirability — looser on size, never on truth.

Phase 4: Back

The question. For the top 3 from Phase 3, what’s the best approach to move forward, and what’s the 100-day plan to back it?

What you do. Run best-approach analysis per top-3 concept — build internally, buy or acquire, partner, joint venture, or hybrid, with sequencing. The answer depends on Phase 3 evidence, which is why it’s settled here, not earlier. And for most AI bets in 2026 the durable answer has a shape: buy the model, own the data product, the evaluation harness and the orchestration layer, and design for model swap — the model layer is commoditising fast enough that what you build should assume the underlying model changes within a year. Part of the analysis is asking what the frontier will hand you for free next quarter, so you don’t build it. Write a Lean Business Case V1 per backed bet — bounded sizing, buyer evidence, cost-to-serve, gross-margin shape, three- and five-year projections, assumptions confidence-tagged with the evidence labels from Phase 3, and one named outcome target per bet — the desired-outcome metric the day-100 milestone is measured against.

Stage a capital plan against decision gates the CFO has accepted — optionality, not instalments: each tranche names its gate, its decision right and its kill trigger, and buys the right, not the obligation, to continue. Write the 100-Day Plan for the AI business shifter — day-30 first decision gate, day-100 milestone gate, named kill conditions, GTM steps, and the evaluation harness and production monitoring stood up as day-one deliverables, not afterthoughts. Write lighter commitment cards for the other top-3 bets — focused execution beats spread attention. Fund any held Transformative option at its agreed small scale, with its re-decision date in the diary. Publish the portfolio kill list across all four phases with reason codes. Name a business owner per backed bet, accountable to a P&L line, not a project sponsor. Pre-agree kill conditions in writing.

End with a governance hand-off note: the cadence the leadership team adopts after the engagement ends.

Tools. Best-approach analysis. Lean Business Case V1, T-shirt sized with confidence-tagged assumptions. Staged capital in the Stage-Gate tradition (Cooper), priced as real options (Myers, MIT) — fund to the next gate, not the whole journey. 100-Day Plan template. Commitment cards. Portfolio Kill List. Governance cadence — quarterly portfolio review using Board of Innovation’s four Innovation Portfolio Management disciplines (Strategic Alignment, Innovation Ambition, Resource Allocation, Blocker Identification); annual Investability Scorecard re-score; continuous evaluation, drift and security monitoring on every live bet; a quarterly frontier re-test on anything being built — would we still build this, or has it become a buy; Stop / Pause / Proceed on missed gates; annual kill-list review.

Canvases. Lean Business Case Canvas V1 — revenue, GM, opex, EBITDA, 3- and 5-year totals, capex required, core assumptions with confidence levels, sensitivity per assumption, one outcome target. 100-Day Plan Canvas — milestones, decision gates, owner, kill conditions, GTM steps, eval harness and monitoring plan, weekly evidence ask. Bet Card, final version.

What comes out. Best-approach decision per top-3 concept. Lean Business Case V1 per backed bet (internal working instrument). Staged capital plan (internal working instrument). 100-Day Plan for the AI business shifter (walk-away). Lighter commitment cards for the other top-3 bets (walk-away). Any held Transformative option, funded small with a named re-decision date. Portfolio Kill List with reason codes (walk-away). Governance Hand-off Note (walk-away).

Gate — a bet is backed if: the best-approach analysis is settled with rationale grounded in Phase 3 evidence; the Lean Business Case V1 stands up to finance scrutiny with evidence labels visible per assumption; the capital plan stages commitment against decision gates the CFO has accepted; a named business owner exists; kill conditions are explicit and pre-agreed in writing.

Tools that run across all four phases

Five artefacts travel through the method, accumulating evidence as concepts move from phase to phase.

The Concept Register is the single source of truth across Phases 2–4 — one row per concept: status, output shape, territory, owner, evidence labels, Bet Card link.

The Assumption Log captures every assumption as it surfaces, ranked Risk × Impact, tagged with an evidence label, with a test designed for the top three to six at each phase (Bland & Osterwalder, Testing Business Ideas). Persists into the 100-Day Plan.

The Bet Card is one page per surviving concept, updated through phases. Replaces the slide-deck-per-bet pattern; what the sponsor reads in the steering meeting. Captures problem, user, shape, DVF+ score, evidence labels per claim, AI risk note, top assumptions, owner.

The evidence labels — Strong / Moderate / Weak, where Strong means observed behaviour with money or time behind it — are applied to every claim on every Bet Card. The label is what makes a Lean Business Case credible to a CFO rather than hopeful.

The Investability Scorecard is the internal ranking instrument that produces the strategic + commercial value mix at Phase 3, and is re-scored annually as part of the governance hand-off. Six dimensions: market size, strategic fit, capability + data readiness, time-to-revenue, defensibility, regulatory exposure.

Running it yourself

This is the minimum viable version for a leadership team that wants to run it without external help.

In Scope, export the current AI initiative list. Don’t filter; the universe is the artefact. First define territories: map value pools on both axes — external arenas and internal CX/Ops/processes/capabilities — frame each as a strategic arena with a named buyer or operator, gather JTBD evidence (deep on the top two or three territories, lighter on the rest), classify each by tier. Then take the inventory back in two passes. Bulk triage: normalise every initiative to a one-line intake card, bulk-park the shadow ChatGPT/Copilot use, cluster the duplicates, bulk-stop the unsponsored orphans — hundreds reduce to a few dozen. Detailed classification: for each survivor, classify shape, tier and AI embedment locus (ingredient or tool), apply the four AI-specific failure-mode tests, and run Stop / Pause / Proceed with a written reason — anything too loose to classify is Paused with an owner and a two-week deadline. Then read the whole sorted portfolio across the four mappings before any ideation begins.

In Shape, run per-initiative problem-and-customer discovery (5–8 conversations per initiative), then write one desired-outcome statement each — direction, metric, object. Use the AI Problem Framing Canvas for each. Run a half-day convergent-divergent workshop. Complete a Concept Canvas per surviving idea. For each candidate, force the answer to: which AI unit-economic constraint going away makes this bet possible? If the honest answer is ‘none — we’re just doing this cheaper’, it’s a process improvement; label it as such. The new-line-of-business shape is held to a higher bar: name a JTBD the organisation doesn’t serve today.

In Validate, build a rough prototype per concept and run 15 conversations against it. WTP probes for paid bets are non-negotiable; saved-cost / saved-time probes for internal bets. For internal bets run a two-week baseline against pre-agreed metrics, then a two-week controlled introduction, then a written comparison. Check the data, talent and platform exist before you score feasibility. Apply DVF balance. For every bet write a one-page evaluation design (golden dataset, thresholds for accuracy, hallucination, latency, inference cost per unit). Write a one-page regulatory posture note (AI Act tier, named sectoral regulators, named data-law exposure). For anything agentic, write the autonomy boundary — what it may do unsupervised, the kill-switch, the action log — and red-team the prompt-injection surface. Name 5–8 riskiest assumptions, score Risk × Impact, design the smallest test for the top three. Apply evidence labels honestly — Strong only when behaviour was observed. Rank on strategic + commercial value mix. Close with a leadership alignment session.

In Back, settle the best approach per top-3 concept — and default to buying the model, owning the data, evals and orchestration, and designing for model swap. Write a one-page Lean Business Case V1 per backed bet with one outcome target. Stage the capital to the next gate only. Write the 100-Day Plan for the AI business shifter — eval harness and monitoring as day-one deliverables — and a lighter commitment card for the other top-3 bets. Publish the kill list with reasons. Name an accountable business owner per backed bet. Write the governance hand-off note.

Three things the method depends on, however it’s run. First: the candidate list leaves Scope in customer or operator language; initiatives in internal language can’t be commercially validated honestly. Second: the AI business shifter is scored under a different evidentiary standard than the operational shapes; the same screen kills the wrong ones. Third: every claim carries an evidence label; the label is the difference between a credible portfolio and a hopeful one.

Where this method sits next to the strategy houses

The tier-1 firms have done useful diagnostic work, and the leadership teams I work with have usually read most of it. The method isn’t designed to replace any of them. It sits upstream of them, in the place where most AI engagements assume the decision has already been made.

McKinsey QuantumBlack, BCG and Bain Vector are built for multi-month delivery engagements with large embedded teams — QuantumBlack’s Turo product, for instance, is performance management for AI portfolios already in production, not a selection mechanism. Operating-model and reinvention specialists like Deloitte (whose December 2025 research has named the CIO-(or-CTO) – CFO – CSO shared-ownership pattern as a predictor of higher AI returns) and Board of Innovation (whose AI Reinvention Blueprint integrates strategy, ways-of-working, human-centred design, operating-model redesign and venture-building) then build the organisation around an agreed strategy. BCG’s ‘Widening AI Value Gap’ report (September 2025) names value-based prioritisation as a core element of one of its five ‘future-built’ strategies — specifically ‘reshape and invent the business with value-based prioritisation of AI initiatives and rigorous tracking of results’ — without publicly packaging the work that produces the prioritisation.

The method described here sits earlier in the cycle. It’s the upstream decision that makes the strategy houses’ work focused and the resulting investment proportionate. If your binding constraint is operating-model redesign, you want Deloitte or Board of Innovation. If your binding constraint is multi-year delivery at scale, you want a tier-1 firm. If your binding constraint is which three or five bets to actually back, this is the method for that.

What this method isn’t

It isn’t a technology assessment. The Concept Canvas surfaces the build/buy/partner question for each bet, but the actual technology choices come after the bet has been scoped and decided.

It isn’t an audit of last year’s pilots. The work pays attention to what’s been tried — it usually contains useful information about what’s plausible inside this company — but it isn’t retro-scoring it.

It isn’t a workshop series. The cadence concentrates time with a small group of senior people and the people closest to the existing initiatives.

It isn’t a replacement for the longer Discovery and Incubation work that needs to happen afterwards, for the two or three bets that earn it. Those bets need real customer development, real prototyping, real commercial validation at scale, and real MLOps and evaluation work. What this method does is make sure that investment, attention and execution effort go into the right two or three, rather than being spread across a portfolio that wasn’t designed.

From AI chaos to focused, strategically aligned action

The board doesn’t usually ask the leadership team for three AI bets. The leadership team arrives at the question itself, around the moment it becomes obvious that the activity is consuming capital without producing return — and that the fix isn’t more capital but sharper focus.

What that produces is a rebalanced portfolio. The top two or three bets get disproportionate resource, time and capital so they can scale. The others get repositioned — paused for re-evaluation, folded into adjacent bets, or stopped — with reasons recorded so the same idea doesn’t return six months later under a different name. Momentum follows focus. Returns follow momentum.

Three questions a leadership team should be ready to put on the table:

What big bets are you placing over the next 6, 12 and 36 months?
How will you translate these into clear priorities for your teams?
What will you stop, pause or reposition to make room for them?

This piece is a practical guide for that work: how to turn twenty or fifty unsorted AI initiatives into a smaller, strategically aligned set the leadership team can act on with momentum. Not a longer list, ranked. A shorter one — capital, attention and authority concentrated where they matter.

Sources and further reading

· Johnson & Johnson AI portfolio rationalisation (CIO Jim Swanson, April 2025, 10–15% of ~900 projects = 80% of value): pymnts.com / deeplearning.ai

· Dell GenAI rationalisation (CTO John Roese, January 2025, 800 ideas → 132 use cases → 8 priority projects): fortune.com

· Gartner, “AI Investment Framework for CxOs” (May 2025 CIO survey, n=506; 72% breaking even or losing money): gartner.com

· IBM Institute for Business Value with Oxford Economics, 2025 CEO Study (n=2,000): newsroom.ibm.com

· MIT NANDA, “The GenAI Divide: State of AI in Business 2025”: fortune.com summary

· HBR, “How to Move from AI Experimentation to AI Transformation” (Bain & OpenAI co-authors, April 2026; the ‘micro-productivity trap’): hbr.org

· BCG, The Widening AI Value Gap: Build for the Future 2025, September 2025 (the five ‘future-built’ strategies): bcg.com

· Deloitte, State of AI in the Enterprise 2026 (n=3,235 leaders, 24 countries, fieldwork August–September 2025): deloitte.com

· Deloitte Insights, “How the right mix of C-suite leadership can drive outsized AI returns”, December 2025: deloitte.com

· Tim Brown, Change by Design (2009), for the IDEO desirability/viability/feasibility triad: ideo.com

· Strategyzer, Business Model Canvas and Value Proposition Canvas (Alex Osterwalder), as the lineage of the Concept Canvas: strategyzer.com

· Tony Ulwick, Outcome-Driven Innovation, for the desired-outcome statement format: strategyn.com

· Rita McGrath & Ian MacMillan, “Discovery-Driven Planning” (HBR, 1995), for assumption-led validation: hbr.org

· George Day, “Is It Real? Can We Win? Is It Worth Doing?” (HBR, 2007), for the portfolio screening logic: hbr.org

· Bansi Nagji & Geoff Tuff, “Managing Your Innovation Portfolio” (HBR, 2012), for the Core / Adjacent / Transformational ambition tiers: hbr.org

· NIST, AI Risk Management Framework (Govern / Map / Measure / Manage), for the AI-risk lifecycle: nist.gov

· OWASP, Top 10 for LLM Applications, for the prompt-injection and agent security checks: owasp.org

· Marco Iansiti & Karim Lakhani, “Competing in the Age of AI” (HBR, 2020), on AI value as an operating-model question: hbr.org

· Frank Kumli, “The State of New Business Building” (June 2026), on strategic alignment, search fields and opportunity discipline in corporate venturing: linkedin.com

· McKinsey QuantumBlack, “Turo” (AI portfolio performance management): mckinsey.com

· Board of Innovation, AI Reinvention Blueprint: boardofinnovation.com

· Board of Innovation, Innovation Portfolio Management (Strategic Alignment, Innovation Ambition, Resource Allocation, Blocker Identification mappings): boardofinnovation.com

· Accenture, Innovation Portfolio Management and Governance (Portfolio Innovation Model; Innovation for Longevity vs Innovation for Balance): accenture.com

· The AI Growth Decisions Programme — one-pager: dianaspiridon.com/services