All articles Migrations 10 min read

Change without revolution: how to move 40 product teams off a legacy stack without a freeze

How to migrate 40 product teams off a legacy stack without a freeze, a reorg, or a leap of faith. The technical patterns and coordination model that make it work.

Paul Utr 19 June 2026

You have a migration to run and forty product teams whose output cannot pause. Strategy decisions (which phases, which timeline, which systems first) are settled. The open question is how engineers from forty independently functioning teams execute a migration without freezing production or breaking shared systems, and still arrive at consistent implementations across the organisation.

Five patterns make a migration executable across forty independently functioning teams: the strangler fig, branch by abstraction, feature flags, dark launches, and decision logs.

Why the coordination problem is harder than the technical problem

A single product team with ten engineers can migrate on their own timeline. The blast radius of a mistake is contained to one team’s output and one recovery window. At forty teams, coordination becomes harder than the technical work itself.

The technical patterns for incremental migration are well understood. Migrations at scale still fail because teams implement them inconsistently, discover incompatibilities late, and have no mechanism for making shared decisions without scheduling meetings that take three weeks to produce a resolution.

Technical approaches let each team migrate independently without breaking the shared system. Coordination mechanics keep forty teams consistent without requiring everyone to align before anyone can move.

The strangler fig: the core pattern

Martin Fowler named the strangler fig pattern in 2004. The name comes from the Queensland strangler fig, a tree that grows around a host, gradually takes its nutrients, and eventually replaces it. The new system and the old coexist throughout the transition, with traffic gradually routed from old to new as each component is validated.

The migration runs in three phases.

Transform. Build new components alongside the existing system. Each new component receives the same inputs as the old and produces the same outputs. It does not yet serve production traffic.

Coexist. Route a portion of production traffic to the new component, keeping the old as fallback. Both systems produce output; the results are compared. This is the longest phase and the most valuable: it runs the new system under production load before it carries production responsibility.

Eliminate. Once the new component is validated at full production load, the old component is decommissioned and traffic routing is updated.

The key property of this pattern is that each component migration is independent. Team A can be in the Coexist phase while Team B is still in Transform and Team C is in Eliminate. The system continues to serve traffic throughout.

AWS’s prescriptive guidance on cloud migration lists the strangler fig as the recommended pattern specifically because it maintains operational continuity. The pattern was originally described for monolith-to-microservices transitions but applies equally to content platform migrations, data platform migrations, and any architectural change where the existing system must continue operating during the transition.

Branch by abstraction: the code-level pattern

The strangler fig operates at the system and traffic-routing level. Branch by abstraction is the corresponding code-level pattern: it handles the case where the old system is embedded in the codebase rather than isolated behind a clean interface.

The pattern, described by Martin Fowler and named by Paul Hammant, works in four steps.

Create an interface or adapter that captures all the ways the codebase interacts with the component being replaced. The existing code continues to use the existing component, but now does so through the abstraction layer.
Build the new component behind the same abstraction. Both implementations exist simultaneously. The codebase uses the abstraction; the abstraction can be configured to point to either implementation.
Migrate call sites gradually. Move parts of the codebase from the old implementation to the new through the abstraction. Teams can progress at their own pace because the abstraction layer insulates unmigrated code from changes in the new implementation.
Remove the old implementation once no code paths use it. The abstraction layer may remain for future flexibility or be collapsed if the indirection is no longer warranted.

For a CMS migration, the abstraction is the content provider layer that sits between the frontend and the CMS.

Feature flags and dark launches

The strangler fig and branch by abstraction handle structural separation of old and new. Feature flags work differently: they control the exposure of new behaviour to production traffic without requiring a deployment.

A feature flag is a conditional in the code that enables or disables a code path based on a configuration value rather than a deployment. In a large-scale migration, they serve distinct purposes.

Route a percentage of requests to the new implementation (1%, then 5%, then 25%, then 100%) rather than switching all traffic at once. If problems surface at 5%, the flag rolls back to 0% without a deployment, and the old implementation absorbs full traffic while the issue is addressed.
Individual teams can enable the new implementation for their specific services or user segments independently of what other teams are doing. Team A’s flag state does not affect Team B’s. Each team migrates on its own timeline with its own validation window.
Run the new implementation in parallel with the old, processing the same production requests but not exposing the new system’s outputs to users. Both produce responses; only the old system’s response is returned; the new system’s response is logged and compared. Discrepancies surface bugs before users are affected. This is the most effective risk-reduction technique in a large-scale migration: the new system earns production traffic by proving it produces identical results on production load, not on test data.

For organisations operating at 40-team scale, feature flag infrastructure should be a precondition of the migration. The coordination cost of managing migration state across forty teams without flag-driven traffic control is prohibitive. With it, each team’s migration state is visible in the flag configuration and individually controllable without requiring coordination with other teams.

Decision logs: the coordination layer

Technical patterns handle execution. Consistent decision-making across forty teams over 12 to 18 months is a different kind of problem.

Decision logs are a structured record of the decisions that constrain future choices, not sprint notes or meeting summaries. For every significant architectural or process decision made during the migration, the log records: what was decided, why it was decided, what alternatives were considered, who made the decision, and when it should be revisited if new information changes the context.

A team encountering an architectural question in month eight that was settled in month two does not need to reconstruct the reasoning from meeting notes. The decision log entry contains the reasoning, the alternatives considered, and the constraints that drove the choice.

Decision logs also serve as the mechanism for escalation. When a team finds that a previous decision produces a suboptimal outcome in their specific context, the log provides the baseline for a structured reconsideration: “Decision #47 specified that all content should be served through the abstraction layer. Team Bravo has a use case where direct database access would reduce latency by 80ms. Requesting reconsideration with the following context.”

Without this, teams make local exceptions to established patterns without documentation. The result is a codebase where migration decisions are implemented inconsistently and the reasoning for each team’s approach is irretrievable.

Dual-running: operating both systems in production

During the Coexist phase of a strangler fig migration, both the old and new systems run in production simultaneously. Dual-running is the designed state, and it should be treated as such.

The following operational commitments make it workable at scale.

Observability parity. The new system needs monitoring, alerting, and logging as comprehensive as the old system’s from the moment it receives production traffic. Deploying new components with development-quality observability and then discovering that production failures cannot be diagnosed is a common and avoidable failure. Observability for the new system should be a definition-of-done requirement for the Transform phase.

Clear ownership during the overlap. Ownership during dual-running should be explicit: whoever maintained the old system owns incidents there, whoever built the new system owns incidents there. Incidents that span both need a named escalation path. Without it, dual-running incidents become everyone’s problem, which in practice means nobody’s until they escalate.

Defined exit criteria. Dual-running ends when the new system has earned readiness to absorb full production load through evidence rather than assertion: it has run at 100% traffic volume for a defined period without incident, the output comparison from the dark launch phase showed zero or acceptable divergence, and rollback has been explicitly removed as an option in the flag configuration. Dual-running without exit criteria becomes permanent, and permanent dual-running is the most expensive outcome of a migration that does not complete.

The team coordination model

A 40-team migration coordination model rests on three components, each addressing a different failure mode.

A migration platform team. The platform team’s job is to make every other team’s migration easier, measured by migration velocity across the organisation rather than by the number of services they migrate themselves. They build and maintain the abstraction layers, feature flag infrastructure, traffic routing configuration, and decision log tooling. They are an enablement function.

Migration state visibility. Every team’s migration state should be visible to the organisation: which components are in Transform, Coexist, or Eliminate, and which have not started. A simple status board is sufficient. The purpose is coordination, not oversight: teams can see who has solved similar problems and ask for guidance, and the organisation can identify teams that are blocked and need platform team support.

An escalation protocol that does not require a meeting. In a 40-team organisation, every decision that requires a scheduled meeting takes two to three weeks to resolve. A designated migration authority (a principal engineer, a migration architect, or a fractional technical lead) provides written resolution instead. The protocol: open a decision log entry, tag the migration authority, expect a written response within 48 hours.

Closing

A team without clean system interfaces, feature flag infrastructure, and clear decision-making authority should build those first. They are the preconditions that make the migration executable, and the investment pays back quickly once it is running.

Book a call for a conversation about whether your current architecture supports an incremental migration or whether preconditions need to be established first.

FAQ

What is the strangler fig pattern in software engineering?

The strangler fig pattern is an incremental migration approach where new system components are built alongside the existing system, production traffic is gradually routed from old to new, and old components are retired as new ones are validated. Named by Martin Fowler in 2004, it is recommended in AWS's prescriptive guidance as a standard approach for migrating large systems without a production freeze.
What is branch by abstraction?

Branch by abstraction is a code-level technique for replacing a component embedded in a codebase without branching the entire codebase. An abstraction layer captures all interactions with the component. The new component is built behind the same abstraction. Parts of the codebase are migrated from the old implementation to the new through the abstraction, and the old implementation is removed once no code paths use it. Different parts of the codebase can be at different stages of migration simultaneously.
What is a dark launch in the context of a migration?

A dark launch runs the new system in parallel with the old, processing the same production requests but not exposing the new system's outputs to users. Both systems produce responses; only the old system's response is returned to the user; the new system's response is logged and compared. Discrepancies reveal bugs in the new implementation before users are affected.
How do you coordinate a migration across 40 engineering teams?

Three structural elements are required: a dedicated migration platform team whose job is to make other teams' migrations easier, a visible migration state for all teams, and a decision escalation protocol that does not require a scheduled meeting. Decision log entries reviewed by a designated migration authority within 48 hours replace the two-to-three-week meeting cycle.
What is a decision log and why does it matter for large-scale migrations?

A decision log records significant architectural and process decisions made during a migration: what was decided, why, what alternatives were considered, who decided, and when to revisit. At 40-team scale over 12 to 18 months, the alternative is a codebase where the same decisions are made differently by different teams and the reasoning for each team's approach is lost.
When is an incremental migration not appropriate?

When the existing system is so internally coupled that isolating components behind clean interfaces would cost more than rebuilding from scratch. A second condition: when the team is small enough that managing migration state is simpler than running two systems in parallel. For a five-person team migrating a single product, a planned cutover with a short production freeze may be the right call. The incremental approach earns its overhead at scale, roughly ten or more teams with shared infrastructure.

Sources

Author

Paul Utr

Co-founder & Co-CEO

Paul has been launching online platforms since his teens, picking up UX and product design by building them. He led the Mailgun redesign at Netguru and was Principal Designer at Ramp Network through its seed-to-Series-B run. At WAYF he leads design and organisational alignment, and watches how language carries through every product we ship.

About Paul LinkedIn

We're booking content platform
engagements for 2026.

Twenty-five minutes to walk through the work and decide if we're the right team for it. Scoping and a fixed price come after.

Book a 25-min call Or email us instead