Engineering Practitioner Brief / 18 May 2026
Strangler Fig Migration Cost
The strangler-fig pattern is the most successful long-form migration pattern in software engineering. It avoids the big-bang failure mode by shipping value continuously while replacing the old system one piece at a time. The cost is real and the duration is long, but the published case histories show it works at scale. This page breaks down the per-component cost so a team can size a strangler-fig migration before committing.
The Pattern
Martin Fowler described the pattern in 2004, drawing the name from the strangler-fig tree (a species common in tropical forests, which grows around an existing tree and eventually replaces it). The software analog: build a routing layer in front of the existing system, then progressively implement functionality in a new system, routing traffic to the new implementations as they prove out. Over time, the new system grows until it has replaced the old; the old system is then turned off.
The key insight is that the old and new systems coexist for the duration of the migration. There is no big-bang cutover. There is no period of frozen development. Each new piece of functionality is a self-contained ship-and-verify-and-incrementally-route-traffic cycle. The risk of any one cutover is small because only one piece is changing at a time.
The Cost Components
The Routing Layer
The strangler-fig pattern requires an upstream component that can route requests to either the old or the new system based on rules. The choices in 2026 are well-developed: an API gateway (Kong, AWS API Gateway, GCP Apigee, Azure API Management), a service mesh sidecar (Istio, Linkerd), or a simple reverse proxy (Nginx, Envoy, HAProxy) with feature-flag-driven routing.
Cost ranges: $20,000 to $300,000 per year in infrastructure and licensing for the routing layer, depending on traffic volume and vendor choice. The lower end is a self-hosted Envoy or Nginx; the higher end is a fully-managed enterprise API gateway with high throughput. This cost is paid for the duration of the migration and usually persists afterward, since the routing layer is generally retained for ongoing canary deploys and traffic-management.
Per-Service Extraction
Each piece of functionality extracted from the old system into the new system has its own cost. The range is wide, dominated by domain complexity and data-ownership shape:
| Service Profile | Engineer-Hours | Notable Cost Drivers |
|---|---|---|
| Stateless / single API surface | 200 to 600 | Notification dispatcher, search index, simple cache |
| Owns dedicated database table | 600 to 1,500 | Per-tenant data, product catalog, ratings |
| Reads from shared transactional DB | 1,200 to 2,500 | Reporting service, billing readout, dashboards |
| Writes to shared transactional DB | 2,000 to 4,500 | Order placement, payment, inventory |
| User-facing path with strict latency | 1,500 to 3,500 | Cart, checkout, login |
A typical monolith decomposition into 15 services lands in the 15,000 to 45,000 hour range for the extraction work alone. See monolith decomposition costfor the full cost arithmetic.
Parallel-Run Verification
For services where correctness is critical (payment, pricing, regulatory reporting), the strangler-fig pattern is typically combined with a parallel-run period. The new implementation runs alongside the old, both produce results, and a comparator flags divergences. Once the divergence rate is below an agreed threshold, the new implementation takes over.
Parallel-run infrastructure costs 100 to 500 engineer-hours per service to set up, plus the runtime cost of running the new service before it carries traffic. For services on the critical path the parallel-run period is non-negotiable; for low-stakes services it can be skipped in favour of a faster canary deploy. See parallel run refactor cost for the deeper treatment.
Coordination and Operational Tax
During the migration, both the old and new systems are in production. Cross-cutting changes (security patches, schema changes, feature flags) have to be applied to both. On-call rotations cover both. The duplication cost is typically 15 to 30 percent of engineering capacity for the duration of the migration. McKinsey 2023 research on engineering productivity confirms this range for transition projects.
The Cutover Decision per Service
For each service, the decision to fully cut over from old to new follows a predictable sequence:
- Build the new implementation. Tests pass against the documented requirements.
- Shadow mode. The new implementation runs on production traffic but its responses are discarded. The comparator measures divergence from the old. Run for 2 to 8 weeks depending on traffic patterns.
- 1 percent traffic. Route a small share of real users to the new implementation. Monitor error rates, latency, business metrics.
- 10 percent traffic. Confirm the new path scales and does not have failure modes that only emerge at volume.
- 50 percent traffic. The new path is the primary implementation for half of users. At this point the old path is in standby mode.
- 100 percent traffic. All users on the new implementation. The old path remains warm for a few weeks in case rollback is needed.
- Decommission. The old code path is removed. The routing rule for this service is simplified.
Each step typically takes 1 to 4 weeks of calendar time, with engineer attention concentrated at the shadow-mode setup and at the final decommission. Total per-service calendar time from build start to decommission is typically 3 to 9 months.
The Indefinite Middle State Failure Mode
The most common strangler-fig failure pattern is the indefinite middle state. A team migrates 60 to 80 percent of traffic to the new system, then stops. The remaining functionality is the hardest (legacy admin paths, batch jobs, rare endpoints). The team moves on to other priorities. Both systems continue to exist indefinitely. Both have to be maintained, on-called, security-patched, and dependency-upgraded. The operational cost is roughly double the original monolith.
The mitigations are organisational rather than technical. Commit to a hard end-date for the old system before starting the migration. Treat the last-mile services as non-negotiable rather than optional. Assign sunset ownership to a specific team. Track the migration completion percentage in executive-level dashboards so the indefinite middle state becomes visible.
The cost of falling into this trap is high. A migration that stops at 70 percent and then runs both systems indefinitely costs more than either a successful full migration or a deliberate decision to stay on the monolith. The decision to start should include the discipline to finish.
Case Pattern Reference
Three published case histories of successful strangler-fig migrations, with rough dollar references:
- Shopify modular monolith (2016-2020): Reorganized a Rails monolith into bounded contexts using component-isolation rather than service extraction. Reported in Shopify engineering posts as multi-year, multi-million-dollar effort.
- Etsy services extraction (2014-2018): Extracted performance-critical components from a PHP monolith into Scala services. The strangler-fig pattern was explicit; the routing layer was custom. Full migration ran approximately 4 years.
- Stripe API versioning (ongoing): Not a system decomposition but uses strangler-fig logic at the request level: old API versions and new API versions both supported indefinitely, with internal translation, allowing gradual customer migration without forced upgrades. The cost is real (the API translation layer is non-trivial) and the benefit (customers never break) has paid for it many times over.
Related Reading
- Big-bang rewrite cost
- Parallel run refactor cost
- Monolith decomposition cost
- Feature flag refactor cost
- Legacy code refactoring cost
Frequently Asked Questions
What is the strangler fig pattern?
A migration pattern named by Martin Fowler in 2004 after the strangler-fig tree, which grows around an existing tree until it eventually replaces the host. In software: build a proxy or API gateway in front of the old system, then incrementally implement endpoints in a new system, routing traffic to the new implementations as they become ready, until eventually all traffic is on the new system and the old one can be turned off.
How much does a strangler fig migration cost per service?
Per service extracted from a monolith, plan 600 to 3,000 engineer-hours depending on the domain complexity and data ownership. A clean-domain service (notifications, search index) lands near the bottom. A service that owns part of a shared transactional database lands near the top. A typical decomposition involves 8 to 30 services, so the engineer-time alone runs from 5,000 to 90,000 hours.
What is the proxy or gateway tax?
Every request now goes through an additional hop. Latency increases by 1 to 5 ms typically. Cost: a service mesh (Istio, Linkerd) or API gateway (Kong, AWS API Gateway, GCP Apigee) adds $20K to $300K per year in compute and licensing. This is paid for the entire duration of the migration plus indefinitely after, since the proxy usually stays as the request router.
How long does a strangler fig migration take?
Comparable in elapsed time to a big-bang rewrite but with continuous shipping. For a 200K-line monolith with 20 engineers, 24 to 48 months from start to most-of-traffic-on-services. Some published cases (Shopify, Etsy) ran 4 to 7 years to reach a steady state. The duration is longer than a big-bang would have estimated; the delivered value is greater because shipping never stops.
When do you cut the old branch?
When the percentage of traffic to the new implementation reaches a stable plateau (usually above 95 percent), the new implementation has been in production long enough that edge cases have surfaced (3 to 12 months typically), and the old branch is not the source of truth for any data the new branch cannot reproduce. Removing the old branch too early causes regressions; leaving it too long doubles maintenance cost.
What is the biggest risk of strangler fig?
Indefinite middle state. A team migrates 60 percent of traffic, then stops because the remaining 40 percent is hard. Both old and new systems exist forever, both have to be maintained, both have to be on-called. The cost is then double the original monolith. The mitigation is committing in advance to a hard cutoff for the old system, even if it forces difficult last-mile work.