Engineering Practitioner Brief / 18 May 2026

Monolith Decomposition Cost

The published case histories (Shopify, Etsy, Amazon's 2002 service mandate, Twitter's post-Rails re-architecture, Uber's 2020 reversal back toward fewer services) all agree on one thing: the cost was higher than the original estimate. This page lays out the structural cost components so a 2026 estimate starts from the right base rather than the optimistic one.

Small monolith

$250K to $700K

25K to 100K LOC, 5 to 10 engineers, 9 to 18 months

Medium monolith

$700K to $2M

100K to 500K LOC, 15 to 40 engineers, 18 to 36 months

Large monolith

$2M to $4M+

500K+ LOC, 40+ engineers, 24 to 60 months

Why Big-Bang Almost Always Loses

The reference text on this question is Joel Spolsky's 2000 essay Things You Should Never Do, Part I, which documented Netscape's decision to rewrite the browser from scratch and the years of market share it cost them. The argument generalises beyond browsers. A working monolith contains thousands of bug fixes, edge-case handlers, and quiet defenses against weird user behaviour. A rewrite starts at zero on all of those and has to relearn them. Meanwhile the rewrite ships nothing user-visible for the duration of the work, while competitors keep shipping. This is the single biggest economic argument against big-bang.

The strangler-fig pattern, named by Martin Fowler after the strangler-fig tree, ships a single new service at a time, routes a small fraction of traffic to it, validates correctness, then gradually increases the traffic share until the monolith's old code path can be deleted. The advantages are operational: every step is independently shippable, reversible, and observable. The disadvantage is that the cumulative timeline is longer than the optimistic big-bang estimate, even though the actual delivered value lands sooner. See strangler-fig migration cost for the per-service economics of this pattern and big-bang rewrite cost for the failure-mode arithmetic.

The Six Cost Components

1. Engineer-time on extraction

The primary cost. Per service extracted from a monolith, expect 800 to 4,000 engineer-hours depending on how entangled the service's data is with the rest of the monolith. A clean-domain service (say, a notification dispatcher with one outbound interface) lands near the bottom. A service that owns part of a shared transactional table lands near the top. A typical decomposition involves 8 to 30 services, so the engineer-time alone runs from 6,000 to 120,000 hours.

2. Inter-service infrastructure

New components that did not exist in the monolith world: a service mesh (Istio, Linkerd, Consul Connect, or AWS App Mesh) at $50K to $300K per year in cluster overhead; an API gateway (Kong, AWS API Gateway, GCP Apigee) at $20K to $200K per year; distributed tracing (Honeycomb, Datadog APM, Tempo) at $50K to $500K per year depending on data volume; service registry and configuration (Consul, etcd, AWS Cloud Map). For a mid-sized decomposition the annual ongoing infrastructure premium runs $200K to $800K.

3. Observability data volume

A monolith produces one log stream, one metric set, and one trace per request. The same request through 15 services produces 15 log streams, 15 metric sets, and 15 trace spans. Most observability vendors price on ingest volume, so the bill scales roughly with the number of services. Datadog public pricing puts logs at $0.10 per GB ingested; a service producing 1 TB per month adds $100 per service per month to the bill, and the same workload on 15 services costs 15x as much. Annual observability cost growth of 5x to 10x during decomposition is normal.

4. Coordination tax during the transition

During decomposition the team operates both the monolith and the new services. Every cross-cutting change (a new feature flag, a security patch, a database schema change) has to be applied in both places. Industry consultancy estimates put this dual-operation tax at 15 to 30 percent of engineering capacity for the duration of the transition. McKinsey's 2023 engineering productivity research is consistent with this range, attributing it to coordination overhead rather than incompetence.

5. New on-call patterns

A monolith has one or two on-call rotations. Fifteen services typically have three to eight rotations, plus a platform-team rotation for the shared infrastructure. The hiring, training, and rotation-compensation cost rises proportionally. The flip side is that on-call load per individual engineer can drop because each rotation covers a narrower surface area. Net cost effect: usually a 20 to 50 percent increase in on-call burden initially, dropping back near baseline after 12 to 18 months as services stabilise.

6. Re-platform of the data layer

The single hardest part. A monolith typically owns one database (or a small handful). A service-oriented architecture wants each service to own its own data. Splitting a shared transactional database with referential integrity into per-service databases requires either coordinated migrations, distributed transactions (rare and expensive), or eventual-consistency reads with reconciliation jobs. See database schema debt cost for the dollar arithmetic on per-service data migrations. For most decompositions this is 30 to 50 percent of total cost.

The Distributed Monolith Failure Mode

The worst outcome of a decomposition is the distributed monolith. The team has paid all the cost of microservices (network hops, observability bill, on-call rotation expansion, service mesh) but has not earned the benefit (independent deployability, isolated failure domains, per-team ownership). Every deployment still requires coordinating across multiple services. Every incident still requires a war room. The system is now both expensive to operate and complex to understand.

The cause is almost always cutting boundaries along the wrong seams. Splitting by technology layer (presentation, business logic, data) instead of by business capability (orders, payments, inventory) guarantees that every user-facing feature touches every service. Sam Newman's Monolith to Microservices is the standard reference for getting the seams right; it argues that the boundaries should follow the business's bounded contexts as defined by Domain Driven Design.

The repair cost of a distributed monolith, if discovered after the fact, is roughly equivalent to doing the decomposition over again. This is why a small spike of architectural diligence (event-storming workshops, context-mapping, talking to product about how the business reasons about its own work) at the start of the project is the single highest-leverage investment a decomposition team can make.

The Velocity Dip in the Transition

Every published case history reports a velocity dip during decomposition. Etsy's 2014-2017 service-extraction project, Shopify's 2016 modular-monolith transition, and Uber's 2016-2019 microservices expansion (subsequently partly reversed in 2020) all show 12 to 24 month periods where feature delivery slowed even as engineering headcount grew. The dip is usually 20 to 40 percent of feature velocity, recovering only after the transition completes.

This is the opportunity cost that does not appear on any spreadsheet. A team shipping 30 percent less for two years is, in revenue terms, the most expensive single component of decomposition for any company whose product is still growing. Mature, stable products feel this less because the marginal feature has less revenue impact. Growth-stage products feel it most because the lost shipping speed compounds against competitors.

When Decomposition Pays Back

The benefit case is real but specific. A successful decomposition recovers cost through three mechanisms: independent team scaling (teams ship without coordinating cross-monolith deploys), failure isolation (one service down does not bring the whole system down), and per-service technology choice (the data team can use Python, the payment team can use Go, the front-end team can use TypeScript). The first one is the largest. McKinsey research on engineering organisational design consistently shows that team-level deployment independence is the single biggest driver of large-org velocity.

The break-even point depends on team size and deployment culture. A 15-engineer team in a single monolith with a healthy CI / CD pipeline and trunk-based development may never benefit from decomposition. A 75-engineer team with multiple product lines and varying release cadences almost certainly will. The inflection point is roughly 30 to 50 engineers for most products. Below it, monolith is cheaper. Above it, services are cheaper. The decomposition itself is the bridge across the inflection.

Frequently Asked Questions

How much does it cost to decompose a monolith?

For a small monolith (one service, 25K-100K LOC, 5-10 engineers), $250K to $700K across 9 to 18 months. For a medium monolith (100K to 500K LOC, 15 to 40 engineers), $700K to $2M across 18 to 36 months. For a large monolith (500K+ LOC, 40+ engineers), $2M to $4M+ across 24 to 60 months. Numbers include engineer-time plus the new infra footprint.

Should I do a big-bang rewrite or strangler-fig?

Strangler-fig in nearly every case. Joel Spolsky's 2000 essay on Netscape's big-bang rewrite remains the canonical warning. The pattern that beats both is what Sam Newman calls 'monolith-first': resist decomposition until module boundaries are stable, then carve out one service at a time using strangler-fig.

How long does monolith decomposition take?

Almost always longer than the original estimate. Reference cases (Shopify, Etsy, Amazon, Twitter) ran 3 to 8 years from start of decomposition to majority of traffic on services. For a 200K LOC monolith with 20 engineers, 24 to 36 months to a healthy state is realistic; 12 months is optimistic.

Does monolith decomposition pay back?

Only when the monolith is genuinely blocking team scaling. Below 20 engineers, microservices often cost more than they save. Above 50 engineers, the team coordination tax of a monolith usually exceeds the operational tax of services. The break-even is somewhere in between and depends heavily on deployment culture and on-call discipline.

What is the infrastructure cost premium of microservices?

Typically 1.5x to 3x the monolith's infrastructure spend for a comparable workload, mostly from inter-service network overhead, service-mesh proxy resource usage, increased observability data volume, and the duplicate redundancy required by each service. Some of this is recovered through better autoscaling per service, but not all.

What goes wrong most often?

Cutting service boundaries along the wrong seams. Teams often split by technology layer (presentation service, business-logic service, data service) when domain-driven design says to split by business capability. The result is a distributed monolith where every change still touches every service. This pattern, named the 'distributed monolith' by industry practitioners, is more expensive than the monolith it replaced.