Rebuilt Code Can Preserve the Old System
The cleanest rewrite in Git can still be a failed modernization. Legacy risk lives in behavior, data ownership, batch side effects, and operational habit, not only in the old language or UI framework.
The Mistake Is Treating Syntax as Architecture
Teams assume modernization means rebuilding the same screens on a newer stack. They interview stakeholders, copy visible workflows, and underestimate the rules encoded in reports, jobs, overrides, and support scripts.
For legacy redesign instead of code rebuild, a senior review should ask which coexistence decision is being made, which evidence proves it, and what signal would force the team to pause.
A New UI Around the Same Constraint
A manufacturing team rebuilt order entry while the old ERP kept calculating lead times and reservations. The new UI passed isolated tests, but planners refused cutover because promised dates differed for partial orders and a recall label rule lived in an old stored procedure. The program recovered only after order promising became a bounded wave with as-is evidence, shadow comparison, signed parity tolerances, and a single writer for the migrated plant cohort.
Redesign the Operating Boundary
Where Migration Plans Hide Ownership
Engineering metrics reward lines of code migrated and microservices created. Business outcomes need correct orders, balances, and compliance reports on the day after cutover.
Behavior that was never written down
Legacy systems encode rules in UI validations, nightly jobs, exception handlers, and "ask Susan" tribal knowledge. A rewrite team that interviews stakeholders for two weeks captures a fraction of what production does.
When the new service ships, edge cases appear in production:
- Discount rules that only applied to one region.
- Status transitions that legacy allowed but the new API rejects.
- Reports that matched the old ledger until month-end close.
- Fee waivers triggered only when a CSR used a specific override code.
- Inventory holds that expired on a calendar rule buried in a nightly job.
The gap is not "missing features." It is undocumented behavior that the old system still performs. Interview notes and product briefs describe the happy path. Production encodes twenty years of exceptions, workarounds, and regulatory patches. A rewrite team that treats discovery as a two-week workshop is building against a spec that was never true.
Ownership stays vague
Rewrite projects often mirror the old org chart: a platform team owns "the new stack," product owns "requirements," operations owns "both until further notice."
Without named owners per domain (orders, pricing, inventory, policy), parity disputes become meetings. The loudest stakeholder wins. The legacy app stays because nobody will sign the retirement form.
The parallel system tax
While rewrite runs, legacy still gets regulatory patches, sales promises, and bug fixes. The new codebase diverges from the truth users still touch daily. Integration tests against a frozen snapshot of legacy go stale the week after they pass.
| Symptom | What teams say | What is actually happening |
|---|---|---|
| Dual run past year two | "Almost ready for cutover" | Parity never scoped per domain |
| Rising reconciliation headcount | "Temporary during migration" | Two systems both authoritative |
| Rewrite team morale drop | "Scope creep" | Legacy kept changing under them |
Route Shape Versus Ownership Shape
Bad: greenfield rewrite in parallel
Legacy monolith (still changing)
|
| undocumented gaps
v
New platform (frozen spec from month 2)
|
+----> both run in production
Typical failures:
- Dual operations teams reconciling mismatched totals.
- "Almost done" for years because scope was never bounded.
- Rollback plan is "keep legacy" because parity was never proven.
Good: redesign waves with as-is truth
Wave 1: document + move Orders (parity evidence)
Wave 2: Pricing (single writer, legacy read-only)
Wave 3: retire legacy order path
During each wave:
- As-is behavior is captured for that bounded context (rules, jobs, exceptions).
- To-be design states who owns writes and what "equivalent" means.
- Parity checks compare legacy and modern on sampled production inputs.
- Legacy write paths for that context are disabled when evidence holds.
What Changes When Writers Are Named
Code is the easy part to replace. The system is behavior plus data ownership plus operational habit.
Redesign means asking:
- What must stay true for regulators and customers?
- Which modules are really one domain?
- What can we stop doing instead of reimplementing?
Rewrite without those questions produces modern syntax around the same accidental complexity.
Worked Example: manufacturing ERP order module
A plant software team rewrote the order entry UI in a modern framework while the legacy ERP still calculated lead times and reservations.
| Phase | Legacy ERP | New UI | Outcome |
|---|---|---|---|
| Rewrite-only | still authoritative | calls legacy APIs sometimes | UI refresh, same core |
| Wrong parallel | still changing rules | reimplemented rules from memory | divergent lead times |
| Wave redesign | owns writes until parity | shadow compare, then owns UI + API to new service | single path to retire |
After wave redesign, planners stopped maintaining two spreadsheets for promised ship dates. The wave took longer in calendar time than the UI rewrite alone, but operations ran one authoritative path for lead times before legacy order entry was retired.
Where This Shows Up: financial services and manufacturing
Financial services: policy and fee logic often lives in legacy cores and advisor workflows. A rewrite that misses accrual edge cases breaks examiner trust faster than a slow Strangler wave with parity diffs. Examiner questions are not about framework choice. They are about whether balances and disclosures match prior periods. A parallel rewrite that cannot show sampled parity against the legacy ledger keeps both systems in scope for audits.
Manufacturing: ERP modules tie to shop floor scanners and supplier EDI. Rebuilding screens without lot traceability rules stops the line when labels no longer match what shipping expects. Plant supervisors tolerate slow software. They do not tolerate wrong lot numbers on a recall. Wave redesign that captures reservation and traceability rules before UI polish prevents a go-live that halts production.
Same lesson: capture behavior and ownership per wave, do not assume a new repository implies a new trustworthy system.
Where Control Moves During Redesign
Rewrite-only:
Stakeholder notes --> New app --> isolated green tests
Legacy ERP -------> keeps changing ------> operations still depend on it
Wave redesign:
As-is behavior --> bounded context --> shadow compare --> single writer --> retire old path
The Rebuild Versus Redesign Tradeoff
The tradeoff is visible progress versus risk reduction. New screens impress quickly. Retired paths and parity evidence reduce risk slowly, but they are the only milestones operations can trust.
For legacy redesign instead of code rebuild, the useful review is not a generic architecture checklist. It should inspect writer ownership, parity evidence, wave boundaries, rollback, and retirement. If those fields are missing, the team may still be busy, but leadership does not yet have a decision-quality artifact.
Review Packet Before Release
For legacy redesign, the release review should not start with percentage of traffic shifted or number of services created. It should start with the ownership table. For every entity touched by the wave, name the current writer, the target writer, the read paths that remain on legacy, the jobs that still mutate data, and the condition that disables those jobs. If a batch process, admin SQL script, or vendor feed can still update the entity, the wave is not ready.
For legacy redesign, parity evidence should sample production-shaped cases by business segment, not only random identifiers. Compare the values that users, auditors, reports, and downstream integrations care about. In financial services, that means balances, statuses, fee calculations, disclosures, and event timestamps. In manufacturing, it means lead times, lot traceability, reservation state, and label outputs. HTTP 200 on old and new paths is not parity. Matching business outcomes under realistic edge cases is parity.
For legacy redesign, rollback design is simple when the old writer never stopped writing. It is difficult when writes have moved. Decide whether rollback means routing traffic back to legacy, replaying events, freezing writes, or operating a manual exception queue. If rollback would reintroduce two writers, the safer response may be fail-closed and human handling rather than automatic reversal.
For legacy redesign, require a retirement ticket in the same wave plan as the build ticket. Modernization without retirement is accumulation. The old path may remain read-only for audit or reporting, but writable legacy behavior should have a deletion or disablement date tied to evidence.
Failure Modes to Rehearse
Legacy redesign test plans should include concurrent writes, stale reads, replay after partial failure, and rollback after ownership moves. Test a legacy batch job running while the modern service writes the same entity. Test a support-user override, not only a customer-facing request. Test what happens when CDC pauses for an hour and then resumes. Test deletion, merge, and correction flows because those are where hidden writers often appear.
For legacy redesign, sample cases should include the ugly tail: old product codes, inactive customers, manually corrected orders, regional exceptions, and records created before the current schema existed. Those cases are not edge noise. They are often the reason the legacy system survived for years.
Modernization Paths Worth Comparing
Lift-and-shift is honest when hosting is the problem. Refactor in place is honest when behavior is sound but code structure hurts change. Wave redesign is the safer path when the system is operationally critical and parity must be proven per capability. A big-bang rewrite is defensible only when the domain is small enough to freeze and prove.
In legacy redesign instead of code rebuild, the alternative paths are not steps on a ladder. Each one carries a different mix of risk, cost, and learning. The weak choice is the one that hides the tradeoff until users, operators, or auditors discover it for you.
The Rule After the Rebuild Debate
The cleanest rewrite in Git can still be a failed modernization. Legacy risk lives in behavior, data ownership, batch side effects, and operational habit, not only in the old language or UI framework. The practical lesson is to demand evidence that fits legacy redesign instead of code rebuild, not a universal checklist. The artifact should expose writer ownership, parity evidence, wave boundaries, rollback, and retirement clearly enough for another team to challenge the decision.
If legacy redesign instead of code rebuild is the decision in front of your team, use the Modernization Snapshot to pressure-test the boundary before it hardens.