← Tech Guides
★ Tech Guide

Rationalization & Decommissioning

A practitioner's reference for IT portfolio rationalization, application assessment, migration planning, and systematic decommissioning of legacy systems.

01

Quick Reference

Decision frameworks, key acronyms, and common triggers at a glance. Use this section for fast lookups during planning and assessment meetings.

Rationalization Decision Tree

Walk through this flowchart to arrive at a TIME quadrant classification for any application in your portfolio.

                        +---------------------------+
                        |   Is the application      |
                        |   actively used?           |
                        +---------------------------+
                           /                  \
                         YES                   NO
                         /                      \
              +-----------------+        +-----------------+
              | Does it provide |        | Are there legal |
              | business value? |        | or compliance   |
              +-----------------+        | retention reqs? |
                /           \            +-----------------+
              YES            NO            /           \
              /               \          YES            NO
    +-------------+    +------------+     |        +----------+
    | Is it       |    | Can users  |     |        | ELIMINATE |
    | technically |    | migrate to |  RETAIN      | (Retire)  |
    | sound?      |    | another    |  w/ minimal  +----------+
    +-------------+    | system?    |  investment
      /        \       +------------+
    YES        NO        /       \
    /           \      YES       NO
+---------+ +----------+ +----------+
| INVEST  | | MIGRATE  | | TOLERATE |
| (Keep & | | (Move to | | (Maintain|
|  grow)  | |  modern) | |  as-is)  |
+---------+ +----------+ +----------+

Key Acronyms

TCO Total Cost of Ownership
Full lifecycle cost: licensing, infrastructure, support, labor, and opportunity cost
CMDB Configuration Management Database
Authoritative repository of IT asset records and their relationships
ITAM IT Asset Management
Discipline for tracking hardware, software, and cloud asset lifecycle
APM Application Portfolio Management
Governance framework for evaluating and rationalizing the application estate
EoL End of Life
Vendor-announced date when a product will no longer be sold or actively developed
EoS End of Support
Date when vendor ceases security patches and technical support
TIME Tolerate, Invest, Migrate, Eliminate
Gartner-originated model for categorizing applications by business value vs. technical fitness
6Rs Rehost, Replatform, Refactor, Repurchase, Retain, Retire
AWS migration strategy framework adopted industry-wide
CAB Change Advisory Board
Governance body that reviews and approves changes to production systems
RACI Responsible, Accountable, Consulted, Informed
Matrix for clarifying roles and decision rights in cross-functional work
SLA Service Level Agreement
Contract defining uptime, response time, and support commitments

Common Rationalization Triggers

Rationalization initiatives are typically catalyzed by one or more of these organizational events:

Mergers & Acquisitions
Duplicate systems from combined portfolios create redundancy. Post-M&A integration demands rapid assessment of overlapping capabilities.
Cloud Migration
Lift-and-shift deadlines force evaluation of every workload. Many apps are retired rather than migrated when their true usage is measured.
Cost Reduction
Budget pressure from leadership or economic downturn. Eliminating shelfware, unused licenses, and redundant infrastructure yields quick wins.
Compliance & Regulation
GDPR, HIPAA, SOX, or sector-specific mandates require knowing where data lives. Unsupported systems become compliance liabilities.
End of Life / End of Support
Vendor discontinues patches or support. Running EoS software increases security risk and may violate audit requirements.
Digital Transformation
Modernization initiatives replace legacy monoliths with microservices, SaaS, or low-code platforms. Old systems must be retired cleanly.
Technical Debt Reduction
Accumulated complexity from decades of customization. Rationalization breaks the cycle by eliminating systems that cost more to maintain than replace.

TIME Model Quadrants

Each application in your portfolio maps to one of four quadrants based on its business value and technical quality:

Tolerate Low Value / Low Quality
The app works but is not strategic. It is technically mediocre but not worth the disruption of replacement right now. Action: Minimize investment. Limit to break-fix maintenance only. Set a sunset date and communicate it.
Invest High Value / High Quality
Strategic, well-architected applications that drive competitive advantage. Action: Fund enhancements, scale the platform, and treat it as a core asset. Protect its health with monitoring and regular upgrades.
Migrate High Value / Low Quality
Critical business capability trapped in aging technology. Action: Plan migration to a modern platform. Evaluate the 6Rs to choose the right migration strategy. Prioritize based on risk and EoS dates.
Eliminate Low Value / Low Quality
Little business value and poor technical fitness. Action: Decommission. Archive data per retention policy, redirect users to alternatives, and reclaim infrastructure. Often the quickest cost savings.

The 6Rs at a Glance

When an application survives rationalization, choose the right migration path:

R1
Rehost (Lift & Shift)
Move the application as-is to new infrastructure (typically cloud IaaS). No code changes. Fastest path but does not modernize. Best for time-pressured data center exits.
R2
Replatform (Lift & Reshape)
Make targeted optimizations during migration: swap the database to a managed service, containerize the runtime, or update the OS. Moderate effort with meaningful gains.
R3
Refactor (Re-architect)
Redesign the application to take full advantage of cloud-native capabilities. Break monoliths into microservices, adopt serverless, or rebuild with modern frameworks. Highest effort but greatest long-term benefit.
R4
Repurchase (Replace with SaaS)
Drop the custom or on-prem solution and buy a commercial SaaS equivalent. Common for CRM, ITSM, HR, and ERP workloads. Requires data migration and user retraining.
R5
Retain (Revisit Later)
Keep the application where it is for now. Used when there are dependencies, compliance constraints, or insufficient business case to act. Set a review date.
R6
Retire (Decommission)
Switch off the application entirely. Archive data, notify stakeholders, reclaim licenses and infrastructure. This guide focuses heavily on executing this path correctly.
02

Portfolio Discovery

Before you can rationalize, you must know what you have. Discovery builds the authoritative application inventory that every downstream decision depends on.

Application Inventory Techniques

No single method catches everything. Combine automated and manual approaches for completeness:

Automated Discovery
Network scanning, agent-based discovery, and cloud API enumeration. Tools interrogate infrastructure to find running workloads, installed software, and open ports. Fast coverage but misses business context.
Manual Audit
Surveys, interviews with business unit owners, and review of existing documentation. Captures application purpose, business criticality, and user counts that automated tools cannot infer. Slow but essential for context.
Financial Analysis
Pull licensing costs, cloud spend, vendor contracts, and support invoices from procurement and finance systems. Cross-reference with discovered assets to build TCO. Often reveals forgotten subscriptions and shelfware.

CMDB: Key Attributes to Capture

Your Configuration Management Database becomes the single source of truth. At minimum, capture these attributes for every application:

Attribute Type Why It Matters
Application Name String Canonical name for unambiguous reference across teams
App ID Unique Key Machine-readable identifier linking to all related CIs
Business Owner Contact Person accountable for business decisions about the app
Technical Owner Contact Person responsible for uptime, patching, and architecture
Technology Stack Tags Languages, frameworks, databases, and middleware in use
Annual Cost (TCO) Currency Infra + licenses + support + labor to run the application
Active Users Integer Monthly active users; zero signals an elimination candidate
Integrations List Upstream/downstream systems; determines decommission blast radius
Lifecycle Stage Enum Active / Sunset / Decommissioned / Under Review
EoL / EoS Dates Date Vendor end-of-life and end-of-support deadlines
Data Classification Enum Public / Internal / Confidential / Restricted; governs retention rules
Hosting Environment Enum On-prem / IaaS / PaaS / SaaS / Hybrid; affects migration path

Discovery Tools Comparison

Tool Type Best For Discovery Method Cost Model
ServiceNow Discovery ITSM + CMDB Enterprises already on ServiceNow Agent + agentless probes Per-node subscription
LeanIX EA / APM Application portfolio management Integration APIs + manual SaaS per-user
AWS Migration Hub Cloud migration AWS-bound migrations Agent (ADS) + connector Free (AWS native)
Azure Migrate Cloud migration Azure-bound migrations Appliance-based scan Free (Azure native)
Flexera One ITAM / SAM License optimization, SaaS mgmt Agent + beacon + API Per-device subscription
Cloudockit Cloud documentation Multi-cloud architecture diagrams API read-only access SaaS subscription
Tip: Multi-source correlation No single tool captures the full picture. Best practice is to run automated discovery, then enrich results with financial data from procurement and business context from owner interviews.

Shadow IT Detection Methods

Up to 40% of enterprise IT spend occurs outside official channels. Finding shadow IT is critical for an accurate inventory.

CASB Analysis
Cloud Access Security Brokers (Netskope, Zscaler, McAfee MVISION) intercept traffic to identify unsanctioned SaaS usage. They provide risk scores and user counts per app.
SSO / IdP Logs
Review Okta, Azure AD, or Ping Identity logs for SAML/OIDC integrations. Apps appearing in SSO that are not in the CMDB are shadow IT candidates.
Expense Reports
Search corporate credit card and expense reimbursement data for recurring SaaS charges. Departments often procure tools independently through expense accounts.
Network Traffic Analysis
DNS logs and firewall records reveal external services being accessed. Unusual outbound connections to SaaS endpoints indicate unsanctioned tools.

AWS Discovery CLI Commands

Use the AWS Application Discovery Service to enumerate workloads prior to migration:

aws-cli / discovery
# Start continuous data collection via the Discovery Agent
aws discovery start-continuous-export

# List discovered servers with key attributes
aws discovery describe-configurations \
    --configuration-type SERVER

# Export server details for offline analysis
aws discovery start-export-task \
    --export-data-format CSV

# Check export status
aws discovery describe-export-tasks

# List discovered applications (agent-grouped)
aws discovery list-configurations \
    --configuration-type APPLICATION

# Tag servers with rationalization decisions
aws discovery create-tags \
    --configuration-ids "d-server-01a2b3c4d5" \
    --tags key=RationalizationDecision,value=RETIRE

# Fetch network connection data between servers
aws discovery describe-configurations \
    --configuration-type CONNECTION \
    --filters name=sourceServerId,values=d-server-01a2b3c4d5,condition=EQUALS

CMDB Schema Snippet

A minimal relational schema for tracking applications and their dependencies:

SQL / PostgreSQL
CREATE TABLE applications (
    app_id          VARCHAR(20) PRIMARY KEY,
    app_name        VARCHAR(255) NOT NULL,
    description     TEXT,
    business_owner  VARCHAR(255),
    technical_owner VARCHAR(255),
    business_unit   VARCHAR(100),
    criticality     VARCHAR(20) CHECK (criticality IN
                        ('Critical','High','Medium','Low')),
    lifecycle_stage VARCHAR(30) CHECK (lifecycle_stage IN
                        ('Active','Sunset','Decommissioned','Under Review')),
    hosting_env     VARCHAR(30) CHECK (hosting_env IN
                        ('On-Prem','IaaS','PaaS','SaaS','Hybrid')),
    tech_stack      TEXT[],           -- Array of technology tags
    annual_tco      NUMERIC(12,2),
    active_users    INTEGER DEFAULT 0,
    data_class      VARCHAR(30),
    eol_date        DATE,
    eos_date        DATE,
    time_category   VARCHAR(20),      -- Tolerate/Invest/Migrate/Eliminate
    six_r_strategy  VARCHAR(20),      -- Rehost/Replatform/Refactor/etc.
    created_at      TIMESTAMPTZ DEFAULT NOW(),
    updated_at      TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE app_dependencies (
    id              SERIAL PRIMARY KEY,
    source_app_id   VARCHAR(20) REFERENCES applications(app_id),
    target_app_id   VARCHAR(20) REFERENCES applications(app_id),
    dependency_type VARCHAR(50),      -- API, Database, File, Message Queue
    protocol        VARCHAR(50),      -- REST, SOAP, JDBC, SFTP, AMQP
    data_flow       VARCHAR(20) CHECK (data_flow IN
                        ('Inbound','Outbound','Bidirectional')),
    criticality     VARCHAR(20),
    description     TEXT,
    UNIQUE(source_app_id, target_app_id, dependency_type)
);

CREATE TABLE app_costs (
    id              SERIAL PRIMARY KEY,
    app_id          VARCHAR(20) REFERENCES applications(app_id),
    cost_category   VARCHAR(50),      -- License, Infrastructure, Support, Labor
    annual_amount   NUMERIC(12,2),
    vendor          VARCHAR(255),
    contract_end    DATE,
    notes           TEXT
);

-- Useful indexes for rationalization queries
CREATE INDEX idx_apps_lifecycle ON applications(lifecycle_stage);
CREATE INDEX idx_apps_time ON applications(time_category);
CREATE INDEX idx_apps_eol ON applications(eol_date);
CREATE INDEX idx_deps_source ON app_dependencies(source_app_id);
CREATE INDEX idx_deps_target ON app_dependencies(target_app_id);

Phase 1 Discovery Checklist (Weeks 1-4)

  • Identify executive sponsor and secure mandate for discovery Without top-down authority, business units may not cooperate with data requests
  • Deploy automated discovery agents across all network segments Cover production, staging, DR, and DMZ environments
  • Extract software asset data from SCCM / Intune / Jamf Desktop and endpoint software is often overlooked
  • Pull cloud resource inventories from AWS, Azure, and GCP accounts Use native tools: AWS Config, Azure Resource Graph, GCP Asset Inventory
  • Request vendor contract and licensing data from Procurement Include renewal dates, termination clauses, and per-seat costs
  • Distribute application owner survey to all business units Keep surveys short: app name, purpose, users, criticality, alternatives
  • Correlate CASB and SSO logs to detect shadow IT Flag any discovered app not present in the CMDB
  • Establish the CMDB schema and begin populating records Start with automated data; enrich with manual survey results in weeks 3-4
  • Validate discovered inventory with infrastructure and security teams Cross-check against firewall rules, DNS records, and load balancer configs
  • Publish Phase 1 Discovery Report with app count and coverage metrics Report should include confidence level and known gaps
03

Assessment Frameworks

Structured scoring models that transform subjective opinions into defensible, data-driven rationalization decisions. These frameworks create consistency across hundreds of applications.

TIME Model Deep Dive

The TIME model plots every application on a 2x2 matrix. The Y-axis represents business value (strategic alignment, revenue contribution, user satisfaction). The X-axis represents technical fitness (architecture quality, security posture, maintainability, performance).

                        TECHNICAL FITNESS
                   Low                    High
               +------------------+------------------+
               |                  |                  |
   High        |    MIGRATE       |    INVEST        |
               |                  |                  |
               |  High value but  |  Strategic and   |
  BUSINESS     |  poor tech.      |  well-built.     |
  VALUE        |  Modernize or    |  Fund growth.    |
               |  replatform.     |                  |
               +------------------+------------------+
               |                  |                  |
   Low         |    ELIMINATE     |    TOLERATE      |
               |                  |                  |
               |  No value, bad   |  Works fine but  |
               |  tech. Retire    |  not strategic.  |
               |  immediately.    |  Minimal invest. |
               |                  |                  |
               +------------------+------------------+
Common mistake: Skipping the Tolerate quadrant Many teams want to act on every application. But Tolerate is a valid strategy. Some apps cost little, serve a niche function, and would cause more disruption to replace than to maintain. Assign a review date and move on.
30%
Typical Eliminate
of apps in large portfolios
25%
Typical Tolerate
minimal investment needed
25%
Typical Migrate
require modernization
20%
Typical Invest
strategic growth targets

Business Value Scoring Criteria

Score each application from 1 (lowest) to 5 (highest) across these dimensions, then compute a weighted average:

Criterion Weight 1 (Low) 5 (High)
Strategic Alignment 30% No link to any strategic initiative Directly enables top-3 business objective
Business Criticality 25% Failure has zero operational impact Outage stops revenue or violates regulations
Utilization 20% 0 active users in past 90 days Used daily by 500+ users across units
Revenue Impact 15% No revenue link; pure cost center Directly generates or enables revenue stream
User Satisfaction 10% Frequent complaints; workarounds common High NPS; users actively advocate for the tool

Technical Fitness Scoring Criteria

Evaluate the technical health of each application across these dimensions:

Criterion Weight 1 (Low) 5 (High)
Architecture Quality 25% Monolith, no API, tightly coupled Modular, well-documented APIs, loosely coupled
Security Posture 25% Known CVEs, no patching, EoS components Current patches, pen-tested, compliant
Performance & Reliability 20% Frequent outages, slow response times 99.9%+ uptime, sub-second response
Maintainability 15% No docs, no tests, single expert dependency Well-documented, CI/CD, multiple maintainers
Scalability 15% Cannot handle 2x current load Auto-scales horizontally, elastic infrastructure

Scoring Formulas

Compute composite scores and map them to TIME quadrants using threshold values:

Formula
# Business Value Score (BVS)
BVS = (Strategic_Alignment * 0.30)
    + (Business_Criticality * 0.25)
    + (Utilization * 0.20)
    + (Revenue_Impact * 0.15)
    + (User_Satisfaction * 0.10)

# Technical Fitness Score (TFS)
TFS = (Architecture_Quality * 0.25)
    + (Security_Posture * 0.25)
    + (Performance_Reliability * 0.20)
    + (Maintainability * 0.15)
    + (Scalability * 0.15)

# TIME Quadrant Assignment (threshold = 3.0)
if BVS >= 3.0 and TFS >= 3.0:  INVEST
if BVS >= 3.0 and TFS <  3.0:  MIGRATE
if BVS <  3.0 and TFS >= 3.0:  TOLERATE
if BVS <  3.0 and TFS <  3.0:  ELIMINATE

# Composite Score (for ranked prioritization)
Composite = (BVS * 0.55) + (TFS * 0.45)
Calibration tip Run a calibration session with 10 representative apps across the portfolio. Score them independently, then compare results as a group. Adjust weight percentages if scores cluster too tightly or if a criterion does not differentiate well.

Assessment Questionnaire: Key Questions

Use these questions during stakeholder interviews to gather the data needed for scoring:

Business Context
  1. What business process does this application support?
  2. What happens if this application is unavailable for 4 hours? 24 hours? 1 week?
  3. How many unique users accessed this application in the last 90 days?
  4. Does this application directly generate or enable revenue? If so, estimate the annual amount.
  5. Which strategic initiatives (if any) depend on this application?
  6. Is there an alternative system that provides similar or overlapping capability?
Technical Health
  1. What is the primary technology stack (language, framework, database, OS)?
  2. When was the last security patch applied? Are there known unpatched CVEs?
  3. Is the application vendor-supported? If so, what is the EoL/EoS date?
  4. How many people can maintain this application? Is there a single point of expertise?
  5. Does the application have automated tests and a CI/CD pipeline?
  6. What is the average monthly uptime over the past 12 months?
Cost & Dependencies
  1. What is the total annual cost to run this application (infrastructure + licenses + support + labor)?
  2. List all systems that send data to or receive data from this application.
  3. What data does this application store? What is the data classification level?
  4. Are there regulatory or contractual requirements that mandate keeping this data for a specific period?
  5. What is the vendor contract renewal date and cancellation notice period?

Python Assessment Scoring Script

Automate TIME quadrant assignment from CSV survey data:

Python 3
#!/usr/bin/env python3
"""
Rationalization Scoring Engine
Reads application survey data and assigns TIME quadrants.
Input:  CSV with columns for each scoring criterion (1-5 scale)
Output: CSV with BVS, TFS, TIME quadrant, and composite score
"""
import csv
import sys
from dataclasses import dataclass
from typing import List

# --- Weight Configuration ---
BV_WEIGHTS = {
    "strategic_alignment": 0.30,
    "business_criticality": 0.25,
    "utilization": 0.20,
    "revenue_impact": 0.15,
    "user_satisfaction": 0.10,
}

TF_WEIGHTS = {
    "architecture_quality": 0.25,
    "security_posture": 0.25,
    "performance_reliability": 0.20,
    "maintainability": 0.15,
    "scalability": 0.15,
}

QUADRANT_THRESHOLD = 3.0


@dataclass
class AppAssessment:
    app_id: str
    app_name: str
    bvs: float = 0.0
    tfs: float = 0.0
    quadrant: str = ""
    composite: float = 0.0


def compute_weighted_score(row: dict, weights: dict) -> float:
    """Compute weighted average from survey responses."""
    score = 0.0
    for criterion, weight in weights.items():
        value = float(row.get(criterion, 0))
        value = max(1.0, min(5.0, value))  # Clamp to 1-5
        score += value * weight
    return round(score, 2)


def assign_quadrant(bvs: float, tfs: float) -> str:
    """Map scores to TIME quadrant."""
    if bvs >= QUADRANT_THRESHOLD and tfs >= QUADRANT_THRESHOLD:
        return "INVEST"
    elif bvs >= QUADRANT_THRESHOLD and tfs < QUADRANT_THRESHOLD:
        return "MIGRATE"
    elif bvs < QUADRANT_THRESHOLD and tfs >= QUADRANT_THRESHOLD:
        return "TOLERATE"
    else:
        return "ELIMINATE"


def process_portfolio(input_file: str, output_file: str) -> List[AppAssessment]:
    """Process all applications and write scored output."""
    results = []

    with open(input_file, newline="", encoding="utf-8") as f:
        reader = csv.DictReader(f)
        for row in reader:
            app = AppAssessment(
                app_id=row["app_id"],
                app_name=row["app_name"],
            )
            app.bvs = compute_weighted_score(row, BV_WEIGHTS)
            app.tfs = compute_weighted_score(row, TF_WEIGHTS)
            app.quadrant = assign_quadrant(app.bvs, app.tfs)
            app.composite = round(app.bvs * 0.55 + app.tfs * 0.45, 2)
            results.append(app)

    # Sort by composite score (lowest first = highest priority to act)
    results.sort(key=lambda a: a.composite)

    # Write output
    with open(output_file, "w", newline="", encoding="utf-8") as f:
        writer = csv.writer(f)
        writer.writerow(["app_id", "app_name", "bvs", "tfs",
                          "quadrant", "composite"])
        for app in results:
            writer.writerow([app.app_id, app.app_name, app.bvs,
                             app.tfs, app.quadrant, app.composite])

    # Print summary
    quadrant_counts = {}
    for app in results:
        quadrant_counts[app.quadrant] = quadrant_counts.get(
            app.quadrant, 0) + 1

    print(f"\n{'='*50}")
    print(f"  Portfolio Assessment Summary")
    print(f"  Total Applications: {len(results)}")
    print(f"{'='*50}")
    for q in ["INVEST", "MIGRATE", "TOLERATE", "ELIMINATE"]:
        count = quadrant_counts.get(q, 0)
        pct = (count / len(results) * 100) if results else 0
        bar = "#" * int(pct / 2)
        print(f"  {q:10s}  {count:4d}  ({pct:5.1f}%)  {bar}")
    print(f"{'='*50}\n")

    return results


if __name__ == "__main__":
    if len(sys.argv) != 3:
        print("Usage: python assess.py input.csv output.csv")
        sys.exit(1)
    process_portfolio(sys.argv[1], sys.argv[2])

Assessment Best Practices

Score in cohorts, not isolation Assess applications in groups of 10-15 from the same business domain. This provides natural comparison points and helps calibrate scoring consistency across assessors.
Watch for ownership bias Application owners tend to overstate business value and understate technical debt. Use data (usage metrics, incident counts, CVE scans) to validate subjective responses. A second assessor improves objectivity.
Never skip dependency mapping before eliminating An app that looks unused may still provide a critical data feed, authentication service, or file-transfer function to another system. Section 05 covers dependency analysis in detail.
Document the "why" for every decision Months later, someone will ask why App X was tagged for elimination. Capture the assessment rationale in the CMDB record alongside the scores. Include who was interviewed, what data was examined, and the date of assessment.
04

Business Case & Cost Analysis

Rationalization only happens when leadership sees the numbers. This section covers how to build a defensible financial case with TCO models, ROI projections, and industry benchmarks that survive executive scrutiny.

TCO Calculation Methodology

Total Cost of Ownership extends far beyond the license fee. A rigorous TCO model accounts for every dollar spent to keep an application running, including costs that never appear on the application's budget line.

Cost Category Components Example Annual Cost
Infrastructure Servers (physical/virtual), storage, network, load balancers, DR replication $120,000 - $400,000
Licensing Software licenses, SaaS subscriptions, database licenses, middleware $50,000 - $250,000
Support Contracts Vendor premium support, extended support (post-EoS), managed services $30,000 - $150,000
Internal Labor FTEs for operations, development, DBA, security, project management $180,000 - $500,000
Training Onboarding new staff, vendor certification, ongoing skills development $10,000 - $40,000
Opportunity Cost Engineering time spent on legacy maintenance instead of innovation $80,000 - $300,000
Technical Debt Interest Escalating cost of workarounds, security patches, compatibility shims $25,000 - $100,000
Integration Maintenance Keeping point-to-point integrations alive as connected systems change $15,000 - $60,000
Hidden costs of parallel running During any migration or decommission, you will run old and new systems simultaneously for weeks or months. This overlap adds 10-20% to your total migration budget. Budget for dual licensing, dual infrastructure, dual support, and the labor to keep both environments synchronized.

Additional hidden costs that are frequently underestimated:

Data Migration Tooling
ETL scripts, data transformation tools, validation frameworks, and reconciliation reports. Often requires specialized consultants for legacy formats.
Extended License Overlap
Vendor contracts rarely align with migration timelines. You may pay for 6-12 months of licenses on a system that is being decommissioned while the contract runs out.
User Retraining
Migrating users to a replacement system requires training materials, sessions, help desk surge capacity, and productivity loss during the transition curve.
Consultant & Contractor Fees
Specialized knowledge for legacy platforms (mainframe, COBOL, proprietary ERP) often requires external consultants at premium rates during decommission.

ROI Modeling

The return on investment formula for rationalization is straightforward, but the inputs require careful estimation. Conservative assumptions build credibility with finance teams.

ROI Formula
# ================================================
# ROI Calculation for Application Rationalization
# ================================================

# Step 1: Calculate Annual Savings
#   Sum of all costs eliminated when the app is retired
annual_savings = (
    infrastructure_cost       # Servers, storage, network freed
  + license_cost              # Licenses terminated or reallocated
  + support_contract_cost     # Vendor support cancelled
  + labor_cost_reduction      # FTE hours redirected to other work
  + integration_maintenance   # Point-to-point integrations removed
)

# Step 2: Calculate One-Time Migration Cost
migration_cost = (
    data_migration_labor      # ETL development, validation, testing
  + parallel_running_cost     # Dual-environment period (10-20% budget)
  + user_retraining           # Training materials, sessions, downtime
  + consultant_fees           # External expertise for legacy platforms
  + tooling_and_automation    # Migration scripts, reconciliation tools
  + project_management        # PM, change management, communications
  + contingency_buffer        # 20-30% of above for unknowns
)

# Step 3: Compute ROI
roi_percent = ((annual_savings - migration_cost) / migration_cost) * 100

# Step 4: Break-Even Analysis
break_even_months = migration_cost / (annual_savings / 12)

# ================================================
# Example: Legacy CRM Decommission
# ================================================
annual_savings   = 340_000   # $340K/year total cost eliminated
migration_cost   = 480_000   # $480K one-time migration cost

roi_year_1 = ((340_000 - 480_000) / 480_000) * 100  # -29.2% (still paying off)
roi_year_2 = ((680_000 - 480_000) / 480_000) * 100  # +41.7% (cumulative savings)
roi_year_3 = ((1_020_000 - 480_000) / 480_000) * 100 # +112.5%

break_even = 480_000 / (340_000 / 12)  # 16.9 months
Break-even benchmark Major rationalization programs typically break even in 3-4 years. Individual application retirements with low migration complexity can break even in 6-12 months. Present the 3-year and 5-year cumulative savings to show the compounding value.

Real-World Case Studies

Documented results from organizations that have executed rationalization programs at scale:

Trinity Health
Retired 740+ applications across a multi-hospital system. Achieved $68M in recurring annual savings by eliminating redundant clinical and administrative systems post-merger. Multi-year program driven by M&A integration.
Fortune 100 Financial Services
Targeted 450 applications for rationalization as part of a cloud migration initiative. Delivered $7M+ in annual savings in the first wave by retiring redundant trading and reporting platforms.
Real Estate Firm
Rationalized 120 applications across property management, leasing, and back-office functions. Achieved $1.4M annual savings and reduced integration complexity by 40%.
Global Bank — License Discovery
License audit discovered 66,000 unused software licenses across the enterprise. Reclamation and termination yielded $4.8M in immediate savings without decommissioning a single application.
SaaS Rationalization Initiative
Mid-size enterprise audited SaaS subscriptions across all departments. Found 53% of licenses unused. Consolidated overlapping tools and achieved $462K annual savings (33% reduction in SaaS spend).

Industry Benchmarks

Reference data points for benchmarking your rationalization program against industry norms:

15-30%
Avg IT Cost Reduction
from portfolio rationalization programs
53%
SaaS Licenses Unused
$21M average waste per enterprise
20-30%
Budget Overrun Risk
from hidden costs; add contingency
80%
CIOs See It As Critical
but only 20% have fully implemented

Cost-Benefit Template

Use this template to structure the financial case for each rationalization candidate. Fill in actual values from discovery and assessment data:

Line Item Category Formula / Source Annual Value
Infrastructure savings Benefit Server + storage + network costs from CMDB $___,___
License termination Benefit Contract value from Procurement records $___,___
Support contract savings Benefit Vendor support fees (check cancellation terms) $___,___
Labor reallocation Benefit FTE hours x blended rate (ops + dev + DBA) $___,___
Risk reduction value Benefit P(breach) x impact estimate for EoS systems $___,___
Data migration cost Cost ETL development + validation + testing labor ($___,___)
Parallel running cost Cost Dual environment x estimated months x 1.15 ($___,___)
Retraining cost Cost User count x hours x blended rate + materials ($___,___)
Project management Cost PM FTE allocation x duration + change mgmt ($___,___)
Contingency (25%) Cost Sum of costs above x 0.25 ($___,___)
Net annual benefit Net Total benefits - amortized costs $___,___

Tracking & Reporting Savings

Track actual vs. projected savings Create a savings tracker that records projected savings at decision time, then validates actual savings at 6-month and 12-month milestones. Finance teams respect programs that can demonstrate accountability. Report variance with explanations.
Beware of "paper savings" Eliminating an application does not save money unless the underlying resources are actually deprovisioned. If the server still runs, the license still renews, or the FTE is not redeployed, the saving is illusory. Mandate resource deprovisioning as a required step in every decommission runbook.
05

Dependency Mapping

Understanding application interconnections is the difference between a clean decommission and a cascading failure. Map dependencies before making any changes to the production portfolio.

Why Dependencies Matter

Every application exists in a web of connections. Removing one node without understanding its edges creates unpredictable failures downstream. Dependency mapping is the single most important risk-mitigation activity in any rationalization program.

Blast Radius
Decommissioning an application affects every system that depends on it. Without a dependency map, you cannot estimate the blast radius. A seemingly low-value application may be a critical data provider to five high-value systems.
Cascading Failures
System A calls System B, which calls System C. Retiring System C does not just break B; it breaks A too, and potentially everything upstream of A. Indirect dependencies are invisible without systematic analysis.
Hidden Integrations
Tribal knowledge integrations, undocumented batch jobs, SFTP file drops scheduled at 2 AM, and shared database views. These shadow integrations rarely appear in architecture diagrams but will surface loudly on decommission day.

Dependency Discovery Techniques

No single method finds all dependencies. Use a layered approach that combines automated detection with human knowledge:

Network Traffic Analysis
Capture and analyze network flows for 30+ days to establish a baseline of all connections to and from the target application. Use NetFlow, packet captures, or firewall logs. This catches runtime dependencies that no documentation or code scan will reveal, including monthly and quarterly batch processes that only run periodically.
APM Instrumentation
Application Performance Monitoring tools like Dynatrace, AppDynamics, and New Relic automatically discover service-to-service calls, database connections, and external API dependencies. They provide call frequency, latency, and error rates. Best for applications that already have agents deployed.
Code Analysis & Static Scanning
Scan source code and configuration files for connection strings, API endpoint URLs, queue names, and service references. Tools like SonarQube, Semgrep, or custom regex scans can extract hard-coded dependencies. Effective for applications where source code is accessible and version-controlled.
Stakeholder Interviews & Tribal Knowledge
Interview application owners, developers, and operations staff. Ask specifically: "What breaks if this goes away?" Long-tenured staff often know about integrations that predate current documentation. Capture this knowledge formally before institutional memory is lost to attrition.

Dependency Types

Classify each discovered dependency by its integration pattern. The type determines the mitigation strategy required before decommission:

Type Protocol Detection Method Decommission Risk
API Calls REST, gRPC, SOAP, GraphQL Network analysis, APM traces, API gateway logs High — Synchronous; callers fail immediately
Database Shared Access JDBC, ODBC, direct SQL DB connection logs, audit trails, code scan High — Other apps reading/writing same tables
File Transfers SFTP, S3, NFS, SMB File server logs, cron schedules, S3 access logs Medium — Asynchronous; may fail silently
Message Queues Kafka, RabbitMQ, SQS, JMS Broker admin consoles, consumer group listings Medium — Messages queue up; delayed impact
Batch Jobs Cron, Autosys, Control-M, Airflow Scheduler job definitions, ops runbooks Medium — Fails on next scheduled run
Shared Authentication SSO, LDAP, SAML, OIDC, Kerberos IdP configuration, federation metadata High — Users locked out across systems
Shared Libraries / SDKs JAR, NuGet, npm, pip Dependency manifests, build scripts Low — Versioned; local copies persist
DNS / Load Balancer DNS CNAME, VIP, reverse proxy DNS zone files, LB configs, certificate SANs Medium — Traffic routes to dead endpoint

Upstream vs. Downstream Impact

For every application being evaluated, analyze impact in both directions:

Upstream Who Depends on This?

Upstream dependencies are the consumers of this application's data or services. These are the systems that break if this application goes away.

  • Which systems call this application's APIs?
  • Which systems read from this application's database?
  • Which dashboards or reports pull data from here?
  • Which downstream processes consume files this app produces?
  • Are there SSO/auth dependencies where this app is the IdP?

Impact: Upstream dependency count determines the blast radius. High upstream count = high decommission risk.

Downstream What Does This Depend On?

Downstream dependencies are the services and data sources this application relies on. These are the systems that must stay running if this app is retained or migrated.

  • Which databases, APIs, or services does this app call?
  • What authentication systems does it use to verify users?
  • What message queues or event streams does it subscribe to?
  • Are there shared infrastructure dependencies (DNS, LB, certs)?
  • What third-party services or SaaS APIs are integrated?

Impact: Downstream dependencies constrain the migration order. You cannot migrate this app before its downstream services are ready.

Dependency Mapping Tools

Tool Focus Discovery Method Strengths Limitations
ServiceNow CSDM Operations Agent + CMDB relationships Deep ITSM integration, change impact analysis Requires mature CMDB; manual relationship entry
Ardoq Architecture API integrations + manual modeling Visual dependency graphs, what-if analysis Relies on imported data quality; no runtime discovery
Dynatrace Runtime OneAgent auto-discovery Real-time topology, AI-powered baselining Agent deployment required; cost scales with hosts
AppDynamics Runtime Agent-based instrumentation Business transaction tracing, flow maps Java/.NET focus; limited for non-standard stacks
Faddom Infrastructure Agentless network analysis No agents needed, fast deployment, hybrid cloud Network-level only; no code-level dependency detail

Blast Radius Calculation

Quantify the potential impact of decommissioning an application by computing its blast radius score. This combines the number of dependent systems with their criticality:

Pseudocode / Blast Radius Analysis
# ================================================
# Blast Radius Scoring for Decommission Candidates
# ================================================

def calculate_blast_radius(app_id, dependency_graph):
    """
    Walk the dependency graph to find all systems
    that would be impacted by removing this application.
    Returns a risk score and list of affected systems.
    """
    affected = set()
    queue = [app_id]

    # BFS traversal of upstream dependencies
    while queue:
        current = queue.pop(0)
        upstream = dependency_graph.get_upstream(current)
        for dep in upstream:
            if dep.target_id not in affected:
                affected.add(dep.target_id)
                queue.append(dep.target_id)  # Follow chain

    # Score each affected system by criticality weight
    CRITICALITY_WEIGHTS = {
        "Critical": 10,   # Revenue-generating, regulatory
        "High":      5,   # Core business process
        "Medium":    2,   # Departmental tool
        "Low":       1,   # Convenience / reporting
    }

    blast_score = 0
    for system_id in affected:
        system = get_application(system_id)
        weight = CRITICALITY_WEIGHTS.get(system.criticality, 1)
        blast_score += weight

    return {
        "app_id":          app_id,
        "affected_count":  len(affected),
        "blast_score":     blast_score,
        "affected_systems": list(affected),
        "risk_level":      classify_risk(blast_score),
    }

def classify_risk(score):
    """Map blast score to risk tier."""
    if score >= 20:  return "CRITICAL"   # Requires exec approval
    if score >= 10:  return "HIGH"       # Requires architect review
    if score >= 5:   return "MEDIUM"     # Standard CAB approval
    return "LOW"                         # Team-level decision

# --- Risk Matrix ---
# Likelihood: How likely is the dependency to cause failure?
#   HIGH   = Synchronous call, no fallback
#   MEDIUM = Async/batch, partial fallback exists
#   LOW    = Shared library, cached data, redundant path
#
# Impact: What is the business impact if this dependency breaks?
#   CRITICAL = Revenue loss, regulatory violation
#   HIGH     = Major business process disruption
#   MEDIUM   = Departmental impact, workaround available
#   LOW      = Inconvenience, cosmetic, reporting delay
#
#              Impact
#            LOW  MED  HIGH CRIT
# Likelihood +----|----|----|----+
# HIGH       | M  | H  | C  | C  |
# MEDIUM     | L  | M  | H  | C  |
# LOW        | L  | L  | M  | H  |
#            +----|----|----|----+
# L=Low, M=Medium, H=High, C=Critical

Dependency Graph Notation

For each edge in your dependency graph, document these attributes to enable accurate impact analysis:

Attribute Description Example
Source The calling/consuming application APP-0042 (Order Service)
Target The called/providing application APP-0108 (Pricing Engine)
Protocol Communication method and format REST/JSON over HTTPS
Direction Data flow direction (inbound, outbound, bidirectional) Outbound (Order → Pricing)
Volume Calls per day, messages per hour, or data volume ~12,000 calls/day
Latency Sensitivity How quickly does the caller need a response? Synchronous, <200ms SLA
Criticality Business impact if this edge is severed Critical — orders cannot be priced
Fallback Behavior What happens when the dependency is unavailable? Circuit breaker; uses cached prices for 15 min
Never decommission without 30+ days of traffic analysis Monthly batch jobs, quarterly reporting processes, and annual reconciliation tasks will not appear in a one-week observation window. A minimum of 30 days of network traffic capture is required before declaring an application has no active dependencies. For financial systems, extend this to 90 days to cover quarter-end processing.
The "dark launch" validation pattern Before decommissioning, configure the target application to log all inbound requests without responding. Route traffic to a proxy that records callers while returning errors. This "dark mode" reveals dependencies you missed during discovery, without the risk of silently serving stale data.
06

Migration Strategies

The 6Rs framework provides a structured vocabulary for choosing the right migration path for every application in the portfolio. Wrong strategy selection is the leading cause of migration program failure.

The 6Rs Framework

Originally developed by Gartner (as the 5Rs) and expanded by AWS, the 6Rs framework categorizes every possible migration outcome. Each application should be assigned exactly one R based on its business value, technical fitness, dependencies, and organizational constraints. The right choice balances speed, cost, risk, and long-term strategic value.

Wrong strategy selection is the #1 cause of migration failure Choosing to rehost an application that needs refactoring creates cloud-hosted legacy debt. Choosing to refactor an application that should be retired wastes months of engineering effort. Invest time in assessment (Section 03) and dependency mapping (Section 05) before assigning strategies. A wrong R costs more than a slow R.

R1: Rehost (Lift & Shift)

When to Rehost
  • Data center exit deadline is imminent (1-3 months)
  • Budget is limited and modernization is not funded
  • Application is stable and does not require optimization
  • Organization lacks cloud-native engineering capacity
  • First phase of a "migrate then modernize" strategy
Characteristics

Effort: LOW Cost: $ Timeline: 1-3 months

Cloud Optimization: LOW

Pros: Fastest migration path, minimal risk, no code changes, predictable timeline.

Cons: No cloud-native benefits (auto-scaling, managed services), same operational overhead in the cloud, potential for higher cloud costs than on-prem.

R2: Replatform (Lift & Reshape)

When to Replatform
  • Want managed services without full re-architecture
  • Moderate timeline available (3-6 months)
  • Database migration to managed service is the main goal
  • Containerization is feasible without code rewrites
  • OS or runtime upgrade is needed regardless of migration
Characteristics

Effort: MODERATE Cost: $$ Timeline: 3-6 months

Cloud Optimization: MODERATE

Pros: Meaningful cost and operational gains, reduced DBA/ops overhead, improved reliability through managed services.

Cons: Requires testing for compatibility, some code changes may be needed, does not fully leverage cloud-native architecture.

Common replatform optimizations:

Self-managed DB → RDS / Cloud SQL
Migrate from self-hosted PostgreSQL, MySQL, or SQL Server to a managed database service. Eliminates patching, backup management, and HA configuration overhead.
Web Server → App Service / Elastic Beanstalk
Move from self-managed Apache/Nginx on VMs to a managed application platform. Auto-scaling, TLS termination, and deployment slots included.
VMs → Containers (ECS, AKS, GKE)
Containerize the application without rewriting. Use existing Dockerfiles or create them from the current deployment. Gains portability and density.
On-prem Queue → SQS / Azure Service Bus
Replace self-managed RabbitMQ or ActiveMQ with a cloud-managed message queue. Eliminates cluster management and provides built-in dead-letter handling.

R3: Refactor (Re-architect)

When to Refactor
  • Innovation and agility are critical for the business
  • Long-term strategy justifies the investment (6-18+ months)
  • Current architecture cannot scale to meet projected demand
  • Competitive pressure demands faster release cycles
  • Application is strategic and will be invested in for 5+ years
Characteristics

Effort: HIGH Cost: $$$ Timeline: 6-18+ months

Cloud Optimization: EXCELLENT

Pros: Full cloud-native benefits, auto-scaling, serverless options, independent deployability, dramatically lower long-term operational costs.

Cons: Highest risk and cost, requires skilled cloud engineers, long timeline, scope creep danger, feature parity challenges.

Common refactoring patterns:

Monolith → Microservices
Decompose a monolithic application into independently deployable services organized around business domains. Use the Strangler Fig pattern to migrate incrementally rather than as a big-bang rewrite.
Synchronous → Event-Driven
Replace synchronous request-response patterns with asynchronous event streaming (Kafka, EventBridge, Pub/Sub). Improves resilience, decouples services, and enables real-time data processing.
Server-Based → Serverless
Move discrete functions to Lambda, Azure Functions, or Cloud Functions. Best suited for event-triggered, bursty workloads. Eliminates idle compute cost and simplifies operations.
Batch → Stream Processing
Convert overnight batch ETL jobs to real-time streaming pipelines using Kinesis, Dataflow, or Flink. Reduces data latency from hours to seconds and eliminates batch-window constraints.

R4: Repurchase (Replace with SaaS)

When to Repurchase
  • The function is a commodity (CRM, email, HR, ITSM)
  • A mature commercial SaaS solution exists
  • Customization requirements are low to moderate
  • Total SaaS subscription cost is less than on-prem TCO
  • Vendor innovation pace exceeds internal capacity
Characteristics

Effort: MODERATE Cost: $$ Timeline: 2-4 months

Cloud Optimization: HIGH

Pros: No infrastructure to manage, vendor handles upgrades and security, fast deployment, predictable subscription cost.

Cons: Data migration complexity, user retraining, vendor lock-in, customization limits, ongoing subscription cost.

Common repurchase targets:

CRM → Salesforce / HubSpot
Replace custom or legacy CRM systems. Data migration of contacts, accounts, and opportunity history is the primary challenge.
Email → Microsoft 365 / Google Workspace
Retire on-prem Exchange or Lotus Notes. Includes calendar, contacts, and archive migration.
HR → Workday / BambooHR
Replace custom HR and payroll systems with cloud HCM platforms. Regulatory compliance is a key selection criterion.
ITSM → ServiceNow / Jira SM
Consolidate disparate help desk and ticketing tools to a single ITSM platform with workflow automation.

R5: Retain (Keep As-Is)

When to Retain
  • Application has a planned sunset date within 12-18 months
  • Migration cost exceeds the remaining lifetime value
  • Regulatory or compliance constraints prevent changes
  • Deep dependencies make migration sequence complex
  • No suitable replacement or migration path exists today
  • Application is under active vendor development (wait for next version)
Characteristics

Effort: NONE Cost: $0 migration Timeline: N/A

Cloud Optimization: NONE

Pros: Zero disruption, zero migration cost, avoids risk of change.

Cons: Ongoing maintenance and technical debt accumulation, deferred risk, may become more expensive to migrate later.

Requirement: Set a mandatory review date (6-12 months) to reassess. Retain is a deferral, not a permanent decision.

R6: Retire (Decommission)

When to Retire
  • Application has zero active users (validated by 30+ days of monitoring)
  • Functionality has been replaced by another system
  • Application is redundant due to M&A consolidation
  • Security liability exceeds the cost of decommission
  • Vendor support has ended and no extended support is available
  • Cost to maintain exceeds the value delivered
Characteristics

Effort: LOW-MODERATE Cost: $ (one-time) Timeline: 1-3 months

Pros: Eliminates all ongoing costs, reduces attack surface, simplifies the portfolio, frees infrastructure and staff capacity.

Cons: Requires data archival planning, stakeholder communication, and dependency verification. Section 07 covers the full decommission runbook.

Savings timeline: Typically the fastest path to realized cost savings because there is no new system to build or buy.

Strategy Comparison Matrix

Strategy Effort Timeline Cost Cloud Optimization Long-Term Value
Rehost Low 1-3 months $ Low Low-Medium
Replatform Moderate 3-6 months $$ Moderate Medium
Refactor High 6-18+ months $$$ Excellent High
Repurchase Moderate 2-4 months $$ High High
Retain None N/A $0 None Low
Retire Low 1-3 months $ N/A High (savings)

Migration Strategy Decision Tree

Walk through this decision tree for each application to arrive at the optimal R:

                    +---------------------------+
                    |   Is the application       |
                    |   actively used?            |
                    +---------------------------+
                       /                  \
                     YES                   NO
                     /                      \
          +------------------+       +------------------+
          | Does it need to  |       | Is data retention |
          | exist in-house?  |       | required?          |
          +------------------+       +------------------+
            /           \               /           \
          YES            NO           YES            NO
          /               \           /               \
  +--------------+  +-----------+  RETAIN         +---------+
  | Is it cloud- |  | SaaS alt  |  (archive       | RETIRE  |
  | ready as-is? |  | available?|  data, then     +---------+
  +--------------+  +-----------+  retire later)
    /        \        /       \
  YES        NO     YES       NO
  /           \     /          \
REHOST    +--------+ REPURCHASE  RETAIN
          | Worth  |             (revisit)
          | re-    |
          | arch?  |
          +--------+
           /      \
         YES      NO
         /         \
    REFACTOR    REPLATFORM

Real-World Portfolio Mix Patterns

In practice, organizations apply a mix of strategies across their portfolio. The ideal mix depends on organizational goals, timeline, and budget. Here are three common patterns:

Balanced

A measured approach that optimizes across all dimensions:

  • 40% Rehost
  • 30% Replatform
  • 10% Refactor
  • 10% Repurchase
  • 10% Retire

Best for organizations with moderate timelines and a mix of strategic and commodity applications. Balances quick wins with long-term modernization.

Speed-First

Prioritizes migration velocity over optimization:

  • 70% Rehost
  • 20% Replatform
  • 10% Retire

Best for data center exit deadlines, lease expirations, or compliance mandates. Accepts technical debt in exchange for speed. Plan a second wave of optimization after migration.

Optimization-First

Maximizes cloud-native value at higher cost and timeline:

  • 40% Refactor
  • 30% Replatform
  • 20% Repurchase
  • 10% Retire

Best for organizations with strong engineering teams, generous timelines, and a strategic mandate for digital transformation. Delivers highest long-term ROI.

Strategy Assignment Logic

Use assessment scores and dependency data to recommend the optimal migration strategy programmatically:

Pseudocode / Strategy Engine
# ================================================
# 6R Strategy Assignment Engine
# ================================================
# Inputs: TIME quadrant, technical fitness, business value,
#         dependency count, active users, annual TCO

def assign_strategy(app):
    """
    Recommend a 6R strategy based on assessment data.
    Returns primary recommendation and alternatives.
    """

    # ELIMINATE quadrant -> check if Retire or Retain
    if app.time_quadrant == "ELIMINATE":
        if app.active_users == 0 and app.upstream_deps <= 2:
            return "RETIRE"
        elif app.data_retention_required:
            return "RETAIN"   # Archive data first, then retire
        else:
            return "RETIRE"

    # TOLERATE quadrant -> Retain or Repurchase
    if app.time_quadrant == "TOLERATE":
        if app.saas_alternative_exists and app.tco > app.saas_cost:
            return "REPURCHASE"
        else:
            return "RETAIN"

    # MIGRATE quadrant -> Replatform or Refactor
    if app.time_quadrant == "MIGRATE":
        if app.technical_fitness < 2.0:
            return "REFACTOR"          # Too degraded to just move
        elif app.can_containerize and app.managed_db_compatible:
            return "REPLATFORM"
        else:
            return "REFACTOR"

    # INVEST quadrant -> Rehost or Replatform
    if app.time_quadrant == "INVEST":
        if app.cloud_ready and app.deadline_months <= 3:
            return "REHOST"            # Speed priority
        elif app.technical_fitness >= 4.0:
            return "REPLATFORM"        # Already solid; optimize
        else:
            return "REPLATFORM"

    return "RETAIN"  # Default: revisit later
Automated recommendations require human validation Strategy assignment logic provides a starting point, not a final answer. Every recommendation should be reviewed by an architect and the application owner before committing resources. Edge cases, political factors, and strategic pivots cannot be captured in a scoring algorithm.
07

Decommissioning Runbook

The step-by-step operational guide for shutting down systems safely. A disciplined, phased approach prevents cascading failures, data loss, and compliance violations. Every decommission should follow this runbook.

Phased Approach Overview

Decommissioning is not a single event but a multi-phase process spanning weeks or months. Each phase has defined gates that must be passed before proceeding.

1
Pre-Decommission Planning (T-90 to T-30 days)
Identify all stakeholders, document dependencies, confirm replacement systems are operational, and build the detailed decommission plan. This phase consumes the most calendar time but prevents costly surprises later.
2
Stakeholder Notification (T-30 days)
Formally notify all affected users, business owners, vendor contacts, and downstream system owners. Provide migration paths, training schedules, and support contacts. Document acknowledgments.
3
Data Handling & Archival (T-14 days)
Execute the data retention plan: archive records per regulatory requirements, migrate active data to replacement systems, verify backup integrity with test restores, and document chain of custody.
4
Decommission Day Execution (T-0)
Execute the shutdown sequence: disable user access, stop application services, revoke credentials, remove DNS records, tear down infrastructure, and update the CMDB. Status updates every 2 hours to stakeholders.
5
Post-Shutdown Monitoring (T+1 to T+90 days)
Monitor for unexpected failures in dependent systems, track support tickets related to the decommissioned application, validate that cost savings materialize, and conduct lessons-learned reviews.

Pre-Decommission Checklist

Complete every item before scheduling the Go/No-Go meeting. Skipping items is the leading cause of decommission rollbacks.

  • Stakeholder sign-off obtained (business owner, IT, legal, compliance) All four groups must sign; a missing legal sign-off can halt the entire process
  • All dependencies verified and removed or rerouted Cross-reference dependency map from Section 05 with current traffic analysis
  • Data backup completed and verified with test restore A backup that has never been restored is not a backup
  • Data archival plan executed per retention policy Confirm retention periods meet regulatory requirements (see Section 08)
  • Replacement system confirmed operational and load-tested Run parallel operations for a minimum of 2 weeks before cutover
  • User migration and retraining complete Track completion rates; aim for 95%+ before scheduling decommission
  • Vendor notifications sent per contract timeline Most contracts require 30-90 days written notice; check early termination clauses
  • License termination dates scheduled Align with contract renewal dates to avoid paying for unused periods
  • DNS TTL lowered 24-48 hours before decommission Allows rapid propagation of record removals on decommission day
  • Monitoring and alerting reconfigured Remove old checks; add new alerts for dependent systems that may break
  • Rollback plan documented and tested Include snapshot locations, restore procedures, and decision criteria
  • Communication sent to all affected users Include decommission date, replacement system URL, and support contact
  • Go/No-Go meeting scheduled with all required attendees Schedule at least 48 hours before decommission window; prepare decision matrix

DNS & Certificate Cleanup

Stale DNS records and expired certificates are security risks and operational debt. Clean them up as part of the decommission.

AWS Route53 — DNS Record Deletion
# List hosted zones to find the target zone
aws route53 list-hosted-zones --query "HostedZones[*].[Id,Name]" --output table

# Export current records for audit trail before deletion
aws route53 list-resource-record-sets \
  --hosted-zone-id Z0123456789ABCDEF \
  --output json > /backup/dns-records-$(date +%Y%m%d).json

# Delete A record for the decommissioned application
aws route53 change-resource-record-sets \
  --hosted-zone-id Z0123456789ABCDEF \
  --change-batch '{
    "Changes": [{
      "Action": "DELETE",
      "ResourceRecordSet": {
        "Name": "legacy-app.example.com",
        "Type": "A",
        "TTL": 300,
        "ResourceRecords": [{"Value": "10.0.1.50"}]
      }
    }]
  }'

# Delete CNAME records pointing to the decommissioned host
aws route53 change-resource-record-sets \
  --hosted-zone-id Z0123456789ABCDEF \
  --change-batch '{
    "Changes": [{
      "Action": "DELETE",
      "ResourceRecordSet": {
        "Name": "app.example.com",
        "Type": "CNAME",
        "TTL": 300,
        "ResourceRecords": [{"Value": "legacy-app.example.com"}]
      }
    }]
  }'

# Verify deletion
aws route53 list-resource-record-sets \
  --hosted-zone-id Z0123456789ABCDEF \
  --query "ResourceRecordSets[?Name=='legacy-app.example.com.']"
Certificate Revocation
# Let's Encrypt — revoke and delete certificate
certbot revoke --cert-name legacy-app.example.com --reason cessationofoperation
certbot delete --cert-name legacy-app.example.com

# AWS ACM — delete certificate (must be disassociated from all resources first)
aws acm list-certificates --query "CertificateSummaryList[?DomainName=='legacy-app.example.com']"
aws acm delete-certificate --certificate-arn arn:aws:acm:us-east-1:123456789012:certificate/abc-def-123

# Internal PKI — revoke via OpenSSL
openssl ca -revoke /etc/pki/CA/newcerts/legacy-app.pem -config /etc/pki/CA/openssl.cnf
openssl ca -gencrl -out /etc/pki/CA/crl/ca.crl -config /etc/pki/CA/openssl.cnf

# Verify certificate is no longer valid
openssl s_client -connect legacy-app.example.com:443 -servername legacy-app.example.com 2>/dev/null | \
  openssl x509 -noout -dates 2>/dev/null || echo "Certificate no longer served"

Credential Revocation

Every credential associated with the decommissioned system must be revoked. Orphaned credentials are a top attack vector.

AWS IAM — Role & Policy Cleanup
# Identify IAM roles used by the application
aws iam list-roles --query "Roles[?contains(RoleName, 'legacy-app')].[RoleName,Arn]" --output table

# Detach all managed policies from the role
aws iam list-attached-role-policies --role-name legacy-app-role \
  --query "AttachedPolicies[*].PolicyArn" --output text | \
  xargs -I {} aws iam detach-role-policy --role-name legacy-app-role --policy-arn {}

# Delete inline policies
aws iam list-role-policies --role-name legacy-app-role \
  --query "PolicyNames[]" --output text | \
  xargs -I {} aws iam delete-role-policy --role-name legacy-app-role --policy-name {}

# Remove instance profiles
aws iam remove-role-from-instance-profile \
  --instance-profile-name legacy-app-profile --role-name legacy-app-role
aws iam delete-instance-profile --instance-profile-name legacy-app-profile

# Delete the role
aws iam delete-role --role-name legacy-app-role
Kubernetes — Service Account Removal
# List service accounts in the application namespace
kubectl get serviceaccounts -n legacy-app -o wide

# Remove RBAC bindings
kubectl delete clusterrolebinding legacy-app-binding
kubectl delete rolebinding legacy-app-binding -n legacy-app

# Delete the service account
kubectl delete serviceaccount legacy-app-sa -n legacy-app

# Delete the namespace (removes all remaining resources)
kubectl delete namespace legacy-app --grace-period=60
Database User Cleanup
-- PostgreSQL: revoke and drop application user
REVOKE ALL PRIVILEGES ON ALL TABLES IN SCHEMA public FROM legacy_app_user;
REVOKE ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public FROM legacy_app_user;
REVOKE CONNECT ON DATABASE appdb FROM legacy_app_user;
DROP USER IF EXISTS legacy_app_user;

-- MySQL: revoke and drop
REVOKE ALL PRIVILEGES, GRANT OPTION FROM 'legacy_app_user'@'%';
DROP USER IF EXISTS 'legacy_app_user'@'%';
FLUSH PRIVILEGES;
Secrets Manager Cleanup
# AWS Secrets Manager — schedule deletion (7-30 day recovery window)
aws secretsmanager delete-secret \
  --secret-id legacy-app/db-credentials \
  --recovery-window-in-days 30

aws secretsmanager delete-secret \
  --secret-id legacy-app/api-keys \
  --recovery-window-in-days 30

# HashiCorp Vault — revoke and delete
vault lease revoke -prefix secret/legacy-app/
vault kv metadata delete secret/legacy-app/db-credentials
vault kv metadata delete secret/legacy-app/api-keys
vault policy delete legacy-app-policy

Infrastructure Teardown

Remove infrastructure in reverse dependency order: load balancers first, then compute, then storage. Always deregister before deleting.

Load Balancer — Target Deregistration
# AWS ALB — deregister targets and delete target group
aws elbv2 describe-target-groups \
  --query "TargetGroups[?contains(TargetGroupName, 'legacy-app')].[TargetGroupArn]" --output text

aws elbv2 deregister-targets \
  --target-group-arn arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/legacy-app/abc123 \
  --targets Id=i-0abc123def456,Port=8080

# Remove listener rules pointing to the target group
aws elbv2 delete-rule --rule-arn arn:aws:elasticloadbalancing:us-east-1:123456789012:listener-rule/app/main-alb/abc/def/rule123

# Delete target group after deregistration
aws elbv2 delete-target-group \
  --target-group-arn arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/legacy-app/abc123

# Azure LB — remove backend pool member
az network lb address-pool address remove \
  --resource-group rg-legacy --lb-name lb-legacy \
  --pool-name legacy-app-pool --name legacy-vm-1
Firewall Rule Cleanup
# AWS Security Group — revoke ingress/egress and delete
aws ec2 describe-security-groups \
  --filters "Name=group-name,Values=legacy-app-sg" \
  --query "SecurityGroups[*].[GroupId,GroupName]" --output table

aws ec2 revoke-security-group-ingress --group-id sg-0abc123 \
  --ip-permissions '[{"IpProtocol":"tcp","FromPort":443,"ToPort":443,"IpRanges":[{"CidrIp":"0.0.0.0/0"}]}]'

aws ec2 delete-security-group --group-id sg-0abc123

# Azure NSG — remove rules and delete
az network nsg rule delete --resource-group rg-legacy --nsg-name nsg-legacy --name allow-legacy-app
az network nsg delete --resource-group rg-legacy --name nsg-legacy

# iptables — remove application-specific rules (document before removing)
iptables -L -n --line-numbers | grep "legacy-app"
iptables -D INPUT 12    # Remove by line number after verification
Compute Instance Termination
# AWS EC2 — create final AMI snapshot, then terminate
aws ec2 create-image --instance-id i-0abc123def456 \
  --name "legacy-app-final-$(date +%Y%m%d)" --no-reboot

# Wait for AMI to be available
aws ec2 wait image-available --image-ids ami-0xyz789

# Terminate instances
aws ec2 terminate-instances --instance-ids i-0abc123def456 i-0def789abc012

# Azure VM — deallocate and delete
az vm deallocate --resource-group rg-legacy --name legacy-vm-1
az vm delete --resource-group rg-legacy --name legacy-vm-1 --yes
Storage Volume Cleanup
# AWS EBS — snapshot before deletion
aws ec2 create-snapshot --volume-id vol-0abc123 \
  --description "legacy-app-final-$(date +%Y%m%d)" \
  --tag-specifications 'ResourceType=snapshot,Tags=[{Key=Retention,Value=90days}]'

aws ec2 wait snapshot-completed --snapshot-ids snap-0xyz789

# Delete the volume after snapshot confirmation
aws ec2 delete-volume --volume-id vol-0abc123

# AWS S3 — empty and delete application bucket
aws s3 rm s3://legacy-app-data --recursive
aws s3 rb s3://legacy-app-data

# Azure Managed Disk — snapshot and delete
az snapshot create --resource-group rg-legacy --name legacy-disk-snap \
  --source /subscriptions/.../disks/legacy-data-disk
az disk delete --resource-group rg-legacy --name legacy-data-disk --yes

Rollback Plan Requirements

Every decommission must have a documented rollback plan. Hope is not a strategy.

Snapshot & Backup Retention
  • Retain VM snapshots and AMIs for minimum 30 days post-decommission
  • Keep database backups for 90 days (covers quarter-end processing)
  • Store configuration exports (Terraform state, Ansible playbooks) indefinitely in version control
  • Tag all retained artifacts with decommission date and retention expiry
Rollback Decision Criteria
  • Rollback window: within 72 hours of decommission for full restore
  • Partial rollback (data only): available for 30 days
  • Trigger conditions: critical dependency failure, data loss discovery, regulatory finding
  • Approval required from: IT Lead + Business Owner (2-person rule)

Document re-provisioning procedures with estimated restore times:

Full Infrastructure Restore
Re-provision from AMI/snapshot + restore database from backup + reconfigure DNS + reinstate firewall rules. Estimated time: 4-8 hours.
Data-Only Restore
Restore database backup to a new instance + reconnect to replacement system or temporary read-only interface. Estimated time: 1-3 hours.
Configuration Restore
Re-apply Terraform/Ansible configs from version control + redeploy application containers. Estimated time: 2-4 hours.

Go/No-Go Decision

The Go/No-Go meeting is the final gate before execution. All four criteria must be met. A single failure triggers a postponement, not a partial proceed.

Dependency Proof

All application interfaces have been validated. No active connections remain. Traffic analysis (minimum 30 days) confirms zero inbound requests from production systems.

Data Proof

Archives verified and accessible. Retention periods meet or exceed regulatory requirements. Test restores completed successfully. Chain of custody documented.

Communication Proof

All stakeholders have been notified with acknowledgment. Users have been migrated to replacement systems. Vendor termination notices sent per contract requirements.

Rollback Proof

Recovery plan documented with assigned owners. Snapshots and backups verified. Rollback procedures tested in staging environment. Restore time estimates validated.

Never decommission on a Friday or before a holiday Decommission failures surface within 24-72 hours. If the team is unavailable for rapid response, a minor issue becomes a major incident. Schedule decommissions for Tuesday or Wednesday, with the full operations team available through end-of-week.
08

Data Retention & Compliance

Regulatory requirements and data handling during decommissioning. Getting retention wrong exposes the organization to fines, litigation risk, and audit failures. This section maps frameworks to actionable procedures.

Regulatory Framework Overview

Every application handles data subject to one or more regulatory frameworks. Identify applicable frameworks before planning any data destruction.

Framework Retention Period Key Requirements Penalty
GDPR Varies (minimum necessary) Right to erasure (Art. 17), data minimization, lawful basis required for retention Up to 4% global revenue
HIPAA 6 years PHI protection, audit trails, BAA requirements, breach notification $100–$50K per violation
SOX 7 years Financial records, audit trails, internal controls, whistleblower protections Criminal penalties
PCI DSS 1 year (logs), varies (data) Cardholder data protection, secure deletion, key management Fines + loss of processing
FedRAMP 90d online, 12mo active, 18mo cold Audit log retention, secure decommissioning, media sanitization Loss of authorization
CCPA/CPRA Varies Consumer deletion rights, data inventory, opt-out mechanisms $2,500–$7,500 per violation

Data Classification During Decommission

Classify all data in the decommissioned system before choosing archival or destruction methods. Different data classes have different handling requirements.

PII — Personally Identifiable Information

Names, addresses, SSNs, email addresses, phone numbers, financial account numbers. Subject to GDPR, CCPA/CPRA, and state privacy laws. Requires encryption at rest and in transit, access logging, and certified destruction.

PHI — Protected Health Information

Medical records, diagnoses, treatment plans, insurance IDs, lab results. Subject to HIPAA with mandatory 6-year retention. Requires BAA with any vendor handling PHI. Breach notification within 60 days.

Financial Records

Transaction logs, general ledgers, tax records, audit trails, invoices. Subject to SOX (7 years) and IRS requirements. Must maintain audit trail integrity through the entire retention period. Immutable storage recommended.

General Business Data

Internal communications, project documents, configuration data, non-sensitive logs. Standard retention policies apply (typically 3-5 years). Lower handling requirements but still subject to legal hold and e-discovery obligations.

Archival Strategies

Choose the storage tier based on access frequency requirements and cost constraints. Data that must be queryable needs hot or warm storage; data retained only for compliance can go cold or offline.

Hot Storage
Active access with full query capability. AWS S3 Standard, Azure Hot, GCS Standard. Use for data that may be needed within minutes. Highest cost but lowest latency. Typical use: first 90 days post-decommission.
Warm Storage
Occasional access with slightly slower retrieval. AWS S3 Infrequent Access, Azure Cool, GCS Nearline. Use for data accessed monthly or less. 30-day minimum storage charge. Typical use: 90 days to 1 year.
Cold Storage
Rare access with hours for retrieval. AWS S3 Glacier, Azure Archive, GCS Coldline/Archive. Retrieval takes 1-12 hours. Lowest cloud cost. Typical use: 1-7 year regulatory retention.
Offline Storage
Physical media stored in secure vaults. Tape (LTO), offline disk arrays, AWS Snowball exports. Air-gapped from network. Use for maximum security or very long retention (7+ years). Requires physical retrieval procedures.

Legal Hold Considerations

A legal hold (litigation hold) is a directive to preserve all potentially relevant data when litigation is reasonably anticipated. Legal holds override normal retention and deletion policies.

What Triggers a Legal Hold
  • Pending or threatened litigation
  • Government investigation or subpoena
  • Regulatory audit or inquiry
  • Internal investigation (fraud, harassment, compliance)
  • Contractual dispute with active negotiation
Legal Hold Requirements
  • Check BEFORE any data deletion — consult Legal for active holds
  • Litigation freeze overrides normal retention policies
  • Preserve data in its original format and location when possible
  • Document all preservation actions with timestamps
  • Release procedures: Legal must issue written release before any destruction
Always check for legal holds before destroying data Destroying data subject to a legal hold constitutes spoliation of evidence. Consequences include adverse inference instructions (the court assumes the destroyed data was harmful to your case), monetary sanctions, and in extreme cases, criminal contempt charges. When in doubt, preserve.

Secure Data Destruction (NIST 800-88)

NIST Special Publication 800-88 defines three levels of media sanitization. Choose the level based on data classification and the media's next destination.

1
Clear — Software Overwrite
Protects against simple, non-invasive data recovery. Suitable for media being reused within the same security domain. Uses standard read/write commands.
2
Purge — Cryptographic or Physical Techniques
Protects against laboratory-level data recovery. Required for media leaving the organization's control. Includes cryptographic erasure, SSD secure erase, and degaussing.
3
Destroy — Physical Destruction
Renders media completely unusable. Required for highest-classification data. Includes physical shredding, disintegration, and incineration. Third-party destruction must include certificate of destruction.
Clear — Software Overwrite Commands
# Linux — overwrite with random data (3-pass)
shred -vzn 3 /dev/sdX

# Linux — single-pass zero fill (faster, acceptable for most use cases)
dd if=/dev/zero of=/dev/sdX bs=4M status=progress

# Windows — SDelete (Sysinternals)
sdelete -z -p 3 D:\legacy-app-data\

# Verify overwrite (should show only zeros)
hexdump -C /dev/sdX | head -20
Purge — Cryptographic Erasure & Secure Erase
# Cryptographic erasure — destroy the encryption key
# (Requires data to have been encrypted at rest)
aws kms schedule-key-deletion --key-id alias/legacy-app-key --pending-window-in-days 7

# SSD Secure Erase via hdparm
hdparm --user-master u --security-set-pass DecomPW /dev/sdX
hdparm --user-master u --security-erase DecomPW /dev/sdX

# NVMe Secure Erase
nvme format /dev/nvme0n1 --ses=1    # User Data Erase
nvme format /dev/nvme0n1 --ses=2    # Cryptographic Erase

# Degaussing: use NSA/CSS EPL-listed degausser for magnetic media
# (No software command — physical device required)
Destroy — Physical Destruction Methods
# Physical destruction is performed by certified vendors
# Document the following for each piece of media:

# 1. Media serial number and type (HDD, SSD, tape, etc.)
# 2. Destruction method (shredding, disintegration, incineration)
# 3. Destruction standard met (NIST 800-88, DoD 5220.22-M)
# 4. Vendor name and certification number
# 5. Date and time of destruction
# 6. Witness name(s) and signature(s)
# 7. Certificate of destruction reference number

# Approved destruction methods by media type:
# HDD:  Shredding (cross-cut to <2mm), degaussing + shredding, incineration
# SSD:  Shredding (cross-cut to <2mm), disintegration
# Tape: Degaussing + shredding, incineration
# Optical: Shredding, incineration

Chain of Custody Documentation

Maintain an auditable chain of custody for every piece of data handled during decommissioning. This record is your proof of compliance during audits.

Record Element Description Example
Data Inventory Complete listing of all data assets with classification level appdb: 2.3TB, PII (GDPR), 847 tables
Sanitization Method NIST 800-88 level and specific technique used Purge: cryptographic erasure via AWS KMS key deletion
Tool & Version Software or device used for sanitization shred (GNU coreutils 8.32), Garner HD-3WXL degausser
Personnel Names and roles of individuals who performed and witnessed Performed: J. Smith (SRE), Witnessed: M. Jones (Security)
Timestamp Date and time of each action in UTC 2025-03-15T14:30:00Z
Certificate of Destruction Formal document from vendor (physical) or system log (digital) CoD-2025-0315-001, signed by Iron Mountain
Verification Results Confirmation that sanitization was successful Post-wipe hex dump verified: all zeros, no recoverable data

Retention Schedule Template

Use this template to document retention commitments for each data category in the decommissioned system. Fill in during the T-14 data handling phase.

Data Category Classification Regulatory Basis Retention Period Storage Tier Destruction Date
User account records PII GDPR Art. 17, CCPA Delete at decommission N/A T-0
Transaction history Financial SOX, IRS 7 years Cold (Glacier) T+7 years
Audit logs Operational PCI DSS, SOX 7 years Cold (Glacier) T+7 years
Application logs Operational Internal policy 1 year Warm (S3 IA) T+1 year
Medical records PHI HIPAA 6 years Cold (encrypted) T+6 years
Cardholder data PCI PCI DSS Delete at decommission N/A (purge) T-0
Configuration backups Internal Internal policy 90 days Hot (S3 Standard) T+90 days
Retention is a floor, not a ceiling Regulatory retention periods are minimums. Legal holds, contractual obligations, and business needs may extend retention beyond regulatory requirements. Always check with Legal before setting destruction dates.
09

Stakeholder Management

Communication, governance, and change management for decommissioning programs. Decommissioning fails when treated as a purely technical exercise. Success requires cross-functional alignment, clear communication, and executive sponsorship.

RACI Matrix

Define clear accountability for each phase of the decommissioning lifecycle. R = Responsible (does the work), A = Accountable (makes the decision), C = Consulted (provides input), I = Informed (kept in the loop).

Activity Executive Sponsor IT Lead Business Owner Legal/Compliance Operations
Portfolio Assessment I R A C C
Business Case A R C C I
Data Archival I R C A R
Decommission Execution I A C C R
Post-Validation I R A C R

Communication Plan

Structured, timeline-driven communication prevents surprises and builds organizational buy-in. Every milestone has a defined audience, message, and channel.

T-90
Internal Announcement
Notify IT leadership and business unit leads. Share rationale, timeline, and expected impact. Solicit early concerns and identify potential blockers. Channel: email + leadership meeting.
T-60
Impact Assessment Distribution
Share detailed impact assessment with all stakeholders. Include dependency analysis, user migration plan, and training schedule. Channel: shared document + stakeholder review meeting.
T-30
User Notification
Notify all end users with specific decommission date, alternative solutions, training schedule, and support contacts. Channel: email blast + in-app banner + intranet announcement.
T-14
Final Reminder
Send final reminder with confirmed decommission date, support surge plan, and escalation contacts. Confirm user migration completion rates. Channel: email + Slack/Teams announcement.
T-7
Go/No-Go Meeting
Final confirmation or escalation. Review all four Go/No-Go criteria (Section 07). Document decision and any conditional approvals. Channel: video conference with all required attendees.
T-0
Decommission Execution Updates
Status updates every 2 hours during execution window. Report progress, issues, and rollback decisions in real time. Channel: war room (virtual or physical) + status email.
T+1
Confirmation & Support Surge
Send confirmation email to all stakeholders. Activate support surge team for increased ticket volume. Monitor dependent systems for unexpected failures. Channel: email + help desk escalation.
T+30
Lessons Learned & Savings Report
Conduct retrospective review. Document what worked, what failed, and improvements for the next decommission. Publish realized cost savings vs. projections. Channel: report + all-hands presentation.

Change Advisory Board (CAB) Process

Decommissions are changes and must go through your organization's change management process. Here is what CAB typically requires for a decommission request.

Submission Requirements
  • Impact assessment: systems affected, user count, business processes disrupted
  • Rollback plan: documented and tested with estimated restore time
  • Approvals: business owner, IT lead, legal/compliance sign-off
  • Implementation plan: step-by-step execution with assigned owners
  • Communication plan: who was notified, when, and through which channel
Review Criteria & Risk Scoring
  • Scope: number of users, interfaces, and data volumes affected
  • Reversibility: can it be rolled back? How quickly?
  • Timing: does it conflict with change freezes, quarter-end, or peak periods?
  • Dependencies: are all upstream/downstream systems accounted for?
  • Risk score: Low (auto-approve), Medium (CAB review), High (executive approval)

Vendor Notification

Vendor relationships require careful unwinding. Contract terms, data export obligations, and early termination penalties must all be addressed before the decommission date.

Contract Termination Clauses
Review notice period requirements (typically 30-90 days written notice). Identify auto-renewal dates and cancellation windows. Flag early termination fees and negotiate where possible.
Data Export Requirements
Request data export in standard formats (CSV, JSON, SQL dump). Verify completeness of export against source record counts. Confirm vendor's data destruction timeline post-export.
Early Termination Assessment
Calculate remaining contract value vs. early termination fee. Compare against ongoing costs of maintaining the system. Present cost-benefit analysis to finance for approval.
Support Wind-Down
Negotiate reduced support tier during transition period. Ensure vendor support remains available through T+30 for data questions. Get written confirmation of service end date.
Vendor Notification Template Structure
# Vendor Notification — Formal Termination Letter
# ================================================

Subject: Notice of Service Termination — [Application Name]
Date:    [Current Date]
To:      [Vendor Account Manager]
From:    [Organization Procurement Lead]
CC:      [IT Lead], [Legal Counsel], [Business Owner]

# Section 1: Termination Notice
- Reference contract number and effective date
- State intent to terminate per Section [X] of the agreement
- Specify requested termination date (minimum notice period)

# Section 2: Data Export Request
- Request complete data export in [format] by [date]
- Specify data categories to be exported
- Request written confirmation of export completeness
- Request vendor data destruction certificate post-export

# Section 3: Financial Settlement
- Request final invoice and pro-rated credits
- Address any early termination fees per contract terms
- Confirm return of any prepaid amounts

# Section 4: Transition Support
- Request continued access for [X] days post-termination
- Specify support level needed during transition
- Identify key vendor contacts for transition questions

# Section 5: Confirmation
- Request written acknowledgment within 10 business days
- Provide point of contact for vendor questions

Resistance Management

Resistance to decommissioning is natural and predictable. Users fear change, teams fear loss of control, and leaders fear disruption. Address resistance proactively with these strategies.

Identify Champions Early

Find power users and team leads who understand the rationale for decommissioning. Involve them in replacement system design and testing. Their endorsement carries more weight than any executive mandate. Offer them early access to new tools and recognition for their role in the transition.

Provide Clear Migration Paths

Never decommission without a concrete alternative. Document step-by-step migration guides for every user workflow. Provide hands-on training sessions (not just documentation). Offer office hours and dedicated support during the first 30 days post-migration.

Show the Business Case

Share the financial rationale in terms users understand: cost savings reinvested in better tools, reduced security risk, fewer outages. Quantify the pain of the status quo (maintenance hours, security vulnerabilities, user complaints about the old system).

Establish Feedback Channels

Create dedicated channels (Slack/Teams, feedback forms, regular town halls) for users to voice concerns. Respond to every piece of feedback, even if the answer is "we understand but the decision stands." Unheard resistance goes underground and becomes sabotage.

Executive Reporting Template

Provide leadership with a consistent, concise view of the decommissioning program. Update this report monthly or at each program milestone.

Decommissioning Program Status Report
Report Element Description Format
Program Status Overall health of the decommissioning program Green / Amber / Red
Applications Decommissioned Count completed this reporting period Number + cumulative total
Cost Savings Realized Actual savings vs. projected savings Dollar amount + variance %
Upcoming Decommissions Applications scheduled for next 30/60/90 days Application list with target dates
Risks & Escalations Blockers, delays, and issues requiring executive attention Risk description + mitigation + owner
Key Decisions Required Decisions that require executive sponsor approval Decision description + options + recommendation
Decommissioning fails when treated as IT-only Programs that lack executive sponsorship and cross-functional governance have a 60%+ failure rate. Business owners must be accountable for their applications. Legal must validate data handling. Finance must track savings realization. IT executes, but governance is everyone's responsibility.
10

Post-Decommission

Validation, measurement, and continuous improvement after systems are retired. Decommissioning is not complete when the server is shut down — it is complete when every trace is cleaned up, savings are verified, and lessons feed back into the program.

Post-Shutdown Validation Checklist

Run through every item within 5 business days of shutdown. Incomplete validation leads to orphaned resources, phantom alerts, and cost leakage that erodes the savings you just earned.

  • System confirmed unreachable (ping, HTTP, DNS) Test from outside the network — internal DNS caches can mask stale records for hours
  • No orphaned cloud resources (EBS volumes, snapshots, S3 buckets, IP addresses) Unattached resources continue billing silently — check every region, not just the primary
  • No active network connections to decommissioned IPs Review flow logs and firewall connection tables for 48+ hours post-shutdown
  • All DNS records removed and verified via external DNS Use dig or nslookup against 8.8.8.8 and 1.1.1.1 to confirm propagation
  • Certificates revoked or expired Revoke immediately — do not wait for natural expiration if the system held sensitive data
  • Firewall rules cleaned up Remove allow-rules referencing decommissioned IPs to reduce attack surface
  • Load balancer targets deregistered Stale targets cause health check failures that pollute monitoring dashboards
  • CI/CD pipelines disabled or removed Orphaned pipelines can trigger builds to non-existent infrastructure
  • Monitoring and alerting removed (no phantom alerts) Phantom alerts cause alert fatigue and erode trust in the monitoring system
  • Service mesh configuration cleaned (Istio VirtualServices, etc.) Stale mesh routes cause connection timeouts for upstream callers
  • CMDB updated (status: Retired, retirement date recorded) If the CMDB still shows Active, the next audit will flag it as a discrepancy
  • Documentation archived with DEPRECATED labels Move to an archive folder in Confluence/SharePoint — do not delete, future teams may need context
  • License cancellations confirmed with vendors Contact procurement to verify cancellation — auto-renewals can silently re-activate
  • Cost savings appearing in financial reports Verify in the next billing cycle — lag between shutdown and cost disappearance varies by provider

Orphaned Resource Detection

Cloud providers do not automatically clean up dependent resources when you terminate an instance. These orphans accumulate cost silently. Run these checks immediately after decommission and again 30 days later.

AWS — Find Unattached EBS Volumes

# List all unattached EBS volumes (available = not attached to any instance)
aws ec2 describe-volumes \
  --filters Name=status,Values=available \
  --query 'Volumes[*].{ID:VolumeId,Size:Size,Created:CreateTime,AZ:AvailabilityZone}' \
  --output table

# Find unused Elastic IPs (not associated with any ENI)
aws ec2 describe-addresses \
  --query 'Addresses[?AssociationId==null].{IP:PublicIp,AllocID:AllocationId}' \
  --output table

# Find stale EBS snapshots older than 90 days with no associated volume
aws ec2 describe-snapshots \
  --owner-ids self \
  --query 'Snapshots[?StartTime<=`2025-01-01`].{ID:SnapshotId,Size:VolumeSize,Start:StartTime,Desc:Description}' \
  --output table

# Find orphaned S3 buckets (empty or last modified > 90 days ago)
for bucket in $(aws s3api list-buckets --query 'Buckets[].Name' --output text); do
  count=$(aws s3api list-objects-v2 --bucket "$bucket" --max-items 1 \
    --query 'KeyCount' --output text 2>/dev/null)
  if [ "$count" = "0" ] || [ "$count" = "None" ]; then
    echo "EMPTY: $bucket"
  fi
done

Azure — Find Orphaned Resources

# Find orphaned managed disks (not attached to any VM)
az disk list \
  --query "[?managedBy==null].{Name:name,RG:resourceGroup,Size:diskSizeGb,State:diskState}" \
  --output table

# Find unused public IP addresses
az network public-ip list \
  --query "[?ipConfiguration==null].{Name:name,RG:resourceGroup,IP:ipAddress,SKU:sku.name}" \
  --output table

# Find empty resource groups (no resources inside)
for rg in $(az group list --query "[].name" --output tsv); do
  count=$(az resource list --resource-group "$rg" --query "length([])" --output tsv)
  if [ "$count" = "0" ]; then
    echo "EMPTY RG: $rg"
  fi
done

# Find orphaned network interfaces
az network nic list \
  --query "[?virtualMachine==null].{Name:name,RG:resourceGroup,PrivateIP:ipConfigurations[0].privateIpAddress}" \
  --output table

GCP — Find Orphaned Resources

# Find unattached persistent disks
gcloud compute disks list \
  --filter="NOT users:*" \
  --format="table(name,zone,sizeGb,status,lastAttachTimestamp)"

# Find unused static external IPs
gcloud compute addresses list \
  --filter="status=RESERVED" \
  --format="table(name,region,address,status)"

# Find orphaned snapshots (source disk no longer exists)
gcloud compute snapshots list \
  --format="table(name,sourceDisk,diskSizeGb,creationTimestamp)" \
  --filter="creationTimestamp<'2025-01-01'"
Automate orphan detection Do not rely on manual checks. Schedule these scripts as weekly cron jobs or Lambda/Azure Functions. Pipe results to Slack or email. One missed orphaned volume costs more per year than the time to set up automation.

Cost Savings Tracking

Projected savings are promises. Realized savings are results. Track both rigorously to maintain executive trust and justify future rationalization investment.

$0
Projected Savings
Total estimated annual savings from business case
$0
Actual Savings
Verified reduction appearing in financial reports
$0
License Reduction
Software license costs eliminated or renegotiated
$0
Infra Reduction
Compute, storage, network costs eliminated
$0
Support Reduction
Vendor support contracts terminated or reduced
0 FTE
FTE Reallocation
Staff hours redirected to higher-value work

Monthly Savings Tracker

Maintain this table for at least 12 months post-decommission. Finance should validate actuals against projections every quarter.

Month Category Projected Actual Variance
Month 1 Infrastructure $12,000 $11,400 -5%
Month 1 Licensing $8,500 $8,500 0%
Month 1 Support $3,200 $0 -100%
Month 2 Infrastructure $12,000 $12,100 +1%
Month 2 Licensing $8,500 $8,500 0%
Month 2 Support $3,200 $3,200 0%
Month 3 Infrastructure $12,000 $11,800 -2%
Month 3 Licensing $8,500 $8,500 0%
Month 3 Support $3,200 $3,200 0%
Paper savings are not real savings — verify every line item A decommission that removes infrastructure but forgets to cancel the support contract has not saved money. A license that auto-renews because nobody notified procurement is a direct cost leak. Validate every projected saving against actual invoices, cloud bills, and vendor statements. If the number does not appear in a financial report, it is not a realized saving.

Lessons Learned Framework

Conduct a structured retrospective within 2 weeks of completing each decommission. Document findings in a shared knowledge base so future programs start from a stronger baseline.

Capture What Went Well
Identify practices, tools, and decisions that should be repeated. Was the timeline accurate? Did stakeholder communication prevent surprises? Were rollback procedures adequate? Document specific actions, not vague sentiments. Add successful patterns to the decommissioning runbook as standard steps.
Analyze What Went Wrong
Document failures with root cause analysis, not blame. Did a dependency get missed? Was data migration incomplete? Did a vendor refuse to cooperate? For each failure, identify the systemic cause (missing check, inadequate tooling, unclear ownership) and propose a specific process improvement to prevent recurrence.
Update What Surprised Us
Surprises reveal blind spots in the planning process. An unexpected dependency, a regulatory requirement discovered mid-execution, or a cost that appeared after shutdown all indicate gaps in the assessment framework. Feed these back into discovery templates and risk checklists for future projects.
Improve What to Change Next Time
Translate lessons into concrete runbook updates. Add new checklist items, adjust timeline estimates, revise communication templates, or introduce new validation steps. Every decommission should make the next one faster and more reliable. Assign owners and deadlines for each improvement action.

Continuous Rationalization Program

One-time rationalization projects deliver short-term wins. Sustainable portfolio health requires an ongoing program with regular cadence, governance controls, and measurable outcomes.

Review Cadence

Q
Quarterly Portfolio Review
Review application health scores, usage trends, and cost anomalies. Identify new retirement candidates. Update the rationalization backlog with prioritized targets. Duration: 2-hour steering committee meeting with data pre-read.
Y
Annual Deep Assessment
Full portfolio re-scoring against business strategy. Re-evaluate all Tolerate decisions. Refresh TCO models with current pricing. Align rationalization roadmap with enterprise architecture targets. Duration: 2-4 week assessment cycle.

New Application Intake Governance

Rationalization is futile if new applications enter the portfolio without controls. Every new application request must pass through an approval workflow before procurement or development begins.

01
Business Justification
Requestor submits business case with use case, expected users, alternatives evaluated, and estimated TCO. Must demonstrate that no existing application can meet the need.
02
Duplicate Check
Architecture review board validates against the application catalog. If overlapping capabilities exist, requestor must justify why existing tools are insufficient.
03
Security & Compliance Review
InfoSec evaluates data handling, authentication requirements, and regulatory implications. Legal reviews vendor contracts and data residency obligations.
04
Approval & Cataloging
If approved, the application is registered in the CMDB with a designated owner, support tier, review schedule, and exit criteria defined from day one.

Shadow IT Monitoring

Shadow IT undermines every rationalization effort. Applications purchased on corporate cards, SaaS tools signed up with work email, and departmental servers running under desks all expand the portfolio invisibly. Continuous detection is essential.

# SaaS Discovery via DNS/Proxy Logs
# Aggregate unique SaaS domains from proxy logs
awk -F'[ /]' '/CONNECT/{print $5}' /var/log/squid/access.log \
  | sort -u \
  | grep -E '\.(io|com|app|cloud|dev|net)$' \
  > /tmp/saas-domains.txt

# Cross-reference discovered domains against approved catalog
comm -23 \
  <(sort /tmp/saas-domains.txt) \
  <(sort /etc/approved-saas-catalog.txt) \
  > /tmp/shadow-it-candidates.txt

echo "Shadow IT candidates found: $(wc -l < /tmp/shadow-it-candidates.txt)"

# Cloud Account Discovery
# Find AWS accounts not in the organization
aws organizations list-accounts \
  --query 'Accounts[?Status==`ACTIVE`].{ID:Id,Name:Name,Email:Email}' \
  --output table

# CASB (Cloud Access Security Broker) Integration
# Most enterprises should deploy a CASB for automated shadow IT discovery:
# - Microsoft Defender for Cloud Apps
# - Netskope
# - Palo Alto Prisma SaaS
# These integrate with SSO, proxy, and endpoint agents for real-time visibility.

KPI Dashboard

Track these metrics at the program level to demonstrate value and identify areas needing attention.

0
Portfolio Size
Total active applications (trend: decreasing)
0
Retired This Quarter
Applications successfully decommissioned
$0
Annual Savings
Realized cost savings (verified by finance)
0 days
Avg Cycle Time
Mean time from decision to completed decommission
0%
Tech Debt Ratio
Improvement in technical debt score (trending down)
0%
CMDB Accuracy
Percentage of applications with current, validated records

Program Maturity Model

Assess where your organization sits today and define a roadmap to the next level. Most enterprises begin at Level 1 or 2. Reaching Level 3 typically takes 12-18 months of sustained effort.

Level Name Characteristics
1 Ad Hoc No formal process. Decommissioning happens reactively when hardware fails or contracts expire. No portfolio visibility. Savings are accidental, not measured. Shadow IT is rampant and undetected.
2 Defined Documented rationalization process exists. Manual data collection and scoring. Decommissioning follows a checklist but execution is inconsistent. Basic cost tracking in spreadsheets. CMDB exists but accuracy is below 70%.
3 Managed Regular quarterly cadence with executive sponsorship. Automated discovery tools deployed. Metrics tracked in dashboards. Intake governance prevents uncontrolled portfolio growth. CMDB accuracy above 85%. Average decommission cycle under 90 days.
4 Optimized Fully automated discovery and continuous rationalization. Predictive analytics identify retirement candidates before they become problems. Real-time cost attribution. Shadow IT detected within 24 hours. CMDB accuracy above 95%. Decommissioning is a standard, low-friction operational process.
80% of CIOs recognize rationalization as critical, but only 20% have fully implemented a program. Be in the 20%. The gap between recognition and execution is where most organizations stall. Start with a single quarterly review cycle, retire 5 applications, measure the savings, and publish the results. Quick wins build the political capital to formalize the program. Perfection is the enemy of progress — a Level 2 program running consistently beats a Level 4 program stuck in planning.