Rationalization & Decommissioning

Quick Reference

Decision frameworks, key acronyms, and common triggers at a glance. Use this section for fast lookups during planning and assessment meetings.

Rationalization Decision Tree

Walk through this flowchart to arrive at a TIME quadrant classification for any application in your portfolio.

                        +---------------------------+
                        |   Is the application      |
                        |   actively used?           |
                        +---------------------------+
                           /                  \
                         YES                   NO
                         /                      \
              +-----------------+        +-----------------+
              | Does it provide |        | Are there legal |
              | business value? |        | or compliance   |
              +-----------------+        | retention reqs? |
                /           \            +-----------------+
              YES            NO            /           \
              /               \          YES            NO
    +-------------+    +------------+     |        +----------+
    | Is it       |    | Can users  |     |        | ELIMINATE |
    | technically |    | migrate to |  RETAIN      | (Retire)  |
    | sound?      |    | another    |  w/ minimal  +----------+
    +-------------+    | system?    |  investment
      /        \       +------------+
    YES        NO        /       \
    /           \      YES       NO
+---------+ +----------+ +----------+
| INVEST  | | MIGRATE  | | TOLERATE |
| (Keep & | | (Move to | | (Maintain|
|  grow)  | |  modern) | |  as-is)  |
+---------+ +----------+ +----------+

Key Acronyms

TCO	Total Cost of Ownership Full lifecycle cost: licensing, infrastructure, support, labor, and opportunity cost
CMDB	Configuration Management Database Authoritative repository of IT asset records and their relationships
ITAM	IT Asset Management Discipline for tracking hardware, software, and cloud asset lifecycle
APM	Application Portfolio Management Governance framework for evaluating and rationalizing the application estate
EoL	End of Life Vendor-announced date when a product will no longer be sold or actively developed
EoS	End of Support Date when vendor ceases security patches and technical support
TIME	Tolerate, Invest, Migrate, Eliminate Gartner-originated model for categorizing applications by business value vs. technical fitness
6Rs	Rehost, Replatform, Refactor, Repurchase, Retain, Retire AWS migration strategy framework adopted industry-wide
CAB	Change Advisory Board Governance body that reviews and approves changes to production systems
RACI	Responsible, Accountable, Consulted, Informed Matrix for clarifying roles and decision rights in cross-functional work
SLA	Service Level Agreement Contract defining uptime, response time, and support commitments

Common Rationalization Triggers

Rationalization initiatives are typically catalyzed by one or more of these organizational events:

Mergers & Acquisitions

Duplicate systems from combined portfolios create redundancy. Post-M&A integration demands rapid assessment of overlapping capabilities.

Cloud Migration

Lift-and-shift deadlines force evaluation of every workload. Many apps are retired rather than migrated when their true usage is measured.

Cost Reduction

Budget pressure from leadership or economic downturn. Eliminating shelfware, unused licenses, and redundant infrastructure yields quick wins.

Compliance & Regulation

GDPR, HIPAA, SOX, or sector-specific mandates require knowing where data lives. Unsupported systems become compliance liabilities.

End of Life / End of Support

Vendor discontinues patches or support. Running EoS software increases security risk and may violate audit requirements.

Digital Transformation

Modernization initiatives replace legacy monoliths with microservices, SaaS, or low-code platforms. Old systems must be retired cleanly.

Technical Debt Reduction

Accumulated complexity from decades of customization. Rationalization breaks the cycle by eliminating systems that cost more to maintain than replace.

TIME Model Quadrants

Each application in your portfolio maps to one of four quadrants based on its business value and technical quality:

Tolerate Low Value / Low Quality

The app works but is not strategic. It is technically mediocre but not worth the disruption of replacement right now. Action: Minimize investment. Limit to break-fix maintenance only. Set a sunset date and communicate it.

Invest High Value / High Quality

Strategic, well-architected applications that drive competitive advantage. Action: Fund enhancements, scale the platform, and treat it as a core asset. Protect its health with monitoring and regular upgrades.

Migrate High Value / Low Quality

Critical business capability trapped in aging technology. Action: Plan migration to a modern platform. Evaluate the 6Rs to choose the right migration strategy. Prioritize based on risk and EoS dates.

Eliminate Low Value / Low Quality

Little business value and poor technical fitness. Action: Decommission. Archive data per retention policy, redirect users to alternatives, and reclaim infrastructure. Often the quickest cost savings.

The 6Rs at a Glance

When an application survives rationalization, choose the right migration path:

Rehost (Lift & Shift)

Move the application as-is to new infrastructure (typically cloud IaaS). No code changes. Fastest path but does not modernize. Best for time-pressured data center exits.

Replatform (Lift & Reshape)

Make targeted optimizations during migration: swap the database to a managed service, containerize the runtime, or update the OS. Moderate effort with meaningful gains.

Refactor (Re-architect)

Redesign the application to take full advantage of cloud-native capabilities. Break monoliths into microservices, adopt serverless, or rebuild with modern frameworks. Highest effort but greatest long-term benefit.

Repurchase (Replace with SaaS)

Drop the custom or on-prem solution and buy a commercial SaaS equivalent. Common for CRM, ITSM, HR, and ERP workloads. Requires data migration and user retraining.

Retain (Revisit Later)

Keep the application where it is for now. Used when there are dependencies, compliance constraints, or insufficient business case to act. Set a review date.

Retire (Decommission)

Switch off the application entirely. Archive data, notify stakeholders, reclaim licenses and infrastructure. This guide focuses heavily on executing this path correctly.

Portfolio Discovery

Before you can rationalize, you must know what you have. Discovery builds the authoritative application inventory that every downstream decision depends on.

Application Inventory Techniques

No single method catches everything. Combine automated and manual approaches for completeness:

Automated Discovery

Network scanning, agent-based discovery, and cloud API enumeration. Tools interrogate infrastructure to find running workloads, installed software, and open ports. Fast coverage but misses business context.

Manual Audit

Surveys, interviews with business unit owners, and review of existing documentation. Captures application purpose, business criticality, and user counts that automated tools cannot infer. Slow but essential for context.

Financial Analysis

Pull licensing costs, cloud spend, vendor contracts, and support invoices from procurement and finance systems. Cross-reference with discovered assets to build TCO. Often reveals forgotten subscriptions and shelfware.

CMDB: Key Attributes to Capture

Your Configuration Management Database becomes the single source of truth. At minimum, capture these attributes for every application:

Attribute	Type	Why It Matters
Application Name	String	Canonical name for unambiguous reference across teams
App ID	Unique Key	Machine-readable identifier linking to all related CIs
Business Owner	Contact	Person accountable for business decisions about the app
Technical Owner	Contact	Person responsible for uptime, patching, and architecture
Technology Stack	Tags	Languages, frameworks, databases, and middleware in use
Annual Cost (TCO)	Currency	Infra + licenses + support + labor to run the application
Active Users	Integer	Monthly active users; zero signals an elimination candidate
Integrations	List	Upstream/downstream systems; determines decommission blast radius
Lifecycle Stage	Enum	Active / Sunset / Decommissioned / Under Review
EoL / EoS Dates	Date	Vendor end-of-life and end-of-support deadlines
Data Classification	Enum	Public / Internal / Confidential / Restricted; governs retention rules
Hosting Environment	Enum	On-prem / IaaS / PaaS / SaaS / Hybrid; affects migration path

Discovery Tools Comparison

Tool	Type	Best For	Discovery Method	Cost Model
ServiceNow Discovery	ITSM + CMDB	Enterprises already on ServiceNow	Agent + agentless probes	Per-node subscription
LeanIX	EA / APM	Application portfolio management	Integration APIs + manual	SaaS per-user
AWS Migration Hub	Cloud migration	AWS-bound migrations	Agent (ADS) + connector	Free (AWS native)
Azure Migrate	Cloud migration	Azure-bound migrations	Appliance-based scan	Free (Azure native)
Flexera One	ITAM / SAM	License optimization, SaaS mgmt	Agent + beacon + API	Per-device subscription
Cloudockit	Cloud documentation	Multi-cloud architecture diagrams	API read-only access	SaaS subscription

Tip: Multi-source correlation No single tool captures the full picture. Best practice is to run automated discovery, then enrich results with financial data from procurement and business context from owner interviews.

Shadow IT Detection Methods

Up to 40% of enterprise IT spend occurs outside official channels. Finding shadow IT is critical for an accurate inventory.

CASB Analysis

Cloud Access Security Brokers (Netskope, Zscaler, McAfee MVISION) intercept traffic to identify unsanctioned SaaS usage. They provide risk scores and user counts per app.

SSO / IdP Logs

Review Okta, Azure AD, or Ping Identity logs for SAML/OIDC integrations. Apps appearing in SSO that are not in the CMDB are shadow IT candidates.

Expense Reports

Search corporate credit card and expense reimbursement data for recurring SaaS charges. Departments often procure tools independently through expense accounts.

Network Traffic Analysis

DNS logs and firewall records reveal external services being accessed. Unusual outbound connections to SaaS endpoints indicate unsanctioned tools.

AWS Discovery CLI Commands

Use the AWS Application Discovery Service to enumerate workloads prior to migration:

aws-cli / discovery

# Start continuous data collection via the Discovery Agent
aws discovery start-continuous-export

# List discovered servers with key attributes
aws discovery describe-configurations \
    --configuration-type SERVER

# Export server details for offline analysis
aws discovery start-export-task \
    --export-data-format CSV

# Check export status
aws discovery describe-export-tasks

# List discovered applications (agent-grouped)
aws discovery list-configurations \
    --configuration-type APPLICATION

# Tag servers with rationalization decisions
aws discovery create-tags \
    --configuration-ids "d-server-01a2b3c4d5" \
    --tags key=RationalizationDecision,value=RETIRE

# Fetch network connection data between servers
aws discovery describe-configurations \
    --configuration-type CONNECTION \
    --filters name=sourceServerId,values=d-server-01a2b3c4d5,condition=EQUALS

CMDB Schema Snippet

A minimal relational schema for tracking applications and their dependencies:

SQL / PostgreSQL

CREATE TABLE applications (
    app_id          VARCHAR(20) PRIMARY KEY,
    app_name        VARCHAR(255) NOT NULL,
    description     TEXT,
    business_owner  VARCHAR(255),
    technical_owner VARCHAR(255),
    business_unit   VARCHAR(100),
    criticality     VARCHAR(20) CHECK (criticality IN
                        ('Critical','High','Medium','Low')),
    lifecycle_stage VARCHAR(30) CHECK (lifecycle_stage IN
                        ('Active','Sunset','Decommissioned','Under Review')),
    hosting_env     VARCHAR(30) CHECK (hosting_env IN
                        ('On-Prem','IaaS','PaaS','SaaS','Hybrid')),
    tech_stack      TEXT[],           -- Array of technology tags
    annual_tco      NUMERIC(12,2),
    active_users    INTEGER DEFAULT 0,
    data_class      VARCHAR(30),
    eol_date        DATE,
    eos_date        DATE,
    time_category   VARCHAR(20),      -- Tolerate/Invest/Migrate/Eliminate
    six_r_strategy  VARCHAR(20),      -- Rehost/Replatform/Refactor/etc.
    created_at      TIMESTAMPTZ DEFAULT NOW(),
    updated_at      TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE app_dependencies (
    id              SERIAL PRIMARY KEY,
    source_app_id   VARCHAR(20) REFERENCES applications(app_id),
    target_app_id   VARCHAR(20) REFERENCES applications(app_id),
    dependency_type VARCHAR(50),      -- API, Database, File, Message Queue
    protocol        VARCHAR(50),      -- REST, SOAP, JDBC, SFTP, AMQP
    data_flow       VARCHAR(20) CHECK (data_flow IN
                        ('Inbound','Outbound','Bidirectional')),
    criticality     VARCHAR(20),
    description     TEXT,
    UNIQUE(source_app_id, target_app_id, dependency_type)
);

CREATE TABLE app_costs (
    id              SERIAL PRIMARY KEY,
    app_id          VARCHAR(20) REFERENCES applications(app_id),
    cost_category   VARCHAR(50),      -- License, Infrastructure, Support, Labor
    annual_amount   NUMERIC(12,2),
    vendor          VARCHAR(255),
    contract_end    DATE,
    notes           TEXT
);

-- Useful indexes for rationalization queries
CREATE INDEX idx_apps_lifecycle ON applications(lifecycle_stage);
CREATE INDEX idx_apps_time ON applications(time_category);
CREATE INDEX idx_apps_eol ON applications(eol_date);
CREATE INDEX idx_deps_source ON app_dependencies(source_app_id);
CREATE INDEX idx_deps_target ON app_dependencies(target_app_id);

Phase 1 Discovery Checklist (Weeks 1-4)

Identify executive sponsor and secure mandate for discovery Without top-down authority, business units may not cooperate with data requests
Deploy automated discovery agents across all network segments Cover production, staging, DR, and DMZ environments
Extract software asset data from SCCM / Intune / Jamf Desktop and endpoint software is often overlooked
Pull cloud resource inventories from AWS, Azure, and GCP accounts Use native tools: AWS Config, Azure Resource Graph, GCP Asset Inventory
Request vendor contract and licensing data from Procurement Include renewal dates, termination clauses, and per-seat costs
Distribute application owner survey to all business units Keep surveys short: app name, purpose, users, criticality, alternatives
Correlate CASB and SSO logs to detect shadow IT Flag any discovered app not present in the CMDB
Establish the CMDB schema and begin populating records Start with automated data; enrich with manual survey results in weeks 3-4
Validate discovered inventory with infrastructure and security teams Cross-check against firewall rules, DNS records, and load balancer configs
Publish Phase 1 Discovery Report with app count and coverage metrics Report should include confidence level and known gaps

Assessment Frameworks

Structured scoring models that transform subjective opinions into defensible, data-driven rationalization decisions. These frameworks create consistency across hundreds of applications.

TIME Model Deep Dive

The TIME model plots every application on a 2x2 matrix. The Y-axis represents business value (strategic alignment, revenue contribution, user satisfaction). The X-axis represents technical fitness (architecture quality, security posture, maintainability, performance).

                        TECHNICAL FITNESS
                   Low                    High
               +------------------+------------------+
               |                  |                  |
   High        |    MIGRATE       |    INVEST        |
               |                  |                  |
               |  High value but  |  Strategic and   |
  BUSINESS     |  poor tech.      |  well-built.     |
  VALUE        |  Modernize or    |  Fund growth.    |
               |  replatform.     |                  |
               +------------------+------------------+
               |                  |                  |
   Low         |    ELIMINATE     |    TOLERATE      |
               |                  |                  |
               |  No value, bad   |  Works fine but  |
               |  tech. Retire    |  not strategic.  |
               |  immediately.    |  Minimal invest. |
               |                  |                  |
               +------------------+------------------+

Common mistake: Skipping the Tolerate quadrant Many teams want to act on every application. But Tolerate is a valid strategy. Some apps cost little, serve a niche function, and would cause more disruption to replace than to maintain. Assign a review date and move on.

30%

Typical Eliminate

of apps in large portfolios

25%

Typical Tolerate

minimal investment needed

25%

Typical Migrate

require modernization

20%

Typical Invest

strategic growth targets

Business Value Scoring Criteria

Score each application from 1 (lowest) to 5 (highest) across these dimensions, then compute a weighted average:

Criterion	Weight	1 (Low)	5 (High)
Strategic Alignment	30%	No link to any strategic initiative	Directly enables top-3 business objective
Business Criticality	25%	Failure has zero operational impact	Outage stops revenue or violates regulations
Utilization	20%	0 active users in past 90 days	Used daily by 500+ users across units
Revenue Impact	15%	No revenue link; pure cost center	Directly generates or enables revenue stream
User Satisfaction	10%	Frequent complaints; workarounds common	High NPS; users actively advocate for the tool

Technical Fitness Scoring Criteria

Evaluate the technical health of each application across these dimensions:

Criterion	Weight	1 (Low)	5 (High)
Architecture Quality	25%	Monolith, no API, tightly coupled	Modular, well-documented APIs, loosely coupled
Security Posture	25%	Known CVEs, no patching, EoS components	Current patches, pen-tested, compliant
Performance & Reliability	20%	Frequent outages, slow response times	99.9%+ uptime, sub-second response
Maintainability	15%	No docs, no tests, single expert dependency	Well-documented, CI/CD, multiple maintainers
Scalability	15%	Cannot handle 2x current load	Auto-scales horizontally, elastic infrastructure

Scoring Formulas

Compute composite scores and map them to TIME quadrants using threshold values:

Formula

# Business Value Score (BVS)
BVS = (Strategic_Alignment * 0.30)
    + (Business_Criticality * 0.25)
    + (Utilization * 0.20)
    + (Revenue_Impact * 0.15)
    + (User_Satisfaction * 0.10)

# Technical Fitness Score (TFS)
TFS = (Architecture_Quality * 0.25)
    + (Security_Posture * 0.25)
    + (Performance_Reliability * 0.20)
    + (Maintainability * 0.15)
    + (Scalability * 0.15)

# TIME Quadrant Assignment (threshold = 3.0)
if BVS >= 3.0 and TFS >= 3.0:  INVEST
if BVS >= 3.0 and TFS <  3.0:  MIGRATE
if BVS <  3.0 and TFS >= 3.0:  TOLERATE
if BVS <  3.0 and TFS <  3.0:  ELIMINATE

# Composite Score (for ranked prioritization)
Composite = (BVS * 0.55) + (TFS * 0.45)

Calibration tip Run a calibration session with 10 representative apps across the portfolio. Score them independently, then compare results as a group. Adjust weight percentages if scores cluster too tightly or if a criterion does not differentiate well.

Assessment Questionnaire: Key Questions

Use these questions during stakeholder interviews to gather the data needed for scoring:

Business Context

What business process does this application support?
What happens if this application is unavailable for 4 hours? 24 hours? 1 week?
How many unique users accessed this application in the last 90 days?
Does this application directly generate or enable revenue? If so, estimate the annual amount.
Which strategic initiatives (if any) depend on this application?
Is there an alternative system that provides similar or overlapping capability?

Technical Health

What is the primary technology stack (language, framework, database, OS)?
When was the last security patch applied? Are there known unpatched CVEs?
Is the application vendor-supported? If so, what is the EoL/EoS date?
How many people can maintain this application? Is there a single point of expertise?
Does the application have automated tests and a CI/CD pipeline?
What is the average monthly uptime over the past 12 months?

Cost & Dependencies

What is the total annual cost to run this application (infrastructure + licenses + support + labor)?
List all systems that send data to or receive data from this application.
What data does this application store? What is the data classification level?
Are there regulatory or contractual requirements that mandate keeping this data for a specific period?
What is the vendor contract renewal date and cancellation notice period?

Python Assessment Scoring Script

Automate TIME quadrant assignment from CSV survey data:

Python 3

#!/usr/bin/env python3
"""
Rationalization Scoring Engine
Reads application survey data and assigns TIME quadrants.
Input:  CSV with columns for each scoring criterion (1-5 scale)
Output: CSV with BVS, TFS, TIME quadrant, and composite score
"""
import csv
import sys
from dataclasses import dataclass
from typing import List

# --- Weight Configuration ---
BV_WEIGHTS = {
    "strategic_alignment": 0.30,
    "business_criticality": 0.25,
    "utilization": 0.20,
    "revenue_impact": 0.15,
    "user_satisfaction": 0.10,
}

TF_WEIGHTS = {
    "architecture_quality": 0.25,
    "security_posture": 0.25,
    "performance_reliability": 0.20,
    "maintainability": 0.15,
    "scalability": 0.15,
}

QUADRANT_THRESHOLD = 3.0


@dataclass
class AppAssessment:
    app_id: str
    app_name: str
    bvs: float = 0.0
    tfs: float = 0.0
    quadrant: str = ""
    composite: float = 0.0


def compute_weighted_score(row: dict, weights: dict) -> float:
    """Compute weighted average from survey responses."""
    score = 0.0
    for criterion, weight in weights.items():
        value = float(row.get(criterion, 0))
        value = max(1.0, min(5.0, value))  # Clamp to 1-5
        score += value * weight
    return round(score, 2)


def assign_quadrant(bvs: float, tfs: float) -> str:
    """Map scores to TIME quadrant."""
    if bvs >= QUADRANT_THRESHOLD and tfs >= QUADRANT_THRESHOLD:
        return "INVEST"
    elif bvs >= QUADRANT_THRESHOLD and tfs < QUADRANT_THRESHOLD:
        return "MIGRATE"
    elif bvs < QUADRANT_THRESHOLD and tfs >= QUADRANT_THRESHOLD:
        return "TOLERATE"
    else:
        return "ELIMINATE"


def process_portfolio(input_file: str, output_file: str) -> List[AppAssessment]:
    """Process all applications and write scored output."""
    results = []

    with open(input_file, newline="", encoding="utf-8") as f:
        reader = csv.DictReader(f)
        for row in reader:
            app = AppAssessment(
                app_id=row["app_id"],
                app_name=row["app_name"],
            )
            app.bvs = compute_weighted_score(row, BV_WEIGHTS)
            app.tfs = compute_weighted_score(row, TF_WEIGHTS)
            app.quadrant = assign_quadrant(app.bvs, app.tfs)
            app.composite = round(app.bvs * 0.55 + app.tfs * 0.45, 2)
            results.append(app)

    # Sort by composite score (lowest first = highest priority to act)
    results.sort(key=lambda a: a.composite)

    # Write output
    with open(output_file, "w", newline="", encoding="utf-8") as f:
        writer = csv.writer(f)
        writer.writerow(["app_id", "app_name", "bvs", "tfs",
                          "quadrant", "composite"])
        for app in results:
            writer.writerow([app.app_id, app.app_name, app.bvs,
                             app.tfs, app.quadrant, app.composite])

    # Print summary
    quadrant_counts = {}
    for app in results:
        quadrant_counts[app.quadrant] = quadrant_counts.get(
            app.quadrant, 0) + 1

    print(f"\n{'='*50}")
    print(f"  Portfolio Assessment Summary")
    print(f"  Total Applications: {len(results)}")
    print(f"{'='*50}")
    for q in ["INVEST", "MIGRATE", "TOLERATE", "ELIMINATE"]:
        count = quadrant_counts.get(q, 0)
        pct = (count / len(results) * 100) if results else 0
        bar = "#" * int(pct / 2)
        print(f"  {q:10s}  {count:4d}  ({pct:5.1f}%)  {bar}")
    print(f"{'='*50}\n")

    return results


if __name__ == "__main__":
    if len(sys.argv) != 3:
        print("Usage: python assess.py input.csv output.csv")
        sys.exit(1)
    process_portfolio(sys.argv[1], sys.argv[2])

Assessment Best Practices

Score in cohorts, not isolation Assess applications in groups of 10-15 from the same business domain. This provides natural comparison points and helps calibrate scoring consistency across assessors.

Watch for ownership bias Application owners tend to overstate business value and understate technical debt. Use data (usage metrics, incident counts, CVE scans) to validate subjective responses. A second assessor improves objectivity.

Never skip dependency mapping before eliminating An app that looks unused may still provide a critical data feed, authentication service, or file-transfer function to another system. Section 05 covers dependency analysis in detail.

Document the "why" for every decision Months later, someone will ask why App X was tagged for elimination. Capture the assessment rationale in the CMDB record alongside the scores. Include who was interviewed, what data was examined, and the date of assessment.

Business Case & Cost Analysis

Rationalization only happens when leadership sees the numbers. This section covers how to build a defensible financial case with TCO models, ROI projections, and industry benchmarks that survive executive scrutiny.

TCO Calculation Methodology

Total Cost of Ownership extends far beyond the license fee. A rigorous TCO model accounts for every dollar spent to keep an application running, including costs that never appear on the application's budget line.

Cost Category	Components	Example Annual Cost
Infrastructure	Servers (physical/virtual), storage, network, load balancers, DR replication	$120,000 - $400,000
Licensing	Software licenses, SaaS subscriptions, database licenses, middleware	$50,000 - $250,000
Support Contracts	Vendor premium support, extended support (post-EoS), managed services	$30,000 - $150,000
Internal Labor	FTEs for operations, development, DBA, security, project management	$180,000 - $500,000
Training	Onboarding new staff, vendor certification, ongoing skills development	$10,000 - $40,000
Opportunity Cost	Engineering time spent on legacy maintenance instead of innovation	$80,000 - $300,000
Technical Debt Interest	Escalating cost of workarounds, security patches, compatibility shims	$25,000 - $100,000
Integration Maintenance	Keeping point-to-point integrations alive as connected systems change	$15,000 - $60,000

Hidden costs of parallel running During any migration or decommission, you will run old and new systems simultaneously for weeks or months. This overlap adds 10-20% to your total migration budget. Budget for dual licensing, dual infrastructure, dual support, and the labor to keep both environments synchronized.

Additional hidden costs that are frequently underestimated:

Data Migration Tooling

ETL scripts, data transformation tools, validation frameworks, and reconciliation reports. Often requires specialized consultants for legacy formats.

Extended License Overlap

Vendor contracts rarely align with migration timelines. You may pay for 6-12 months of licenses on a system that is being decommissioned while the contract runs out.

User Retraining

Migrating users to a replacement system requires training materials, sessions, help desk surge capacity, and productivity loss during the transition curve.

Consultant & Contractor Fees

Specialized knowledge for legacy platforms (mainframe, COBOL, proprietary ERP) often requires external consultants at premium rates during decommission.

ROI Modeling

The return on investment formula for rationalization is straightforward, but the inputs require careful estimation. Conservative assumptions build credibility with finance teams.

ROI Formula

# ================================================
# ROI Calculation for Application Rationalization
# ================================================

# Step 1: Calculate Annual Savings
#   Sum of all costs eliminated when the app is retired
annual_savings = (
    infrastructure_cost       # Servers, storage, network freed
  + license_cost              # Licenses terminated or reallocated
  + support_contract_cost     # Vendor support cancelled
  + labor_cost_reduction      # FTE hours redirected to other work
  + integration_maintenance   # Point-to-point integrations removed
)

# Step 2: Calculate One-Time Migration Cost
migration_cost = (
    data_migration_labor      # ETL development, validation, testing
  + parallel_running_cost     # Dual-environment period (10-20% budget)
  + user_retraining           # Training materials, sessions, downtime
  + consultant_fees           # External expertise for legacy platforms
  + tooling_and_automation    # Migration scripts, reconciliation tools
  + project_management        # PM, change management, communications
  + contingency_buffer        # 20-30% of above for unknowns
)

# Step 3: Compute ROI
roi_percent = ((annual_savings - migration_cost) / migration_cost) * 100

# Step 4: Break-Even Analysis
break_even_months = migration_cost / (annual_savings / 12)

# ================================================
# Example: Legacy CRM Decommission
# ================================================
annual_savings   = 340_000   # $340K/year total cost eliminated
migration_cost   = 480_000   # $480K one-time migration cost

roi_year_1 = ((340_000 - 480_000) / 480_000) * 100  # -29.2% (still paying off)
roi_year_2 = ((680_000 - 480_000) / 480_000) * 100  # +41.7% (cumulative savings)
roi_year_3 = ((1_020_000 - 480_000) / 480_000) * 100 # +112.5%

break_even = 480_000 / (340_000 / 12)  # 16.9 months

Break-even benchmark Major rationalization programs typically break even in 3-4 years. Individual application retirements with low migration complexity can break even in 6-12 months. Present the 3-year and 5-year cumulative savings to show the compounding value.

Real-World Case Studies

Documented results from organizations that have executed rationalization programs at scale:

Trinity Health

Retired 740+ applications across a multi-hospital system. Achieved $68M in recurring annual savings by eliminating redundant clinical and administrative systems post-merger. Multi-year program driven by M&A integration.

Fortune 100 Financial Services

Targeted 450 applications for rationalization as part of a cloud migration initiative. Delivered $7M+ in annual savings in the first wave by retiring redundant trading and reporting platforms.

Real Estate Firm

Rationalized 120 applications across property management, leasing, and back-office functions. Achieved $1.4M annual savings and reduced integration complexity by 40%.

Global Bank — License Discovery

License audit discovered 66,000 unused software licenses across the enterprise. Reclamation and termination yielded $4.8M in immediate savings without decommissioning a single application.

SaaS Rationalization Initiative

Mid-size enterprise audited SaaS subscriptions across all departments. Found 53% of licenses unused. Consolidated overlapping tools and achieved $462K annual savings (33% reduction in SaaS spend).

Industry Benchmarks

Reference data points for benchmarking your rationalization program against industry norms:

15-30%

Avg IT Cost Reduction

from portfolio rationalization programs

53%

SaaS Licenses Unused

$21M average waste per enterprise

20-30%

Budget Overrun Risk

from hidden costs; add contingency

80%

CIOs See It As Critical

but only 20% have fully implemented

Cost-Benefit Template

Use this template to structure the financial case for each rationalization candidate. Fill in actual values from discovery and assessment data:

Line Item	Category	Formula / Source	Annual Value
Infrastructure savings	Benefit	Server + storage + network costs from CMDB	$___,___
License termination	Benefit	Contract value from Procurement records	$___,___
Support contract savings	Benefit	Vendor support fees (check cancellation terms)	$___,___
Labor reallocation	Benefit	FTE hours x blended rate (ops + dev + DBA)	$___,___
Risk reduction value	Benefit	P(breach) x impact estimate for EoS systems	$___,___
Data migration cost	Cost	ETL development + validation + testing labor	($___,___)
Parallel running cost	Cost	Dual environment x estimated months x 1.15	($___,___)
Retraining cost	Cost	User count x hours x blended rate + materials	($___,___)
Project management	Cost	PM FTE allocation x duration + change mgmt	($___,___)
Contingency (25%)	Cost	Sum of costs above x 0.25	($___,___)
Net annual benefit	Net	Total benefits - amortized costs	$___,___

Tracking & Reporting Savings

Track actual vs. projected savings Create a savings tracker that records projected savings at decision time, then validates actual savings at 6-month and 12-month milestones. Finance teams respect programs that can demonstrate accountability. Report variance with explanations.

Beware of "paper savings" Eliminating an application does not save money unless the underlying resources are actually deprovisioned. If the server still runs, the license still renews, or the FTE is not redeployed, the saving is illusory. Mandate resource deprovisioning as a required step in every decommission runbook.

Dependency Mapping

Understanding application interconnections is the difference between a clean decommission and a cascading failure. Map dependencies before making any changes to the production portfolio.

Why Dependencies Matter

Every application exists in a web of connections. Removing one node without understanding its edges creates unpredictable failures downstream. Dependency mapping is the single most important risk-mitigation activity in any rationalization program.

Blast Radius

Decommissioning an application affects every system that depends on it. Without a dependency map, you cannot estimate the blast radius. A seemingly low-value application may be a critical data provider to five high-value systems.

Cascading Failures

System A calls System B, which calls System C. Retiring System C does not just break B; it breaks A too, and potentially everything upstream of A. Indirect dependencies are invisible without systematic analysis.

Hidden Integrations

Tribal knowledge integrations, undocumented batch jobs, SFTP file drops scheduled at 2 AM, and shared database views. These shadow integrations rarely appear in architecture diagrams but will surface loudly on decommission day.

Dependency Discovery Techniques

No single method finds all dependencies. Use a layered approach that combines automated detection with human knowledge:

Network Traffic Analysis

Capture and analyze network flows for 30+ days to establish a baseline of all connections to and from the target application. Use NetFlow, packet captures, or firewall logs. This catches runtime dependencies that no documentation or code scan will reveal, including monthly and quarterly batch processes that only run periodically.

APM Instrumentation

Application Performance Monitoring tools like Dynatrace, AppDynamics, and New Relic automatically discover service-to-service calls, database connections, and external API dependencies. They provide call frequency, latency, and error rates. Best for applications that already have agents deployed.

Code Analysis & Static Scanning

Scan source code and configuration files for connection strings, API endpoint URLs, queue names, and service references. Tools like SonarQube, Semgrep, or custom regex scans can extract hard-coded dependencies. Effective for applications where source code is accessible and version-controlled.

Stakeholder Interviews & Tribal Knowledge

Interview application owners, developers, and operations staff. Ask specifically: "What breaks if this goes away?" Long-tenured staff often know about integrations that predate current documentation. Capture this knowledge formally before institutional memory is lost to attrition.

Dependency Types

Classify each discovered dependency by its integration pattern. The type determines the mitigation strategy required before decommission:

Type	Protocol	Detection Method	Decommission Risk
API Calls	REST, gRPC, SOAP, GraphQL	Network analysis, APM traces, API gateway logs	High — Synchronous; callers fail immediately
Database Shared Access	JDBC, ODBC, direct SQL	DB connection logs, audit trails, code scan	High — Other apps reading/writing same tables
File Transfers	SFTP, S3, NFS, SMB	File server logs, cron schedules, S3 access logs	Medium — Asynchronous; may fail silently
Message Queues	Kafka, RabbitMQ, SQS, JMS	Broker admin consoles, consumer group listings	Medium — Messages queue up; delayed impact
Batch Jobs	Cron, Autosys, Control-M, Airflow	Scheduler job definitions, ops runbooks	Medium — Fails on next scheduled run
Shared Authentication	SSO, LDAP, SAML, OIDC, Kerberos	IdP configuration, federation metadata	High — Users locked out across systems
Shared Libraries / SDKs	JAR, NuGet, npm, pip	Dependency manifests, build scripts	Low — Versioned; local copies persist
DNS / Load Balancer	DNS CNAME, VIP, reverse proxy	DNS zone files, LB configs, certificate SANs	Medium — Traffic routes to dead endpoint

Upstream vs. Downstream Impact

For every application being evaluated, analyze impact in both directions:

Upstream Who Depends on This?

Upstream dependencies are the consumers of this application's data or services. These are the systems that break if this application goes away.

Which systems call this application's APIs?
Which systems read from this application's database?
Which dashboards or reports pull data from here?
Which downstream processes consume files this app produces?
Are there SSO/auth dependencies where this app is the IdP?

Impact: Upstream dependency count determines the blast radius. High upstream count = high decommission risk.

Downstream What Does This Depend On?

Downstream dependencies are the services and data sources this application relies on. These are the systems that must stay running if this app is retained or migrated.

Which databases, APIs, or services does this app call?
What authentication systems does it use to verify users?
What message queues or event streams does it subscribe to?
Are there shared infrastructure dependencies (DNS, LB, certs)?
What third-party services or SaaS APIs are integrated?

Impact: Downstream dependencies constrain the migration order. You cannot migrate this app before its downstream services are ready.

Dependency Mapping Tools

Tool	Focus	Discovery Method	Strengths	Limitations
ServiceNow CSDM	Operations	Agent + CMDB relationships	Deep ITSM integration, change impact analysis	Requires mature CMDB; manual relationship entry
Ardoq	Architecture	API integrations + manual modeling	Visual dependency graphs, what-if analysis	Relies on imported data quality; no runtime discovery
Dynatrace	Runtime	OneAgent auto-discovery	Real-time topology, AI-powered baselining	Agent deployment required; cost scales with hosts
AppDynamics	Runtime	Agent-based instrumentation	Business transaction tracing, flow maps	Java/.NET focus; limited for non-standard stacks
Faddom	Infrastructure	Agentless network analysis	No agents needed, fast deployment, hybrid cloud	Network-level only; no code-level dependency detail

Blast Radius Calculation

Quantify the potential impact of decommissioning an application by computing its blast radius score. This combines the number of dependent systems with their criticality:

Pseudocode / Blast Radius Analysis

# ================================================
# Blast Radius Scoring for Decommission Candidates
# ================================================

def calculate_blast_radius(app_id, dependency_graph):
    """
    Walk the dependency graph to find all systems
    that would be impacted by removing this application.
    Returns a risk score and list of affected systems.
    """
    affected = set()
    queue = [app_id]

    # BFS traversal of upstream dependencies
    while queue:
        current = queue.pop(0)
        upstream = dependency_graph.get_upstream(current)
        for dep in upstream:
            if dep.target_id not in affected:
                affected.add(dep.target_id)
                queue.append(dep.target_id)  # Follow chain

    # Score each affected system by criticality weight
    CRITICALITY_WEIGHTS = {
        "Critical": 10,   # Revenue-generating, regulatory
        "High":      5,   # Core business process
        "Medium":    2,   # Departmental tool
        "Low":       1,   # Convenience / reporting
    }

    blast_score = 0
    for system_id in affected:
        system = get_application(system_id)
        weight = CRITICALITY_WEIGHTS.get(system.criticality, 1)
        blast_score += weight

    return {
        "app_id":          app_id,
        "affected_count":  len(affected),
        "blast_score":     blast_score,
        "affected_systems": list(affected),
        "risk_level":      classify_risk(blast_score),
    }

def classify_risk(score):
    """Map blast score to risk tier."""
    if score >= 20:  return "CRITICAL"   # Requires exec approval
    if score >= 10:  return "HIGH"       # Requires architect review
    if score >= 5:   return "MEDIUM"     # Standard CAB approval
    return "LOW"                         # Team-level decision

# --- Risk Matrix ---
# Likelihood: How likely is the dependency to cause failure?
#   HIGH   = Synchronous call, no fallback
#   MEDIUM = Async/batch, partial fallback exists
#   LOW    = Shared library, cached data, redundant path
#
# Impact: What is the business impact if this dependency breaks?
#   CRITICAL = Revenue loss, regulatory violation
#   HIGH     = Major business process disruption
#   MEDIUM   = Departmental impact, workaround available
#   LOW      = Inconvenience, cosmetic, reporting delay
#
#              Impact
#            LOW  MED  HIGH CRIT
# Likelihood +----|----|----|----+
# HIGH       | M  | H  | C  | C  |
# MEDIUM     | L  | M  | H  | C  |
# LOW        | L  | L  | M  | H  |
#            +----|----|----|----+
# L=Low, M=Medium, H=High, C=Critical

Dependency Graph Notation

For each edge in your dependency graph, document these attributes to enable accurate impact analysis:

Attribute	Description	Example
Source	The calling/consuming application	APP-0042 (Order Service)
Target	The called/providing application	APP-0108 (Pricing Engine)
Protocol	Communication method and format	REST/JSON over HTTPS
Direction	Data flow direction (inbound, outbound, bidirectional)	Outbound (Order → Pricing)
Volume	Calls per day, messages per hour, or data volume	~12,000 calls/day
Latency Sensitivity	How quickly does the caller need a response?	Synchronous, <200ms SLA
Criticality	Business impact if this edge is severed	Critical — orders cannot be priced
Fallback Behavior	What happens when the dependency is unavailable?	Circuit breaker; uses cached prices for 15 min

Never decommission without 30+ days of traffic analysis Monthly batch jobs, quarterly reporting processes, and annual reconciliation tasks will not appear in a one-week observation window. A minimum of 30 days of network traffic capture is required before declaring an application has no active dependencies. For financial systems, extend this to 90 days to cover quarter-end processing.

The "dark launch" validation pattern Before decommissioning, configure the target application to log all inbound requests without responding. Route traffic to a proxy that records callers while returning errors. This "dark mode" reveals dependencies you missed during discovery, without the risk of silently serving stale data.

Migration Strategies

The 6Rs framework provides a structured vocabulary for choosing the right migration path for every application in the portfolio. Wrong strategy selection is the leading cause of migration program failure.

The 6Rs Framework

Originally developed by Gartner (as the 5Rs) and expanded by AWS, the 6Rs framework categorizes every possible migration outcome. Each application should be assigned exactly one R based on its business value, technical fitness, dependencies, and organizational constraints. The right choice balances speed, cost, risk, and long-term strategic value.

Wrong strategy selection is the #1 cause of migration failure Choosing to rehost an application that needs refactoring creates cloud-hosted legacy debt. Choosing to refactor an application that should be retired wastes months of engineering effort. Invest time in assessment (Section 03) and dependency mapping (Section 05) before assigning strategies. A wrong R costs more than a slow R.

R1: Rehost (Lift & Shift)

When to Rehost

Data center exit deadline is imminent (1-3 months)
Budget is limited and modernization is not funded
Application is stable and does not require optimization
Organization lacks cloud-native engineering capacity
First phase of a "migrate then modernize" strategy

Characteristics

Effort: LOW Cost: $ Timeline: 1-3 months

Cloud Optimization: LOW

Pros: Fastest migration path, minimal risk, no code changes, predictable timeline.

Cons: No cloud-native benefits (auto-scaling, managed services), same operational overhead in the cloud, potential for higher cloud costs than on-prem.

R2: Replatform (Lift & Reshape)

When to Replatform

Want managed services without full re-architecture
Moderate timeline available (3-6 months)
Database migration to managed service is the main goal
Containerization is feasible without code rewrites
OS or runtime upgrade is needed regardless of migration

Characteristics

Effort: MODERATE Cost: $$ Timeline: 3-6 months

Cloud Optimization: MODERATE

Pros: Meaningful cost and operational gains, reduced DBA/ops overhead, improved reliability through managed services.

Cons: Requires testing for compatibility, some code changes may be needed, does not fully leverage cloud-native architecture.

Common replatform optimizations:

Self-managed DB → RDS / Cloud SQL

Migrate from self-hosted PostgreSQL, MySQL, or SQL Server to a managed database service. Eliminates patching, backup management, and HA configuration overhead.

Web Server → App Service / Elastic Beanstalk

Move from self-managed Apache/Nginx on VMs to a managed application platform. Auto-scaling, TLS termination, and deployment slots included.

VMs → Containers (ECS, AKS, GKE)

Containerize the application without rewriting. Use existing Dockerfiles or create them from the current deployment. Gains portability and density.

On-prem Queue → SQS / Azure Service Bus

Replace self-managed RabbitMQ or ActiveMQ with a cloud-managed message queue. Eliminates cluster management and provides built-in dead-letter handling.

R3: Refactor (Re-architect)

When to Refactor

Innovation and agility are critical for the business
Long-term strategy justifies the investment (6-18+ months)
Current architecture cannot scale to meet projected demand
Competitive pressure demands faster release cycles
Application is strategic and will be invested in for 5+ years

Characteristics

Effort: HIGH Cost: $$$ Timeline: 6-18+ months

Cloud Optimization: EXCELLENT

Pros: Full cloud-native benefits, auto-scaling, serverless options, independent deployability, dramatically lower long-term operational costs.

Cons: Highest risk and cost, requires skilled cloud engineers, long timeline, scope creep danger, feature parity challenges.

Common refactoring patterns:

Monolith → Microservices

Decompose a monolithic application into independently deployable services organized around business domains. Use the Strangler Fig pattern to migrate incrementally rather than as a big-bang rewrite.

Synchronous → Event-Driven

Replace synchronous request-response patterns with asynchronous event streaming (Kafka, EventBridge, Pub/Sub). Improves resilience, decouples services, and enables real-time data processing.

Server-Based → Serverless

Move discrete functions to Lambda, Azure Functions, or Cloud Functions. Best suited for event-triggered, bursty workloads. Eliminates idle compute cost and simplifies operations.

Batch → Stream Processing

Convert overnight batch ETL jobs to real-time streaming pipelines using Kinesis, Dataflow, or Flink. Reduces data latency from hours to seconds and eliminates batch-window constraints.

R4: Repurchase (Replace with SaaS)

When to Repurchase

The function is a commodity (CRM, email, HR, ITSM)
A mature commercial SaaS solution exists
Customization requirements are low to moderate
Total SaaS subscription cost is less than on-prem TCO
Vendor innovation pace exceeds internal capacity

Characteristics

Effort: MODERATE Cost: $$ Timeline: 2-4 months

Cloud Optimization: HIGH

Pros: No infrastructure to manage, vendor handles upgrades and security, fast deployment, predictable subscription cost.

Cons: Data migration complexity, user retraining, vendor lock-in, customization limits, ongoing subscription cost.

Common repurchase targets:

CRM → Salesforce / HubSpot

Replace custom or legacy CRM systems. Data migration of contacts, accounts, and opportunity history is the primary challenge.

Email → Microsoft 365 / Google Workspace

Retire on-prem Exchange or Lotus Notes. Includes calendar, contacts, and archive migration.

HR → Workday / BambooHR

Replace custom HR and payroll systems with cloud HCM platforms. Regulatory compliance is a key selection criterion.

ITSM → ServiceNow / Jira SM

Consolidate disparate help desk and ticketing tools to a single ITSM platform with workflow automation.

R5: Retain (Keep As-Is)

When to Retain

Application has a planned sunset date within 12-18 months
Migration cost exceeds the remaining lifetime value
Regulatory or compliance constraints prevent changes
Deep dependencies make migration sequence complex
No suitable replacement or migration path exists today
Application is under active vendor development (wait for next version)

Characteristics

Effort: NONE Cost: $0 migration Timeline: N/A

Cloud Optimization: NONE

Pros: Zero disruption, zero migration cost, avoids risk of change.

Cons: Ongoing maintenance and technical debt accumulation, deferred risk, may become more expensive to migrate later.

Requirement: Set a mandatory review date (6-12 months) to reassess. Retain is a deferral, not a permanent decision.

R6: Retire (Decommission)

When to Retire

Application has zero active users (validated by 30+ days of monitoring)
Functionality has been replaced by another system
Application is redundant due to M&A consolidation
Security liability exceeds the cost of decommission
Vendor support has ended and no extended support is available
Cost to maintain exceeds the value delivered

Characteristics

Effort: LOW-MODERATE Cost: $ (one-time) Timeline: 1-3 months

Pros: Eliminates all ongoing costs, reduces attack surface, simplifies the portfolio, frees infrastructure and staff capacity.

Cons: Requires data archival planning, stakeholder communication, and dependency verification. Section 07 covers the full decommission runbook.

Savings timeline: Typically the fastest path to realized cost savings because there is no new system to build or buy.

Strategy Comparison Matrix

Strategy	Effort	Timeline	Cost	Cloud Optimization	Long-Term Value
Rehost	Low	1-3 months	$	Low	Low-Medium
Replatform	Moderate	3-6 months	$$	Moderate	Medium
Refactor	High	6-18+ months	$$$	Excellent	High
Repurchase	Moderate	2-4 months	$$	High	High
Retain	None	N/A	$0	None	Low
Retire	Low	1-3 months	$	N/A	High (savings)

Migration Strategy Decision Tree

Walk through this decision tree for each application to arrive at the optimal R:

                    +---------------------------+
                    |   Is the application       |
                    |   actively used?            |
                    +---------------------------+
                       /                  \
                     YES                   NO
                     /                      \
          +------------------+       +------------------+
          | Does it need to  |       | Is data retention |
          | exist in-house?  |       | required?          |
          +------------------+       +------------------+
            /           \               /           \
          YES            NO           YES            NO
          /               \           /               \
  +--------------+  +-----------+  RETAIN         +---------+
  | Is it cloud- |  | SaaS alt  |  (archive       | RETIRE  |
  | ready as-is? |  | available?|  data, then     +---------+
  +--------------+  +-----------+  retire later)
    /        \        /       \
  YES        NO     YES       NO
  /           \     /          \
REHOST    +--------+ REPURCHASE  RETAIN
          | Worth  |             (revisit)
          | re-    |
          | arch?  |
          +--------+
           /      \
         YES      NO
         /         \
    REFACTOR    REPLATFORM

Real-World Portfolio Mix Patterns

In practice, organizations apply a mix of strategies across their portfolio. The ideal mix depends on organizational goals, timeline, and budget. Here are three common patterns:

Balanced

A measured approach that optimizes across all dimensions:

40% Rehost
30% Replatform
10% Refactor
10% Repurchase
10% Retire

Best for organizations with moderate timelines and a mix of strategic and commodity applications. Balances quick wins with long-term modernization.

Speed-First

Prioritizes migration velocity over optimization:

70% Rehost
20% Replatform
10% Retire

Best for data center exit deadlines, lease expirations, or compliance mandates. Accepts technical debt in exchange for speed. Plan a second wave of optimization after migration.

Optimization-First

Maximizes cloud-native value at higher cost and timeline:

40% Refactor
30% Replatform
20% Repurchase
10% Retire

Best for organizations with strong engineering teams, generous timelines, and a strategic mandate for digital transformation. Delivers highest long-term ROI.

Strategy Assignment Logic

Use assessment scores and dependency data to recommend the optimal migration strategy programmatically:

Pseudocode / Strategy Engine

# ================================================
# 6R Strategy Assignment Engine
# ================================================
# Inputs: TIME quadrant, technical fitness, business value,
#         dependency count, active users, annual TCO

def assign_strategy(app):
    """
    Recommend a 6R strategy based on assessment data.
    Returns primary recommendation and alternatives.
    """

    # ELIMINATE quadrant -> check if Retire or Retain
    if app.time_quadrant == "ELIMINATE":
        if app.active_users == 0 and app.upstream_deps <= 2:
            return "RETIRE"
        elif app.data_retention_required:
            return "RETAIN"   # Archive data first, then retire
        else:
            return "RETIRE"

    # TOLERATE quadrant -> Retain or Repurchase
    if app.time_quadrant == "TOLERATE":
        if app.saas_alternative_exists and app.tco > app.saas_cost:
            return "REPURCHASE"
        else:
            return "RETAIN"

    # MIGRATE quadrant -> Replatform or Refactor
    if app.time_quadrant == "MIGRATE":
        if app.technical_fitness < 2.0:
            return "REFACTOR"          # Too degraded to just move
        elif app.can_containerize and app.managed_db_compatible:
            return "REPLATFORM"
        else:
            return "REFACTOR"

    # INVEST quadrant -> Rehost or Replatform
    if app.time_quadrant == "INVEST":
        if app.cloud_ready and app.deadline_months <= 3:
            return "REHOST"            # Speed priority
        elif app.technical_fitness >= 4.0:
            return "REPLATFORM"        # Already solid; optimize
        else:
            return "REPLATFORM"

    return "RETAIN"  # Default: revisit later

Automated recommendations require human validation Strategy assignment logic provides a starting point, not a final answer. Every recommendation should be reviewed by an architect and the application owner before committing resources. Edge cases, political factors, and strategic pivots cannot be captured in a scoring algorithm.

Decommissioning Runbook

The step-by-step operational guide for shutting down systems safely. A disciplined, phased approach prevents cascading failures, data loss, and compliance violations. Every decommission should follow this runbook.

Phased Approach Overview

Decommissioning is not a single event but a multi-phase process spanning weeks or months. Each phase has defined gates that must be passed before proceeding.

Pre-Decommission Planning (T-90 to T-30 days)

Identify all stakeholders, document dependencies, confirm replacement systems are operational, and build the detailed decommission plan. This phase consumes the most calendar time but prevents costly surprises later.

Stakeholder Notification (T-30 days)

Formally notify all affected users, business owners, vendor contacts, and downstream system owners. Provide migration paths, training schedules, and support contacts. Document acknowledgments.

Data Handling & Archival (T-14 days)

Execute the data retention plan: archive records per regulatory requirements, migrate active data to replacement systems, verify backup integrity with test restores, and document chain of custody.

Decommission Day Execution (T-0)

Execute the shutdown sequence: disable user access, stop application services, revoke credentials, remove DNS records, tear down infrastructure, and update the CMDB. Status updates every 2 hours to stakeholders.

Post-Shutdown Monitoring (T+1 to T+90 days)

Monitor for unexpected failures in dependent systems, track support tickets related to the decommissioned application, validate that cost savings materialize, and conduct lessons-learned reviews.

Pre-Decommission Checklist

Complete every item before scheduling the Go/No-Go meeting. Skipping items is the leading cause of decommission rollbacks.

Stakeholder sign-off obtained (business owner, IT, legal, compliance) All four groups must sign; a missing legal sign-off can halt the entire process
All dependencies verified and removed or rerouted Cross-reference dependency map from Section 05 with current traffic analysis
Data backup completed and verified with test restore A backup that has never been restored is not a backup
Data archival plan executed per retention policy Confirm retention periods meet regulatory requirements (see Section 08)
Replacement system confirmed operational and load-tested Run parallel operations for a minimum of 2 weeks before cutover
User migration and retraining complete Track completion rates; aim for 95%+ before scheduling decommission
Vendor notifications sent per contract timeline Most contracts require 30-90 days written notice; check early termination clauses
License termination dates scheduled Align with contract renewal dates to avoid paying for unused periods
DNS TTL lowered 24-48 hours before decommission Allows rapid propagation of record removals on decommission day
Monitoring and alerting reconfigured Remove old checks; add new alerts for dependent systems that may break
Rollback plan documented and tested Include snapshot locations, restore procedures, and decision criteria
Communication sent to all affected users Include decommission date, replacement system URL, and support contact
Go/No-Go meeting scheduled with all required attendees Schedule at least 48 hours before decommission window; prepare decision matrix

DNS & Certificate Cleanup

Stale DNS records and expired certificates are security risks and operational debt. Clean them up as part of the decommission.

AWS Route53 — DNS Record Deletion

# List hosted zones to find the target zone
aws route53 list-hosted-zones --query "HostedZones[*].[Id,Name]" --output table

# Export current records for audit trail before deletion
aws route53 list-resource-record-sets \
  --hosted-zone-id Z0123456789ABCDEF \
  --output json > /backup/dns-records-$(date +%Y%m%d).json

# Delete A record for the decommissioned application
aws route53 change-resource-record-sets \
  --hosted-zone-id Z0123456789ABCDEF \
  --change-batch '{
    "Changes": [{
      "Action": "DELETE",
      "ResourceRecordSet": {
        "Name": "legacy-app.example.com",
        "Type": "A",
        "TTL": 300,
        "ResourceRecords": [{"Value": "10.0.1.50"}]
      }
    }]
  }'

# Delete CNAME records pointing to the decommissioned host
aws route53 change-resource-record-sets \
  --hosted-zone-id Z0123456789ABCDEF \
  --change-batch '{
    "Changes": [{
      "Action": "DELETE",
      "ResourceRecordSet": {
        "Name": "app.example.com",
        "Type": "CNAME",
        "TTL": 300,
        "ResourceRecords": [{"Value": "legacy-app.example.com"}]
      }
    }]
  }'

# Verify deletion
aws route53 list-resource-record-sets \
  --hosted-zone-id Z0123456789ABCDEF \
  --query "ResourceRecordSets[?Name=='legacy-app.example.com.']"

Certificate Revocation

# Let's Encrypt — revoke and delete certificate
certbot revoke --cert-name legacy-app.example.com --reason cessationofoperation
certbot delete --cert-name legacy-app.example.com

# AWS ACM — delete certificate (must be disassociated from all resources first)
aws acm list-certificates --query "CertificateSummaryList[?DomainName=='legacy-app.example.com']"
aws acm delete-certificate --certificate-arn arn:aws:acm:us-east-1:123456789012:certificate/abc-def-123

# Internal PKI — revoke via OpenSSL
openssl ca -revoke /etc/pki/CA/newcerts/legacy-app.pem -config /etc/pki/CA/openssl.cnf
openssl ca -gencrl -out /etc/pki/CA/crl/ca.crl -config /etc/pki/CA/openssl.cnf

# Verify certificate is no longer valid
openssl s_client -connect legacy-app.example.com:443 -servername legacy-app.example.com 2>/dev/null | \
  openssl x509 -noout -dates 2>/dev/null || echo "Certificate no longer served"

Credential Revocation

Every credential associated with the decommissioned system must be revoked. Orphaned credentials are a top attack vector.

AWS IAM — Role & Policy Cleanup

# Identify IAM roles used by the application
aws iam list-roles --query "Roles[?contains(RoleName, 'legacy-app')].[RoleName,Arn]" --output table

# Detach all managed policies from the role
aws iam list-attached-role-policies --role-name legacy-app-role \
  --query "AttachedPolicies[*].PolicyArn" --output text | \
  xargs -I {} aws iam detach-role-policy --role-name legacy-app-role --policy-arn {}

# Delete inline policies
aws iam list-role-policies --role-name legacy-app-role \
  --query "PolicyNames[]" --output text | \
  xargs -I {} aws iam delete-role-policy --role-name legacy-app-role --policy-name {}

# Remove instance profiles
aws iam remove-role-from-instance-profile \
  --instance-profile-name legacy-app-profile --role-name legacy-app-role
aws iam delete-instance-profile --instance-profile-name legacy-app-profile

# Delete the role
aws iam delete-role --role-name legacy-app-role

Kubernetes — Service Account Removal

# List service accounts in the application namespace
kubectl get serviceaccounts -n legacy-app -o wide

# Remove RBAC bindings
kubectl delete clusterrolebinding legacy-app-binding
kubectl delete rolebinding legacy-app-binding -n legacy-app

# Delete the service account
kubectl delete serviceaccount legacy-app-sa -n legacy-app

# Delete the namespace (removes all remaining resources)
kubectl delete namespace legacy-app --grace-period=60

Database User Cleanup

-- PostgreSQL: revoke and drop application user
REVOKE ALL PRIVILEGES ON ALL TABLES IN SCHEMA public FROM legacy_app_user;
REVOKE ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public FROM legacy_app_user;
REVOKE CONNECT ON DATABASE appdb FROM legacy_app_user;
DROP USER IF EXISTS legacy_app_user;

-- MySQL: revoke and drop
REVOKE ALL PRIVILEGES, GRANT OPTION FROM 'legacy_app_user'@'%';
DROP USER IF EXISTS 'legacy_app_user'@'%';
FLUSH PRIVILEGES;

Secrets Manager Cleanup

# AWS Secrets Manager — schedule deletion (7-30 day recovery window)
aws secretsmanager delete-secret \
  --secret-id legacy-app/db-credentials \
  --recovery-window-in-days 30

aws secretsmanager delete-secret \
  --secret-id legacy-app/api-keys \
  --recovery-window-in-days 30

# HashiCorp Vault — revoke and delete
vault lease revoke -prefix secret/legacy-app/
vault kv metadata delete secret/legacy-app/db-credentials
vault kv metadata delete secret/legacy-app/api-keys
vault policy delete legacy-app-policy

Infrastructure Teardown

Remove infrastructure in reverse dependency order: load balancers first, then compute, then storage. Always deregister before deleting.

Load Balancer — Target Deregistration

# AWS ALB — deregister targets and delete target group
aws elbv2 describe-target-groups \
  --query "TargetGroups[?contains(TargetGroupName, 'legacy-app')].[TargetGroupArn]" --output text

aws elbv2 deregister-targets \
  --target-group-arn arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/legacy-app/abc123 \
  --targets Id=i-0abc123def456,Port=8080

# Remove listener rules pointing to the target group
aws elbv2 delete-rule --rule-arn arn:aws:elasticloadbalancing:us-east-1:123456789012:listener-rule/app/main-alb/abc/def/rule123

# Delete target group after deregistration
aws elbv2 delete-target-group \
  --target-group-arn arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/legacy-app/abc123

# Azure LB — remove backend pool member
az network lb address-pool address remove \
  --resource-group rg-legacy --lb-name lb-legacy \
  --pool-name legacy-app-pool --name legacy-vm-1

Firewall Rule Cleanup

# AWS Security Group — revoke ingress/egress and delete
aws ec2 describe-security-groups \
  --filters "Name=group-name,Values=legacy-app-sg" \
  --query "SecurityGroups[*].[GroupId,GroupName]" --output table

aws ec2 revoke-security-group-ingress --group-id sg-0abc123 \
  --ip-permissions '[{"IpProtocol":"tcp","FromPort":443,"ToPort":443,"IpRanges":[{"CidrIp":"0.0.0.0/0"}]}]'

aws ec2 delete-security-group --group-id sg-0abc123

# Azure NSG — remove rules and delete
az network nsg rule delete --resource-group rg-legacy --nsg-name nsg-legacy --name allow-legacy-app
az network nsg delete --resource-group rg-legacy --name nsg-legacy

# iptables — remove application-specific rules (document before removing)
iptables -L -n --line-numbers | grep "legacy-app"
iptables -D INPUT 12    # Remove by line number after verification

Compute Instance Termination

# AWS EC2 — create final AMI snapshot, then terminate
aws ec2 create-image --instance-id i-0abc123def456 \
  --name "legacy-app-final-$(date +%Y%m%d)" --no-reboot

# Wait for AMI to be available
aws ec2 wait image-available --image-ids ami-0xyz789

# Terminate instances
aws ec2 terminate-instances --instance-ids i-0abc123def456 i-0def789abc012

# Azure VM — deallocate and delete
az vm deallocate --resource-group rg-legacy --name legacy-vm-1
az vm delete --resource-group rg-legacy --name legacy-vm-1 --yes

Storage Volume Cleanup

# AWS EBS — snapshot before deletion
aws ec2 create-snapshot --volume-id vol-0abc123 \
  --description "legacy-app-final-$(date +%Y%m%d)" \
  --tag-specifications 'ResourceType=snapshot,Tags=[{Key=Retention,Value=90days}]'

aws ec2 wait snapshot-completed --snapshot-ids snap-0xyz789

# Delete the volume after snapshot confirmation
aws ec2 delete-volume --volume-id vol-0abc123

# AWS S3 — empty and delete application bucket
aws s3 rm s3://legacy-app-data --recursive
aws s3 rb s3://legacy-app-data

# Azure Managed Disk — snapshot and delete
az snapshot create --resource-group rg-legacy --name legacy-disk-snap \
  --source /subscriptions/.../disks/legacy-data-disk
az disk delete --resource-group rg-legacy --name legacy-data-disk --yes

Rollback Plan Requirements

Every decommission must have a documented rollback plan. Hope is not a strategy.

Snapshot & Backup Retention

Retain VM snapshots and AMIs for minimum 30 days post-decommission
Keep database backups for 90 days (covers quarter-end processing)
Store configuration exports (Terraform state, Ansible playbooks) indefinitely in version control
Tag all retained artifacts with decommission date and retention expiry

Rollback Decision Criteria

Rollback window: within 72 hours of decommission for full restore
Partial rollback (data only): available for 30 days
Trigger conditions: critical dependency failure, data loss discovery, regulatory finding
Approval required from: IT Lead + Business Owner (2-person rule)

Document re-provisioning procedures with estimated restore times:

Full Infrastructure Restore

Re-provision from AMI/snapshot + restore database from backup + reconfigure DNS + reinstate firewall rules. Estimated time: 4-8 hours.

Data-Only Restore

Restore database backup to a new instance + reconnect to replacement system or temporary read-only interface. Estimated time: 1-3 hours.

Configuration Restore

Re-apply Terraform/Ansible configs from version control + redeploy application containers. Estimated time: 2-4 hours.

Go/No-Go Decision

The Go/No-Go meeting is the final gate before execution. All four criteria must be met. A single failure triggers a postponement, not a partial proceed.

Dependency Proof

All application interfaces have been validated. No active connections remain. Traffic analysis (minimum 30 days) confirms zero inbound requests from production systems.

Data Proof

Archives verified and accessible. Retention periods meet or exceed regulatory requirements. Test restores completed successfully. Chain of custody documented.

Communication Proof

All stakeholders have been notified with acknowledgment. Users have been migrated to replacement systems. Vendor termination notices sent per contract requirements.

Rollback Proof

Recovery plan documented with assigned owners. Snapshots and backups verified. Rollback procedures tested in staging environment. Restore time estimates validated.

Never decommission on a Friday or before a holiday Decommission failures surface within 24-72 hours. If the team is unavailable for rapid response, a minor issue becomes a major incident. Schedule decommissions for Tuesday or Wednesday, with the full operations team available through end-of-week.

Data Retention & Compliance

Regulatory requirements and data handling during decommissioning. Getting retention wrong exposes the organization to fines, litigation risk, and audit failures. This section maps frameworks to actionable procedures.

Regulatory Framework Overview

Every application handles data subject to one or more regulatory frameworks. Identify applicable frameworks before planning any data destruction.

Framework	Retention Period	Key Requirements	Penalty
GDPR	Varies (minimum necessary)	Right to erasure (Art. 17), data minimization, lawful basis required for retention	Up to 4% global revenue
HIPAA	6 years	PHI protection, audit trails, BAA requirements, breach notification	$100–$50K per violation
SOX	7 years	Financial records, audit trails, internal controls, whistleblower protections	Criminal penalties
PCI DSS	1 year (logs), varies (data)	Cardholder data protection, secure deletion, key management	Fines + loss of processing
FedRAMP	90d online, 12mo active, 18mo cold	Audit log retention, secure decommissioning, media sanitization	Loss of authorization
CCPA/CPRA	Varies	Consumer deletion rights, data inventory, opt-out mechanisms	$2,500–$7,500 per violation

Data Classification During Decommission

Classify all data in the decommissioned system before choosing archival or destruction methods. Different data classes have different handling requirements.

PII — Personally Identifiable Information

Names, addresses, SSNs, email addresses, phone numbers, financial account numbers. Subject to GDPR, CCPA/CPRA, and state privacy laws. Requires encryption at rest and in transit, access logging, and certified destruction.

PHI — Protected Health Information

Medical records, diagnoses, treatment plans, insurance IDs, lab results. Subject to HIPAA with mandatory 6-year retention. Requires BAA with any vendor handling PHI. Breach notification within 60 days.

Financial Records

Transaction logs, general ledgers, tax records, audit trails, invoices. Subject to SOX (7 years) and IRS requirements. Must maintain audit trail integrity through the entire retention period. Immutable storage recommended.

General Business Data

Internal communications, project documents, configuration data, non-sensitive logs. Standard retention policies apply (typically 3-5 years). Lower handling requirements but still subject to legal hold and e-discovery obligations.

Archival Strategies

Choose the storage tier based on access frequency requirements and cost constraints. Data that must be queryable needs hot or warm storage; data retained only for compliance can go cold or offline.

Hot Storage

Active access with full query capability. AWS S3 Standard, Azure Hot, GCS Standard. Use for data that may be needed within minutes. Highest cost but lowest latency. Typical use: first 90 days post-decommission.

Warm Storage

Occasional access with slightly slower retrieval. AWS S3 Infrequent Access, Azure Cool, GCS Nearline. Use for data accessed monthly or less. 30-day minimum storage charge. Typical use: 90 days to 1 year.

Cold Storage

Rare access with hours for retrieval. AWS S3 Glacier, Azure Archive, GCS Coldline/Archive. Retrieval takes 1-12 hours. Lowest cloud cost. Typical use: 1-7 year regulatory retention.

Offline Storage

Physical media stored in secure vaults. Tape (LTO), offline disk arrays, AWS Snowball exports. Air-gapped from network. Use for maximum security or very long retention (7+ years). Requires physical retrieval procedures.

Legal Hold Considerations

A legal hold (litigation hold) is a directive to preserve all potentially relevant data when litigation is reasonably anticipated. Legal holds override normal retention and deletion policies.

What Triggers a Legal Hold

Pending or threatened litigation
Government investigation or subpoena
Regulatory audit or inquiry
Internal investigation (fraud, harassment, compliance)
Contractual dispute with active negotiation

Legal Hold Requirements

Check BEFORE any data deletion — consult Legal for active holds
Litigation freeze overrides normal retention policies
Preserve data in its original format and location when possible
Document all preservation actions with timestamps
Release procedures: Legal must issue written release before any destruction

Always check for legal holds before destroying data Destroying data subject to a legal hold constitutes spoliation of evidence. Consequences include adverse inference instructions (the court assumes the destroyed data was harmful to your case), monetary sanctions, and in extreme cases, criminal contempt charges. When in doubt, preserve.

Secure Data Destruction (NIST 800-88)

NIST Special Publication 800-88 defines three levels of media sanitization. Choose the level based on data classification and the media's next destination.

Clear — Software Overwrite

Protects against simple, non-invasive data recovery. Suitable for media being reused within the same security domain. Uses standard read/write commands.

Purge — Cryptographic or Physical Techniques

Protects against laboratory-level data recovery. Required for media leaving the organization's control. Includes cryptographic erasure, SSD secure erase, and degaussing.

Destroy — Physical Destruction

Renders media completely unusable. Required for highest-classification data. Includes physical shredding, disintegration, and incineration. Third-party destruction must include certificate of destruction.

Clear — Software Overwrite Commands

# Linux — overwrite with random data (3-pass)
shred -vzn 3 /dev/sdX

# Linux — single-pass zero fill (faster, acceptable for most use cases)
dd if=/dev/zero of=/dev/sdX bs=4M status=progress

# Windows — SDelete (Sysinternals)
sdelete -z -p 3 D:\legacy-app-data\

# Verify overwrite (should show only zeros)
hexdump -C /dev/sdX | head -20

Purge — Cryptographic Erasure & Secure Erase

# Cryptographic erasure — destroy the encryption key
# (Requires data to have been encrypted at rest)
aws kms schedule-key-deletion --key-id alias/legacy-app-key --pending-window-in-days 7

# SSD Secure Erase via hdparm
hdparm --user-master u --security-set-pass DecomPW /dev/sdX
hdparm --user-master u --security-erase DecomPW /dev/sdX

# NVMe Secure Erase
nvme format /dev/nvme0n1 --ses=1    # User Data Erase
nvme format /dev/nvme0n1 --ses=2    # Cryptographic Erase

# Degaussing: use NSA/CSS EPL-listed degausser for magnetic media
# (No software command — physical device required)

Destroy — Physical Destruction Methods

# Physical destruction is performed by certified vendors
# Document the following for each piece of media:

# 1. Media serial number and type (HDD, SSD, tape, etc.)
# 2. Destruction method (shredding, disintegration, incineration)
# 3. Destruction standard met (NIST 800-88, DoD 5220.22-M)
# 4. Vendor name and certification number
# 5. Date and time of destruction
# 6. Witness name(s) and signature(s)
# 7. Certificate of destruction reference number

# Approved destruction methods by media type:
# HDD:  Shredding (cross-cut to <2mm), degaussing + shredding, incineration
# SSD:  Shredding (cross-cut to <2mm), disintegration
# Tape: Degaussing + shredding, incineration
# Optical: Shredding, incineration

Chain of Custody Documentation

Maintain an auditable chain of custody for every piece of data handled during decommissioning. This record is your proof of compliance during audits.

Record Element	Description	Example
Data Inventory	Complete listing of all data assets with classification level	appdb: 2.3TB, PII (GDPR), 847 tables
Sanitization Method	NIST 800-88 level and specific technique used	Purge: cryptographic erasure via AWS KMS key deletion
Tool & Version	Software or device used for sanitization	shred (GNU coreutils 8.32), Garner HD-3WXL degausser
Personnel	Names and roles of individuals who performed and witnessed	Performed: J. Smith (SRE), Witnessed: M. Jones (Security)
Timestamp	Date and time of each action in UTC	2025-03-15T14:30:00Z
Certificate of Destruction	Formal document from vendor (physical) or system log (digital)	CoD-2025-0315-001, signed by Iron Mountain
Verification Results	Confirmation that sanitization was successful	Post-wipe hex dump verified: all zeros, no recoverable data

Retention Schedule Template

Use this template to document retention commitments for each data category in the decommissioned system. Fill in during the T-14 data handling phase.

Data Category	Classification	Regulatory Basis	Retention Period	Storage Tier	Destruction Date
User account records	PII	GDPR Art. 17, CCPA	Delete at decommission	N/A	T-0
Transaction history	Financial	SOX, IRS	7 years	Cold (Glacier)	T+7 years
Audit logs	Operational	PCI DSS, SOX	7 years	Cold (Glacier)	T+7 years
Application logs	Operational	Internal policy	1 year	Warm (S3 IA)	T+1 year
Medical records	PHI	HIPAA	6 years	Cold (encrypted)	T+6 years
Cardholder data	PCI	PCI DSS	Delete at decommission	N/A (purge)	T-0
Configuration backups	Internal	Internal policy	90 days	Hot (S3 Standard)	T+90 days

Retention is a floor, not a ceiling Regulatory retention periods are minimums. Legal holds, contractual obligations, and business needs may extend retention beyond regulatory requirements. Always check with Legal before setting destruction dates.

Stakeholder Management

Communication, governance, and change management for decommissioning programs. Decommissioning fails when treated as a purely technical exercise. Success requires cross-functional alignment, clear communication, and executive sponsorship.

RACI Matrix

Define clear accountability for each phase of the decommissioning lifecycle. R = Responsible (does the work), A = Accountable (makes the decision), C = Consulted (provides input), I = Informed (kept in the loop).

Activity	Executive Sponsor	IT Lead	Business Owner	Legal/Compliance	Operations
Portfolio Assessment	I	R	A	C	C
Business Case	A	R	C	C	I
Data Archival	I	R	C	A	R
Decommission Execution	I	A	C	C	R
Post-Validation	I	R	A	C	R

Communication Plan

Structured, timeline-driven communication prevents surprises and builds organizational buy-in. Every milestone has a defined audience, message, and channel.

T-90

Internal Announcement

Notify IT leadership and business unit leads. Share rationale, timeline, and expected impact. Solicit early concerns and identify potential blockers. Channel: email + leadership meeting.

T-60

Impact Assessment Distribution

Share detailed impact assessment with all stakeholders. Include dependency analysis, user migration plan, and training schedule. Channel: shared document + stakeholder review meeting.

T-30

User Notification

Notify all end users with specific decommission date, alternative solutions, training schedule, and support contacts. Channel: email blast + in-app banner + intranet announcement.

T-14

Final Reminder

Send final reminder with confirmed decommission date, support surge plan, and escalation contacts. Confirm user migration completion rates. Channel: email + Slack/Teams announcement.

T-7

Go/No-Go Meeting

Final confirmation or escalation. Review all four Go/No-Go criteria (Section 07). Document decision and any conditional approvals. Channel: video conference with all required attendees.

T-0

Decommission Execution Updates

Status updates every 2 hours during execution window. Report progress, issues, and rollback decisions in real time. Channel: war room (virtual or physical) + status email.

T+1

Confirmation & Support Surge

Send confirmation email to all stakeholders. Activate support surge team for increased ticket volume. Monitor dependent systems for unexpected failures. Channel: email + help desk escalation.

T+30

Lessons Learned & Savings Report

Conduct retrospective review. Document what worked, what failed, and improvements for the next decommission. Publish realized cost savings vs. projections. Channel: report + all-hands presentation.

Change Advisory Board (CAB) Process

Decommissions are changes and must go through your organization's change management process. Here is what CAB typically requires for a decommission request.

Submission Requirements

Impact assessment: systems affected, user count, business processes disrupted
Rollback plan: documented and tested with estimated restore time
Approvals: business owner, IT lead, legal/compliance sign-off
Implementation plan: step-by-step execution with assigned owners
Communication plan: who was notified, when, and through which channel

Review Criteria & Risk Scoring

Scope: number of users, interfaces, and data volumes affected
Reversibility: can it be rolled back? How quickly?
Timing: does it conflict with change freezes, quarter-end, or peak periods?
Dependencies: are all upstream/downstream systems accounted for?
Risk score: Low (auto-approve), Medium (CAB review), High (executive approval)

Vendor Notification

Vendor relationships require careful unwinding. Contract terms, data export obligations, and early termination penalties must all be addressed before the decommission date.

Contract Termination Clauses

Review notice period requirements (typically 30-90 days written notice). Identify auto-renewal dates and cancellation windows. Flag early termination fees and negotiate where possible.

Data Export Requirements

Request data export in standard formats (CSV, JSON, SQL dump). Verify completeness of export against source record counts. Confirm vendor's data destruction timeline post-export.

Early Termination Assessment

Calculate remaining contract value vs. early termination fee. Compare against ongoing costs of maintaining the system. Present cost-benefit analysis to finance for approval.

Support Wind-Down

Negotiate reduced support tier during transition period. Ensure vendor support remains available through T+30 for data questions. Get written confirmation of service end date.

Vendor Notification Template Structure

# Vendor Notification — Formal Termination Letter
# ================================================

Subject: Notice of Service Termination — [Application Name]
Date:    [Current Date]
To:      [Vendor Account Manager]
From:    [Organization Procurement Lead]
CC:      [IT Lead], [Legal Counsel], [Business Owner]

# Section 1: Termination Notice
- Reference contract number and effective date
- State intent to terminate per Section [X] of the agreement
- Specify requested termination date (minimum notice period)

# Section 2: Data Export Request
- Request complete data export in [format] by [date]
- Specify data categories to be exported
- Request written confirmation of export completeness
- Request vendor data destruction certificate post-export

# Section 3: Financial Settlement
- Request final invoice and pro-rated credits
- Address any early termination fees per contract terms
- Confirm return of any prepaid amounts

# Section 4: Transition Support
- Request continued access for [X] days post-termination
- Specify support level needed during transition
- Identify key vendor contacts for transition questions

# Section 5: Confirmation
- Request written acknowledgment within 10 business days
- Provide point of contact for vendor questions

Resistance Management

Resistance to decommissioning is natural and predictable. Users fear change, teams fear loss of control, and leaders fear disruption. Address resistance proactively with these strategies.

Identify Champions Early

Find power users and team leads who understand the rationale for decommissioning. Involve them in replacement system design and testing. Their endorsement carries more weight than any executive mandate. Offer them early access to new tools and recognition for their role in the transition.

Provide Clear Migration Paths

Never decommission without a concrete alternative. Document step-by-step migration guides for every user workflow. Provide hands-on training sessions (not just documentation). Offer office hours and dedicated support during the first 30 days post-migration.

Show the Business Case

Share the financial rationale in terms users understand: cost savings reinvested in better tools, reduced security risk, fewer outages. Quantify the pain of the status quo (maintenance hours, security vulnerabilities, user complaints about the old system).

Establish Feedback Channels

Create dedicated channels (Slack/Teams, feedback forms, regular town halls) for users to voice concerns. Respond to every piece of feedback, even if the answer is "we understand but the decision stands." Unheard resistance goes underground and becomes sabotage.

Executive Reporting Template

Provide leadership with a consistent, concise view of the decommissioning program. Update this report monthly or at each program milestone.

Decommissioning Program Status Report

Report Element	Description	Format
Program Status	Overall health of the decommissioning program	Green / Amber / Red
Applications Decommissioned	Count completed this reporting period	Number + cumulative total
Cost Savings Realized	Actual savings vs. projected savings	Dollar amount + variance %
Upcoming Decommissions	Applications scheduled for next 30/60/90 days	Application list with target dates
Risks & Escalations	Blockers, delays, and issues requiring executive attention	Risk description + mitigation + owner
Key Decisions Required	Decisions that require executive sponsor approval	Decision description + options + recommendation

Decommissioning fails when treated as IT-only Programs that lack executive sponsorship and cross-functional governance have a 60%+ failure rate. Business owners must be accountable for their applications. Legal must validate data handling. Finance must track savings realization. IT executes, but governance is everyone's responsibility.

Post-Decommission

Validation, measurement, and continuous improvement after systems are retired. Decommissioning is not complete when the server is shut down — it is complete when every trace is cleaned up, savings are verified, and lessons feed back into the program.

Post-Shutdown Validation Checklist

Run through every item within 5 business days of shutdown. Incomplete validation leads to orphaned resources, phantom alerts, and cost leakage that erodes the savings you just earned.

System confirmed unreachable (ping, HTTP, DNS) Test from outside the network — internal DNS caches can mask stale records for hours
No orphaned cloud resources (EBS volumes, snapshots, S3 buckets, IP addresses) Unattached resources continue billing silently — check every region, not just the primary
No active network connections to decommissioned IPs Review flow logs and firewall connection tables for 48+ hours post-shutdown
All DNS records removed and verified via external DNS Use dig or nslookup against 8.8.8.8 and 1.1.1.1 to confirm propagation
Certificates revoked or expired Revoke immediately — do not wait for natural expiration if the system held sensitive data
Firewall rules cleaned up Remove allow-rules referencing decommissioned IPs to reduce attack surface
Load balancer targets deregistered Stale targets cause health check failures that pollute monitoring dashboards
CI/CD pipelines disabled or removed Orphaned pipelines can trigger builds to non-existent infrastructure
Monitoring and alerting removed (no phantom alerts) Phantom alerts cause alert fatigue and erode trust in the monitoring system
Service mesh configuration cleaned (Istio VirtualServices, etc.) Stale mesh routes cause connection timeouts for upstream callers
CMDB updated (status: Retired, retirement date recorded) If the CMDB still shows Active, the next audit will flag it as a discrepancy
Documentation archived with DEPRECATED labels Move to an archive folder in Confluence/SharePoint — do not delete, future teams may need context
License cancellations confirmed with vendors Contact procurement to verify cancellation — auto-renewals can silently re-activate
Cost savings appearing in financial reports Verify in the next billing cycle — lag between shutdown and cost disappearance varies by provider

Orphaned Resource Detection

Cloud providers do not automatically clean up dependent resources when you terminate an instance. These orphans accumulate cost silently. Run these checks immediately after decommission and again 30 days later.

AWS — Find Unattached EBS Volumes

# List all unattached EBS volumes (available = not attached to any instance)
aws ec2 describe-volumes \
  --filters Name=status,Values=available \
  --query 'Volumes[*].{ID:VolumeId,Size:Size,Created:CreateTime,AZ:AvailabilityZone}' \
  --output table

# Find unused Elastic IPs (not associated with any ENI)
aws ec2 describe-addresses \
  --query 'Addresses[?AssociationId==null].{IP:PublicIp,AllocID:AllocationId}' \
  --output table

# Find stale EBS snapshots older than 90 days with no associated volume
aws ec2 describe-snapshots \
  --owner-ids self \
  --query 'Snapshots[?StartTime<=`2025-01-01`].{ID:SnapshotId,Size:VolumeSize,Start:StartTime,Desc:Description}' \
  --output table

# Find orphaned S3 buckets (empty or last modified > 90 days ago)
for bucket in $(aws s3api list-buckets --query 'Buckets[].Name' --output text); do
  count=$(aws s3api list-objects-v2 --bucket "$bucket" --max-items 1 \
    --query 'KeyCount' --output text 2>/dev/null)
  if [ "$count" = "0" ] || [ "$count" = "None" ]; then
    echo "EMPTY: $bucket"
  fi
done

Azure — Find Orphaned Resources

# Find orphaned managed disks (not attached to any VM)
az disk list \
  --query "[?managedBy==null].{Name:name,RG:resourceGroup,Size:diskSizeGb,State:diskState}" \
  --output table

# Find unused public IP addresses
az network public-ip list \
  --query "[?ipConfiguration==null].{Name:name,RG:resourceGroup,IP:ipAddress,SKU:sku.name}" \
  --output table

# Find empty resource groups (no resources inside)
for rg in $(az group list --query "[].name" --output tsv); do
  count=$(az resource list --resource-group "$rg" --query "length([])" --output tsv)
  if [ "$count" = "0" ]; then
    echo "EMPTY RG: $rg"
  fi
done

# Find orphaned network interfaces
az network nic list \
  --query "[?virtualMachine==null].{Name:name,RG:resourceGroup,PrivateIP:ipConfigurations[0].privateIpAddress}" \
  --output table

GCP — Find Orphaned Resources

# Find unattached persistent disks
gcloud compute disks list \
  --filter="NOT users:*" \
  --format="table(name,zone,sizeGb,status,lastAttachTimestamp)"

# Find unused static external IPs
gcloud compute addresses list \
  --filter="status=RESERVED" \
  --format="table(name,region,address,status)"

# Find orphaned snapshots (source disk no longer exists)
gcloud compute snapshots list \
  --format="table(name,sourceDisk,diskSizeGb,creationTimestamp)" \
  --filter="creationTimestamp<'2025-01-01'"

Automate orphan detection Do not rely on manual checks. Schedule these scripts as weekly cron jobs or Lambda/Azure Functions. Pipe results to Slack or email. One missed orphaned volume costs more per year than the time to set up automation.

Cost Savings Tracking

Projected savings are promises. Realized savings are results. Track both rigorously to maintain executive trust and justify future rationalization investment.

Projected Savings

Total estimated annual savings from business case

Actual Savings

Verified reduction appearing in financial reports

License Reduction

Software license costs eliminated or renegotiated

Infra Reduction

Compute, storage, network costs eliminated

Support Reduction

Vendor support contracts terminated or reduced

0 FTE

FTE Reallocation

Staff hours redirected to higher-value work

Monthly Savings Tracker

Maintain this table for at least 12 months post-decommission. Finance should validate actuals against projections every quarter.

Month	Category	Projected	Actual	Variance
Month 1	Infrastructure	$12,000	$11,400	-5%
Month 1	Licensing	$8,500	$8,500	0%
Month 1	Support	$3,200	$0	-100%
Month 2	Infrastructure	$12,000	$12,100	+1%
Month 2	Licensing	$8,500	$8,500	0%
Month 2	Support	$3,200	$3,200	0%
Month 3	Infrastructure	$12,000	$11,800	-2%
Month 3	Licensing	$8,500	$8,500	0%
Month 3	Support	$3,200	$3,200	0%

Paper savings are not real savings — verify every line item A decommission that removes infrastructure but forgets to cancel the support contract has not saved money. A license that auto-renews because nobody notified procurement is a direct cost leak. Validate every projected saving against actual invoices, cloud bills, and vendor statements. If the number does not appear in a financial report, it is not a realized saving.

Lessons Learned Framework

Conduct a structured retrospective within 2 weeks of completing each decommission. Document findings in a shared knowledge base so future programs start from a stronger baseline.

Capture What Went Well

Identify practices, tools, and decisions that should be repeated. Was the timeline accurate? Did stakeholder communication prevent surprises? Were rollback procedures adequate? Document specific actions, not vague sentiments. Add successful patterns to the decommissioning runbook as standard steps.

Analyze What Went Wrong

Document failures with root cause analysis, not blame. Did a dependency get missed? Was data migration incomplete? Did a vendor refuse to cooperate? For each failure, identify the systemic cause (missing check, inadequate tooling, unclear ownership) and propose a specific process improvement to prevent recurrence.

Update What Surprised Us

Surprises reveal blind spots in the planning process. An unexpected dependency, a regulatory requirement discovered mid-execution, or a cost that appeared after shutdown all indicate gaps in the assessment framework. Feed these back into discovery templates and risk checklists for future projects.

Improve What to Change Next Time

Translate lessons into concrete runbook updates. Add new checklist items, adjust timeline estimates, revise communication templates, or introduce new validation steps. Every decommission should make the next one faster and more reliable. Assign owners and deadlines for each improvement action.

Continuous Rationalization Program

One-time rationalization projects deliver short-term wins. Sustainable portfolio health requires an ongoing program with regular cadence, governance controls, and measurable outcomes.

Review Cadence

Quarterly Portfolio Review

Review application health scores, usage trends, and cost anomalies. Identify new retirement candidates. Update the rationalization backlog with prioritized targets. Duration: 2-hour steering committee meeting with data pre-read.

Annual Deep Assessment

Full portfolio re-scoring against business strategy. Re-evaluate all Tolerate decisions. Refresh TCO models with current pricing. Align rationalization roadmap with enterprise architecture targets. Duration: 2-4 week assessment cycle.

New Application Intake Governance

Rationalization is futile if new applications enter the portfolio without controls. Every new application request must pass through an approval workflow before procurement or development begins.

Business Justification

Requestor submits business case with use case, expected users, alternatives evaluated, and estimated TCO. Must demonstrate that no existing application can meet the need.

Duplicate Check

Architecture review board validates against the application catalog. If overlapping capabilities exist, requestor must justify why existing tools are insufficient.

Security & Compliance Review

InfoSec evaluates data handling, authentication requirements, and regulatory implications. Legal reviews vendor contracts and data residency obligations.

Approval & Cataloging

If approved, the application is registered in the CMDB with a designated owner, support tier, review schedule, and exit criteria defined from day one.

Shadow IT Monitoring

Shadow IT undermines every rationalization effort. Applications purchased on corporate cards, SaaS tools signed up with work email, and departmental servers running under desks all expand the portfolio invisibly. Continuous detection is essential.

# SaaS Discovery via DNS/Proxy Logs
# Aggregate unique SaaS domains from proxy logs
awk -F'[ /]' '/CONNECT/{print $5}' /var/log/squid/access.log \
  | sort -u \
  | grep -E '\.(io|com|app|cloud|dev|net)$' \
  > /tmp/saas-domains.txt

# Cross-reference discovered domains against approved catalog
comm -23 \
  <(sort /tmp/saas-domains.txt) \
  <(sort /etc/approved-saas-catalog.txt) \
  > /tmp/shadow-it-candidates.txt

echo "Shadow IT candidates found: $(wc -l < /tmp/shadow-it-candidates.txt)"

# Cloud Account Discovery
# Find AWS accounts not in the organization
aws organizations list-accounts \
  --query 'Accounts[?Status==`ACTIVE`].{ID:Id,Name:Name,Email:Email}' \
  --output table

# CASB (Cloud Access Security Broker) Integration
# Most enterprises should deploy a CASB for automated shadow IT discovery:
# - Microsoft Defender for Cloud Apps
# - Netskope
# - Palo Alto Prisma SaaS
# These integrate with SSO, proxy, and endpoint agents for real-time visibility.

KPI Dashboard

Track these metrics at the program level to demonstrate value and identify areas needing attention.

Portfolio Size

Total active applications (trend: decreasing)

Retired This Quarter

Applications successfully decommissioned

Annual Savings

Realized cost savings (verified by finance)

0 days

Avg Cycle Time

Mean time from decision to completed decommission

Tech Debt Ratio

Improvement in technical debt score (trending down)

CMDB Accuracy

Percentage of applications with current, validated records

Program Maturity Model

Assess where your organization sits today and define a roadmap to the next level. Most enterprises begin at Level 1 or 2. Reaching Level 3 typically takes 12-18 months of sustained effort.

Level	Name	Characteristics
1	Ad Hoc	No formal process. Decommissioning happens reactively when hardware fails or contracts expire. No portfolio visibility. Savings are accidental, not measured. Shadow IT is rampant and undetected.
2	Defined	Documented rationalization process exists. Manual data collection and scoring. Decommissioning follows a checklist but execution is inconsistent. Basic cost tracking in spreadsheets. CMDB exists but accuracy is below 70%.
3	Managed	Regular quarterly cadence with executive sponsorship. Automated discovery tools deployed. Metrics tracked in dashboards. Intake governance prevents uncontrolled portfolio growth. CMDB accuracy above 85%. Average decommission cycle under 90 days.
4	Optimized	Fully automated discovery and continuous rationalization. Predictive analytics identify retirement candidates before they become problems. Real-time cost attribution. Shadow IT detected within 24 hours. CMDB accuracy above 95%. Decommissioning is a standard, low-friction operational process.

80% of CIOs recognize rationalization as critical, but only 20% have fully implemented a program. Be in the 20%. The gap between recognition and execution is where most organizations stall. Start with a single quarterly review cycle, retire 5 applications, measure the savings, and publish the results. Quick wins build the political capital to formalize the program. Perfection is the enemy of progress — a Level 2 program running consistently beats a Level 4 program stuck in planning.