|
Hi there,
Today we will talk about how Maersk’s NotPetya cyberattack brought global shipping operations to a halt and what business leaders can learn from its costly but remarkably fast recovery.
In June 2017, Maersk was hit by the NotPetya malware, and large parts of the company went dark. Ports stalled, systems failed, and teams switched to manual work just to keep cargo moving. Maersk later estimated the impact at $250 million to $300 million in losses. The recovery became a blueprint for crisis operations and was rebuilt from one surviving piece of identity infrastructure.
|
|
|
Executive Summary
NotPetya spread through a compromised software update and rapidly disabled Maersk’s IT environment. The company lost core systems across offices and terminals and had to restore operations under intense pressure. Maersk ultimately reported losses in the $250 million to $300 million range.
Maersk rebuilt by re-imaging roughly 45,000 PCs and 4,000 servers and restoring critical applications at speed. A key breakthrough came from finding an uninfected domain controller in Ghana, which helped re-establish identity and access. The incident reshaped how leaders think about cyber risk, business continuity, and operational resilience.
|
|
|
Background
Maersk runs one of the world’s largest shipping and logistics networks, so downtime has immediate physical consequences. A cyber incident does not just affect email and files. It delays containers, gate moves, and terminal throughput. That tight connection between digital systems and physical flows makes recovery speed a business requirement, not just an IT preference.
NotPetya was disguised as ransomware but was designed to be destructive, and it hit global firms far beyond its initial target. Maersk’s environment was broadly affected, which forced the company to operate with degraded tools while rebuilding its core systems. The company’s leadership had to coordinate technology restoration and customer-facing continuity at the same time.
|
|
|
The Business Challenge
1. Operational paralysis
When core systems failed, terminals and logistics workflows slowed sharply. Staff had to improvise with manual processes to move cargo and communicate status. The business faced immediate backlog risk because every hour of disruption could compound into days of congestion.
2. Identity and access collapse
If identity systems fail, everything else becomes harder to restore safely. Teams cannot authenticate users, push policies, or rejoin machines to the domain at scale. Rebuilding required a trusted root of access to avoid restoring chaos.
3. Scale of rebuild
The environment was huge, spanning many locations and thousands of devices. Re-imaging tens of thousands of endpoints and thousands of servers is normally a long-term program, not an emergency sprint. Maersk had to compress a months-long rebuild into days while continuing operations.
4. Customer trust under uncertainty
Shipping customers need accurate ETAs and predictable handoffs. During a crisis, incomplete information can trigger cancellations and rerouting. Maersk needed a consistent message and visible progress to keep relationships stable.
5. Financial exposure beyond IT
The losses were not limited to technical costs. They also included missed revenue, delays, and remediation across operations. Maersk estimated the total impact at $250 million to $300 million, which shows how cyber incidents can become major P&L events. The company needed controls that would reduce the probability and impact of future shocks.
|
|
|
The strategic moves
1. Stabilize the business with manual continuity
Maersk prioritized keeping terminals moving, even at reduced speed. Teams used workaround communication and paper-based processes to prevent a total shutdown. This bought time for IT to rebuild the core.
2. Rebuild identity first
The company focused on restoring the ability to authenticate users and control access. A surviving domain controller in Ghana became a critical recovery asset. With identity restored, rebuilding endpoints and servers became faster and safer.
3. Standardize the rebuild approach
Maersk used repeatable imaging and deployment steps to scale recovery. The goal was consistency and speed, not perfect customization on day one. Standardization reduced mistakes while thousands of devices were being restored.
4. Use speed as a trust signal
Every restored system reduced customer uncertainty and employee stress. Rapid progress also helped align global teams around a shared recovery narrative. Speed became part of reputation management, not just an operational goal.
5. Treat cyber resilience as a competitive advantage
The incident reframed security investment from a cost center into business protection. Leadership attention shifted toward hardening the operating model. The priority became reducing the blast radius and improving recovery time, not chasing perfect prevention.
|
|
|
Execution
1. Mass reinstallation and re-imaging
Maersk rebuilt by reinstalling about 45,000 PCs and 4,000 servers in a compressed timeframe. This required disciplined tooling, staffing, and sequencing across locations. It also required clear minimum viable operations target so critical workflows could be restored first.
2. Restore core applications in priority order
Teams focused first on the systems that moved containers, managed bookings, and controlled terminal operations. Non-essential services followed once the backbone was functioning. This sequencing protected throughput and reduced congestion risk.
3. Coordinate field teams and parts logistics
Recovery was not centralized in one office. It required coordinated work across ports, branches, and data centers. A clear command structure and escalation paths reduced delays and duplicate work.
4. Communicate continuously with customers and staff
Status updates kept customers informed about delays and expected recovery steps. Internal updates reduced rumor cycles and aligned priorities. Consistent messaging prevented confusion from becoming a second crisis.
5. Post-incident hardening and learning loops
Maersk used lessons from the incident to raise standards and strengthen controls. The company emphasized better segmentation, backups, and response readiness to reduce future impact. The goal was faster isolation and faster recovery if another shock occurred.
|
|
|
Results and Impact
1. Large but bounded financial damage
Maersk estimated the incident cost $250 million to $300 million. The figure reflects both operational disruption and remediation costs. It also created a clear internal benchmark for what acceptable risk can truly cost.
2. Proof that recovery speed is a strategy
The rebuild was unusually fast for an environment of this scale. Restoring core capability quickly limited downstream congestion and customer churn. Speed became part of the company’s resilience identity.
3. A concrete playbook for global continuity
Manual continuity plans were validated under pressure. Teams learned which processes could survive without systems and which could not. Those lessons improved future runbooks and drills.
4. Stronger executive attention to cyber risk
Cyber risk moved from being seen as an IT issue to a board-level operational risk. Leaders saw that supply-chain software and identity systems could bring down physical operations. Investment decisions became easier to justify when real numbers were attached.
5. An industry-wide warning signal
NotPetya showed how a regional trigger could create global damage through interconnected systems. It pushed more firms to treat segmentation and recovery time as core KPIs. The incident became a reference case for supply-chain cyber risk.
|
|
|
Lessons for Business Leaders
1. Identity is your recovery lever
If identity collapses, recovery slows everywhere. Protect it with segmentation, backups, and tested restore paths. Treat identity resilience as critical infrastructure.
2. Measure resilience like revenue
Track recovery time objectives and test them regularly. Make drills realistic and repeatable, not ceremonial. What gets measured gets funded and improved.
3. Standardization wins in emergencies
During a crisis, consistency beats perfection. Repeatable images, scripts, and processes scale faster than heroics. Standard rebuild paths reduce error rates under pressure.
4. Manual continuity must be designed, not guessed
Workarounds should be planned and practiced. Identify the few workflows that must continue without systems. Train those workflows so teams do not have to invent them during chaos.
5. Cyber risk is operational risk
Better mix, better margins, and better reinvestment reinforce each other. A turnaround sticks when it changes the economics, not only the headlines. Sustainable wins come from loops that get stronger over time.
|
|
|