The Scenario
Imagine you work at a company that builds online courses. You've got an in-house learning management system (LMS) hosting proprietary course content. Videos live in S3 buckets and various video providers, protected by a DRM service managed by a third-party vendor.
One day, someone walks into your office: "Before you joined, our team wasn't well-coordinated. It's possible the DRM provider also bundles video storage we don't use. Can you check if any of our videos ended up there?"
Simple question. Messy answer.
What Actually Happened
You dig in and discover: A founding team member uploaded videos to the vendor's storage as a temporary measure, then left. No documentation. No tracking.
Weeks later, the vendor experiences infrastructure issues. Their problems cascade upstream to your DRM service, which breaks your LMS. The chain of causation is so tangled it takes hours to figure out why everything went down.
Now you need to tell three different people about this. And each one needs a completely different explanation.
For Your Engineering Team: Technical Precision
Your team needs the exact failure chain. This goes in your internal KB.
INCIDENT: LMS Outage – Root Cause Analysis
TIMELINE:
- 14:32 UTC: DRM service returned 503 errors
- 14:45 UTC: LMS course playback failed for all users
- 15:10 UTC: Incident team identified DRM service logs showing upstream failures
- 15:30 UTC: Traced to [Third-Party Vendor] S3 storage degradation
ROOT CAUSE:
Videos stored in multiple undocumented locations:
- Primary: Company S3 bucket (monitored, failover)
- Undocumented: Vendor managed storage (no monitoring, no alerting)
IMPACT:
- 6-hour outage affecting 2,400 active learners
- ~$8K revenue loss
- 47 support escalations
RESOLUTION:
1. Consolidated all videos to primary S3 bucket
2. Deprovisioned vendor storage
3. Implemented CloudFront caching layer
4. Added automated inventory checks to CI/CD
PREVENTION:
- Infrastructure audit scheduled
- Documentation required for all storage decisions
- Monitoring alerts for upstream DRM health
This is detailed. Technical. It answers the "how" and "why" for people building the system.
For Your Management Team: Business Impact
Your non-technical stakeholders don't care about S3 buckets or DRM protocols. They care about impact and action.
SUBJECT: LMS Outage – Summary & Next Steps
We experienced a 6-hour LMS outage this afternoon affecting ~2,400 students.
WHAT HAPPENED:
Our video storage system had videos in two places instead of one. When the
secondary location had problems, it broke course delivery because of how our
DRM is configured. This secondary location wasn't documented when originally set up.
BUSINESS IMPACT:
- 2,400 students unable to access courses (6 hours)
- ~$8K estimated revenue impact
- 47 customer support tickets
WHAT WE'RE DOING:
- Consolidated storage to a single monitored location
- Added safeguards against configuration drift
- Conducting full infrastructure review
TIMELINE:
All systems operational as of 8:45 PM. No further issues expected.
Short. Factual. Focused on impact and next steps. Zero jargon.
For Your Customers: Accountability
Your customers don't care about your infrastructure. They care that their courses were down. Keep it brief, take responsibility, show you fixed it.
SUBJECT: Service Restored – Course Access
We experienced a service interruption this afternoon (2:30 PM – 8:45 PM UTC)
that prevented course access.
WHAT HAPPENED:
A configuration in our video delivery system became misaligned, affecting
course playback. We've identified and resolved the root cause.
WHAT WE DID:
- Restored full service at 8:45 PM
- Consolidated video storage to prevent similar issues
- Added monitoring to catch problems faster
FOR YOUR ACCOUNT:
You have not been charged for downtime. Your course progress is intact.
Resume immediately.
Thank you for your patience.
Short. Accountable. Solution-focused. No technical noise.
The Real Skill
You're dealing with the same incident. But:
- Your team needs technical details to prevent it happening again
- Your management needs business impact to make resource decisions
- Your customers need reassurance and accountability
Engineers who get promoted aren't always the ones who build the cleverest systems. They're the ones who can explain what they've built - and why it matters - to anyone in the room.
That's not soft skill theater. That's infrastructure.
TL;DR
Same incident, three audiences, three completely different explanations. Master that shift and you'll communicate like an engineer who actually ships things.
Tags: engineering communication · technical writing · stakeholder management · incident communication · career development · team leadership · infrastructure · documentation