writing / career

Communication Is a Skill Engineers Can't Afford to Ignore

Engineers often fail not because they can't code, but because they can't explain what broke and why. Here's how to communicate differently to your team, your management, and your customers.

Atharva Uday UndeAtharva Uday UndeMay 17, 20268 min read
engineering communicationtechnical writingstakeholder managementincident communicationcareer developmentteam leadershipinfrastructuredocumentation

The Scenario

Imagine you work at a company that builds online courses. You've got an in-house learning management system (LMS) hosting proprietary course content. Videos live in S3 buckets and various video providers, protected by a DRM service managed by a third-party vendor.

One day, someone walks into your office: "Before you joined, our team wasn't well-coordinated. It's possible the DRM provider also bundles video storage we don't use. Can you check if any of our videos ended up there?"

Simple question. Messy answer.


What Actually Happened

You dig in and discover: A founding team member uploaded videos to the vendor's storage as a temporary measure, then left. No documentation. No tracking.

Weeks later, the vendor experiences infrastructure issues. Their problems cascade upstream to your DRM service, which breaks your LMS. The chain of causation is so tangled it takes hours to figure out why everything went down.

Now you need to tell three different people about this. And each one needs a completely different explanation.


For Your Engineering Team: Technical Precision

Your team needs the exact failure chain. This goes in your internal KB.

INCIDENT: LMS Outage – Root Cause Analysis

TIMELINE:
- 14:32 UTC: DRM service returned 503 errors
- 14:45 UTC: LMS course playback failed for all users
- 15:10 UTC: Incident team identified DRM service logs showing upstream failures
- 15:30 UTC: Traced to [Third-Party Vendor] S3 storage degradation

ROOT CAUSE:
Videos stored in multiple undocumented locations:
- Primary: Company S3 bucket (monitored, failover)
- Undocumented: Vendor managed storage (no monitoring, no alerting)

IMPACT:
- 6-hour outage affecting 2,400 active learners
- ~$8K revenue loss
- 47 support escalations

RESOLUTION:
1. Consolidated all videos to primary S3 bucket
2. Deprovisioned vendor storage
3. Implemented CloudFront caching layer
4. Added automated inventory checks to CI/CD

PREVENTION:
- Infrastructure audit scheduled
- Documentation required for all storage decisions
- Monitoring alerts for upstream DRM health

This is detailed. Technical. It answers the "how" and "why" for people building the system.


For Your Management Team: Business Impact

Your non-technical stakeholders don't care about S3 buckets or DRM protocols. They care about impact and action.

SUBJECT: LMS Outage – Summary & Next Steps

We experienced a 6-hour LMS outage this afternoon affecting ~2,400 students.

WHAT HAPPENED:
Our video storage system had videos in two places instead of one. When the
secondary location had problems, it broke course delivery because of how our
DRM is configured. This secondary location wasn't documented when originally set up.

BUSINESS IMPACT:
- 2,400 students unable to access courses (6 hours)
- ~$8K estimated revenue impact
- 47 customer support tickets

WHAT WE'RE DOING:
- Consolidated storage to a single monitored location
- Added safeguards against configuration drift
- Conducting full infrastructure review

TIMELINE:
All systems operational as of 8:45 PM. No further issues expected.

Short. Factual. Focused on impact and next steps. Zero jargon.


For Your Customers: Accountability

Your customers don't care about your infrastructure. They care that their courses were down. Keep it brief, take responsibility, show you fixed it.

SUBJECT: Service Restored – Course Access

We experienced a service interruption this afternoon (2:30 PM – 8:45 PM UTC)
that prevented course access.

WHAT HAPPENED:
A configuration in our video delivery system became misaligned, affecting
course playback. We've identified and resolved the root cause.

WHAT WE DID:
- Restored full service at 8:45 PM
- Consolidated video storage to prevent similar issues
- Added monitoring to catch problems faster

FOR YOUR ACCOUNT:
You have not been charged for downtime. Your course progress is intact.
Resume immediately.

Thank you for your patience.

Short. Accountable. Solution-focused. No technical noise.


The Real Skill

You're dealing with the same incident. But:

  • Your team needs technical details to prevent it happening again
  • Your management needs business impact to make resource decisions
  • Your customers need reassurance and accountability

Engineers who get promoted aren't always the ones who build the cleverest systems. They're the ones who can explain what they've built - and why it matters - to anyone in the room.

That's not soft skill theater. That's infrastructure.


TL;DR

Same incident, three audiences, three completely different explanations. Master that shift and you'll communicate like an engineer who actually ships things.


Tags: engineering communication · technical writing · stakeholder management · incident communication · career development · team leadership · infrastructure · documentation