DiscoverAWS Podcast#722: The Frugal Architect w/Werner Vogels: How Warner Bros. Discovery keeps streaming seamless
#722: The Frugal Architect w/Werner Vogels: How Warner Bros. Discovery keeps streaming seamless

#722: The Frugal Architect w/Werner Vogels: How Warner Bros. Discovery keeps streaming seamless

Update: 2025-05-26
Share

Digest

This AWS podcast episode (722) features Werner Vogels and Tom Leman (VP of SRE at Warner Bros. Discovery, WBD), exploring WBD's approach to building reliable and cost-effective streaming platforms on AWS. Leman details his role, emphasizing the importance of observability and operational intelligence in managing hundreds of microservices across nine AWS regions. WBD utilizes a standardized operational metadata schema to catalog cloud resources, improving visibility into system health, security, and cost management, directly impacting customer experience. Cost optimization is crucial, with "cost per subscriber" serving as a key metric. The discussion highlights the balance between frugality and avoiding short-sighted cost-cutting. The episode stresses aligning technical decisions with business goals, particularly in the context of post-merger integration, where a "best of both" approach was adopted. A "closed door" philosophy guides irreversible architectural decisions, particularly concerning databases. Finally, WBD's "celebration of error" approach to incident management emphasizes learning and improvement over blame, fostering knowledge sharing and process enhancements.

Outlines

00:00:00
Introduction and Site Reliability Engineering at WBD

Introduction to AWS podcast episode 722, featuring Werner Vogels and Tom Leman, VP of SRE at Warner Brothers Discovery. The episode focuses on WBD's approach to building reliable and cost-effective streaming platforms, including Tom Leman's description of his role and the importance of observability and operational intelligence in managing hundreds of microservices across nine AWS regions.

00:05:09
Operational Metadata, Customer Experience, and Cost Optimization

Discussion on WBD's operational metadata schema, improving visibility into system health, security, and cost management. This is linked to cost optimization strategies, using "cost per subscriber" as a key metric, and balancing frugality with maintaining customer experience.

00:22:23
Business Alignment, Merger Integration, and Architectural Decisions

The importance of aligning technical decisions with business goals is emphasized, particularly regarding the successful integration of two organizations after a merger using a "best of both" approach and the role of shared operational metadata. The "closed door" philosophy for irreversible architectural decisions is also explained.

00:40:01
Incident Management and Conclusion

WBD's "celebration of error" approach to incident management is detailed, focusing on learning from incidents and improving systems, processes, and knowledge sharing.

Keywords

Operational Metadata Schema


A standardized system for cataloging and organizing cloud resources, improving visibility into system health, security, and cost. Enables efficient resource management and streamlined operations.

Cost Per Subscriber


A key performance indicator (KPI) used to measure the efficiency of a streaming service. It balances cost growth with subscriber acquisition, providing a business-focused metric for cost optimization.

Celebration of Error


A positive approach to incident management that emphasizes learning and improvement from errors. Focuses on shared learnings, process improvements, and knowledge transfer.

Frugal Architecture


Designing and building systems that are cost-effective and efficient without compromising reliability or customer experience. Prioritizes long-term value and avoids short-sighted cost-cutting.

Site Reliability Engineering (SRE)


The discipline of ensuring the reliability and scalability of systems, particularly in large-scale cloud environments.

Microservices


An architectural style that structures an application as a collection of loosely coupled, independently deployable services.

AWS


Amazon Web Services, a comprehensive cloud platform providing various services for computing, storage, databases, and more.

Cloud Cost Optimization


Strategies and techniques for reducing cloud computing expenses while maintaining performance and reliability.

Merger Integration


The process of combining two or more organizations' IT systems and infrastructure after a merger or acquisition.

Q&A

  • How does WBD's operational metadata schema improve efficiency and cost management?

    The schema provides a standardized way to track and manage cloud resources, improving visibility into resource utilization, security vulnerabilities, and cost allocation. This allows for better resource optimization and proactive cost management.

  • What is the "celebration of error" approach to incident management, and what are its benefits?

    It focuses on learning from incidents to improve systems, processes, and knowledge sharing. It shifts the focus from blame to understanding and improvement, leading to a more positive and proactive approach to reliability.

  • How does WBD balance frugality with the need for reliable and scalable systems?

    WBD uses metrics like "cost per subscriber" to track efficiency, prioritizing long-term cost optimization without compromising customer experience. They also employ a "closed door" philosophy for irreversible architectural decisions.

  • How did WBD successfully integrate two large organizations after a merger?

    A "best of both" approach was adopted, with engineers from both organizations collaborating to create a new platform. The existing operational metadata schema from Discovery+ played a key role in standardizing the new system.

Show Notes

With only nine months to launch Max, Tom Leaman, VP of Site Reliability Engineering at Warner Bros. Discovery had to move fast to keep millions of viewers streaming smoothly. Learn about their innovative approach to measuring efficiency, managing global operations, and building resilient systems at massive scale with your hosts Simon Elisha and Dr. Werner Vogels.

Learn More: http://thefrugalarchitect.com/architects/tom-leaman-warner-bros-discovery.html
Comments 
In Channel
loading

Table of contents

00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

#722: The Frugal Architect w/Werner Vogels: How Warner Bros. Discovery keeps streaming seamless

#722: The Frugal Architect w/Werner Vogels: How Warner Bros. Discovery keeps streaming seamless

Amazon Web Services