Practice of Cloud System Administration, The
DevOps and SRE Practices for Web Services, Volume 2

Authors:

Language: English

51.64 €

In Print (Delivery period: 14 days).

Add to cartAdd to cart
Publication date:
560 p. · 18x3 cm · Paperback

?There?s an incredible amount of depth and thinking in the practices described here, and it?s impressive to see it all in one place.?

?Win Treese, coauthor of Designing Systems for Internet Commerce

 

The Practice of Cloud System Administration, Volume 2, focuses on ?distributed? or ?cloud? computing and brings a DevOps/SRE sensibility to the practice of system administration. Unsatisfied with books that cover either design or operations in isolation, the authors created this authoritative reference centered on a comprehensive approach.

 

Case studies and examples from Google, Etsy, Twitter, Facebook, Netflix, Amazon, and other industry giants are explained in practical ways that are useful to all enterprises. The new companion to the best-selling first volume, The Practice of System and Network Administration, Second Edition, this guide offers expert coverage of the following and many other crucial topics:

 

Designing and building modern web and distributed systems

  • Fundamentals of large system design
  • Understand the new software engineering implications of cloud administration
  • Make systems that are resilient to failure and grow and scale dynamically
  • Implement DevOps principles and cultural changes
  • IaaS/PaaS/SaaS and virtual platform selection

Operating and running systems using the latest DevOps/SRE strategies

  • Upgrade production systems with zero down-time
  • What and how to automate; how to decide what not to automate
  • On-call best practices that improve uptime
  • Why distributed systems require fundamentally different system administration techniques
  • Identify and resolve resiliency problems before they surprise you

Assessing and evaluating your team?s operational effectiveness

  • Manage the scientific process of continuous improvement
  • A forty-page, pain-free assessment system you can start using today

 

  • Part I: Design: Building It
  • Chapter 1: Designing in a Distributed World
  • Chapter 2: Designing for Operations
  • Chapter 3: Selecting a Service Platform
  • Chapter 4: Application Architectures
  • Chapter 5: Design Patterns for Scaling
  • Chapter 6: Design Patterns for Resiliency
  • Part II: Operations: Running It
  • Chapter 7: Operations in a Distributed World
  • Chapter 8: DevOps Culture
  • Chapter 9: Service Delivery: The Build Phase
  • Chapter 10: Service Delivery: The Deployment Phase
  • Chapter 11: Upgrading Live Services
  • Chapter 12: Automation
  • Chapter 13: Design Documents
  • Chapter 14: Oncall
  • Chapter 15: Disaster Preparedness
  • Chapter 16: Monitoring Fundamentals
  • Chapter 17: Monitoring Architecture and Practice
  • Chapter 18: Capacity Planning
  • Chapter 19: Creating KPIs
  • Chapter 20: Operational Excellence

Thomas A. Limoncelli is an internationally recognized author, speaker, and system administrator with more than twenty years of experience at companies like Google, Bell Labs, and StackExchange.com.

 

Strata R. Chalup has more than twenty-five years of experience in Silicon Valley, focusing on IT strategy, best-practices, and scalable infrastructures at firms that include Apple, Sun, Cisco, McAfee, and Palm.

 

Christina J. Hogan has more than twenty years of experience in system administration and network engineering, from Silicon Valley to Italy and Switzerland. She has a master’s degree in computer science, a doctorate in aeronautical engineering, and has been part of a Formula 1 racing team.

The new industry-standard reference to best-practice cloud system administration

  • Indispensable for everyone who manages cloud systems for the enterprise or service providers, or plans to do so (a market of 1,400,000+ IT pros in the US alone)
  • Covers DevOps, administrative design patterns, scalability, reliability, automation, metrics, backup/restore, provisioning, documentation, operational hygiene, and more
  • Includes a full chapter on identifying and resolving resiliency problems before they bite you
  • The perfect complement and follow-up to Volume I on computing system administration
  • By three world-class experts in cloud system administration