In the process of building a monitoring and analytics platform that ingests trillions of data points a day, Datadog has learned many lessons about scalable, distributed systems in the cloud. We'd like to share those experiences with our community in this series: "Datadog on..."
Each episode will offer a conversation with the engineers who build Datadog. They'll share real-world experiences architecting, building, operating, and monitoring modern systems giving you actionable information you can apply at your organization. With plenty of time left for Q&A, we'd like you to join the discussion.
Datadog on Building Reliable Distributed Applications Using Temporal
October 29, 2024
Allen George, Loic Minaudier and Ara Pulido
Temporal is an open source platform to build resilient and reliable distributed systems. Datadog started using Temporal in 2020 as the foundation for our internal software delivery platform. Since then, its usage has been widely adopted as a platform that any engineering team can use to build their systems.
Datadog on Cloud Workload Identities
November 5, 2024
Ian Ferguson, Tabitha Sable and Christophe Tafani-Dereeper
Datadog operates dozens of Kubernetes clusters, tens of thousands of hosts, and millions of containers across a multi-cloud environment, spanning AWS, Azure, and Google Cloud. With over 2,000 engineers, we needed to ensure that every developer and application could securely and efficiently access resources across these various cloud pr...