In the process of building a monitoring and analytics platform that ingests trillions of data points a day, Datadog has learned many lessons about scalable, distributed systems in the cloud. We'd like to share those experiences with our community in this series: "Datadog on..."
Each episode will offer a conversation with the engineers who build Datadog. They'll share real-world experiences architecting, building, operating, and monitoring modern systems giving you actionable information you can apply at your organization. With plenty of time left for Q&A, we'd like you to join the discussion.
Datadog on Caching
April 27, 2023
Jessica Cordonnier, Mitch Ward and Ara Pulido
Caching (and cache invalidation!) is often mentioned as one of the hardest problems in computer science. While caching can bring substantial performance improvements, reasoning about cached data can be extremely difficult as caching fundamentally means that you are no longer reading from your source of truth. With that in mind, many te...
Datadog on Data Engineering Pipelines: Apache Spark at Scale
March 23, 2023
Alodie Boissonnet, Anton Ippolitov and Ara Pulido
Datadog on Site Reliability Engineering
February 22, 2023
Brandon West, Laura de Vesine and Rick Mangi
Datadog on the Lifecycle of Threats and Vulnerabilities
January 12, 2023
Nick Frichette, Adam Stevko and Andrew Krug