Datadog on Stateful Workloads on Kubernetes

March 26, 2024

Ara Pulido

Edward Dale

Martin Dickson

Category

distributed systems →

reliability →

storage →

Container orchestration platforms, like Kubernetes, were, from the beginning, an ideal solution for microservice architectures running a lot of stateless services. This was also the case for Datadog, which is run on dozens of self-managed Kubernetes clusters in a multi-cloud environment, adding up to hundreds of thousands of pods. But what about stateful applications? What are the best practices to run and scale those without losing data?

The team at Datadog owning our Kafka clusters has been running big business critical storage workloads on our Kubernetes clusters for a long time. Over the years, they have gained experience on how to run this type of workload at scale, and have created tooling around it.

The team at Datadog owning our Postgres databases, on the other hand, is currently working on the transition to move their workloads from managed cloud instances to Kubernetes.

In this live session from the Datadog London Summit, Ara Pulido, Staff Developer Advocate, will chat with Martin Dickson, Senior Software Engineer in the Datadog Kafka team, and Edward Dale, Engineering Manager in the Postgres team at Datadog, about their experience, their tooling, and their stories (good and bad) on running stateful workloads in Kubernetes.

The following category:

Datadog on Stateful Workloads on Kubernetes

Category

Episodes like this