November 16, 2020
With many blog posts published and talks given on the topic, it’s no secret that Datadog is running Kubernetes at scale. We currently run dozens of clusters, some of them with thousands of nodes. Additionally, we have clusters running in multiple clouds. How are we monitoring all of that, ensuring we can scale up quickly and safely?
In this session Ara Pulido, Technical Evangelist, will chat with Celene Chang and Charly Fontaine - both software engineers on the Container Integrations team at Datadog. This team is responsible for deploying and running the Datadog Agent in our Kubernetes clusters. We’ll cover how we are running the Datadog Agent in our clusters, which metrics we care about, and the monitors we have set up. By the end of the session you will have new ideas and best practices on monitoring Kubernetes with Datadog that you can apply in your own environment.
By the end of the session you will have a better understanding of what chaos engineering is, how it can help your organization, and what you need to get started in your organization.
Datadog on Building Reliable Distributed Applications Using Temporal →
Datadog on OpenTelemetry →
Datadog on Secure Remote Updates →
Datadog on Stateful Workloads on Kubernetes →
Datadog on Data Science →
Datadog on Kubernetes Autoscaling →
Datadog on Kubernetes Node Management →
Datadog on Caching →
Datadog on Data Engineering Pipelines: Apache Spark at Scale →
Datadog on Site Reliability Engineering →
Datadog on Building an Event Storage System →
Datadog on gRPC →
Datadog on Gamedays →
Datadog on Chaos Engineering →
Datadog on Serverless →
Datadog on Software Delivery →
Datadog on Incident Management →
Datadog on Kubernetes →