Datadog on Incident Management

A episode by Leo Cavaille, Matt Hardwick and Ara Pulido
Datadog, Datadog and Datadog

Register to watch this content

By submitting your email you agree to the Terms of Service and Privacy Statement
Watch this content now

Categories covered by this episode

About this episode

Datadog is a monitoring and analytics platform that ingests trillions of data points per day, coming from more than 8,000 customers. With a complex distributed architecture and hundreds of deployments per day, needless to say sometimes things don't go as planned. Our teams have been improving the way incidents are managed at Datadog over the years and they are using that knowledge to help Datadog customers manage their own incidents.

In this session, Technical Evangelist Ara Pulido will chat with Léo Cavaillé, SRE Manager, and Matt Hardwick, an engineer working on Datadog’s incident application. They will discuss how incident management evolved at Datadog, how we handle incidents today, and how the SRE team is working alongside the engineers building Datadog’s Incident application to make Datadog the best place to organize, investigate, manage, and solve your infrastructure and application incidents.