Datadog on Data Informed Product Development
July 26, 2022
Ara Pulido, Miranda Kapin and Derek Howles
Datadog is an observability and security platform. That means that our users may be in a high stress situation: debugging an issue in production, managing an incident or responding to a security threat. Having a good UX is particularly critical in those cases.
Datadog on Detecting Threats using Network Traffic Flows
June 11, 2022
Theo Guidoux, Andrew Krug and Anna Pauxberger
At Datadog’s scale, with over 18,000 customers sending trillions of data points per day, analyzing the volume of data coming in can be challenging. One of the largest log sources internally at Datadog are networking logs. Being able to analyze and make sense of them is critical to keep Datadog secure. To help with the task, we have bui...
Datadog on Web Security Standards
May 19, 2022
Jean-Baptiste Aviat, Ayaz Badouraly and Andrew Krug
Datadog on Rust
February 23, 2022
Duarte Nunes, Ara Pulido and Brian Troutwine
Rust is a programming language that has been gaining popularity over the past few years, with its adopters claiming that it helps them write faster, memory efficient, and more reliable software.
Datadog on Profiling in Production
January 28, 2022
Julien Danjou and Kirk Kaiser
Depending on your chosen programming language and stack, you may have never used a profiler in production. The very idea of using a profiler in production for a web service may seem unrealistic, due to the amount of overhead involved. After all, aren’t profilers extremely computationally expensive to run?
Datadog on Data Visualization
December 14, 2021
Mark Hintz, Ara Pulido and Kemper Smith
Datadog customers send trillions of data points per day. These data points are processed by Datadog and used to debug production issues in real time. But, in order to reason about all this data, we humans need visual representations. Visualizations can help us discover connections and problem points.
Datadog on Building Responsive UX
September 30, 2021
Amy Luo, Edwin Morris and Ara Pulido
Datadog product designers and frontend developers have been working together to create a new, better UX for creating dashboards, which is one of the most important parts of using Datadog. A central part of this effort was building a new layout engine. Working on this project was a bit different from the usual feature work, so the colla...
Datadog on Gamedays
August 31, 2021
Elijah Andrews, Mike Petruzelli and Ara Pulido
As engineers, as we scale our applications and infrastructure, we accept that failure can and will happen. But, how can we get ahead of those potential failures? Gamedays are events which aim to test the resilience of a system when facing abnormal and turbulent situations, checking whether our expectations on how it will fail (or not) ...
Datadog on Chaos Engineering
June 1, 2021
Joris Bonnefoy, Tay Nishimura and Ara Pulido
As you scale your applications, remaining resilient to underlying network failures, resource constraints introduced by other applications, or spikes in traffic can become exponentially more complex, even with very thorough testing and processes. Chaos engineering is a discipline that encourages experimenting in production and injecting...
Datadog on Security and Compliance
March 31, 2021
Kirk Kaiser and Andrew Spangler
At Datadog, customer trust and data security are of the utmost importance.
Datadog on Agent Integration Development
March 23, 2021
Christine Chen, Ara Pulido and Julia Simon
To make sure that customers are getting the most out of the platform in the least amount of time, Datadog maintains more than 400 built-in integrations. These integrations collect metrics, events, and logs from a diverse set of sources: databases, source control, bug tracking tools, cloud providers, automation tools, and more.
Datadog on eBPF
January 26, 2021
Lee Avital, Guillaume Fournier and Ara Pulido
eBPF (extended Berkeley Packet Filter) is a Linux technology that can run sandboxed programs in the kernel without changing kernel source code or loading kernel modules. While the kernel is an ideal place to implement monitoring/observability, networking, and security it wasn't until the recent broad adoption of eBPF that it was feasib...
Datadog on Serverless
December 10, 2020
David Huie, Kirk Kaiser and Andrew Krug
The Datadog Security Platform team leverages Serverless to ingest security events across many different cloud providers, deployment platforms, and devices. These security events are then transformed and shipped to a data lake to help defend and protect the platform as a whole. Once there, these ingested events are used to drive interna...
Datadog on Kubernetes Monitoring
November 16, 2020
Celene Chang, Charly Fontaine and Ara Pulido
With many blog posts published and talks given on the topic, it’s no secret that Datadog is running Kubernetes at scale. We currently run dozens of clusters, some of them with thousands of nodes. Additionally, we have clusters running in multiple clouds. How are we monitoring all of that, ensuring we can scale up quickly and safely?
Datadog on Software Delivery
September 30, 2020
Jacob LeGrone, Ara Pulido and Benjamin Smith
Over 800 Engineers at Datadog do thousands of deployments per day, to hundreds of services in different environments, regions, and cloud providers. How can we manage all those deployments in a common way and have a reliable paper trail way to audit any changes?
Datadog on Incident Management
August 27, 2020
Leo Cavaille, Matt Hardwick and Ara Pulido
Datadog is a monitoring and analytics platform that ingests trillions of data points per day, coming from more than 8,000 customers. With a complex distributed architecture and hundreds of deployments per day, needless to say sometimes things don't go as planned. Our teams have been improving the way incidents are managed at Datadog ov...
Datadog on RocksDB
June 30, 2020
James Bibby, Kenny House and Ara Pulido
Datadog is a monitoring and analytics platform that ingests trillions of data points per day, coming from more than 8,000 customers. Each of those is associated with metadata, mostly in the form of tags, and it can also be part of streams of related data points, which can then be explored, queried, or aggregated. RocksDB is used by man...
Datadog on Kafka
May 27, 2020
Jamie Alquiza, Kirk Kaiser and Balthazar Rouberol
In this session, we’ll speak with two engineers responsible for scaling the Kafka infrastructure within Datadog, Balthazar Rouberol and Jamie Alquiza. They'll share their strategy in scaling Kafka, how it’s been deployed on Kubernetes, and introduce kafka-kit; our open source toolkit for scaling Kafka clusters.
Datadog on Kubernetes
May 27, 2020
Laurent Bernaille and Ara Pulido
When 2 years ago Datadog decided to move its infrastructure platform to Kubernetes we didn’t expect to find so many roadblocks, but ingesting trillions of datapoints per day in a reliable fashion requires pushing the limits of cloud computing.