Datadog on gRPC

September 29, 2022

Anthonin Bonnefoy

Anthonin Bonnefoy

Antoine Tollenaere

Antoine Tollenaere

Ara Pulido

Ara Pulido

Category

Datadog, the observability platform used by thousands of companies, is made up of hundreds of services that communicate over the network using gRPC, an RPC framework, making it a critical component for Datadog’s reliability.

As teams investigated incidents related to their services, they discovered that some of them were gRPC related. But, were there common patterns to those incidents? Could we use them to learn more about gRPC and how to use it better?

During this past year, an engineering squad with members from different teams was formed to study gRPC related incidents and share lessons learned. They wrote a set of best practices for all engineering teams to follow and common libraries that implement them.

In this session Ara Pulido, Staff Developer Advocate, will chat with Anthonin Bonnefoy, Senior Software Engineer in the Core Resilience team and Antoine Tollenaere, Team Lead in the Networking team, who were part of this squad, to share their investigation of the incidents and the gRPC best practices they came up with to avoid those in the future.

By the end of the session you will have a better understanding of the internals of gRPC and how to better implement it at your organization.