Case Study - Integrating PagerDuty with Cherwell

PagerDuty runs a SaaS platform for managing and delivering alerts related to various incidents triggered across an organization's systems.

  • Client Pager Duty
  • Timeline December 2016-April 2017
  • Industry Technology Operations

Company Overview

Pager Duty is a leading incident management platform that helps organizations improve their operational reliability and agility. Founded in 2010, the company provides a comprehensive solution for managing incidents, automating responses, and ensuring that the right teams are notified at the right time. With features like on-call scheduling, real-time alerts, and analytics, PagerDuty empowers teams to respond quickly to incidents, minimizing downtime and enhancing service delivery. The platform integrates with various tools and services, making it a vital component of modern DevOps and IT operations.

Project Overview

PagerDuty runs a SaaS platform for managing and delivering alerts related to various incidents triggered across an organization's systems. They provide a range of tools for configuring the set of services monitored, the escalation policies that determine who gets notified of each incident, and the notification preferences for each user configured within the platform. To extend the breadth of their platform, they published a request to have a bi-directional integration with the Cherwell Service Management platform developed, allowing PagerDuty incidents to be triggered by events within Cherwell and for updates to the PagerDuty incident to be reflected on the corresponding event within Cherwell.

The Cherwell Service Management platform is a suite of IT service management applications. It enables the rapid development and monitoring of data-centric applications, using a mixture of built-in configuration tools and custom application development supported within the platform.

Having extensive experience with PagerDuty implementations across multiple client engagements, Go Between had a deep understanding of their platform's capabilities. After reviewing the available documentation for Cherwell, our team identified an approach for integrating these two systems that leveraged web service One-Step and REST API utilities built into the Cherwell platform and used an AWS Lambda function to translate these requests into the format expected within the PagerDuty platform.

After our proposal was selected, Go Between produced a set of workflow and sequence diagrams to further validate and communicate the proposed design. The process of constructing these diagrams required deeper investigation into each system and helped ensure a thorough understanding of the implementation details. This documentation served to fully clarify the requirements for the end deliverable and was a key tool for managing expectations with all project stakeholders. While implementation details were adjusted as we continued to learn more about each system, this provided a solid foundation for building the integration.

To support the necessary communication from PagerDuty to Cherwell, Go Between implemented a Node-based AWS Lambda function that was triggered on each change to a PagerDuty incident and relayed these updates into the Cherwell system. With built-in support for configuring these types of Lambda functions, this approach fit seamlessly into the PagerDuty platform architecture. The lambda function used the Cherwell REST API to make a series of HTTP requests, retrieving and updating the corresponding Cherwell incidents as needed. We included full automated test coverage for this functionality, which proved extremely valuable for maintaining high quality as new requirements arose late in the project.

To support the necessary communication from Cherwell to PagerDuty, we used the built-in Cherwell tools to construct a Blueprint for configuring a client's Cherwell instance with the modifications to support this functionality. These changes included updates to the Cherwell data models, additional event listeners to process updates to incidents within the Cherwell service, and custom functions to relay these updates into the PagerDuty REST APIs when appropriate.

All artifacts supporting this integration were completed and delivered, along with documentation for the steps to deploy and configure the Cherwell Blueprint. In addition, we provided a set of screencasts covering the setup and validation of the integration to provide a clear understanding of its operation. With this initial integration completed, the PagerDuty team was able to gather the necessary buy-in from both organizations to proceed further refine this integration before delivering it to their customers.