Incident → Business Impact & Comms Agent

1📋

Overview & business value

When a production incident hits, this agent translates it from IT terms into business terms — and runs the communications. It identifies the affected service, queries CRM and ERP to find the customers, orders and revenue at risk, estimates the impact, decides who needs to be told and how urgently, and drafts tailored updates for each audience. It runs alongside the remediation loop, not instead of it. It reads and reasons freely; every public or customer-facing message is an Action Ticket requiring approval.

Problem

IT describes an incident in CIs and error rates. The business hears nothing until customers complain. Nobody can quickly answer "who is affected and how much revenue is at risk?", and customer comms are slow, inconsistent and reactive.

Solution approach

Event-driven on a P1/P2 incident. The agent maps the failing service to affected customers (Salesforce) and orders/revenue (SAP), estimates impact, and drafts tiered comms — internal Teams, the public status page, and customer email — with a human approving anything public.

Core capabilities

Severity + affected-service identification
Customer mapping via Salesforce
Revenue-at-risk via SAP orders
Business-impact estimation
On-call engagement (PagerDuty)
Status-page update (approval)
Tiered internal + customer comms
Post-incident business summary

How it helps

Incidents are communicated in the language the business cares about. Customers are notified proactively instead of complaining inbound. Revenue and trust are protected during outages, and comms are consistent and fast.

Illustrative value model — plug in your own figures

↓ min

Time to first business update

Revenue-at-risk made visible

↑ trust

Proactive vs inbound comms

Value = faster, proactive comms → retained customers + protected revenue, plus incident-commander time saved. Placeholders — substitute your baseline.

2📖

The story (for the sales conversation)

The one-liner

"When checkout goes down, IT sees an error rate. The business sees nothing — until angry customers call. This agent instantly turns the outage into 'here's who's affected, here's the revenue at risk, here's the message' and runs the comms."

😣 Today, without the agent

The payment service degrades. Engineering is heads-down fixing it. The incident commander asks "who's affected?" and nobody knows without pulling CRM and ERP by hand. Support gets blindsided by inbound tickets. A status-page update goes out late and vague. The Tier-1 accounts find out from their own customers, not from us.

😌 The same outage, with the agent

The P1 fires. While engineering remediates, the agent maps the failing checkout service to 340 affected accounts (12 of them Tier-1) and $480k of in-flight orders, and estimates impact. It drafts an internal Teams briefing, a status-page update, and a tiered customer note — and asks the incident commander to approve the public ones. Support and the Tier-1 owners are briefed before the phones start ringing.

"It drafts every message and surfaces the revenue at risk, but it never publishes to the status page or emails customers without a human approving. Proactive comms, with a hand on the public-facing switch."

The trust point — lead with this for Support, Comms and Incident leadership

The villain: the translation gap

IT speaks in CIs; the business hears nothing until customers complain.

The hero: impact + comms in one

The agent names affected customers and revenue, and drafts the messages.

The reason to trust it

Public and customer comms are reversible Action Tickets, sent only on approval.

💡How to use this tab: open here for Support / Comms / incident leadership, tell the before/after, land the trust line, then show "Sample questions" and "Live scenarios". Keep Tools/Governance for the technical buyer.

3📥

Input data

Source system	Field / entity	Type	Purpose
ServiceNow	incident (priority, CI, service)	object	Trigger + affected service
Monitoring	error rate, affected endpoints	metrics	Scope + severity context
Salesforce	Accounts on affected service / tier	sObject	Who is affected
SAP S/4HANA	in-flight orders / revenue	object	Revenue at risk
PagerDuty	on-call schedule	object	Who to engage
Statuspage	component, incident	record	Public status update

ℹ️Developer note: trigger is a P1/P2 ServiceNow incident (or monitoring alert). This agent handles business translation and comms; remediation runs in parallel via the platform's MOP loop.

4⚙️

Processing flow (tools mapped per step)

ℹ️Each step shows its path and whether it is read-only or a write via Action Ticket.

Detect + classify severity deterministic

Read the incident and its CI/service; confirm P1/P2.

servicenow.incident · read

Identify affected service non-deterministic

Correlate CI + monitoring to the customer-facing service (e.g. checkout).

monitoring read · CI correlation

Map to customers deterministic

Query Salesforce for accounts on the affected service and their tiers.

salesforce.query · affected accounts

Map to revenue deterministic

Query SAP for in-flight orders / revenue tied to the service.

sap.read · in-flight orders

Estimate business impact non-deterministic

Reason over customers + revenue + error rate to size the impact and urgency.

impact estimate

Engage on-call deterministic Action Ticket

Confirm the on-call owner; add an incident note in PagerDuty.

pagerduty · on-call + note (via ticket)

Status-page update deterministic human approval Action Ticket

Draft and (on approval) publish a public status-page incident.

statuspage create-incident (via ticket)

Tiered comms non-deterministic approval for customer Action Ticket

Internal Teams briefing immediately; tiered customer emails on approval.

teams.post · email.sendgrid (via ticket)

Post-incident summary non-deterministic

After recovery, compose a business-impact summary (duration, customers, revenue).

summary → ServiceNow + Teams

5🔧

Skills & tools (real connectors)

ℹ️Each tool maps to a real processorKey. Read tools are direct; writes only run inside an Action Ticket, and public/customer comms require human approval.

servicenow.incidentTable APISOURCE

itsm.processors.ServiceNowIncident

Reads the incident + CI/service and updates the record via the Table API.

/api/now/v2/table/incident

READ direct · WRITE via ticket

pagerdutyREST · EventsSOURCE

incident.processors.PagerDutyConnector

Triggers/manages PagerDuty incidents, queries on-call schedules and adds notes via the REST + Events APIs.

on-call schedule · add note

READ direct · WRITE via ticket

salesforce.queryREST / BulkSOURCE

salesforce.QuerySalesforceObject / crm.processors.SalesforceBulkAPI

Finds accounts on the affected service and their tiers via SOQL / Bulk.

SELECT Account, Tier FROM … WHERE Service__c = …

READ · direct

sap.readOData v4SOURCE

erp.processors.SAPS4HanaPull

Pulls in-flight orders / revenue tied to the affected service.

{baseUrl}/sap/opu/odata4/sap/{serviceName}/{entitySet}

READ · direct

monitoringDatadog / AzureSOURCE

observability.processors.DatadogLogForward / AzureMonitorIngest

Reads error rates and affected endpoints for scope and severity context.

metrics · logs (read)

READ · direct

statuspageRESTSOURCE

devops.processors.StatuspageInvoke

Atlassian Statuspage — incidents, components, subscribers, scheduled maintenance.

{"operation":"create-incident","payload":{…}}

WRITE · via Action Ticket (approval)

teams.postGraphSINK

collaboration.processors.TeamsSendMessage

Internal stakeholder briefings (support, account owners, leadership).

TeamsSendMessage · channel/chat

WRITE · via Action Ticket

email.sendgridSendGridSINK

marketing.processors.SendGridSend

Sends tiered customer email comms (send, get_status). Approval required for customer-facing send.

send · get_status

WRITE · via Action Ticket (approval)

⚠️Connectivity honesty: this agent handles business translation and communications — remediation runs in parallel through the platform's MOP loop, not here. Status page is Atlassian StatuspageInvoke; customer email is SendGridSend (alt: SMTP/PutEmail). Both customer-facing channels are approval-gated.

6🤖

Agent prompt (production)

🤖 System prompt

You are the Incident Business-Impact and Communications Orchestrator.

## Operating rules
1. Read/reason freely across ServiceNow, monitoring, Salesforce, SAP, PagerDuty.
2. You do NOT remediate — remediation runs in parallel via the platform MOP loop.
3. You NEVER write directly. Any incident update, status-page post, on-call note
   or customer email is an Action Ticket carrying a MOP; ProcBot executes,
   Sherlock validates.
4. Deterministic for: severity read, customer/order mapping, on-call lookup,
   incident-record update.
5. Non-deterministic for: affected-service correlation, business-impact estimate,
   deciding who to notify and how urgently, drafting the messages.
6. Human approval before: any public status-page post and any customer email.
7. Cite the tool and record for every number. Never invent customer counts or
   revenue figures.

## Tools
READ (direct): servicenow.incident, pagerduty (read), salesforce.query, sap.read,
  monitoring
WRITE (Action Ticket only): servicenow.incident (update), pagerduty (note),
  statuspage (approval), teams.post, email.sendgrid (approval)

## Output (every step)
{ "decision":"...", "path":"deterministic | non-deterministic", "confidence":0.0,
  "action_ticket": { "mop":"...", "scope":"...", "approval":"auto | human" } | null,
  "evidence":[ { "tool":"...", "record":"..." } ], "message_to_user":"..." }

7🛡️

Execution governance — Action Ticket → MOP

No public post or customer email leaves the model. The agent drafts and reasons; an Action Ticket carries the MOP; ProcBot executes; Sherlock validates; the ticket is reversible. Status-page posts and customer emails require human approval first.

1 · Reason

Agent sizes impact and drafts the comms.

→

2 · Action Ticket

MOP id, audience, message, approval level.

→

3 · ProcBot executes

Posts to status page / sends email. Model never writes.

→

4 · Sherlock validates

Confirms delivery; logs the comms; can post a correction.

{
  "action_ticket": "AT-9002",
  "mop": "MOP-STATUSPAGE-UPDATE",
  "approval": "human",   // public-facing
  "target": { "tool": "statuspage", "processorKey": "devops.processors.StatuspageInvoke" },
  "procedure": { "operation": "create-incident",
    "payload": { "name": "Checkout degraded", "status": "investigating",
                 "components": { "checkout": "degraded_performance" },
                 "body": "We are investigating elevated checkout errors." } },
  "validation": { "owner": "sherlock", "expect": "incident.id present" },
  "reversible": true
}

8🔀

Data flow

flowchart TD A([P1 incident: payment service]) --> B[Read incident + CI servicenow.incident] B --> C[Correlate affected service monitoring] C --> D[Map to customers salesforce.query] C --> E[Map to revenue sap.read] D --> F[Estimate business impact] E --> F F --> G{{MOP-ONCALL-ENGAGE}} G --> H[PagerDuty note + Teams brief] F --> I{Public / customer comms?} I -->|Yes| J[Human approval] J --> K{{MOP-STATUSPAGE-UPDATE + MOP-CUSTOMER-COMMS}} K --> L[Statuspage + SendGrid] L --> M[Post-incident business summary]

9🏗️

Systems touched

🛠️ ServiceNow — incident + CI/service

📟 PagerDuty — on-call engagement

☁️ Salesforce — affected customers + tiers

🟡 SAP S/4HANA — in-flight orders / revenue

📡 Datadog / Azure Monitor — error rate context

🌐 Statuspage — public status update

💬 Teams + SendGrid — internal + customer comms

10🗄️

Mock data (seed to demo)

Incident

Monitoring

Affected customers

Revenue at risk

Status page

// servicenow.incident INC0099001
{ "number":"INC0099001", "priority":"1", "short_description":"Payment service degraded",
  "cmdb_ci":"svc-payments-prod", "service":"Checkout", "state":"In Progress" }

// monitoring (Datadog)
{ "service":"checkout", "errorRatePct":22, "p95LatencyMs":4800, "since":"2026-06-01T14:02:00Z" }

// salesforce.query accounts on Checkout service
{ "affectedAccounts":340, "tier1":12, "tier2":74, "tier3":254 }

// sap.read in-flight orders tied to Checkout
{ "inFlightOrders":880, "revenueAtRisk":480000, "currency":"USD" }

// statuspage create-incident (draft, pending approval)
{ "name":"Checkout degraded", "status":"investigating",
  "components":{ "checkout":"degraded_performance" },
  "body":"We are investigating elevated checkout errors and working on a fix." }

11💬

Sample questions (conversational triggers)

The natural-language prompts this agent is built to answer. Each maps to the live scenarios below.

💥"What's the business impact of INC0099001?"→ Scenarios 1–5

👥"Which customers are affected by the checkout outage?"→ Scenario 3

💸"How much revenue is at risk?"→ Scenario 4

📣"Update the status page."→ Scenario 7 (approval)

✉️"Draft customer comms for the Tier-1 accounts."→ Scenario 8 (approval)

📝"Give me the post-incident business summary."→ Scenario 9

🎯 Recommended demo run

Seed the P1 incident + customer/revenue data, then ask "What's the business impact of INC0099001?" — the agent classifies (1), correlates the service (2), maps customers (3) and revenue (4), estimates impact (5), engages on-call (6), and drafts the status-page + customer comms for approval (7–8). After recovery, ask for the post-incident summary (9). Scenario 10 is a false-alarm de-escalation.

12🎬

Live scenarios (tool-execution traces)

ℹ️Ten end-to-end runs. Each step is a read (SOURCE), reasoning, an Action Ticket with a MOP, or an agent message.