When Encryption Breaks Your Slack Notifications: A Tale of KMS, SNS, and AWS Chatbot
It started innocently enough - a Trivy security scan flagged 9 high-severity vulnerabilities in our Terraform configuration. The issue? Unencrypted SNS topics. The fix seemed straightforward: add a KMS key, encrypt the topics, deploy to dev for validation. What could go wrong?
The Setup
# Before: Unencrypted SNS topic
resource "aws_sns_topic" "application_alarms" {
name = "application-alarms-${var.environment}"
# No encryption - Trivy security vulnerability
}
# After: Encrypted SNS topic
resource "aws_sns_topic" "application_alarms" {
name = "application-alarms-${var.environment}"
kms_master_key_id = module.sns_kms.key_arn # ✅ Encrypted!
}
The PR passed code review, tests passed, and we deployed to the development environment. The Trivy scan went green. Victory!
But before promoting to production, I wanted to validate the change properly. Good thing I did - after waiting a day to observe the dev environment, I noticed something concerning: no CloudWatch alarm notifications were appearing in our Slack channel. The infrastructure looked fine, but the silence was suspicious.
The Investigation
Phase 1: Everything Looks Fine
Initial checks showed no obvious issues:
- ✅ CloudWatch alarms were triggering
- ✅ SNS topics existed and were properly configured
- ✅ AWS Chatbot was connected to our Slack workspace
- ✅ SNS subscriptions were active
But notifications weren’t reaching Slack. Time to dig deeper.
Phase 2: The SNS Metrics
Checking SNS metrics revealed something interesting:
aws cloudwatch get-metric-statistics \
--namespace AWS/SNS \
--metric-name NumberOfNotificationsDelivered \
--dimensions Name=TopicName,Value=application-alarms-dev \
--statistics Sum
Result: 0 deliveries when CloudWatch alarms triggered. But when we manually published a test message:
aws sns publish \
--topic-arn arn:aws:sns:us-west-2:123456789:application-alarms-dev \
--message "Test message"
Result: Message delivered successfully! SNS metrics showed 1 successful delivery.
So SNS could deliver messages, but CloudWatch alarms couldn’t reach SNS. The plot thickens.
Phase 3: CloudWatch Logs Tell the Truth
We enabled CloudWatch Logs for AWS Chatbot and triggered another test:
{
"message": "Event received is not supported",
"eventType": "CloudWatchAlarm"
}
Wait, Chatbot was receiving messages but rejecting them? Let’s try triggering the actual CloudWatch alarm:
Result: No logs at all. The messages never reached Chatbot.
This narrowed it down: CloudWatch couldn’t publish to the encrypted SNS topic.
The Root Cause: Three Missing Permissions
The issue wasn’t just one missing permission - it was three separate problems:
Problem 1: CloudWatch Can’t Publish to Encrypted SNS
When you encrypt an SNS topic with KMS, CloudWatch Alarms needs explicit permission to use that key. This is documented, but easy to miss:
# What we had (wrong):
services = [{
name = "sns.amazonaws.com"
actions = ["kms:Decrypt", "kms:GenerateDataKey"]
}]
# What we needed:
services = [
{
name = "sns.amazonaws.com"
actions = ["kms:Decrypt", "kms:GenerateDataKey"]
},
{
name = "cloudwatch.amazonaws.com" # ← Missing!
actions = ["kms:Decrypt", "kms:GenerateDataKey"]
}
]
Why? CloudWatch encrypts alarm data before sending it to SNS. Without KMS permissions, it can’t encrypt, so it can’t publish.
Problem 2: AWS Chatbot Uses Two Different Roles
This one caught us off guard. AWS Chatbot actually uses two separate IAM roles:
Channel Role - Configured in the Chatbot console
- Used for: Querying CloudWatch, describing resources
- Example:
aws-chatbot-notifications-{env}
Service-Linked Role - Auto-created by AWS
- Used for: SNS subscription and message decryption
- Always:
AWSServiceRoleForAWSChatbot
We had granted KMS permissions to the channel role, but SNS subscriptions use the service-linked role!
# Check who's actually subscribed:
aws sns list-subscriptions-by-topic \
--topic-arn arn:aws:sns:us-west-2:123456789:application-alarms-dev
# Result:
{
"SubscriptionArn": "...",
"Principal": "arn:aws:iam::123456789:role/aws-service-role/management.chatbot.amazonaws.com/AWSServiceRoleForAWSChatbot"
}
Not the role we granted permissions to!
Problem 3: Over-Privileged Permissions
While fixing the first two issues, we noticed we’d granted kms:GenerateDataKey to both Chatbot roles. But Chatbot only decrypts messages - it never encrypts anything. This violates the principle of least privilege.
The Solution
Step 1: Create the KMS Key Policy
module "sns_kms" {
source = "./modules/kms"
alias_name = "/alias/${var.project}/sns/${var.environment}"
description = "KMS Key used to encrypt/decrypt SNS topics"
# Service principals that can use this key
services = [
{
# SNS needs to encrypt messages at rest
name = "sns.amazonaws.com"
actions = ["kms:Decrypt", "kms:GenerateDataKey"]
},
{
# CloudWatch needs to encrypt alarm messages
name = "cloudwatch.amazonaws.com"
actions = ["kms:Decrypt", "kms:GenerateDataKey"]
}
]
# AWS principals (IAM roles) that can use this key
additional_principals = [{
type = "AWS"
identifiers = [
# Chatbot channel role (for CloudWatch queries)
aws_iam_role.chatbot_notifications.arn,
# Chatbot service-linked role (for SNS subscriptions)
"arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/aws-service-role/management.chatbot.amazonaws.com/AWSServiceRoleForAWSChatbot"
]
actions = [
"kms:Decrypt", # Required
"kms:DescribeKey" # Optional but useful
# NOT kms:GenerateDataKey - Chatbot doesn't encrypt!
]
}]
}
Step 2: Create the Chatbot IAM Role in Terraform
Previously, we’d created this manually in the console. Time to codify it:
resource "aws_iam_role" "chatbot_notifications" {
name = "aws-chatbot-notifications-${var.environment}"
assume_role_policy = data.aws_iam_policy_document.chatbot_assume_role.json
}
data "aws_iam_policy_document" "chatbot_assume_role" {
statement {
effect = "Allow"
principals {
type = "Service"
identifiers = ["chatbot.amazonaws.com"]
}
actions = ["sts:AssumeRole"]
}
}
# CloudWatch read-only access
resource "aws_iam_role_policy" "chatbot_cloudwatch_readonly" {
name = "CloudWatchReadOnlyAccess"
role = aws_iam_role.chatbot_notifications.id
policy = data.aws_iam_policy_document.chatbot_cloudwatch_readonly.json
}
# KMS decrypt for SNS messages
resource "aws_iam_role_policy" "chatbot_kms_decrypt" {
name = "SNSKMSDecryptAccess"
role = aws_iam_role.chatbot_notifications.id
policy = data.aws_iam_policy_document.chatbot_kms_decrypt.json
}
data "aws_iam_policy_document" "chatbot_kms_decrypt" {
statement {
sid = "AllowDecryptSNSMessages"
effect = "Allow"
actions = [
"kms:Decrypt",
"kms:DescribeKey"
]
resources = [module.sns_kms.key_arn]
}
}
Step 3: Manual Configuration (The Catch)
Here’s where it gets frustrating - AWS Chatbot configurations can’t be managed by Terraform (as of late 2024). You have to manually update them in the console:
- Navigate to AWS Chatbot → Slack → Your Workspace
- Click on your channel configuration
- Update the IAM Role to use the Terraform-managed role
- Critical: Make sure the AWS Chatbot Slack app is installed in your workspace
- Critical: Add the
@AWS Chatbotbot to your Slack channel
Missing step 4 or 5? Silent failures. No errors, no logs, just… nothing.
The Message Flow
When everything is configured correctly:
Testing and Verification
After deploying the fix:
# 1. Verify KMS policy includes all principals
aws kms get-key-policy \
--key-id <key-id> \
--policy-name default
# 2. Trigger a test alarm
aws cloudwatch set-alarm-state \
--alarm-name "your-alarm-name" \
--state-value ALARM \
--state-reason "Testing encryption fix"
# 3. Check Chatbot logs for processing
aws logs tail /aws/chatbot/your-config-name --follow
# 4. Verify Slack notification received
Success criteria:
- CloudWatch Logs show: “Sending message to Slack”
- SNS metrics show successful delivery
- Slack channel receives the notification
Key Takeaways
Encrypted SNS requires THREE service principals:
sns.amazonaws.com- to encrypt messages at restcloudwatch.amazonaws.com- to publish encrypted alarms- Both Chatbot roles - to decrypt messages
AWS Chatbot uses two different roles:
- Channel role (configured in console)
- Service-linked role (used by SNS subscriptions)
- Both need KMS decrypt permissions
Least privilege matters:
- Chatbot only needs
kms:Decrypt - Not
kms:GenerateDataKey(it doesn’t encrypt) - Over-privileging increases attack surface
- Chatbot only needs
Manual steps are unavoidable:
- Chatbot configurations not in Terraform
- Slack app must be installed
- Bot must be added to channels
- Document these steps for your team
Test thoroughly in non-production:
- Manual SNS publish ≠ CloudWatch alarm
- Different code paths, different permissions
- Always test with actual triggers before promoting
- Wait to observe behavior, don’t rush to prod
CloudWatch Logs are your friend:
- Enable logging for Chatbot in dev
- Reveals message format issues early
- Shows actual errors (not just “no notifications”)
- Critical for debugging encryption issues
The Aftermath
After deploying this fix to development and thoroughly validating:
- ✅ Slack notifications working correctly
- ✅ Security compliance achieved (encrypted SNS)
- ✅ Infrastructure as code (Chatbot IAM role)
- ✅ Least privilege permissions (removed unnecessary GenerateDataKey)
- ✅ Validated in dev before production deployment
For Future Reference
If you’re adding KMS encryption to SNS topics used with AWS Chatbot:
Checklist:
- Add
cloudwatch.amazonaws.comto KMS policy - Add
sns.amazonaws.comto KMS policy - Add Chatbot channel role to KMS policy
- Add
AWSServiceRoleForAWSChatbotto KMS policy - Grant only
kms:Decryptto Chatbot roles - Create Chatbot IAM role in Terraform
- Update Chatbot console config to use new role
- Verify Slack app installed in workspace
- Verify bot added to Slack channels
- Test with actual CloudWatch alarm
- Check CloudWatch Logs for Chatbot
- Verify SNS metrics show delivery
- Confirm Slack notifications received
References
- AWS KMS Key Policies
- AWS Chatbot IAM Roles
- SNS Encryption with KMS
- CloudWatch Alarms with Encrypted SNS
Wrapping Up
This experience reinforced that adding encryption isn’t just about flipping a switch - it’s about understanding the entire message flow and all the services involved. AWS Chatbot’s dual-role architecture is a particular gotcha that isn’t well documented.
If you’re managing Slack notifications via AWS Chatbot and planning to encrypt your SNS topics, hopefully this post saves you the debugging time. Always test with actual CloudWatch alarms in a non-production environment, not just manual SNS publishes.
Found this helpful? Hit me up on LinkedIn.