Files
tf-aws-lambda-imageprocessing/INCIDENT_RESPONSE.md
2026-02-22 05:37:03 +00:00

6.8 KiB

Incident Response Runbook

Classification: Confidential
Version: 1.0


Quick Reference

Incident Type First Step Escalation
Compromised credentials Rotate IAM keys Security team
Data breach Isolate S3 bucket Legal + Security
DoS attack Enable WAF AWS Support
Malware in images Quarantine bucket Security team
KMS key compromised Disable key, create new AWS Support

1. Security Alert Response

1.1 Lambda Error Alarm

Trigger: lambda-errors > 5 in 5 minutes

Steps:

  1. Check CloudWatch Logs: /aws/lambda/image-processor-proc
  2. Identify error pattern (input validation, timeout, permissions)
  3. If input validation failures: possible attack vector
  4. If permissions errors: check IAM role changes
  5. Document findings in incident ticket

Recovery:

  • Deploy fix if code-related
  • Update input validation if attack-related
  • Notify users if service impacted

2. Data Breach Response

2.1 S3 Bucket Compromise

Trigger: GuardDuty finding, unusual access patterns

Immediate Actions (0-15 min):

# 1. Block all access to affected bucket
aws s3api put-bucket-policy --bucket image-processor-ACCOUNT \
  --policy '{"Version":"2012-10-17","Statement":[{"Effect":"Deny","Principal":"*","Action":"s3:*","Resource":["arn:aws:s3:::image-processor-ACCOUNT/*"]}]}'

# 2. Enable S3 Object Lock (prevent deletion)
aws s3api put-object-lock-configuration --bucket image-processor-ACCOUNT \
  --object-lock-configuration '{"ObjectLockEnabled":"Enabled"}'

# 3. Capture access logs
aws s3 cp s3://image-processor-logs-ACCOUNT/s3-access-logs/ ./forensics/s3-logs/

Investigation (15-60 min):

  1. Review S3 access logs for unauthorized IPs
  2. Check CloudTrail for API call anomalies
  3. Identify compromised credentials
  4. Scope data exposure (list affected objects)

Containment (1-4 hours):

  1. Rotate all IAM credentials
  2. Revoke suspicious sessions
  3. Enable CloudTrail log file validation
  4. Notify AWS Security

Recovery (4-24 hours):

  1. Create new bucket with hardened policy
  2. Restore from backup if needed
  3. Re-enable services incrementally
  4. Post-incident review

3. KMS Key Compromise

Trigger: KMS key state alarm, unauthorized KeyUsage events

Immediate Actions:

# 1. Disable the key (prevents new encryption/decryption)
aws kms disable-key --key-id <key-id>

# 2. Create new key
aws kms create-key --description "Emergency replacement key"

# 3. Update Lambda environment
aws lambda update-function-configuration \
  --function-name image-processor-proc \
  --environment "Variables={...,KMS_KEY_ID=<new-key-id>}"

Recovery:

  1. Re-encrypt all S3 objects with new key
  2. Update all references to old key
  3. Schedule old key for deletion (30-day window)
  4. Audit all KeyUsage CloudTrail events

4. DoS Attack Response

Trigger: Lambda throttles, CloudWatch spike

Immediate Actions:

# 1. Reduce Lambda concurrency to limit blast radius
aws lambda put-function-concurrency \
  --function-name image-processor-proc \
  --reserved-concurrent-executions 1

# 2. Enable S3 Requester Pays (deter attackers)
aws s3api put-bucket-request-payment \
  --bucket image-processor-ACCOUNT \
  --request-payment-configuration '{"Payer":"Requester"}'

Mitigation:

  1. Enable AWS Shield (if escalated)
  2. Add WAF rules for S3 (CloudFront distribution)
  3. Implement request rate limiting
  4. Block suspicious IP ranges

5. Malware Detection

Trigger: GuardDuty S3 finding, unusual file patterns

Immediate Actions:

# 1. Quarantine affected objects
aws s3 cp s3://image-processor-ACCOUNT/uploads/suspicious.jpg \
  s3://image-processor-ACCOUNT/quarantine/suspicious.jpg

# 2. Remove from uploads
aws s3 rm s3://image-processor-ACCOUNT/uploads/suspicious.jpg

# 3. Tag for investigation
aws s3api put-object-tagging \
  --bucket image-processor-ACCOUNT \
  --key quarantine/suspicious.jpg \
  --tagging 'TagSet=[{Key=Status,Value=Quarantined},{Key=Date,Value=2026-02-22}]'

Analysis:

  1. Download quarantine file to isolated environment
  2. Scan with ClamAV or VirusTotal API
  3. Check file metadata for origin
  4. Review upload source IP in access logs

6. Credential Compromise

Trigger: CloudTrail unusual API calls, GuardDuty finding

Immediate Actions:

# 1. List all access keys for affected user/role
aws iam list-access-keys --user-name <username>

# 2. Deactivate compromised keys
aws iam update-access-key --access-key-id <key-id> --status Inactive

# 3. Delete compromised keys
aws iam delete-access-key --access-key-id <key-id>

# 4. Create new keys
aws iam create-access-key --user-name <username>

Recovery:

  1. Audit all API calls made with compromised credentials
  2. Check for unauthorized resource creation
  3. Rotate all secrets that may have been exposed
  4. Enable MFA if not already enabled

7. Forensics Data Collection

7.1 Preserve Evidence

# CloudTrail logs (last 24 hours)
aws cloudtrail lookup-events --lookup-attributes AttributeKey=EventName,AttributeValue=GetObject \
  --start-time $(date -d '24 hours ago' -Iseconds) > forensics/cloudtrail.json

# CloudWatch Logs
aws logs create-export-task --log-group-name /aws/lambda/image-processor-proc \
  --from $(date -d '24 hours ago' +%s)000 --to $(date +%s)000 \
  --destination s3://forensics-bucket/logs/

# S3 access logs
aws s3 cp s3://image-processor-logs-ACCOUNT/s3-access-logs/ ./forensics/s3-logs/ --recursive

7.2 Chain of Custody

Document:

  • Time of incident detection
  • Personnel involved
  • Actions taken (with timestamps)
  • Evidence collected (with hashes)
  • Systems affected

8. Communication Templates

8.1 Internal Notification

SECURITY INCIDENT NOTIFICATION

Incident ID: INC-YYYY-XXXX
Severity: [Critical/High/Medium/Low]
Status: [Investigating/Contained/Resolved]

Summary: [Brief description]

Impact: [Systems/data affected]

Actions Taken: [List of containment steps]

Next Update: [Time]

Contact: [Incident commander]

8.2 External Notification (if required)

SECURITY ADVISORY

Date: [Date]
Affected Service: AWS Image Processing

Description: [Factual, non-technical summary]

Customer Action: [If customers need to take action]

Status: [Investigating/Resolved]

Contact: security@company.com

9. Post-Incident

9.1 Required Documentation

  1. Incident timeline (minute-by-minute)
  2. Root cause analysis
  3. Impact assessment
  4. Remediation actions
  5. Lessons learned

9.2 Follow-up Actions

Timeframe Action
24 hours Initial incident report
72 hours Root cause analysis
1 week Remediation complete
2 weeks Post-incident review
30 days Security control updates

Review Schedule: This runbook must be tested quarterly via tabletop exercise and updated after each incident.