# Incident Response Runbook **Classification:** Confidential **Version:** 1.0 --- ## Quick Reference | Incident Type | First Step | Escalation | |---------------|------------|------------| | Compromised credentials | Rotate IAM keys | Security team | | Data breach | Isolate S3 bucket | Legal + Security | | DoS attack | Enable WAF | AWS Support | | Malware in images | Quarantine bucket | Security team | | KMS key compromised | Disable key, create new | AWS Support | --- ## 1. Security Alert Response ### 1.1 Lambda Error Alarm **Trigger:** `lambda-errors > 5 in 5 minutes` **Steps:** 1. Check CloudWatch Logs: `/aws/lambda/image-processor-proc` 2. Identify error pattern (input validation, timeout, permissions) 3. If input validation failures: possible attack vector 4. If permissions errors: check IAM role changes 5. Document findings in incident ticket **Recovery:** - Deploy fix if code-related - Update input validation if attack-related - Notify users if service impacted --- ## 2. Data Breach Response ### 2.1 S3 Bucket Compromise **Trigger:** GuardDuty finding, unusual access patterns **Immediate Actions (0-15 min):** ```bash # 1. Block all access to affected bucket aws s3api put-bucket-policy --bucket image-processor-ACCOUNT \ --policy '{"Version":"2012-10-17","Statement":[{"Effect":"Deny","Principal":"*","Action":"s3:*","Resource":["arn:aws:s3:::image-processor-ACCOUNT/*"]}]}' # 2. Enable S3 Object Lock (prevent deletion) aws s3api put-object-lock-configuration --bucket image-processor-ACCOUNT \ --object-lock-configuration '{"ObjectLockEnabled":"Enabled"}' # 3. Capture access logs aws s3 cp s3://image-processor-logs-ACCOUNT/s3-access-logs/ ./forensics/s3-logs/ ``` **Investigation (15-60 min):** 1. Review S3 access logs for unauthorized IPs 2. Check CloudTrail for API call anomalies 3. Identify compromised credentials 4. Scope data exposure (list affected objects) **Containment (1-4 hours):** 1. Rotate all IAM credentials 2. Revoke suspicious sessions 3. Enable CloudTrail log file validation 4. Notify AWS Security **Recovery (4-24 hours):** 1. Create new bucket with hardened policy 2. Restore from backup if needed 3. Re-enable services incrementally 4. Post-incident review --- ## 3. KMS Key Compromise **Trigger:** KMS key state alarm, unauthorized KeyUsage events **Immediate Actions:** ```bash # 1. Disable the key (prevents new encryption/decryption) aws kms disable-key --key-id # 2. Create new key aws kms create-key --description "Emergency replacement key" # 3. Update Lambda environment aws lambda update-function-configuration \ --function-name image-processor-proc \ --environment "Variables={...,KMS_KEY_ID=}" ``` **Recovery:** 1. Re-encrypt all S3 objects with new key 2. Update all references to old key 3. Schedule old key for deletion (30-day window) 4. Audit all KeyUsage CloudTrail events --- ## 4. DoS Attack Response **Trigger:** Lambda throttles, CloudWatch spike **Immediate Actions:** ```bash # 1. Reduce Lambda concurrency to limit blast radius aws lambda put-function-concurrency \ --function-name image-processor-proc \ --reserved-concurrent-executions 1 # 2. Enable S3 Requester Pays (deter attackers) aws s3api put-bucket-request-payment \ --bucket image-processor-ACCOUNT \ --request-payment-configuration '{"Payer":"Requester"}' ``` **Mitigation:** 1. Enable AWS Shield (if escalated) 2. Add WAF rules for S3 (CloudFront distribution) 3. Implement request rate limiting 4. Block suspicious IP ranges --- ## 5. Malware Detection **Trigger:** GuardDuty S3 finding, unusual file patterns **Immediate Actions:** ```bash # 1. Quarantine affected objects aws s3 cp s3://image-processor-ACCOUNT/uploads/suspicious.jpg \ s3://image-processor-ACCOUNT/quarantine/suspicious.jpg # 2. Remove from uploads aws s3 rm s3://image-processor-ACCOUNT/uploads/suspicious.jpg # 3. Tag for investigation aws s3api put-object-tagging \ --bucket image-processor-ACCOUNT \ --key quarantine/suspicious.jpg \ --tagging 'TagSet=[{Key=Status,Value=Quarantined},{Key=Date,Value=2026-02-22}]' ``` **Analysis:** 1. Download quarantine file to isolated environment 2. Scan with ClamAV or VirusTotal API 3. Check file metadata for origin 4. Review upload source IP in access logs --- ## 6. Credential Compromise **Trigger:** CloudTrail unusual API calls, GuardDuty finding **Immediate Actions:** ```bash # 1. List all access keys for affected user/role aws iam list-access-keys --user-name # 2. Deactivate compromised keys aws iam update-access-key --access-key-id --status Inactive # 3. Delete compromised keys aws iam delete-access-key --access-key-id # 4. Create new keys aws iam create-access-key --user-name ``` **Recovery:** 1. Audit all API calls made with compromised credentials 2. Check for unauthorized resource creation 3. Rotate all secrets that may have been exposed 4. Enable MFA if not already enabled --- ## 7. Forensics Data Collection ### 7.1 Preserve Evidence ```bash # CloudTrail logs (last 24 hours) aws cloudtrail lookup-events --lookup-attributes AttributeKey=EventName,AttributeValue=GetObject \ --start-time $(date -d '24 hours ago' -Iseconds) > forensics/cloudtrail.json # CloudWatch Logs aws logs create-export-task --log-group-name /aws/lambda/image-processor-proc \ --from $(date -d '24 hours ago' +%s)000 --to $(date +%s)000 \ --destination s3://forensics-bucket/logs/ # S3 access logs aws s3 cp s3://image-processor-logs-ACCOUNT/s3-access-logs/ ./forensics/s3-logs/ --recursive ``` ### 7.2 Chain of Custody Document: - [ ] Time of incident detection - [ ] Personnel involved - [ ] Actions taken (with timestamps) - [ ] Evidence collected (with hashes) - [ ] Systems affected --- ## 8. Communication Templates ### 8.1 Internal Notification ``` SECURITY INCIDENT NOTIFICATION Incident ID: INC-YYYY-XXXX Severity: [Critical/High/Medium/Low] Status: [Investigating/Contained/Resolved] Summary: [Brief description] Impact: [Systems/data affected] Actions Taken: [List of containment steps] Next Update: [Time] Contact: [Incident commander] ``` ### 8.2 External Notification (if required) ``` SECURITY ADVISORY Date: [Date] Affected Service: AWS Image Processing Description: [Factual, non-technical summary] Customer Action: [If customers need to take action] Status: [Investigating/Resolved] Contact: security@company.com ``` --- ## 9. Post-Incident ### 9.1 Required Documentation 1. Incident timeline (minute-by-minute) 2. Root cause analysis 3. Impact assessment 4. Remediation actions 5. Lessons learned ### 9.2 Follow-up Actions | Timeframe | Action | |-----------|--------| | 24 hours | Initial incident report | | 72 hours | Root cause analysis | | 1 week | Remediation complete | | 2 weeks | Post-incident review | | 30 days | Security control updates | **Review Schedule:** This runbook must be tested quarterly via tabletop exercise and updated after each incident.