AWS S3 via SQS
End-to-end guide for ingesting AWS logs from S3 buckets using SQS notifications
AWS S3 via SQS
This guide walks through ingesting AWS logs that are delivered to S3 buckets — CloudTrail, VPC Flow Logs, GuardDuty findings, ALB/NLB access logs, WAF logs, and any other AWS service that writes JSON or gzipped logs to S3.
nano uses the S3 → SQS notification pattern: when new objects land in your S3 bucket, S3 sends a notification to an SQS queue, and nano's Vector pipeline polls that queue, downloads the objects, and processes them.
Prerequisites
- An AWS account with permissions to create IAM users/roles, SQS queues, and S3 bucket notifications
- AWS CLI installed and configured (
aws configure) - A running nano instance
Step 1: Create the SQS Queue
Create an SQS queue that will receive S3 object notifications. Use a standard queue (not FIFO — S3 notifications don't support FIFO queues).
# Create the queue
aws sqs create-queue \
--queue-name nanosiem-cloudtrail-logs \
--attributes '{
"MessageRetentionPeriod": "86400",
"VisibilityTimeout": "300"
}'Save the queue URL and ARN — you'll need both:
# Get queue URL
QUEUE_URL=$(aws sqs get-queue-url \
--queue-name nanosiem-cloudtrail-logs \
--query 'QueueUrl' --output text)
echo "Queue URL: $QUEUE_URL"
# Get queue ARN
QUEUE_ARN=$(aws sqs get-queue-attributes \
--queue-url "$QUEUE_URL" \
--attribute-names QueueArn \
--query 'Attributes.QueueArn' --output text)
echo "Queue ARN: $QUEUE_ARN"Naming convention: Use a descriptive queue name like nanosiem-cloudtrail-logs or nanosiem-vpc-flowlogs. You'll create one queue per log type (or one shared queue if you prefer, with separate nano log sources filtering by S3 prefix).
Set the SQS Queue Policy
S3 needs permission to send messages to your SQS queue. Without this policy, S3 notifications will silently fail.
Replace ACCOUNT_ID, BUCKET_NAME, and the queue ARN with your values:
cat > /tmp/sqs-policy.json << 'EOF'
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowS3Notifications",
"Effect": "Allow",
"Principal": {
"Service": "s3.amazonaws.com"
},
"Action": "sqs:SendMessage",
"Resource": "arn:aws:sqs:us-east-1:ACCOUNT_ID:nanosiem-cloudtrail-logs",
"Condition": {
"ArnLike": {
"aws:SourceArn": "arn:aws:s3:::BUCKET_NAME"
}
}
}
]
}
EOF
# Apply the policy (edit the file first to replace placeholders)
aws sqs set-queue-attributes \
--queue-url "$QUEUE_URL" \
--attributes '{"Policy": "'"$(cat /tmp/sqs-policy.json | jq -c .)"'"}'Common gotcha: If you skip the queue policy, S3 bucket notifications will appear to configure successfully but no messages will arrive in the queue. Always set this policy before configuring S3 notifications.
Step 2: Configure S3 Bucket Notifications
Tell your S3 bucket to send notifications to the SQS queue when new objects are created.
cat > /tmp/notification.json << 'EOF'
{
"QueueConfigurations": [
{
"Id": "nanosiem-log-notification",
"QueueArn": "arn:aws:sqs:us-east-1:ACCOUNT_ID:nanosiem-cloudtrail-logs",
"Events": ["s3:ObjectCreated:*"],
"Filter": {
"Key": {
"FilterRules": [
{"Name": "suffix", "Value": ".json.gz"}
]
}
}
}
]
}
EOF
# Apply notification configuration (edit the file first to replace placeholders)
aws s3api put-bucket-notification-configuration \
--bucket YOUR_BUCKET_NAME \
--notification-configuration file:///tmp/notification.jsonSuffix Filters by Log Type
Different AWS services use different file extensions. Adjust the FilterRules accordingly:
| AWS Service | Typical Suffix | Prefix Pattern |
|---|---|---|
| CloudTrail | .json.gz | AWSLogs/ACCOUNT_ID/CloudTrail/ |
| VPC Flow Logs | .log.gz | AWSLogs/ACCOUNT_ID/vpcflowlogs/ |
| GuardDuty | .jsonl.gz | AWSLogs/ACCOUNT_ID/GuardDuty/ |
| ALB Access Logs | .log.gz | AWSLogs/ACCOUNT_ID/elasticloadbalancing/ |
| NLB Access Logs | .log.gz | AWSLogs/ACCOUNT_ID/elasticloadbalancing/ |
| WAF Logs | .log.gz | aws-waf-logs-* |
| S3 Access Logs | none | depends on your config |
You can also filter by prefix to narrow which objects trigger notifications:
{
"FilterRules": [
{"Name": "prefix", "Value": "AWSLogs/123456789012/CloudTrail/"},
{"Name": "suffix", "Value": ".json.gz"}
]
}Verify Notifications Are Working
Upload a test file and check the queue:
# Upload a test object that matches your filter
echo '{"test": true}' | gzip | \
aws s3 cp - s3://YOUR_BUCKET_NAME/test-notification.json.gz
# Check the queue for messages (don't delete them)
aws sqs receive-message \
--queue-url "$QUEUE_URL" \
--max-number-of-messages 1 \
--wait-time-seconds 5
# Clean up the test file
aws s3 rm s3://YOUR_BUCKET_NAME/test-notification.json.gzYou should see an SQS message with an s3:ObjectCreated:Put event. If the response is empty, double-check your queue policy and notification configuration.
Step 3: Create an IAM User for nano
Create a dedicated IAM user with minimal permissions. nano needs to read from the SQS queue and download objects from the S3 bucket.
# Create the IAM user
aws iam create-user --user-name nanosiem-log-reader
# Create the access key
aws iam create-access-key --user-name nanosiem-log-readerSave the AccessKeyId and SecretAccessKey from the output — you'll enter these in nano.
IAM Policy
Attach a policy with the minimum required permissions:
cat > /tmp/nanosiem-policy.json << 'EOF'
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "SQSRead",
"Effect": "Allow",
"Action": [
"sqs:ReceiveMessage",
"sqs:DeleteMessage",
"sqs:GetQueueAttributes"
],
"Resource": "arn:aws:sqs:us-east-1:ACCOUNT_ID:nanosiem-*"
},
{
"Sid": "S3Read",
"Effect": "Allow",
"Action": [
"s3:GetObject"
],
"Resource": "arn:aws:s3:::YOUR_BUCKET_NAME/*"
}
]
}
EOF
aws iam put-user-policy \
--user-name nanosiem-log-reader \
--policy-name nanosiem-s3-sqs-read \
--policy-document file:///tmp/nanosiem-policy.jsonLeast privilege: This policy only grants read access. nano deletes SQS messages after processing (to prevent re-ingestion) but never modifies S3 objects. If you have multiple buckets, add each bucket ARN to the S3Read statement's Resource array.
Cross-Account Access (Optional)
If your S3 bucket is in a different AWS account than nano, use an IAM role with sts:AssumeRole instead of direct access keys:
- In the log account (where S3 lives), create an IAM role with the S3/SQS permissions above and a trust policy allowing the nano account to assume it
- In the nano account, give the nano IAM user
sts:AssumeRolepermission for that role - In nano, set the Assume Role ARN field when creating the credential (e.g.,
arn:aws:iam::LOG_ACCOUNT:role/nanosiem-reader)
Step 4: Store Credentials in nano
- Navigate to Settings → Cloud Credentials
- Click Add Credential
- Fill in the form:
| Field | Value |
|---|---|
| Provider | AWS S3 |
| Name | A descriptive name, e.g. AWS Production - CloudTrail |
| Region | The region of your SQS queue, e.g. us-east-1 |
| Access Key ID | The AccessKeyId from Step 3 |
| Secret Access Key | The SecretAccessKey from Step 3 |
- Click Save
The credential is encrypted at rest and the secret key will never be displayed again.
One credential, multiple log sources: You can reuse the same credential across multiple log sources if they share the same IAM user/role. For example, one credential for all your CloudTrail, VPC Flow Log, and GuardDuty feeds in the same account.
Step 5: Create a Log Source
- Navigate to Feeds → New Feed (or use the Log Source Wizard)
- Select "I have sample logs" and paste a representative log entry for your source (see examples below)
- The AI will detect the format and generate a VRL parser
- Configure the source connection:
| Field | Value |
|---|---|
| Source Type | AWS S3 |
| SQS Queue URL | The full queue URL, e.g. https://sqs.us-east-1.amazonaws.com/123456789012/nanosiem-cloudtrail-logs |
| Region | Must match the SQS queue region, e.g. us-east-1 |
| Compression | Auto (recommended) — nano detects gzip/zstd automatically |
| Credential | Select the credential you created in Step 4 |
- Set the feed metadata (name, category, vendor, product)
- Publish the parser to create a version and deploy to Vector
Sample Log Events by AWS Service
CloudTrail
{
"eventVersion": "1.09",
"userIdentity": {
"type": "IAMUser",
"principalId": "AIDACKCEVSQ6C2EXAMPLE",
"arn": "arn:aws:iam::123456789012:user/alice",
"accountId": "123456789012",
"userName": "alice"
},
"eventTime": "2025-01-15T14:23:45Z",
"eventSource": "iam.amazonaws.com",
"eventName": "CreateUser",
"awsRegion": "us-east-1",
"sourceIPAddress": "203.0.113.50",
"userAgent": "aws-cli/2.15.0",
"requestParameters": {
"userName": "new-service-account"
},
"responseElements": {
"user": {
"userName": "new-service-account",
"userId": "AIDACKCEVSQ6C3EXAMPLE",
"arn": "arn:aws:iam::123456789012:user/new-service-account"
}
}
}VPC Flow Logs (JSON format)
{
"version": 2,
"account-id": "123456789012",
"interface-id": "eni-0a1b2c3d4e5f6g7h8",
"srcaddr": "10.0.1.5",
"dstaddr": "203.0.113.50",
"srcport": 49152,
"dstport": 443,
"protocol": 6,
"packets": 25,
"bytes": 5000,
"start": 1705312800,
"end": 1705312860,
"action": "ACCEPT",
"log-status": "OK"
}GuardDuty
{
"schemaVersion": "2.0",
"accountId": "123456789012",
"region": "us-east-1",
"type": "Recon:EC2/PortProbeUnprotectedPort",
"severity": 2,
"createdAt": "2025-01-15T14:23:45.000Z",
"updatedAt": "2025-01-15T14:23:45.000Z",
"title": "Unprotected port on EC2 instance is being probed",
"description": "EC2 instance i-0abc123 has an unprotected port which is being probed by a known malicious host.",
"resource": {
"resourceType": "Instance",
"instanceDetails": {
"instanceId": "i-0abc123def456",
"instanceType": "t3.medium"
}
},
"service": {
"action": {
"portProbeAction": {
"portProbeDetails": [
{
"localPortDetails": {"port": 22, "portName": "SSH"},
"remoteIpDetails": {"ipAddressV4": "198.51.100.0"}
}
]
}
}
}
}Step 6: Verify Ingestion
After publishing, allow a few minutes for Vector to start polling the SQS queue.
Check Feed Health
- Go to Feeds → select your new log source
- On the Overview tab, check:
- Status: Should show "Healthy"
- Event Volume chart: Should show events arriving
- Last Event: Should show a recent timestamp
Search Your Data
Navigate to Search and run a query for your source type:
source_type="aws_cloudtrail"Or search for specific activity:
source_type="aws_cloudtrail" eventName="CreateUser"
| table timestamp, user, src_ip, eventNameCheck for Errors
If no data appears:
-
Check SQS queue depth — Are messages accumulating?
aws sqs get-queue-attributes \ --queue-url "$QUEUE_URL" \ --attribute-names ApproximateNumberOfMessages \ --query 'Attributes.ApproximateNumberOfMessages' --output text- Messages accumulating: nano isn't polling. Check credentials and the log source deployment status.
- Zero messages: Either no new S3 objects are being written, or the S3 notification isn't configured correctly. Go back to Step 2 and verify.
-
Check Vector logs for connection errors:
docker logs nanosiem-vector 2>&1 | grep -i "aws\|s3\|sqs\|error" -
Check ingestion errors in nano at System → Ingestion Errors
AWS Service-Specific Setup Notes
CloudTrail
If you don't already have CloudTrail logging to S3:
# Create a trail (if you don't have one)
aws cloudtrail create-trail \
--name nanosiem-trail \
--s3-bucket-name YOUR_BUCKET_NAME \
--is-multi-region-trail \
--enable-log-file-validation
# Start logging
aws cloudtrail start-logging --name nanosiem-trailCloudTrail delivers logs as gzipped JSON files with a Records array. Each file contains multiple events. nano's parser handles unwrapping the Records array automatically with the built-in CloudTrail parser.
VPC Flow Logs
To deliver VPC Flow Logs to S3 in JSON format:
aws ec2 create-flow-logs \
--resource-type VPC \
--resource-ids vpc-0abc123def456 \
--traffic-type ALL \
--log-destination-type s3 \
--log-destination arn:aws:s3:::YOUR_BUCKET_NAME/vpcflowlogs/ \
--log-format '${version} ${account-id} ${interface-id} ${srcaddr} ${dstaddr} ${srcport} ${dstport} ${protocol} ${packets} ${bytes} ${start} ${end} ${action} ${log-status}' \
--max-aggregation-interval 60JSON vs. text format: VPC Flow Logs default to a space-delimited text format. For easier parsing, consider using the --log-format flag to select specific fields, or enable Parquet format for better performance. nano parsers handle both text and JSON formats.
GuardDuty
GuardDuty findings can be exported to S3 via an export configuration:
- In the GuardDuty console, go to Settings → Findings export options
- Set the S3 bucket and a KMS key for encryption
- Choose export frequency (every 15 minutes for updated findings)
Alternatively, use EventBridge to route GuardDuty findings to S3 for real-time delivery:
# Create an EventBridge rule for GuardDuty findings
aws events put-rule \
--name nanosiem-guardduty-findings \
--event-pattern '{"source": ["aws.guardduty"], "detail-type": ["GuardDuty Finding"]}'
# Set S3 as the target (via Firehose for batching)
# Or send directly to nano's HTTP endpoint for real-time ingestionALB/NLB Access Logs
Enable access logging on your load balancer:
aws elbv2 modify-load-balancer-attributes \
--load-balancer-arn arn:aws:elasticloadbalancing:us-east-1:ACCOUNT_ID:loadbalancer/app/my-alb/abc123 \
--attributes Key=access_logs.s3.enabled,Value=true \
Key=access_logs.s3.bucket,Value=YOUR_BUCKET_NAME \
Key=access_logs.s3.prefix,Value=alb-logsMultiple Log Types from One Bucket
If multiple AWS services write to the same S3 bucket (common with centralized logging), you have two options:
Option A: One Queue per Log Type (Recommended)
Create separate SQS queues with S3 prefix filters:
{
"QueueConfigurations": [
{
"Id": "cloudtrail-notifications",
"QueueArn": "arn:aws:sqs:us-east-1:ACCOUNT_ID:nanosiem-cloudtrail",
"Events": ["s3:ObjectCreated:*"],
"Filter": {
"Key": {
"FilterRules": [
{"Name": "prefix", "Value": "AWSLogs/123456789012/CloudTrail/"}
]
}
}
},
{
"Id": "vpcflow-notifications",
"QueueArn": "arn:aws:sqs:us-east-1:ACCOUNT_ID:nanosiem-vpcflow",
"Events": ["s3:ObjectCreated:*"],
"Filter": {
"Key": {
"FilterRules": [
{"Name": "prefix", "Value": "AWSLogs/123456789012/vpcflowlogs/"}
]
}
}
}
]
}Then create one nano log source per queue, each with its own parser.
Option B: One Shared Queue
Use a single SQS queue for all S3 notifications and let nano route based on the S3 key prefix in the notification message. This is simpler to set up but requires more careful parser routing.
Troubleshooting
"Access Denied" in Vector logs
- Verify the IAM user has
sqs:ReceiveMessage,sqs:DeleteMessage, ands3:GetObjectpermissions - Check the resource ARNs in the policy match your actual queue and bucket
- If using cross-account access, verify the trust policy on the assumed role
Messages in SQS but no events in nano
- Check the log source is deployed (published) — look for "Unpublished changes" banner
- Verify the SQS Queue URL in the log source config matches exactly (no trailing slash)
- Check the region matches between the credential, log source config, and actual SQS queue
- Look for parse errors in System → Ingestion Errors
S3 notifications not arriving in SQS
- Verify the SQS queue policy allows
s3.amazonaws.comto send messages (Step 1) - Check the S3 bucket notification configuration is applied:
aws s3api get-bucket-notification-configuration --bucket YOUR_BUCKET_NAME - Ensure the suffix/prefix filter matches the actual object keys
- S3 notifications can take a few seconds to propagate after configuration
"KMS Decrypt" errors
If your S3 objects are encrypted with a KMS key, add kms:Decrypt permission to the IAM policy:
{
"Sid": "KMSDecrypt",
"Effect": "Allow",
"Action": "kms:Decrypt",
"Resource": "arn:aws:kms:us-east-1:ACCOUNT_ID:key/KEY_ID"
}Recommended Log Sources for Security
If you're starting from scratch, prioritize these AWS log sources:
| Priority | Source | Why |
|---|---|---|
| 1 | CloudTrail | All API activity — IAM changes, resource creation, authentication events |
| 2 | GuardDuty | AWS-native threat detection findings |
| 3 | VPC Flow Logs | Network traffic visibility for lateral movement and exfiltration detection |
| 4 | ALB/NLB Access Logs | Web application activity and potential attack patterns |
| 5 | WAF Logs | Web attack attempts blocked/allowed by AWS WAF |
| 6 | S3 Access Logs | Data access auditing for sensitive buckets |
Next Steps
- Create detection rules for your AWS logs
- Configure enrichment to add GeoIP and threat intel to AWS source IPs
- Set up the GCP integration if you also use Google Cloud