Building effective SIEM dashboards and telemetry pipelines¶
Core objectives¶
Correlate attacker tactics, techniques and procedures (TTPs) with event data across hybrid environments
Identify visibility gaps through systematic mapping against the MITRE ATT&CK framework
Empower security teams with actionable indicators and historical context for investigations
Dashboard development principles¶
Essential components for all platforms¶
TTP correlation views - Map detection rules to ATT&CK techniques
Process lineage visualisation - Parent/child relationships with command-line context
Network activity overlays - GeoIP mapping and protocol analysis
Time-bound analysis - Sliding windows for incident timeframes
Platform-specific implementations¶
Elastic stack (Kibana) dashboards¶
Recommended visualisations:
ATT&CK technique matrix with detection coverage gaps highlighted
Process creation events (Sysmon Event ID 1) with parent process context
Network connection maps enriched with threat intelligence feeds
Sample event schema:
event:
category: ["process", "network"]
module: "powershell"
process:
name: "powershell.exe"
parent:
name: "explorer.exe"
network:
direction: "outbound"
destination:
ip: "185.234.219.112"
Splunk dashboards¶
Key search patterns:
index=sysmon (process_name=powershell.exe OR command_line="*Invoke*")
| stats count by user, host, parent_process_name, command_line
| lookup mitre_technique_lookup command_line OUTPUT technique_id, tactic
| table technique_id, tactic, count
Effective visualisations:
Kill chain phase progression charts
Notable events by severity/frequency heatmaps
Detection rule efficacy over time
Telemetry pipeline architecture¶
Recommended data flows¶
Source |
Collection method |
Destination |
Use case |
---|---|---|---|
Endpoints (Osquery) |
Fleet Manager |
Elastic via Logstash |
Host state interrogation |
Windows (Sysmon) |
Windows Event Forwarding |
Splunk Enterprise |
Process/network monitoring |
Cloud (CloudTrail) |
Kinesis Firehose |
S3 + Athena |
Cloud API activity logging |
Network (Zeek) |
Direct log shipping |
Grafana Loki |
Protocol-level inspection |
Critical enrichment steps¶
Field normalisation (ECS/CIM standards)
Threat intelligence lookups (MISP, Sigma)
ATT&CK technique tagging
Business context mapping (asset criticality)
Threat hunting enablement¶
Data retention strategy¶
High-fidelity logs: 90 days minimum for endpoints/network
Alert metadata: 12 months for trend analysis
Enriched events: 30 days in hot storage
Correlation improvements¶
# Example Sigma-to-Splunk conversion logic
def convert_sigma_rule(rule):
splunk_query = f"index={rule['logsource']} "
splunk_query += " AND ".join(rule['detection']['conditions'])
return {
'search': splunk_query,
'tags': [f"attack_{t}" for t in rule['tags']]
}
Tooling matrix¶
Tool |
Primary strength |
ATT&CK coverage focus |
Integration example |
---|---|---|---|
Kibana |
Custom visualisations |
Full framework |
Elastic Agent → Beats → Dashboard |
Splunk |
Complex correlation |
Execution, persistence, exfil |
UF → Heavy Forwarder → Search Head |
Grafana |
Cloud/metrics visualisation |
Discovery, collection |
CloudWatch → Kinesis → Dashboard |
Osquery |
Endpoint state queries |
Defense evasion, discovery |
Fleet → TLS-encrypted log shipping |
Zeek |
Network protocol analysis |
Command and control |
Sensor → Logstash pipeline → Storage |
Common challenges and mitigation¶
Visibility gaps¶
Problem: Alert-centric monitoring misses stealthy activity
Solution: Baseline normal activity and hunt for deviations
Schema issues¶
Problem: Field name mismatches break correlations
Solution: Enforce ECS/CIM standards with validation checks
Dashboard relevance¶
Problem: Static views become outdated
Solution: Quarterly reviews against current threat intel
Recommended improvements¶
Develop YAML-based detection rule converters
Integrate purple team exercise results
Build ATT&CK coverage scorecards
Implement automated detection validation
Detection validation workflows¶
Automated rule testing framework¶
# Pseudocode for detection validation
def test_detection_rule(rule, test_cases):
alerts_triggered = 0
for case in test_cases:
if execute_siem_query(rule.query, case.log_entry):
alerts_triggered += 1
efficacy_score = (alerts_triggered / len(test_cases)) * 100
return {
'rule_id': rule.id,
'efficacy': f"{efficacy_score}%",
'false_positives': run_against_benign_logs(rule.query)
}
Implementation steps:
Atomic test case generation
Create log entries matching ATT&CK techniques using tools like Caldera or Atomic Red Team
Example test case for T1059 (PowerShell):
log_entry: process.name: "powershell.exe" command_line: "Invoke-Mimikatz -Command '\"sekurlsa::pth /user:admin /domain:corp /ntlm:hash\""'
Benign activity profiling
Collect 30 days of normal business activity logs
Use as control group for false positive measurement
Validation schedule
Weekly: Automated atomic tests
Quarterly: Purple team exercises with new TTPs
Storage tier cost-benefit analysis¶
Financial services data retention model¶
Tier |
Retention |
Cost/Month (GB) |
Use Case |
Example Data |
---|---|---|---|---|
Hot |
30 days |
£0.50 |
Active investigations |
Raw EDR logs, alert queues |
Warm |
90 days |
£0.20 |
Threat hunting |
Enriched events, Sysmon logs |
Cold |
1-3 years |
£0.05 |
Compliance/audit |
Compressed alert metadata |
Cost optimisation strategies:
Selective retention
Keep full-fidelity logs only for critical assets (SWIFT terminals, DCs)
Sample non-critical endpoints at 20% (reduces storage by ~65%)
Compression benchmarks
Zeek logs: 80% reduction with Zstandard
Windows events: 70% reduction with LZ4
Cloud cost controls
AWS S3 lifecycle policies to auto-transition after 30 days
Elasticsearch cold tier with frozen indices
Team competency requirements¶
Cross-functional skills matrix¶
Role |
SIEM Skills |
Cloud Knowledge |
ATT&CK Proficiency |
Typical FTE Ratio |
---|---|---|---|---|
L1 Analyst |
Splunk/ELK query syntax |
Basic AWS/GCP concepts |
TTP recognition |
4:1 (per shift) |
Threat Hunter |
Advanced correlation searches |
Log pipeline management |
Technique emulation |
2 per 10k endpoints |
SOC Engineer |
Dashboard development |
Infrastructure-as-code |
Detection engineering |
1 per cloud account |
Staffing consideration: A 24/5 SOC covering 50k endpoints typically requires:
6 L1 analysts (4 shifts)
3 threat hunters
2 SOC engineers
€85k/year in training budget
Implementation checklist¶
Phase 1: Foundation (Weeks 1-4)¶
Deploy log collection infrastructure
Normalise critical field names (ECS/CIM)
Build 5-10 core ATT&CK-aligned dashboards
Phase 2: Validation (Weeks 5-8)¶
Establish detection testing framework
Profile benign activity baselines
Conduct first purple team exercise
Phase 3: Optimisation (Ongoing)¶
Implement storage tiering policies
Monthly detection efficacy reports
Quarterly skills gap analysis