Normalization
Normalization is a critical stage connecting ingestion from sources and forwarding to targets used to coalesce log data from diverse sources into consistent formats, enabling unified handling across different logging systems.
Log Formats
The processor supports several widely-used log formats:
Generic
| Format | Notation | Key Identifier | Layout Characteristics | Example Fields |
|---|---|---|---|---|
| Elastic Common Schema (ECS) | Dot notation with lowercase | @timestamp | Hierarchical structure | source.ip, network.direction |
| Splunk Common Information Model (CIM) | Underscore with lowercase | _time | Flat structure | src_ip, network_direction |
| Advanced Security Information Model (ASIM) | PascalCase | TimeGenerated | Explicit names | SourceIp, NetworkDirection |
| Google SecOps Unified Data Model (UDM) | Nested structure | metadata.event_timestamp | Entity-based hierarchy | principal.ip, target.ip |
| Open Cybersecurity Schema Framework (OCSF) | Nested structure | time | Class-based hierarchy | src_endpoint.ip, dst_endpoint.ip |
Security-specific
| Format | Description | Key Identifier | Example Fields |
|---|---|---|---|
| Common Event Format (CEF) | ArcSight's standard format | rt (receiptTime) | networkUser, sourceAddress |
| Log Event Extended Format (LEEF) | IBM QRadar's format | devTime | networkUser, srcAddr |
| Common Security Log (CSL) | Microsoft Sentinel's format | TimeGenerated | NetworkUser, SourceAddress |
Format Detection
Source formats can be automatically detected using certain characteristic fields, e.g.
| Context | Field | Format |
|---|---|---|
| Timestamp | @timestamp | ECS |
_time | CIM | |
TimeGenerated | ASIM/CSL | |
metadata.event_timestamp | UDM | |
time | OCSF | |
| Security | rt | CEF |
devTime | LEEF | |
| CSL detection | TimeGenerated + LogSeverity | CSL |
TimeGenerated only | ASIM | |
| UDM detection | metadata.event_type | UDM |
| OCSF detection | class_uid | OCSF |
Conversion
Casing and Delimiters
Each format follows specific naming conventions:
ECS | source.ip, event.severity |
CIM | src_ip, event_severity |
ASIM | SourceIp, EventSeverity |
CEF | sourceAddress, eventSeverity |
LEEF | srcAddr, evtSev |
CSL | SourceIP, EventSeverity |
UDM | principal.ip, security_result.severity |
OCSF | src_endpoint.ip, severity_id |
Complex format conversions may impact performance.
Field Mapping
There are identifiable common network fields based on context across various formats:
| Context | |||
|---|---|---|---|
| Format | Source IP | Destination IP | Direction |
ecs | source.ip | destination.ip | network.direction |
cim | src | dest | direction |
asim | SrcIp | DstIp | NetworkDirection |
cef | src | dst | networkDirection |
leef | srcAddr | dstAddr | netDir |
csl | SourceIp | DestinationIp | NetworkDirection |
udm | principal.ip | target.ip | network.direction |
ocsf | src_endpoint.ip | dst_endpoint.ip | direction_id |
Configuration
Basic
Convert from ECS to ASIM format:
normalize:
source_format: ecs
target_format: asim
Field-specific
Convert a specific network field:
normalize:
field: network_data
source_format: cef
target_format: ecs
Auto-detection
Let the processor detect the source format:
normalize:
target_format: cim
UDM Conversion
Convert ECS to Google SecOps UDM format:
normalize:
source_format: ecs
target_format: udm
OCSF Conversion
Convert to Amazon Security Lake OCSF format:
normalize:
source_format: ecs
target_format: ocsf
Preprocessing
Fields are standardized with normalize for conversion between the ECS, CIM, ASIM, CEF, LEEF, CSL, OCSF, and UDM formats (see the Log Formats and Conversion sections above). Values are formatted for uniform casing with uppercase and lowercase processors when required by the target format's naming conventions.
Postprocessing
Fields are optimized for storage and queries using format conversion with the normalize processor (see the Conversion and Field Mapping sections above). For Microsoft Sentinel integration, data is prepared by converting to the ASIM format with normalize. For Google SecOps, convert to UDM format. For Amazon Security Lake, convert to OCSF format (see Log Formats table).
When converting to UDM or OCSF, schema enforcement rules are applied by default. These rules normalize timestamps, validate event types, and ensure field values conform to the target schema specification.
Complex format conversions may impact processing performance and delivery latency.