rex
rex
Extract fields from text using regular expressions with named capture groups.
Description
The rex command extracts structured data from unstructured text fields using regular expressions. It creates new fields based on named capture groups in your regex pattern, making it possible to parse custom log formats, extract values from messages, and structure free-form text.
This is particularly useful when dealing with logs that don't have structured fields or when you need to extract specific patterns from existing fields.
Syntax
... | rex [field=<field>] "<regex_pattern>"
... | rex mode=sed [field=<field>] "s/<pattern>/<replacement>/[flags]"Optional Arguments
field
Syntax: field=<field>
Description: Source field to extract from
Default: message
regex_pattern
A regular expression with named capture groups using the syntax (?<field_name>pattern). Each named group creates a new field.
mode
Syntax: mode=sed
Description: Switch to sed substitution mode. Instead of extracting fields, replaces text matching a pattern with a replacement string. Uses the s/pattern/replacement/flags syntax. The g flag replaces all occurrences.
Named Capture Groups
Use (?<name>pattern) syntax to create fields:
(?<username>\w+) # Creates field "username"
(?<ip>\d+\.\d+\.\d+\.\d+) # Creates field "ip"
(?<status>\d{3}) # Creates field "status"Examples
Extract username from message
* | rex "user=(?<username>\w+)"
| table username, messageExtracts username from patterns like "user=john".
Extract IP address
* | rex "IP: (?<ip_address>\d+\.\d+\.\d+\.\d+)"
| table ip_address, messageExtracts IP addresses from text.
Multiple captures
* | rex "user=(?<user>\w+) action=(?<action>\w+) status=(?<status>\d+)"
| table user, action, statusExtracts multiple fields from a single pattern.
Extract from custom field
* | rex field=url "https?://(?<domain>[^/]+)/(?<path>.*)"
| table domain, pathParses URL into domain and path components.
Parse custom log format
* | rex "\[(?<timestamp>[^\]]+)\] (?<level>\w+): (?<message>.*)"
| table timestamp, level, messageStructures a custom log format.
Extract error codes
severity=error
| rex "error code: (?<error_code>\d+)"
| stats count() by error_codeExtracts and counts error codes.
Parse Windows event logs
source_type="windows"
| rex "EventID=(?<event_id>\d+).*User=(?<user>[^\s]+)"
| table event_id, userExtracts structured data from Windows logs.
Extract file paths
* | rex "file: (?<file_path>/[^\s]+)"
| dedup file_path
| table file_pathFinds unique file paths mentioned in logs.
Parse authentication logs
action=login
| rex "user (?<username>\w+) from (?<source_ip>\d+\.\d+\.\d+\.\d+)"
| stats count() by username, source_ipStructures authentication data.
Extract command line arguments
process_name="powershell.exe"
| rex field=command_line "-(?<flag>\w+)\s+(?<value>[^\s-]+)"
| table process_name, flag, valueParses command line flags and values.
Parse HTTP logs
source_type="apache"
| rex "(?<method>\w+) (?<url>[^\s]+) HTTP/(?<version>[\d.]+)\" (?<status>\d{3})"
| stats count() by method, statusExtracts HTTP request details.
Extract email addresses
* | rex "(?<email>[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,})"
| dedup email
| table emailFinds all email addresses in logs.
Parse JSON-like strings
* | rex "\"user\":\"(?<user>[^\"]+)\",\"action\":\"(?<action>[^\"]+)\""
| table user, actionExtracts values from JSON strings (when not properly parsed).
Extract version numbers
* | rex "version (?<version>\d+\.\d+\.\d+)"
| stats count() by versionIdentifies software versions.
Parse firewall logs
source_type="firewall"
| rex "src=(?<src_ip>[\d.]+):(?<src_port>\d+) dst=(?<dst_ip>[\d.]+):(?<dst_port>\d+)"
| table src_ip, src_port, dst_ip, dst_portStructures firewall connection data.
Extract hash values
* | rex "hash: (?<file_hash>[a-fA-F0-9]{32,64})"
| dedup file_hash
| table file_hash, messageFinds file hashes in logs.
Parse key-value pairs
* | rex "(?<key>\w+)=(?<value>[^\s]+)"
| table key, valueNote: This only captures one pair. Use multiple rex commands for multiple pairs.
Chain multiple rex commands
* | rex "user=(?<username>\w+)"
| rex "ip=(?<ip_address>[\d.]+)"
| rex "action=(?<action>\w+)"
| table username, ip_address, actionExtracts multiple patterns with separate commands.
Redact sensitive data with sed mode
* | rex mode=sed field=message "s/password=[^ ]*/password=REDACTED/g"Replaces password values with REDACTED in the message field.
Normalize file paths
* | rex mode=sed field=file_path "s/\\\//-/g"Clean up log formatting
* | rex mode=sed "s/\s+/ /g"
| table messageCollapses multiple whitespace characters into a single space.
Redact email addresses
* | rex mode=sed field=message "s/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+/REDACTED_EMAIL/g"Usage Notes
Default field: If no field parameter is specified, rex operates on the message field.
Named groups required: Only named capture groups (?<name>pattern) create fields. Regular groups (pattern) are ignored.
Existing fields: If a named group matches an existing field name, it overwrites that field.
No match: If the pattern doesn't match, no fields are created and the event is unchanged.
Multiple matches: By default, only the first match is captured. For multiple matches, use multiple rex commands or different patterns.
Performance: Complex regex patterns on large datasets can be slow. Filter data first when possible.
Regex syntax: Uses standard regex syntax. Special characters must be escaped: \. for literal dot, \[ for literal bracket.
Case sensitivity: Regex matching is case-sensitive by default. Use (?i) flag for case-insensitive: (?i)(?<user>\w+).
Testing: Test your regex patterns on sample data before running on large datasets.
Alternatives: If your logs are in a standard format (JSON, CSV, etc.), use proper parsing at ingestion time instead of rex.
Common Regex Patterns
IP Address: \d+\.\d+\.\d+\.\d+
Email: [a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}
URL: https?://[^\s]+
Hash (MD5): [a-fA-F0-9]{32}
Hash (SHA256): [a-fA-F0-9]{64}
Date (YYYY-MM-DD): \d{4}-\d{2}-\d{2}
Time (HH:MM:SS): \d{2}:\d{2}:\d{2}
Word: \w+
Number: \d+