Skip to main content

Datasets and Profiles

Datasets and Profiles provide reusable data collection rule templates that standardize how telemetry is collected across device fleets. Instead of configuring each device's data collection individually, you define a dataset once and assign it to multiple devices.

Definitions

A Dataset is a reusable data collection rule template that defines what data to collect and how to collect it. Each dataset specifies a collection type (Windows Event Logs, DNS Logs, etc.) and the configuration parameters for that type. Datasets can optionally reference a preprocessing pipeline for inline processing of collected data.

A Profile is a grouping layer that composes multiple datasets into a single assignable unit. Profiles allow you to bundle related collection rules and apply them to devices as a set.

Type hierarchy: Each dataset has a type (category) and a definition type (specific collector). The type groups datasets into categories — windows, wec, or linux — while the definition type identifies the exact collector implementation (e.g., windows_security_log_collector). This two-level classification drives device compatibility and determines which configuration interface is presented.

Status lifecycle: Datasets and profiles have a status of active, passive, or deleted. Active items are applied to their assigned devices. Passive items remain configured but are not actively applied. Deleted items are soft-deleted and no longer visible in the UI.

Relationship to Devices: Datasets and profiles have a many-to-many relationship with devices. A single dataset can be assigned to multiple devices, and a single device can have multiple datasets assigned to it. This eliminates repetitive per-device configuration and ensures consistent data collection across your fleet.

Processing flow context: Datasets and profiles operate at the device layer of the DataStream processing flow. They govern what data a device collects before it enters preprocessing and pipeline stages.

Provider → Device (dataset rules applied here) → Preprocessing → Pipeline → Postprocessing → Target → Consumer

Management

Deletion Constraints

A dataset cannot be deleted if it is assigned to a device or included in a profile. Likewise, a profile cannot be deleted if it is assigned to a device. Remove all associations before deleting. When a deletion is blocked, the UI displays the specific conflicting devices or profiles that must be unassigned first.

Creating a Dataset

Dataset creation uses a multi-step wizard:

Step 1 — Define Dataset

Enter the dataset name and description.

Step 2 — Configure Dataset

Configure the type-specific collection rules. The configuration interface adapts based on the dataset type. Each type also supports an optional preprocessing pipeline assignment.

Windows (compatible with Windows devices):

  • Windows Security Events (windows_security_log_collector): Event category selector with four modes — ALL, MINIMAL, COMMON, or CUSTOM. Custom mode opens an XML editor for XPath filter expressions.
  • Windows Event Logs (windows_event_log_collector): Basic mode selects predefined channels (Application, System) with severity level filters. Custom mode provides an XPath expression editor with optional DCR config import.
  • Data Collection Rule Collector (data_collection_rule_collector): Custom-only XPath editor for Data Collection Rule queries. Supports importing DCR configuration that is automatically converted to XPath format.
  • Windows Firewall Logs (windows_firewall_log_collector): Profile selection for firewall log collection — Domain, Private, and/or Public.
  • Windows DNS Logs (windows_dns_log_collector): Include/exclude filter system with configurable conditions for DNS query fields (event ID, response code, question type, IP addresses, question name).

WEC (compatible with WEC devices):

  • Windows Event Collector Subscription (windows_event_collector_subscription): Custom-only XPath editor for Event Collector Subscription queries. Shares the same XPath editing interface as Windows Event Logs custom mode but without the DCR import option.

Linux (compatible with Linux devices):

  • Linux System Events (linux_host_log_collector): File path input for the system log source.
  • Linux Audit Events (linux_audit_report_log_collector): File path input for the audit log source.
  • Linux Firewall Events (linux_firewall_log_collector): File path input for the firewall log source.
Advanced Dataset Types

The backend supports additional dataset types that are not exposed in the UI: windows_main_log_collector, windows_file_log_collector, windows_system_log_collector, windows_application_log_collector, windows_object_access_log_collector, and windows_security_threat_analyzer. These are used internally and may appear in API responses or configuration exports.

Step 3 — Assign Devices

Select one or more devices to assign this dataset to. The device list supports multi-select with search filtering.

Step 4 — Review

Review the complete dataset configuration summary before creation. Verify assigned devices and collection rules.

Dataset Detail View

After creation, each dataset has a detail page with three tabs:

General Settings Tab

View and edit the dataset name, description, type, and status (active or passive).

Assigned Devices Tab

View and manage the list of devices assigned to this dataset. Add or remove device assignments.

Dataset Configuration Tab

View and edit the type-specific collection rules for this dataset.

Dataset Operations

  • Clone: Create a copy of an existing dataset with all its configuration. The cloned dataset requires a new name and can be modified independently.
  • Delete: Remove a dataset. A confirmation modal displays before deletion to prevent accidental removal.

Creating a Profile

Profile creation uses a multi-step wizard. Profiles are created with active status by default.

Step 1 — Define Profile

Enter the profile name and description.

Step 2 — Select Datasets

Select one or more existing datasets to include in this profile. The dataset list supports multi-select with filtering.

Step 3 — Assign Devices

Select one or more devices to assign this profile to. Device assignment is optional and can be configured later.

Step 4 — Review

Review the profile summary including selected datasets and assigned devices before creation.

Profile Detail View

The profile detail page provides access to the profile's general settings, assigned datasets, and assigned devices.

Permissions

Access to datasets and profiles is controlled by the following permission scopes:

ScopeDescription
DATASET_READView datasets and their configurations
DATASET_CREATECreate new datasets
DATASET_EDITModify existing datasets and device assignments
DATASET_DELETEDelete datasets
PROFILE_READView profiles and their configurations
PROFILE_CREATECreate new profiles
PROFILE_EDITModify existing profiles, dataset selection, and device assignments
PROFILE_DELETEDelete profiles

Device Integration

Datasets connect to devices through the Configure Data Collection workflow. When configuring a device's data collection:

  1. A selection drawer displays available datasets and profiles
  2. Select one or more datasets or profiles to assign
  3. A confirmation modal with a switch control confirms the assignment change
  4. The device begins collecting data according to the assigned dataset rules
Exclusive Assignment

A device can be assigned either datasets or a profile, not both. Assigning one type replaces any existing assignment of the other type.

Each device tracks its configuration mode (dataset or profile), determining whether it receives collection rules from individual datasets or from a profile.

Assigned datasets appear in the device's detail view under the Data Configuration tab (see Devices Management) and can be managed from either the device or dataset side of the relationship.