Skip to main content

Google Cloud Pub/Sub

Google Cloud Message Queue

Synopsis

Creates a target that publishes messages to Google Cloud Pub/Sub topics with support for batch processing, message ordering, and service account authentication. Provides reliable message delivery to Google Cloud Pub/Sub for event-driven architectures and distributed systems.

Schema

- name: <string>
description: <string>
type: gcppubsub
pipelines: <pipeline[]>
status: <boolean>
properties:
project_id: <string>
topic_id: <string>
credentials_file: <string>
credentials_json: <string>
ordering_key: <string>
max_messages: <numeric>
max_bytes: <numeric>
field_format: <string>
interval: <string|numeric>
cron: <string>
debug:
status: <boolean>
dont_send_logs: <boolean>

Configuration

The following fields are used to define the target:

FieldRequiredDefaultDescription
nameYTarget name
descriptionN-Optional description
typeYMust be gcppubsub
pipelinesN-Optional post-processor pipelines
statusNtrueEnable/disable the target

Connection

FieldRequiredDefaultDescription
project_idY-Google Cloud project ID
topic_idY-Pub/Sub topic ID
credentials_fileN*-Path to service account JSON key file
credentials_jsonN*-Service account JSON key as string

* = Either credentials_file or credentials_json must be provided.

note

Service account must have roles/pubsub.publisher permission on the topic.

Message Configuration

FieldRequiredDefaultDescription
ordering_keyN-Ordering key for message ordering within the topic
max_messagesN1000Maximum number of messages per batch
max_bytesN10485760Maximum batch size in bytes (10 MB)
field_formatN-Data normalization format. See applicable Normalization section

Scheduler

FieldRequiredDefaultDescription
intervalNrealtimeExecution frequency. See Interval for details
cronN-Cron expression for scheduled execution. See Cron for details

Debug Options

FieldRequiredDefaultDescription
debug.statusNfalseEnable debug logging
debug.dont_send_logsNfalseProcess logs but don't send to target (testing)

Details

The Google Cloud Pub/Sub target publishes messages to Pub/Sub topics for asynchronous, scalable message delivery. It supports batch processing for optimal performance and message ordering for maintaining event sequence.

Authentication

Authentication uses Google Cloud service account credentials. You can provide credentials either as a file path or as a JSON string.

IAM Permissions

The service account requires the following IAM role:

IAM RoleRole IDPurpose
Pub/Sub Editorroles/pubsub.editorCheck, create topics and publish messages

If the topic is pre-created and auto-creation is not needed, a narrower set of roles can be used:

IAM RoleRole IDPurpose
Pub/Sub Publisherroles/pubsub.publisherPublish messages to existing topic
Pub/Sub Viewerroles/pubsub.viewerCheck topic existence

Minimum permissions: pubsub.topics.get, pubsub.topics.create (optional), pubsub.topics.publish

Topic Management

The target automatically checks if the specified topic exists and creates it if needed. Topics are created in the specified Google Cloud project.

Message Ordering

When an ordering_key is set, Pub/Sub ensures that messages with the same ordering key are delivered in the order they were published. This is useful for maintaining event sequences for specific entities.

Example: Use device ID as ordering key to ensure events from the same device are processed in order.

Message Attributes

The target automatically adds the following attributes to each message:

  • device_id - Source device identifier
  • device_type - Type of source device
  • device_name - Name of source device

These attributes can be used for message filtering and routing in subscriptions.

Batch Processing

Messages are published in batches for optimal throughput. The target accumulates messages until the batch size limit is reached or during finalization. Google Cloud Pub/Sub supports up to 10 MB per batch.

At-Least-Once Delivery

Pub/Sub guarantees at-least-once delivery. Messages may be delivered more than once in case of network issues or subscriber failures. Design your message handlers to be idempotent.

Message Retention

Messages are retained for 7 days by default. You can configure retention periods from 10 minutes to 7 days in the topic settings.

Dead Letter Topics

Google Cloud Pub/Sub supports dead letter topics for messages that cannot be processed after a configured number of delivery attempts. Configure this in the subscription settings.

Examples

The following are commonly used configuration types.

Basic with File Credentials

Creating a basic Pub/Sub target with credentials file...

- name: basic_pubsub
type: gcppubsub
properties:
project_id: "my-project-id"
topic_id: "application-logs"
credentials_file: "/path/to/service-account-key.json"
max_messages: 1000

Target publishes JSON messages to Pub/Sub topic...

{
"timestamp": "2024-01-15T10:30:00Z",
"host": "server01",
"message": "Application started successfully",
"severity": "info"
}

With JSON Credentials

Using credentials as JSON string (useful for secrets management)...

- name: json_creds_pubsub
type: gcppubsub
properties:
project_id: "my-project-id"
topic_id: "security-events"
credentials_json: "${GCP_SERVICE_ACCOUNT_JSON}"
max_messages: 500

With Message Ordering

Using ordering key to maintain message sequence...

- name: ordered_pubsub
type: gcppubsub
properties:
project_id: "my-project-id"
topic_id: "ordered-events"
credentials_file: "/path/to/service-account-key.json"
ordering_key: "device-001"
max_messages: 250

High-Throughput Configuration

Optimizing for high-volume message publishing...

- name: high_volume_pubsub
type: gcppubsub
properties:
project_id: "my-project-id"
topic_id: "high-volume-logs"
credentials_file: "/path/to/service-account-key.json"
max_messages: 2000
max_bytes: 20971520

Multiple Topics

Publishing to different topics for different log types...

- name: application_logs_pubsub
type: gcppubsub
properties:
project_id: "my-project-id"
topic_id: "application-logs"
credentials_file: "/path/to/service-account-key.json"

- name: error_logs_pubsub
type: gcppubsub
properties:
project_id: "my-project-id"
topic_id: "error-logs"
credentials_file: "/path/to/service-account-key.json"

Field Normalization

Using field normalization for standard format...

- name: normalized_pubsub
type: gcppubsub
properties:
project_id: "my-project-id"
topic_id: "normalized-logs"
credentials_file: "/path/to/service-account-key.json"
field_format: "ecs"
max_messages: 1000

Pipeline Processing

Applying post-processing pipelines before publishing...

- name: pipeline_pubsub
type: gcppubsub
pipelines:
- format_timestamp
- add_metadata
- validate_schema
properties:
project_id: "my-project-id"
topic_id: "processed-events"
credentials_file: "/path/to/service-account-key.json"
max_messages: 750

Scheduled Batching

Configuration with scheduled batch delivery...

- name: scheduled_pubsub
type: gcppubsub
properties:
project_id: "my-project-id"
topic_id: "scheduled-logs"
credentials_file: "/path/to/service-account-key.json"
max_messages: 1000
interval: "5m"

Debug Configuration

Configuration with debugging enabled...

- name: debug_pubsub
type: gcppubsub
properties:
project_id: "my-project-id"
topic_id: "test-logs"
credentials_file: "/path/to/service-account-key.json"
debug:
status: true
dont_send_logs: true

Production Configuration

Configuration for production with optimal settings...

- name: production_pubsub
type: gcppubsub
pipelines:
- checkpoint
properties:
project_id: "production-project"
topic_id: "production-logs"
credentials_json: "${GCP_SERVICE_ACCOUNT_JSON}"
ordering_key: "production-cluster"
max_messages: 1500
max_bytes: 15728640
field_format: "ecs"