Guide

Service Placement

Where protector-api, protector-engine, MariaDB, and RabbitMQ run

master

Overview

This page explains where each Trilio Site Recovery component — protector-api, protector-engine, MariaDB, and RabbitMQ — runs relative to your OpenStack infrastructure. Because Trilio Site Recovery requires two fully independent OpenStack clouds, you must deploy and configure these services on both sites: a primary site and a secondary (DR) site. Understanding service placement is foundational to every other deployment decision, from firewall rules to failover behavior, because there is no direct service-to-service communication between sites — the CLI plugin and Horizon dashboard are the sole coordination layer.

Prerequisites

Before planning your service placement, confirm the following:

Two independent OpenStack clouds are available, each with its own Nova, Cinder, Neutron, and Keystone endpoints. The sites may be in separate physical datacenters or in the same cluster using different regions.
OpenStack Victoria or later is running on both sites.
MariaDB (or MySQL-compatible) is available on each site to back the local Protector database. A shared database between sites is explicitly not supported — each site must have its own database instance.
RabbitMQ is accessible from the protector-engine process on each site.
Python 3.8 or later is installed on every host where you will run Protector services.
Pure Storage FlashArray replication is configured between the two sites before you begin (async or sync, depending on your RPO requirements).
You have admin credentials on both OpenStack clouds to register service users and create Keystone endpoints.
Network connectivity exists between the two sites at the API plane — each site's Keystone and protector-api endpoints (default port 8788) must be reachable from the OSC CLI host and from the other site's management network.

Installation

Perform every step below on both sites unless a step is explicitly marked as site-specific.

Step 1: Create the Protector system user and directories

# Create a non-login system user to run the services
useradd --system --shell /bin/false protector

# Create required directories
mkdir -p /var/log/protector
mkdir -p /var/lib/protector
mkdir -p /etc/protector

# Set ownership
chown -R protector:protector /var/log/protector
chown -R protector:protector /var/lib/protector
chown -R protector:protector /etc/protector

Step 2: Create the Protector database (per site)

Each site needs its own database. The database must be reachable from the host running protector-engine and protector-api.

mysql -u root -p << EOF
CREATE DATABASE protector CHARACTER SET utf8;
GRANT ALL PRIVILEGES ON protector.* TO 'protector'@'localhost' IDENTIFIED BY 'PROTECTOR_DBPASS';
GRANT ALL PRIVILEGES ON protector.* TO 'protector'@'%' IDENTIFIED BY 'PROTECTOR_DBPASS';
FLUSH PRIVILEGES;
EOF

Replace PROTECTOR_DBPASS with a strong password. Record it — you will need it in protector.conf.

Step 3: Register the Protector service in Keystone (per site)

Run the following on each site using that site's admin credentials.

# Source site admin credentials
source ~/admin-openrc

# Create the service user
openstack user create --domain default --password-prompt protector

# Grant admin role in the service project
openstack role add --project service --user protector admin

# Register the service catalog entry
openstack service create \
  --name protector \
  --description "OpenStack Disaster Recovery Service" \
  protector

# Create the three endpoint types
openstack endpoint create --region RegionOne \
  protector public http://controller:8788/v1/%\(tenant_id\)s

openstack endpoint create --region RegionOne \
  protector internal http://controller:8788/v1/%\(tenant_id\)s

openstack endpoint create --region RegionOne \
  protector admin http://controller:8788/v1/%\(tenant_id\)s

Replace controller with the hostname or IP of the node that will run protector-api on that site.

Step 4: Install the Protector package (per site)

git clone https://github.com/your-org/openstack-protector.git
cd openstack-protector

pip install -r requirements.txt
python setup.py install

Step 5: Initialize the database schema (per site)

protector-manage db sync

This applies all Alembic migrations against the local database. Run it after every upgrade.

Step 6: Install systemd unit files (per site)

Create /etc/systemd/system/protector-api.service:

[Unit]
Description=OpenStack Protector API Service
After=network.target

[Service]
Type=simple
User=protector
Group=protector
ExecStart=/usr/local/bin/protector-api --config-file /etc/protector/protector.conf
Restart=on-failure
RestartSec=10
StandardOutput=journal
StandardError=journal

[Install]
WantedBy=multi-user.target

Create /etc/systemd/system/protector-engine.service:

[Unit]
Description=OpenStack Protector Engine Service
After=network.target

[Service]
Type=simple
User=protector
Group=protector
ExecStart=/usr/local/bin/protector-engine --config-file /etc/protector/protector.conf
Restart=on-failure
RestartSec=10
StandardOutput=journal
StandardError=journal

[Install]
WantedBy=multi-user.target

Step 7: Enable and start services (per site)

systemctl daemon-reload

systemctl enable protector-api protector-engine
systemctl start protector-api protector-engine

# Verify both are running
systemctl status protector-api
systemctl status protector-engine

Configuration

The primary configuration file is /etc/protector/protector.conf. You must create this file on each site with values appropriate to that site's infrastructure. The configuration on Site A and Site B will differ in their database connection strings, Keystone auth URLs, and bound addresses — but the structure is identical.

Minimal working configuration

[DEFAULT]
debug = False
log_dir = /var/log/protector
state_path = /var/lib/protector

[api]
bind_host = 0.0.0.0
bind_port = 8788
workers = 4

[database]
connection = mysql+pymysql://protector:PROTECTOR_DBPASS@controller/protector

[keystone_authtoken]
www_authenticate_uri = http://controller:5000
auth_url = http://controller:5000
memcached_servers = controller:11211
auth_type = password
project_domain_name = Default
user_domain_name = Default
project_name = service
username = protector
password = PROTECTOR_PASS

[oslo_policy]
policy_file = /etc/protector/policy.yaml

[service_credentials]
default_trust_roles = member,_member_

Key options explained

Option	Section	Default	Effect
`bind_host`	`[api]`	`0.0.0.0`	Interface `protector-api` listens on. Set to a specific IP to restrict access.
`bind_port`	`[api]`	`8788`	Port for the REST API. Must match the Keystone endpoint URLs you registered.
`workers`	`[api]`	`4`	Number of API worker processes. Increase for high-request-rate deployments.
`connection`	`[database]`	—	SQLAlchemy DSN for the local site's MariaDB instance. Each site points to its own database.
`debug`	`[DEFAULT]`	`False`	Set to `True` to emit verbose logs. Do not use in production — logs include sensitive metadata.
`log_dir`	`[DEFAULT]`	`/var/log/protector`	Directory for API and engine log files.
`state_path`	`[DEFAULT]`	`/var/lib/protector`	Working directory for runtime state files.
`default_trust_roles`	`[service_credentials]`	`member,_member_`	Keystone roles delegated via trust when the service acts on behalf of a tenant. Both `member` and `_member_` are listed for compatibility across OpenStack releases.
`policy_file`	`[oslo_policy]`	`/etc/protector/policy.yaml`	Path to the RBAC policy file.

Why each site has its own database

protector-engine writes DR operation state, protection group metadata, and site registration records to the local database. Metadata synchronization between sites happens at the API layer through explicit sync calls — not through a shared database. This design means each site remains independently operable, which is critical: if Site A's database were shared with Site B, a network partition between sites would prevent both sites from recording state.

RBAC policy file

Create /etc/protector/policy.yaml on each site:

"context_is_admin": "role:admin"
"admin_or_owner": "is_admin:True or project_id:%(project_id)s"
"default": "rule:admin_or_owner"

"protector:protection_groups:index": "rule:default"
"protector:protection_groups:show": "rule:default"
"protector:protection_groups:create": "rule:default"
"protector:protection_groups:update": "rule:default"
"protector:protection_groups:delete": "rule:default"

"protector:members:index": "rule:default"
"protector:members:create": "rule:default"
"protector:members:delete": "rule:default"

"protector:operations:index": "rule:default"
"protector:operations:show": "rule:default"
"protector:operations:action": "rule:default"

"protector:policies:show": "rule:default"
"protector:policies:create": "rule:default"

Cinder policy adjustments (both sites)

The protector service needs permissions beyond the default member role for two Cinder operations used during failover. Add the following to /etc/cinder/policy.yaml on both sites:

# Required for importing replicated volumes into Cinder after failover
"volume_extension:volume_manage": "rule:admin_or_owner"
# Required for unmanaging volumes during failback
"volume_extension:volume_unmanage": "rule:admin_or_owner"
# Required to discover the correct Cinder volume service host
"volume_extension:services:index": "rule:admin_or_owner"

For Kolla-Ansible deployments, place these in /etc/kolla/config/cinder/policy.yaml and then run:

kolla-ansible -i inventory reconfigure -t cinder

Usage

Once both sites are running protector-api and protector-engine with their own databases, the two deployments operate independently — they do not communicate with each other directly. You interact with both sites through the openstack CLI (using the protectorclient plugin) or the Horizon dashboard, which authenticates to whichever site you target and pushes metadata sync calls to the peer site when needed.

Confirming services are reachable

Verify the health endpoint on each site before proceeding:

# Site A
curl http://site-a-controller:8788/

# Site B
curl http://site-b-controller:8788/

A successful response returns the API version discovery document.

Where to run CLI commands

The protectorclient CLI plugin can be run from any host that has network access to both sites' Keystone and protector-api endpoints. It does not need to run on the controller nodes themselves. A typical operator workstation with ~/.config/openstack/clouds.yaml configured for both sites is the recommended pattern:

clouds:
  site-a:
    auth:
      auth_url: http://site-a-controller:5000/v3
      project_name: admin
      username: admin
      password: YOUR_PASSWORD
      user_domain_name: Default
      project_domain_name: Default
    region_name: RegionOne

  site-b:
    auth:
      auth_url: http://site-b-controller:5000/v3
      project_name: admin
      username: admin
      password: YOUR_PASSWORD
      user_domain_name: Default
      project_domain_name: Default
    region_name: RegionOne

Understanding the active/standby split

Protector services run on both sites at all times — there is no concept of "the primary site runs the service and the secondary site does not." The site where VMs are currently running is considered authoritative for metadata, but the protector-api and protector-engine processes on the standby site are fully active and ready to receive failover commands. This symmetrical design means that after a failover, the site roles swap and no service restarts are required.

Checking service logs

# API service log
tail -f /var/log/protector/protector-api.log

# Engine service log
tail -f /var/log/protector/protector-engine.log

# Systemd journal (live)
journalctl -u protector-api -f
journalctl -u protector-engine -f

Examples

Example 1: Verify service placement after installation

Confirm that both services are listening on the expected port on each controller node.

# On the Site A controller
netstat -tlnp | grep 8788

Expected output:

tcp   0   0 0.0.0.0:8788   0.0.0.0:*   LISTEN   <pid>/protector-api

# Confirm systemd reports both services active
systemctl status protector-api protector-engine

Expected output (truncated):

● protector-api.service - OpenStack Protector API Service
     Loaded: loaded (/etc/systemd/system/protector-api.service; enabled)
     Active: active (running) since ...

● protector-engine.service - OpenStack Protector Engine Service
     Loaded: loaded (/etc/systemd/system/protector-engine.service; enabled)
     Active: active (running) since ...

Repeat this verification on the Site B controller.

Example 2: Confirm database connectivity from the service

Before registering sites, verify the protector-engine can reach its local database.

# Test the database credentials from the controller
mysql -h controller -u protector -p protector -e "SHOW TABLES;"

Expected output after db sync has run:

+----------------------+
| Tables_in_protector  |
+----------------------+
| alembic_version      |
| consistency_groups   |
| cg_volumes           |
| dr_operations        |
| pg_members           |
| protection_groups    |
| replication_policies |
| sites                |
+----------------------+

If the table list is empty, re-run protector-manage db sync.

Example 3: Validate API endpoint registration in Keystone

After completing the Keystone registration steps on both sites, confirm the endpoint is discoverable.

# On Site A
source ~/admin-openrc
openstack endpoint list --service protector

Expected output:

+------------------+-----------+--------------+--------------+---------+-----------+------------------------------------------+
| ID               | Region    | Service Name | Service Type | Enabled | Interface | URL                                      |
+------------------+-----------+--------------+--------------+---------+-----------+------------------------------------------+
| <id>             | RegionOne | protector    | protector    | True    | public    | http://site-a-controller:8788/v1/%(tenant_id)s |
| <id>             | RegionOne | protector    | protector    | True    | internal  | http://site-a-controller:8788/v1/%(tenant_id)s |
| <id>             | RegionOne | protector    | protector    | True    | admin     | http://site-a-controller:8788/v1/%(tenant_id)s |
+------------------+-----------+--------------+--------------+---------+-----------+------------------------------------------+

Repeat on Site B using ~/site-b-openrc.

Example 4: Confirm the Cinder policy changes are in effect

After updating /etc/cinder/policy.yaml, verify that the protector service user can list volume services (a proxy check for the policy changes).

# Authenticate as the protector service user
export OS_USERNAME=protector
export OS_PASSWORD=PROTECTOR_PASS
export OS_PROJECT_NAME=service
# ... (remaining auth env vars for Site A)

openstack volume service list

If the command returns the list of Cinder volume services without a 403 Forbidden error, the policy change is active. If it fails, restart Cinder after applying the policy file:

systemctl restart openstack-cinder-api

Troubleshooting

`protector-api` fails to start: database connection error

Symptom: systemctl status protector-api shows Active: failed and the journal contains OperationalError: (pymysql.err.OperationalError) Can't connect to MySQL server.

Likely cause: The connection DSN in protector.conf is wrong, the database does not exist, or the MariaDB service is not running.

Fix:

Verify MariaDB is running: systemctl status mariadb
Test the credentials directly: mysql -h <host> -u protector -p protector -e "SELECT 1;"
Check that the protector database exists: mysql -u root -p -e "SHOW DATABASES;"
If the database is missing, re-run the CREATE DATABASE and GRANT statements from the installation steps.
If credentials are wrong, update protector.conf and restart: systemctl restart protector-api

`protector-api` returns `401 Unauthorized` for all requests

Symptom: Every API call returns HTTP 401, even with a valid token.

Likely cause: The [keystone_authtoken] section in protector.conf is pointing to the wrong Keystone URL, or the protector service user does not exist or has the wrong password.

Fix:

Confirm the service user exists and can authenticate: openstack token issue --os-username protector --os-project-name service
Verify auth_url and www_authenticate_uri in [keystone_authtoken] match the Keystone v3 endpoint for that site.
Confirm the password in protector.conf matches what was set in Keystone.
Restart after any changes: systemctl restart protector-api

`protector-engine` starts but DR operations hang indefinitely

Symptom: DR operations are created (status: running) but never progress beyond 0% and never complete.

Likely cause: protector-engine cannot reach RabbitMQ, or RabbitMQ is not running.

Fix:

Confirm RabbitMQ is running on the expected host: systemctl status rabbitmq-server
Check the protector-engine log for AMQP connection errors: journalctl -u protector-engine | grep -i rabbit
Verify RabbitMQ is accessible from the engine host on the expected port (default 5672): telnet <rabbitmq-host> 5672
Review the [oslo_messaging_rabbit] section of protector.conf if you have customized the RabbitMQ connection.

Keystone endpoint registered but service not discoverable from CLI

Symptom: openstack endpoint list --service protector returns no rows, or the CLI reports EndpointNotFound.

Likely cause: The service type used during openstack service create does not match the type the client looks up, or the endpoint was created against the wrong region.

Fix:

List all services and confirm the entry: openstack service list | grep protector
Confirm the endpoint region matches your clouds.yaml region_name: openstack endpoint list --service protector
If the service or endpoint is missing, re-run the Keystone registration steps from the installation section.

Metadata sync blocked: "Cannot modify protection group — remote site unreachable"

Symptom: Any attempt to modify a Protection Group (add a member, update a policy) fails with a message indicating the peer site is unreachable.

Likely cause: The protector-api on the peer site is down, or network connectivity between the sites on port 8788 is blocked.

Fix:

Confirm the peer site's protector-api is running: systemctl status protector-api on the remote controller.
Test connectivity from the local site to the remote API: curl http://<remote-controller>:8788/
Check firewall rules on both sites allow TCP port 8788 between site management networks.
Once connectivity is restored, check the sync status and push any pending changes: openstack protector protection-group sync-status <pg-name> followed by openstack protector protection-group sync-force <pg-name>.

This behavior is by design — modifications are blocked when the peer is unreachable to prevent metadata divergence between sites.

Cinder `volume_manage` call fails with `403 Forbidden` during failover

Symptom: A failover DR operation fails at the "Manage volumes into Cinder" step with a 403 error in the engine log.

Likely cause: The Cinder policy changes from the prerequisites have not been applied on the target site, or Cinder was not restarted after the policy file was updated.

Fix:

On the target site, verify /etc/cinder/policy.yaml contains the three required rules (volume_manage, volume_unmanage, services:index).
Restart the Cinder API service: systemctl restart openstack-cinder-api
For Kolla-Ansible deployments, re-run: kolla-ansible -i inventory reconfigure -t cinder
Retry the failover operation.

Service Placement

Step 1: Create the Protector system user and directories

Step 2: Create the Protector database (per site)

Step 3: Register the Protector service in Keystone (per site)

Step 4: Install the Protector package (per site)

Step 5: Initialize the database schema (per site)

Step 6: Install systemd unit files (per site)

Step 7: Enable and start services (per site)

Minimal working configuration

Key options explained

Why each site has its own database

RBAC policy file

Cinder policy adjustments (both sites)

Confirming services are reachable

Where to run CLI commands

Understanding the active/standby split

Checking service logs

Example 1: Verify service placement after installation

Example 2: Confirm database connectivity from the service

Example 3: Validate API endpoint registration in Keystone

Example 4: Confirm the Cinder policy changes are in effect

protector-api fails to start: database connection error

protector-api returns 401 Unauthorized for all requests

protector-engine starts but DR operations hang indefinitely

Keystone endpoint registered but service not discoverable from CLI

Metadata sync blocked: "Cannot modify protection group — remote site unreachable"

Cinder volume_manage call fails with 403 Forbidden during failover

`protector-api` fails to start: database connection error

`protector-api` returns `401 Unauthorized` for all requests

`protector-engine` starts but DR operations hang indefinitely

Cinder `volume_manage` call fails with `403 Forbidden` during failover